0% found this document useful (0 votes)
727 views1,061 pages

Maths

Uploaded by

鹿与旧都
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
727 views1,061 pages

Maths

Uploaded by

鹿与旧都
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1061

Minimum mathematics for future scientists and engineers

An inspiring self-study note

VINH PHU NGUYEN

2023
Copyright ©2023, Vinh Phu Nguyen. All rights reserved. No part of this material can be re-
produced, stored or transmitted without the written permission of the author. For information
contact: Vinh Phu Nguyen, Faculty of Engineering, Monash University, Wellington Rd, Clayton
VIC 3800, Australia.

2
About the author

Born in 1980 in Hue (Vietnam) Vinh Phu Nguyen enrolled in


Hai Ba Trung high school in 1995 and graduated three years later.
He then went to Hochiminh University of Technology (known as
Bach Khoa Sai gon among Vietnameses) in 1998. He graduated
in 2003 with a bachelor in civil engineering. Seeing himself unfit
for the industry of civil engineering, he enrolled in the EMMC
(European Master in Mechanics of Construction) program, Liège
University (Belgium). Upon graduation in 2005, he got a CIFRE scholarship to go to the École
nationale d’ingénieurs de Saint-Étienne (ÉNISE; National Engineering School of Saint-Étienne)
in France to work with Prof Jean Michel Bergheau. The PhD topic seemed so difficult for him
that after one year he left France empty handed.
He got a second chance working with Prof Bert Sluys at Faculty of Civil Engineering and
Geosciences, Delft University of Technology (The Netherlands) in 2007. He graduated with a
PhD in computational mechanics in 2011 and went immediately to Johns Hopkins University
(USA) for a postdoc. After a six month spell in the US he went back to Vietnam to get married
and then in 2012 went to Cardiff University (UK) to work with Prof Stéphane Bordas, his master
supervisor back in 2004, till 2014. Since 2016, he has been a lecturer at the department of civil
engineering, Monash University (Australia).
Dr Nguyen has done some research on numerical methods for solving partial differential
equations in solid mechanics problems (e.g. isogeometric analysis, extended finite element
method, material point method) and on damage/fracture models (e.g. phase-field fracture models)
for solving fracture mechanics and large deformation mechanics problems.
Since 2017 he became interested in the history of mathematics and science and the problems
of high school education in mathematics.

3
Preface

There are several issues with the traditional mathematics education. First, it focuses too p
much on
technical details. For example, students are asked to routinely apply the formula b˙ b 2 4ac
=2a
to solve many quadratic equations (e.g. x 2
2x C 1 D 0, x C 5x 10 D 0 etc. and the
2

list goes on). Second, the history of mathematics is completely ignored; textbook exposition
usually presents a complete reversal of the usual order of developments of mathematics. The
main purpose of the textbooks is to present mathematics with its characteristic logical structure
and its incomparable deductive certainty. That’s why in a calculus class students are taught what
is a function, then what is a limit, then what is a derivative and finally applications. The truth is
the reverse: Fermat implicitly used derivative in solving maxima problems; Newton and Leibniz
discovered it; Taylor, Bernoulli brothers, Euler developed it; Lagrange characterized it; and only
at the end of this long period of development, that spans about two hundred years, did Cauchy
and Weierstrass define it. Third, there is little opportunity for students to discover (rediscover to
be exact) the mathematics for themselves. Definitions, theorems are presented at the outset, the
students study the proofs and do applications.
Born and grew up in Vietnam in the early 80s, I received such a mathematical education.
Lack of books and guidance, I spent most of the time solving countless of mathematical exercises.
Even though I remembered enjoying some of them, admittedly the goal was always to get high
marks in exams and particularly pass the university entrance examination. Most of the time, it
was some clever tricks that are learned, not the true meaning of the mathematical concepts or
their applications. Of course why people came up with those concepts and why these concepts
are so defined were not discussed by the teachers (and unfortunately I did not ask these important
questions). After my bachelor, I enrolled in a master program. Again, I was on the same education
route: solving as many problems as possible. And you could guess, after a master was a PhD
study in the Netherlands. Though I had time and freedom and resources to do whatever I felt
needed, the focus was still to pass yet another form of examination – graduation. This time it
is measured by a number of research papers published in a peer-reviewed journal. To pursuit
an academic career, I took a postdoctoral job of which the main aim is to have as many papers
as possible. As you can imagine, I became technically fluent in a narrow field but on a weak
foundation.
Eventually, I got a job in a university in 2016. For the first time in my life, I did not have to
‘perform’ but I am able to really learn things (staff in universities still need to perform to satisfy

4
certain performance criteria which is vital for probation and promotion). This is when I started
reading books not on my research field, and I found that very enjoyable.
The turning point was the book called A Mathematician’s Lament by Paul Lockhart, a profes-
sional mathematician turned college teacher. Paul Lockhart describes how maths is incorrectly
taught in schools and he provides better ways to teach maths. He continues in Measurement
by showing us how we should learn maths by ‘re-discovering maths’ for ourselves. That made
me to decide to re-learn mathematics. But this time it must be done in a (much) more fun and
efficient way. A bit of researching led me to reading the book Learning How to Learn by Barbara
Oakley and Terry Sejnowski. The biggest lesson taken from Oakley and Sejnowski’s book is
that you can learn any subject if you do it properly.
So, I started learning mathematics from scratch during my free time. It started probably
in 2017. I have read many books on mathematics and physics and books on the history of
mathematics. I wrote some notes on my iPad recording what I have learned. Then, it came the
COVID-19 pandemic, also known as the coronavirus pandemic which locked down Melbourne–
the city I am living in. That was when I decided to put my iPad notes in a book format to have a
coherent story which is not only beneficial to me, but it will be helpful to others, hopefully.
This book is a set of notes covering (elementary) algebra, Euclidean geometry, trigonometry,
analytic geometry, calculus of functions of single variables and probability. This covers the main
content of the mathematics curriculum for high school students. These are followed by statistics,
calculus of functions of more than one variable, differential equations, variational calculus, linear
algebra and numerical analysis. These topics are for undergraduate college students majoring
in science, technology, engineering and mathematics. Very few such books exist, I believe, as
the two targeted audiences are too different. This one is different because it was written for me,
firstly and mainly. However, I do believe that high school students can benefit from ‘advanced’
topics by seeing what can be applications of the high school mathematics and what could be
extensions or better explanations thereof. On the other hand, there are college students not having
a solid background in mathematics who can use the elementary parts of this book as a review.
The style of the book, as you might guess, is informal. Mostly because I am not a mathemati-
cian and also I like a conversational tonne. This is not a traditional mathematics textbook, so it
does not include many exercises. Instead it focuses on the mathematical concepts, their origin
(why we need them), their definition (why they are defined like the way they’re), their extension.
The process leading to proofs and solutions is discussed as most often it is the first step which is
hard, all the remaining is mostly labor work (involving algebra usually). And of course, history
of mathematics is included by presenting major men in mathematics and their short biographies.
Of course there is no new mathematics in this book as I am not a mathematician; I do not
produce new mathematics. The maths presented is standard, and thus I do not cite the exact
sources. But, I do mention all the books and sources where I have learned the maths.
The title deserves a bit of explanation. The adjective minimum was used to emphasize that
even though the book covers many topics it has left out also many topics. I do not discuss
topology, graph theory, abstract algebra, differential geometry, simply because I do not know
them (and plan to learn them when the time is ready). But the book goes beyond a study of
mathematics just to apply it to sciences and engineering. However, it seems that no amount of
mathematics is sufficient as Einstein, just hours before his death, pointed to his equations, while
lamenting to his son “If only I had more mathematics”.
And finally, influenced by the fact that I am an engineer, the book introduces programming
from the beginning. Thus, young students can learn mathematics and programming at the same
time! For now, programming is just to automate some tedious calculations, or to compute an
infinite series numerically before attacking it analytically. Or a little bit harder as to solve
Newton’s equations to analyse the orbit of some planets. But a soon exposure to programming is
vital to their future career. Not least, coding is fun! All the code is put in githubŽ at this address.

Acknowledgments
I was lucky to get help from some people. I would like to thank “anh Bé’ who tutored me, for
free, on mathematics when I needed help most. To my secondary school math teacher “Thay
Dieu, who refused to receive tutor fee, I want to acknowledge his generosity. To my high school
math teacher “Thay Son”, whose belief in me made me more confident in myself, I would like to
say thank you very much. To my friend Phuong Thao, who taught me not to memorize formulas,
I want to express my deepest gratitude as this simple advise has changed completely the way I
have studied since. And finally to Prof Hung Nguyen-Dang, whose the EMMC master program
has changed the course of my life and many other Vietnamese, "em cam on Thay rat nhieu".
In the learning process, I cannot say thank you enough to some amazing YouTube channels
such as 3Blue1Brown, Mathologer, blackpenredpen, Dr. Trefor Bazett. They provide animation
based explanation for many mathematics topics from which I have learned a lot.
I have received encouragement along this journey, and I would like to thank you Miguel
Cervera at Universitat Politècnica de Catalunya whom I have never met, Laurence Brassar at
University of Oxford, Haojie Lian at Taiyuan University of Technology, Stéphane Bordas at The
University of Luxembourg–who was my master supervisor back in 2013. A special thank you is
given to Daniel Alves Paladim–a former colleague at Cardiff University–but now working for the
American self-driving car company Cruise LLC. Daniel has shared with me all the fascinating
books on various topics that he has read over the years. To my close friend Chi Nguyen-Thanh
(Royal HaskoningDHV Vietnam), thank you very much for your friendship and encouragement
to this project.
This book was typeset with LATEX on a MacBook. A majority of figures in the book were
created using open source software Asymptote and Tikz. Other figures were generated using
geogebra, processing, Desmos, Julia, Python. I want to say thank you to Le Huy Tien,
a lecturer at Vietnam National University, Hanoi for his help and encouragement on Asymptote
and Tikz.
I dedicate this book to all mathematicians who have created beautiful mathematics that helps
making our world a better place to live.

Vinh Phu Nguyen


email nvinhphu@gmail.com
Ž
GitHub is a website and cloud-based service that helps developers store and manage their code, as well as
track and control changes to their code.
February 20, 2023
Clayton, Australia
Contents

1 Introduction 2
1.1 What is mathematics? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Axiom, definition, theorem and proof . . . . . . . . . . . . . . . . . . . . . . 9
1.3 Exercises versus problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.4 Problem solving strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.5 Computing in mathematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.6 Mathematical anxiety or math phobia . . . . . . . . . . . . . . . . . . . . . . 17
1.7 Millennium Prize Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.8 How to learn mathematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.8.1 Reading materials . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.8.2 Learning tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.9 Organization of the book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2 Algebra 29
2.1 Natural numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.2 Doing some algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.3 Integer numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.3.1 Negative numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.3.2 A brief history on negative numbers . . . . . . . . . . . . . . . . . . 38
2.3.3 Arithmetic of negative integers . . . . . . . . . . . . . . . . . . . . . 38
2.4 Playing with natural numbers . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.4.1 Divisibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.4.2 Math contest problem . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.5 If and only if: conditional statements . . . . . . . . . . . . . . . . . . . . . . 46
2.6 Sums of whole numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
2.6.1 Sum of the first n whole numbers . . . . . . . . . . . . . . . . . . . . 47
2.6.2 Sum of the squares of the first n whole numbers . . . . . . . . . . . . 51
2.6.3 Sum of the cubes of the first n whole numbers . . . . . . . . . . . . . 52
2.7 Prime numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
2.7.1 How many primes are there? . . . . . . . . . . . . . . . . . . . . . . 54

8
2.7.2 The prime number theorem . . . . . . . . . . . . . . . . . . . . . . . 55
2.7.3 Twin primes and the story of Yitang Zhang . . . . . . . . . . . . . . 57
2.8 Rational numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
2.8.1 What is 5=2? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
2.8.2 Arithmetic with rational numbers . . . . . . . . . . . . . . . . . . . . 60
2.8.3 Decimal notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
2.9 Irrational numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
2.9.1 Diagonal of a unit square . . . . . . . . . . . . . . . . . . . . . . . . 64
2.9.2 Arithmetic
p of the irrationals . . . . . . . . . . . . . . . . . . . . . . 65
2.9.3 Roots x . . . . . . . . . . . . . . . . . . . . . . .
n
. . . . . . . . . 66
2.9.4 Rationalizing denominators and simplifying radicals . . . . . . . . . 67
2.9.5 Golden ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
2.9.6 Axioms for the real numbers . . . . . . . . . . . . . . . . . . . . . . 72
2.10 Fibonacci numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
2.11 Continued fractions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
2.12 Pythagoras’ theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
2.12.1 Pythagorean triples . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
2.12.2 Fermat’s last theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 83
2.12.3 Solving integer equations . . . . . . . . . . . . . . . . . . . . . . . . 84
2.12.4 From Pythagorean theorem to trigonometry and more . . . . . . . . . 85
2.13 Imaginary number . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
2.13.1 Linear equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
2.13.2 Quadratic equation . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
2.13.3 Cubic equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
2.13.4 How Viète solved the depressed cubic equation . . . . . . . . . . . . 94
2.13.5 History about Cardano’s formula . . . . . . . . . . . . . . . . . . . . 95
2.14 Mathematical notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
2.14.1 Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
2.15 Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
2.16 Word problems and system of linear equations . . . . . . . . . . . . . . . . . 103
2.17 System of nonlinear equations . . . . . . . . . . . . . . . . . . . . . . . . . . 108
2.18 Algebraic and transcendental equations . . . . . . . . . . . . . . . . . . . . . 111
2.19 Rules of powers (exponentiation) . . . . . . . . . . . . . . . . . . . . . . . . 111
2.19.1 Powers of 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
2.19.2 Power with an irrational index . . . . . . . . . . . . . . . . . . . . . 114
2.20 Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
2.20.1 Simple proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
2.20.2 Inequality of arithmetic and geometric means . . . . . . . . . . . . . 119
2.20.3 Cauchy–Schwarz inequality . . . . . . . . . . . . . . . . . . . . . . 124
2.20.4 Inequalities involving the absolute values . . . . . . . . . . . . . . . 127
2.20.5 Solving inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
2.20.6 Using inequalities to solve equations . . . . . . . . . . . . . . . . . . 130
2.21 Infinity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
2.21.1 Arithmetic series . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
2.21.2 Geometric series . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
2.21.3 Harmonic series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
2.21.4 Basel problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
2.21.5 Viète’s infinite product . . . . . . . . . . . . . . . . . . . . . . . . . 140
2.21.6 Sum of differences . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
2.22 Sequences, convergence and limit . . . . . . . . . . . . . . . . . . . . . . . . 144
2.22.1 Some examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
2.22.2 Rules of limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
2.22.3 Properties of sequences . . . . . . . . . . . . . . . . . . . . . . . . . 148
2.23 Inverse operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
2.24 Logarithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
2.24.1 Rules of logarithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
2.24.2 Some exercises on logarithms . . . . . . . . . . . . . . . . . . . . . 152
2.24.3 Why logarithm useful . . . . . . . . . . . . . . . . . . . . . . . . . . 153
2.24.4 How Henry Briggs calculated logarithms in 1617 . . . . . . . . . . . 154
2.24.5 Solving exponential equations . . . . . . . . . . . . . . . . . . . . . 155
2.25 Complex numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
2.25.1 Definition and arithmetic of complex numbers . . . . . . . . . . . . . 159
2.25.2 de Moivre’s formula . . . . . . . . . . . . . . . . . . . . . . . . . . 163
2.25.3 Roots of complex numbers . . . . . . . . . . . . . . . . . . . . . . . 164
2.25.4 Square root of i . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
2.25.5 Trigonometry identities . . . . . . . . . . . . . . . . . . . . . . . . . 168
2.25.6 Power of real number with a complex exponent . . . . . . . . . . . . 169
2.25.7 Power of an imaginary number with a complex exponent . . . . . . . 174
2.25.8 A summary of different kinds of numbers . . . . . . . . . . . . . . . 176
2.26 Combinatorics: The Art of Counting . . . . . . . . . . . . . . . . . . . . . . . 176
2.26.1 Product rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
2.26.2 Factorial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
2.26.3 Permutations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
2.26.4 Combinations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
2.26.5 Generalized permutations and combinations . . . . . . . . . . . . . . 184
2.26.6 The pigeonhole principle . . . . . . . . . . . . . . . . . . . . . . . . 185
2.26.7 Solutions to questions . . . . . . . . . . . . . . . . . . . . . . . . . . 187
2.27 Pascal triangle and the binomial theorem . . . . . . . . . . . . . . . . . . . . 188
2.27.1 Binomial theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
2.27.2 Sum of powers of integers and Bernoulli numbers . . . . . . . . . . . 190
2.27.3 Binomial theorem: a proof . . . . . . . . . . . . . . . . . . . . . . . 190
2.28 Compounding interest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
2.28.1 How to compute e? . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
2.28.2 Irrationality of e . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
2.28.3 Pascal triangle and e number . . . . . . . . . . . . . . . . . . . . . . 196
2.29 Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
2.29.1 Arithmetic of polynomials . . . . . . . . . . . . . . . . . . . . . . . 199
2.29.2 The polynomial remainder theorem . . . . . . . . . . . . . . . . . . 200
2.29.3 Guessing roots of a polynomial the smart way . . . . . . . . . . . . . 201
2.29.4 Complex roots of z n 1 D 0 come in conjugate pairs . . . . . . . . . 202
2.29.5 The fundamental theorem of algebra . . . . . . . . . . . . . . . . . . 203
2.29.6 Polynomial evaluation and Horner’s method . . . . . . . . . . . . . . 203
2.29.7 Vieta’s formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
2.29.8 Girard-Newton’s identities . . . . . . . . . . . . . . . . . . . . . . . 207
2.30 Modular arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
2.30.1 Rules of modular arithmetic . . . . . . . . . . . . . . . . . . . . . . 211
2.30.2 Solving problems using modular arithmetic . . . . . . . . . . . . . . 212
2.30.3 A problem from a 2006 Hongkong math contest . . . . . . . . . . . . 213
2.30.4 Divisibility with modular arithmetic . . . . . . . . . . . . . . . . . . 217
2.31 Cantor and infinity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
2.31.1 Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
2.31.2 Finite and infinite sets . . . . . . . . . . . . . . . . . . . . . . . . . . 219
2.31.3 Uncountably infinite sets . . . . . . . . . . . . . . . . . . . . . . . . 221
2.32 Number systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
2.33 Graph theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
2.33.1 The Seven Bridges of Königsberg . . . . . . . . . . . . . . . . . . . 223
2.33.2 Map coloring and the four color theorem . . . . . . . . . . . . . . . . 226
2.34 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
2.34.1 Euclidean algorithm: greatest common divisor . . . . . . . . . . . . . 227
2.34.2 Puzzle from Die Hard . . . . . . . . . . . . . . . . . . . . . . . . . . 229
2.35 Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230

3 Geometry and trigonometry 232


3.1 Euclidean geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
3.1.1 Basic concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
3.1.2 The structure of Euclid’s Elements . . . . . . . . . . . . . . . . . . . 237
3.1.3 Influence of The Elements . . . . . . . . . . . . . . . . . . . . . . . 239
3.1.4 Algebraic vs geometric thinking . . . . . . . . . . . . . . . . . . . . 240
3.1.5 The fifth postulate and consequences . . . . . . . . . . . . . . . . . . 241
3.1.6 Area of simple geometries . . . . . . . . . . . . . . . . . . . . . . . 242
3.1.7 Congruence and similarity . . . . . . . . . . . . . . . . . . . . . . . 245
3.1.8 Is a square a rectangle? . . . . . . . . . . . . . . . . . . . . . . . . . 248
3.1.9 Angles in convex polygons . . . . . . . . . . . . . . . . . . . . . . . 248
3.1.10 Circle theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
3.1.11 Tangents to circles . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
3.1.12 Geometric construction . . . . . . . . . . . . . . . . . . . . . . . . . 254
3.1.13 The three classical problems of antiquity . . . . . . . . . . . . . . . . 259
3.1.14 Tessellation: the Mathematics of Tiling . . . . . . . . . . . . . . . . . 260
3.1.15 Platonic solids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
3.1.16 Euler’s polyhedra formula . . . . . . . . . . . . . . . . . . . . . . . 264
3.2 Area of curved figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
3.2.1 Area of the first curved plane: the lune of Hippocrates . . . . . . . . . 266
3.2.2 Area of a parabola segment . . . . . . . . . . . . . . . . . . . . . . . 266
3.2.3 Circumference and area of circles . . . . . . . . . . . . . . . . . . . 268
3.2.4 Calculation of  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270
3.3 Trigonometric functions: right triangles . . . . . . . . . . . . . . . . . . . . . 276
3.4 Trigonometric functions: unit circle . . . . . . . . . . . . . . . . . . . . . . . 278
3.5 Degree versus radian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
3.6 Some first properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280
3.7 Sine table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282
3.8 Trigonometry identities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
3.9 Inverse trigonometric functions . . . . . . . . . . . . . . . . . . . . . . . . . 293
3.10 Inverse trigonometric identities . . . . . . . . . . . . . . . . . . . . . . . . . 293
3.11 Trigonometry inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296
3.12 Trigonometry equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
3.13 Generalized Pythagoras theorem . . . . . . . . . . . . . . . . . . . . . . . . . 305
3.14 Graph of trigonometry functions . . . . . . . . . . . . . . . . . . . . . . . . . 306
3.15 Hyperbolic functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
3.16 Applications of trigonometry . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
3.16.1 Measuring the earth . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
3.16.2 Charting the earth . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316
3.17 Infinite series for sine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318
3.18 Unusual trigonometric identities . . . . . . . . . . . . . . . . . . . . . . . . . 320
3.19 Spherical trigonometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324
3.19.1 Area of spherical triangles . . . . . . . . . . . . . . . . . . . . . . . 325
3.19.2 Area of spherical polygons . . . . . . . . . . . . . . . . . . . . . . . 325
3.20 Analytic geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
3.20.1 Cartesian coordinate system . . . . . . . . . . . . . . . . . . . . . . 327
3.20.2 Lines and Thales theorem revisited . . . . . . . . . . . . . . . . . . . 329
3.20.3 Constructible numbers . . . . . . . . . . . . . . . . . . . . . . . . . 330
3.20.4 Wantzel’s solution on two classical problems of antiquity . . . . . . . 333
3.21 Solving polynomial equations algebraically . . . . . . . . . . . . . . . . . . . 334
3.21.1 Solving quadratic equations . . . . . . . . . . . . . . . . . . . . . . 335
3.21.2 Solving cubic equations . . . . . . . . . . . . . . . . . . . . . . . . . 335
3.21.3 Studying roots of cubic using its graph . . . . . . . . . . . . . . . . . 337
3.22 Non-Euclidean geometries . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338
3.23 Computer algebra systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338
3.24 Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339
4 Calculus 340
4.1 Conic sections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343
4.1.1 Circles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344
4.1.2 Ellipses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344
4.1.3 Parabolas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346
4.1.4 Hyperbolas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347
4.1.5 General form of conic sections . . . . . . . . . . . . . . . . . . . . . 348
4.2 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351
4.2.1 Even and odd functions . . . . . . . . . . . . . . . . . . . . . . . . . 352
4.2.2 Transformation of functions . . . . . . . . . . . . . . . . . . . . . . 354
4.2.3 Function of function . . . . . . . . . . . . . . . . . . . . . . . . . . 354
4.2.4 Domain, co-domain and range of a function . . . . . . . . . . . . . . 355
4.2.5 Inverse functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356
4.2.6 Parametric curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358
4.2.7 History of functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 358
4.2.8 Some exercises about functions . . . . . . . . . . . . . . . . . . . . . 359
4.3 Integral calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360
4.3.1 Volumes of simple solids . . . . . . . . . . . . . . . . . . . . . . . . 360
4.3.2 Definition of an integral . . . . . . . . . . . . . . . . . . . . . . . . . 364
4.3.3 Calculation of integrals using the definition . . . . . . . . . . . . . . 365
4.3.4 Rules of integration . . . . . . . . . . . . . . . . . . . . . . . . . . . 367
4.3.5 Indefinite integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . 368
4.4 Differential calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 368
4.4.1 Maxima of Fermat . . . . . . . . . . . . . . . . . . . . . . . . . . . 369
4.4.2 Heron’s shortest distance . . . . . . . . . . . . . . . . . . . . . . . . 370
4.4.3 Uniform vs non-uniform speed . . . . . . . . . . . . . . . . . . . . . 373
4.4.4 The derivative of a function . . . . . . . . . . . . . . . . . . . . . . . 375
4.4.5 Infinitesimals and differentials . . . . . . . . . . . . . . . . . . . . . 376
4.4.6 The geometric meaning of the derivative . . . . . . . . . . . . . . . . 378
4.4.7 Derivative of f .x/ D x n . . . . . . . . . . . . . . . . . . . . . . . . 380
4.4.8 Derivative of trigonometric functions . . . . . . . . . . . . . . . . . . 382
4.4.9 Rules of derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . 383
4.4.10 The chain rule: derivative of composite functions . . . . . . . . . . . 385
4.4.11 Derivative of inverse functions . . . . . . . . . . . . . . . . . . . . . 385
4.4.12 Derivatives of inverses of trigonometry functions . . . . . . . . . . . 385
4.4.13 Derivatives of ax and number e . . . . . . . . . . . . . . . . . . . . . 386
4.4.14 Logarithm functions . . . . . . . . . . . . . . . . . . . . . . . . . . 388
4.4.15 Derivative of hyperbolic and inverse hyperbolic functions . . . . . . . 391
4.4.16 High order derivatives . . . . . . . . . . . . . . . . . . . . . . . . . 392
4.4.17 Implicit functions and implicit differentiation . . . . . . . . . . . . . 393
4.4.18 Derivative of logarithms . . . . . . . . . . . . . . . . . . . . . . . . 394
4.5 Applications of derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395
4.5.1 Maxima and minima . . . . . . . . . . . . . . . . . . . . . . . . . . 395
4.5.2 Convexity and Jensen’s inequality . . . . . . . . . . . . . . . . . . . 398
4.5.3 Linear approximation . . . . . . . . . . . . . . . . . . . . . . . . . . 401
4.5.4 Newton’s method for solving f .x/ D 0 . . . . . . . . . . . . . . . . 403
4.6 The fundamental theorem of calculus . . . . . . . . . . . . . . . . . . . . . . 406
4.7 Integration techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411
4.7.1 Integration by substitution . . . . . . . . . . . . . . . . . . . . . . . 411
4.7.2 Integration by parts . . . . . . . . . . . . . . . . . . . . . . . . . . . 413
4.7.3 Trigonometric integrals: sine/cosine . . . . . . . . . . . . . . . . . . 415
4.7.4 RepeatedR integration by parts . . . . . . . . . . . . . . . . . . . . . . 417
1
4.7.5 What is 0 x 4 e x dx? . . . . . . . . . . . . . . . . . . . . . . . . . 419
4.7.6 Trigonometric integrals: tangents and secants . . . . . . . . . . . . . 420
4.7.7 Integration by trigonometric substitution . . . . . . . . . . . . . . . . 421
4.7.8 Integration of P .x/=Q.x/ using partial fractions . . . . . . . . . . . 424
4.7.9 Tricks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 428
4.8 Improper integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 432
4.9 Applications of integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433
4.9.1 Length of plane curves . . . . . . . . . . . . . . . . . . . . . . . . . 433
4.9.2 Areas and volumes . . . . . . . . . . . . . . . . . . . . . . . . . . . 435
4.9.3 Area and volume of a solid of revolution . . . . . . . . . . . . . . . . 437
4.9.4 Gravitation of distributed masses . . . . . . . . . . . . . . . . . . . . 442
4.9.5 Using integral to compute limits of sums . . . . . . . . . . . . . . . . 444
4.10 Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 444
4.10.1 Definition of the limit of a function . . . . . . . . . . . . . . . . . . . 445
4.10.2 Rules of limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449
4.10.3 Continuous functions . . . . . . . . . . . . . . . . . . . . . . . . . . 452
4.10.4 Indeterminate forms . . . . . . . . . . . . . . . . . . . . . . . . . . . 456
4.10.5 Differentiable functions . . . . . . . . . . . . . . . . . . . . . . . . . 458
4.11 Some theorems on differentiable functions . . . . . . . . . . . . . . . . . . . 460
4.11.1 Extreme value and intermediate value theorems . . . . . . . . . . . . 460
4.11.2 Rolle’s theorem and the mean value theorem . . . . . . . . . . . . . . 461
4.11.3 Average of a function and the mean value theorem of integrals . . . . 462
4.12 Parametric curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463
4.12.1 Tangents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 464
4.12.2 Length and area of parametric curves . . . . . . . . . . . . . . . . . . 464
4.12.3 Cycloid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465
4.13 Polar coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 468
4.13.1 Polar coordinates and polar graphs . . . . . . . . . . . . . . . . . . . 468
4.13.2 Conic sections in polar coordinates . . . . . . . . . . . . . . . . . . . 470
4.13.3 Cardioid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473
4.13.4 Length and area of polar curves . . . . . . . . . . . . . . . . . . . . 474
4.14 Bézier curves: fascinating parametric curves . . . . . . . . . . . . . . . . . . 475
4.15 Infinite series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 479
4.15.1 The generalized binomial theorem . . . . . . . . . . . . . . . . . . . 479
4.15.2 Series of 1=.1 C x/ or Mercator’s series . . . . . . . . . . . . . . . . 483
4.15.3 Geometric series and logarithm . . . . . . . . . . . . . . . . . . . . . 484
4.15.4 Geometric series and inverse tangent . . . . . . . . . . . . . . . . . . 485
4.15.5 Euler’s work on exponential functions . . . . . . . . . . . . . . . . . 486
4.15.6 Euler’s trigonometry functions . . . . . . . . . . . . . . . . . . . . . 486
4.15.7 Euler’s solution of the Basel problem . . . . . . . . . . . . . . . . . 488
4.15.8 Taylor’s series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 491
4.15.9 Common Taylor series . . . . . . . . . . . . . . . . . . . . . . . . . 493
4.15.10 Taylor’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495
4.16 Applications of Taylor’ series . . . . . . . . . . . . . . . . . . . . . . . . . . 496
4.16.1 Integral evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . 496
4.16.2 Limit evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 499
4.16.3 Series evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 499
4.17 Bernoulli numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 499
4.18 Euler-Maclaurin summation formula . . . . . . . . . . . . . . . . . . . . . . 502
4.19 Fourier series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505
4.19.1 Periodic functions with period 2 . . . . . . . . . . . . . . . . . . . 505
4.19.2 Functions with period 2L . . . . . . . . . . . . . . . . . . . . . . . . 508
4.19.3 Complex form of Fourier series . . . . . . . . . . . . . . . . . . . . . 510
4.20 Special functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 511
4.20.1 Elementary functions . . . . . . . . . . . . . . . . . . . . . . . . . . 511
4.20.2 Factorial of 1=2 and the Gamma function . . . . . . . . . . . . . . . 512
4.20.3 Zeta function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 513
4.21 Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 513

5 Probability 515
5.1 A brief history of probability . . . . . . . . . . . . . . . . . . . . . . . . . . . 517
5.2 Classical probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 518
5.3 Empirical probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 520
5.4 Buffon’s needle problem and Monte Carlo simulations . . . . . . . . . . . . . 521
5.4.1 Buffon’s needle problem . . . . . . . . . . . . . . . . . . . . . . . . 521
5.4.2 Monte Carlo method . . . . . . . . . . . . . . . . . . . . . . . . . . 522
5.5 A review of set theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 524
5.5.1 Subset, superset and empty set . . . . . . . . . . . . . . . . . . . . . 524
5.5.2 Set operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 526
5.6 Random experiments, sample space and event . . . . . . . . . . . . . . . . . . 529
5.7 Probability and its axioms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 530
5.8 Conditional probabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 534
5.8.1 What is a conditional probability? . . . . . . . . . . . . . . . . . . . 534
5.8.2 P .AjB/ is also a probability . . . . . . . . . . . . . . . . . . . . . . 535
5.8.3 Multiplication rule for conditional probability . . . . . . . . . . . . . 536
5.8.4 Bayes’ formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 537
5.8.5 The odds form of the Bayes’ rule . . . . . . . . . . . . . . . . . . . . 541
5.8.6 Independent events . . . . . . . . . . . . . . . . . . . . . . . . . . . 545
5.8.7 The gambler’s ruin problem . . . . . . . . . . . . . . . . . . . . . . 547
5.9 The secretary problem or dating mathematically . . . . . . . . . . . . . . . . 550
5.10 Discrete probability models . . . . . . . . . . . . . . . . . . . . . . . . . . . 553
5.10.1 Discrete random variables . . . . . . . . . . . . . . . . . . . . . . . 556
5.10.2 Probability mass function . . . . . . . . . . . . . . . . . . . . . . . . 557
5.10.3 Special distributions . . . . . . . . . . . . . . . . . . . . . . . . . . 559
5.10.4 Cumulative distribution function . . . . . . . . . . . . . . . . . . . . 570
5.10.5 Expected value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 572
5.10.6 Functions of random variables . . . . . . . . . . . . . . . . . . . . . 574
5.10.7 Linearity of the expectation . . . . . . . . . . . . . . . . . . . . . . . 576
5.10.8 Variance and standard deviation . . . . . . . . . . . . . . . . . . . . 578
5.10.9 Expected value and variance of special distributions . . . . . . . . . . 581
5.11 Continuous probability models . . . . . . . . . . . . . . . . . . . . . . . . . . 582
5.11.1 Probability density function . . . . . . . . . . . . . . . . . . . . . . 582
5.11.2 Expected value and variance . . . . . . . . . . . . . . . . . . . . . . 584
5.11.3 Special continuous distributions . . . . . . . . . . . . . . . . . . . . 584
5.12 Joint discrete distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 588
5.12.1 Two jointly discrete variables . . . . . . . . . . . . . . . . . . . . . . 588
5.12.2 Conditional PMF and CDF . . . . . . . . . . . . . . . . . . . . . . . 589
5.12.3 Independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 590
5.12.4 Conditional expectation . . . . . . . . . . . . . . . . . . . . . . . . . 590
5.12.5 Covariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 593
5.12.6 Variance of a sum of variables . . . . . . . . . . . . . . . . . . . . . 594
5.12.7 Correlation coefficient . . . . . . . . . . . . . . . . . . . . . . . . . 595
5.12.8 Covariance matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . 596
5.13 Joint continuous variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 598
5.14 Transforming density functions . . . . . . . . . . . . . . . . . . . . . . . . . 598
5.15 Inequalities in the theory of probability . . . . . . . . . . . . . . . . . . . . . 599
5.15.1 Markov and Chebyshev inequalities . . . . . . . . . . . . . . . . . . 599
5.15.2 Chernoff’s inequality . . . . . . . . . . . . . . . . . . . . . . . . . . 600
5.16 Limit theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 601
5.16.1 The law of large numbers . . . . . . . . . . . . . . . . . . . . . . . . 601
5.16.2 Central limit theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 602
5.17 Generating functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 605
5.17.1 Ordinary generating function . . . . . . . . . . . . . . . . . . . . . . 605
5.17.2 Moment generating functions . . . . . . . . . . . . . . . . . . . . . . 609
5.17.3 Properties of moment generating functions . . . . . . . . . . . . . . . 611
5.17.4 Proof of the central limit theorem . . . . . . . . . . . . . . . . . . . 612
5.18 Multivariate normal distribution . . . . . . . . . . . . . . . . . . . . . . . . . 614
5.18.1 Random vectors and random matrices . . . . . . . . . . . . . . . . . 614
5.18.2 Functions of random vectors . . . . . . . . . . . . . . . . . . . . . . 616
5.18.3 Multivariate normal distribution . . . . . . . . . . . . . . . . . . . . 617
5.18.4 Mean and covariance of multivariate normal distribution . . . . . . . 618
5.18.5 The probability density function for N.; ˙ / . . . . . . . . . . . . . 619
5.18.6 The bivariate normal distribution . . . . . . . . . . . . . . . . . . . . 619
5.19 Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 622

6 Statistics and machine learning 623


6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 624
6.1.1 What is statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 624
6.1.2 Why study statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . 624
6.1.3 A brief history of statistics . . . . . . . . . . . . . . . . . . . . . . . 624
6.2 A brief introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 624
6.3 Statistical inference: classical approach . . . . . . . . . . . . . . . . . . . . . 624
6.4 Statistical inference: Bayesian approach . . . . . . . . . . . . . . . . . . . . . 625
6.5 Least squares problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 625
6.5.1 Problem statement . . . . . . . . . . . . . . . . . . . . . . . . . . . 625
6.5.2 Solution of the least squares problem . . . . . . . . . . . . . . . . . . 626
6.6 Markov chains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 628
6.6.1 Markov chain: an introduction . . . . . . . . . . . . . . . . . . . . . 628
6.6.2 dd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 631
6.7 Principal component analysis (PCA) . . . . . . . . . . . . . . . . . . . . . . . 631
6.8 Neural networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 632

7 Multivariable calculus 633


7.1 Multivariable functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 635
7.1.1 Scalar valued multivariable functions . . . . . . . . . . . . . . . . . . 635
7.1.2 Level curves, level surfaces and level sets . . . . . . . . . . . . . . . 637
7.1.3 Multivariate calculus: an extension of univariate calculus . . . . . . . 638
7.1.4 Vector valued multivariable functions . . . . . . . . . . . . . . . . . 638
7.2 Derivatives of multivariable functions . . . . . . . . . . . . . . . . . . . . . . 639
7.3 Tangent planes, linear approximation and total differential . . . . . . . . . . . 641
7.4 Newton’s method for solving two equations . . . . . . . . . . . . . . . . . . . 642
7.5 Gradient and directional derivative . . . . . . . . . . . . . . . . . . . . . . . . 642
7.6 Chain rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 646
7.7 Minima and maxima of functions of two variables . . . . . . . . . . . . . . . 647
7.7.1 Stationary points and partial derivatives . . . . . . . . . . . . . . . . 647
7.7.2 Taylor’s series of scalar valued multivariate functions . . . . . . . . . 650
7.7.3 Multi-index notation . . . . . . . . . . . . . . . . . . . . . . . . . . 651
7.7.4 Quadratic forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 652
7.7.5 Constraints and Lagrange multipliers . . . . . . . . . . . . . . . . . . 654
7.7.6 Inequality constraints and Lagrange multipliers . . . . . . . . . . . . 657
7.8 Integration of multivariable functions . . . . . . . . . . . . . . . . . . . . . . 657
7.8.1 Double integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 658
7.8.2 Double integrals in polar coordinates . . . . . . . . . . . . . . . . . . 659
7.8.3 Triple integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 660
7.8.4 Triple integrals in cylindrical and spherical coordinates . . . . . . . . 660
7.8.5 Newton’s shell theorem . . . . . . . . . . . . . . . . . . . . . . . . . 662
7.8.6 Change of variables and the Jacobian . . . . . . . . . . . . . . . . . . 663
7.8.7 Masses, center of mass, and moments . . . . . . . . . . . . . . . . . 667
7.8.8 Barycentric coordinates . . . . . . . . . . . . . . . . . . . . . . . . . 674
7.9 Parametric surfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 676
7.9.1 Parametric representation of common solids . . . . . . . . . . . . . . 677
7.9.2 Practical implementation detail . . . . . . . . . . . . . . . . . . . . . 680
7.9.3 Tangent plane and normal vector . . . . . . . . . . . . . . . . . . . . 681
7.9.4 Surface area and surface integral . . . . . . . . . . . . . . . . . . . . 682
7.9.5 Bézier surfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 682
7.10 Newtonian mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 683
7.10.1 Aristotle’s motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 684
7.10.2 Galileo’s motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 684
7.10.3 Kepler’s laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 685
7.10.4 Newton’s laws of motion . . . . . . . . . . . . . . . . . . . . . . . . 686
7.10.5 Dynamical equations: meaning and solutions . . . . . . . . . . . . . 687
7.10.6 Motion along a curve (Cartesian) . . . . . . . . . . . . . . . . . . . . 689
7.10.7 Motion along a curve (Polar coordinates) . . . . . . . . . . . . . . . 691
7.10.8 Newton’s gravitation . . . . . . . . . . . . . . . . . . . . . . . . . . 693
7.10.9 From Newton’s universal gravitation to Kepler’s laws . . . . . . . . . 695
7.10.10 Discovery of Neptune . . . . . . . . . . . . . . . . . . . . . . . . . . 697
7.10.11 Newton and the Great Plague of 1665–1666 . . . . . . . . . . . . . . 697
7.11 Vector calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 698
7.11.1 Vector fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 699
7.11.2 Central forces and fields . . . . . . . . . . . . . . . . . . . . . . . . 699
7.11.3 Work done by a force and line integrals . . . . . . . . . . . . . . . . 701
7.11.4 Work of gravitational and electric forces . . . . . . . . . . . . . . . . 705
7.11.5 Fluxes and Divergence . . . . . . . . . . . . . . . . . . . . . . . . . 706
7.11.6 Gauss’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 710
7.11.7 Circulation of a fluid and curl . . . . . . . . . . . . . . . . . . . . . . 711
7.11.8 Curl and Stokes’ theorem . . . . . . . . . . . . . . . . . . . . . . . . 713
7.11.9 Green’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 713
7.11.10 Curl free and divergence free vector fields . . . . . . . . . . . . . . . 715
7.11.11 Grad, div, curl and identities . . . . . . . . . . . . . . . . . . . . . . 715
7.11.12 Integration by parts . . . . . . . . . . . . . . . . . . . . . . . . . . . 718
7.11.13 Green’s identities . . . . . . . . . . . . . . . . . . . . . . . . . . . . 718
7.11.14 Kronecker and Levi-Cavita symbols . . . . . . . . . . . . . . . . . . 720
7.11.15 Curvilinear coordinate systems . . . . . . . . . . . . . . . . . . . . . 722
7.12 Complex analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 723
7.12.1 Functions of complex variables . . . . . . . . . . . . . . . . . . . . . 723
7.12.2 Visualization of complex functions . . . . . . . . . . . . . . . . . . . 725
7.12.3 Derivative of complex functions . . . . . . . . . . . . . . . . . . . . 726
7.12.4 Complex integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . 728

8 Tensor analysis 729


8.1 Index notation and Einstein summation convention . . . . . . . . . . . . . . . 732
8.2 Why tensors are facts of the universe? . . . . . . . . . . . . . . . . . . . . . . 733
8.3 What is a tensor: some examples . . . . . . . . . . . . . . . . . . . . . . . . . 733
8.3.1 Tensor of inertia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 733
8.3.2 Stress tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 734
8.4 What is a tensor: more examples . . . . . . . . . . . . . . . . . . . . . . . . . 734
8.5 What is a tensor: definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . 734

9 Differential equations 735


9.1 Mathematical models and differential equations . . . . . . . . . . . . . . . . . 736
9.2 Models of population growth . . . . . . . . . . . . . . . . . . . . . . . . . . 738
9.3 Ordinary differential equations . . . . . . . . . . . . . . . . . . . . . . . . . . 740
9.3.1 System of linear first order equations . . . . . . . . . . . . . . . . . . 741
9.3.2 Exponential of a matrix . . . . . . . . . . . . . . . . . . . . . . . . . 743
9.4 Partial differential equations: a classification . . . . . . . . . . . . . . . . . . 747
9.5 Derivation of common PDEs . . . . . . . . . . . . . . . . . . . . . . . . . . . 747
9.5.1 Wave equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 748
9.5.2 Diffusion equation . . . . . . . . . . . . . . . . . . . . . . . . . . . 751
9.5.3 Poisson’s equation . . . . . . . . . . . . . . . . . . . . . . . . . . . 755
9.6 Linear partial differential equations . . . . . . . . . . . . . . . . . . . . . . . 755
9.7 Dimensionless problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 756
9.7.1 Dimensions and units . . . . . . . . . . . . . . . . . . . . . . . . . . 756
9.7.2 Power laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 757
9.7.3 Dimensional analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 759
9.7.4 Scaling of ODEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 764
9.8 Harmonic oscillation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 765
9.8.1 Simple harmonic oscillation . . . . . . . . . . . . . . . . . . . . . . 766
9.8.2 Damped oscillator . . . . . . . . . . . . . . . . . . . . . . . . . . . . 771
9.8.3 Driven damped oscillation . . . . . . . . . . . . . . . . . . . . . . . 773
9.8.4 Resonance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 775
9.8.5 Driven damped oscillators with any periodic forces . . . . . . . . . . 776
9.8.6 The pendulum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 777
9.8.7 RLC circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 779
9.8.8 Coupled oscillators . . . . . . . . . . . . . . . . . . . . . . . . . . . 780
9.9 Solving the diffusion equation . . . . . . . . . . . . . . . . . . . . . . . . . . 784
9.10 Solving the wave equation: d’Alembert’s solution . . . . . . . . . . . . . . . . 786
9.11 Solving the wave equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 790
9.12 Fourier series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 793
9.12.1 Bessel’s inequality and Parseval’s theorem . . . . . . . . . . . . . . . 793
9.12.2 Fourier transforms (Fourier integrals) . . . . . . . . . . . . . . . . . 795
9.13 Classification of second order linear PDEs . . . . . . . . . . . . . . . . . . . 795
9.14 Fluid mechanics: Navier Stokes equation . . . . . . . . . . . . . . . . . . . . 795

10 Calculus of variations 796


10.1 Introduction and some history comments . . . . . . . . . . . . . . . . . . . . 798
10.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 798
10.3 Variational problems and Euler-Lagrange equation . . . . . . . . . . . . . . . 802
10.4 Solution of some elementary variational problems . . . . . . . . . . . . . . . 805
10.4.1 Eucledian geodesic problem . . . . . . . . . . . . . . . . . . . . . . 805
10.4.2 The Brachistochrone problem . . . . . . . . . . . . . . . . . . . . . 806
10.4.3 The brachistochrone: history and Bernoulli’s genius solution . . . . . 807
10.5 The variational ı operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . 809
10.6 Multi-dimensional variational problems . . . . . . . . . . . . . . . . . . . . . 811
10.7 Boundary conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 812
10.8 Lagrangian mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 815
10.8.1 The Lagrangian, the action and the EL equations . . . . . . . . . . . 816
10.8.2 Generalized coordinates . . . . . . . . . . . . . . . . . . . . . . . . 817
10.8.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 818
10.9 Ritz’ direct method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 820
10.10 What if there is no functional to start with? . . . . . . . . . . . . . . . . . . . 824
10.11 Galerkin methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 827
10.12 The finite element method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 830
10.12.1 Basic idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 830
10.12.2 FEM for 1D wave equation . . . . . . . . . . . . . . . . . . . . . . . 831
10.12.3 Shape functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 835
10.12.4 Role of FEM in computational sciences and engineering . . . . . . . 835

11 Linear algebra 837


11.1 Vector in R3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 839
11.1.1 Addition and scalar multiplication . . . . . . . . . . . . . . . . . . . 840
11.1.2 Dot product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 843
11.1.3 Lines and planes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 847
11.1.4 Projections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 848
11.1.5 Cross product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 849
11.1.6 Hamilton and quartenions . . . . . . . . . . . . . . . . . . . . . . . . 855
11.2 Vectors in Rn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 857
11.3 System of linear equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 860
11.3.1 Gaussian elimination method . . . . . . . . . . . . . . . . . . . . . . 863
11.3.2 The Gauss-Jordan elimination method . . . . . . . . . . . . . . . . . 864
11.3.3 Homogeneous linear systems . . . . . . . . . . . . . . . . . . . . . . 866
11.3.4 Spanning sets of vectors and linear independence . . . . . . . . . . . 867
11.4 Matrix algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 870
11.4.1 Matrix operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 870
11.4.2 The laws for matrix operations . . . . . . . . . . . . . . . . . . . . . 872
11.4.3 Transpose of a matrix . . . . . . . . . . . . . . . . . . . . . . . . . . 873
11.4.4 Partitioned matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . 874
11.4.5 Inverse of a matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . 876
11.4.6 LU decomposition/factorization . . . . . . . . . . . . . . . . . . . . 880
11.5 Subspaces, basis, dimension and rank . . . . . . . . . . . . . . . . . . . . . . 881
11.6 Introduction to linear transformation . . . . . . . . . . . . . . . . . . . . . . . 887
11.7 Linear algebra with Julia . . . . . . . . . . . . . . . . . . . . . . . . . . . . 894
11.8 Orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 894
11.8.1 Orthogonal vectors & orthogonal bases . . . . . . . . . . . . . . . . 894
11.8.2 Orthonormal vectors and orthonormal bases . . . . . . . . . . . . . . 897
11.8.3 Orthogonal matrices . . . . . . . . . . . . . . . . . . . . . . . . . . 898
11.8.4 Orthogonal complements . . . . . . . . . . . . . . . . . . . . . . . . 899
11.8.5 Orthogonal projections . . . . . . . . . . . . . . . . . . . . . . . . . 901
11.8.6 Gram-Schmidt orthogonalization process . . . . . . . . . . . . . . . 902
11.8.7 QR factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 903
11.9 Determinant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 904
11.9.1 Defining the determinant in terms of its properties . . . . . . . . . . . 904
11.9.2 Determinant of elementary matrices . . . . . . . . . . . . . . . . . . 906
11.9.3 A formula for the determinant . . . . . . . . . . . . . . . . . . . . . 908
11.9.4 Cramer’s rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 909
11.10 Eigenvectors and eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . 911
11.10.1 Angular momentum and inertia tensor . . . . . . . . . . . . . . . . . 912
11.10.2 Principal axes and eigenvalue problems . . . . . . . . . . . . . . . . 916
11.10.3 Eigenvalues and eigenvectors . . . . . . . . . . . . . . . . . . . . . . 917
11.10.4 More on eigenvectors/eigvenvalues . . . . . . . . . . . . . . . . . . . 919
11.10.5 Symmetric matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . 920
11.10.6 Quadratic forms and positive definite matrices . . . . . . . . . . . . . 922
11.11 Vector spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 925
11.11.1 Vector spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 926
11.11.2 Change of basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 929
11.11.3 Linear transformations . . . . . . . . . . . . . . . . . . . . . . . . . 932
11.11.4 Diagonalizing a matrix . . . . . . . . . . . . . . . . . . . . . . . . . 936
11.11.5 Inner product and inner product spaces . . . . . . . . . . . . . . . . . 938
11.11.6 Complex vectors and complex matrices . . . . . . . . . . . . . . . . 941
11.11.7 Norm, distance and normed vector spaces . . . . . . . . . . . . . . . 942
11.11.8 Matrix norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 943
11.11.9 The condition number of a matrix . . . . . . . . . . . . . . . . . . . 945
11.11.10The best approximation theorem . . . . . . . . . . . . . . . . . . . . 946
11.12 Singular value decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . 947
11.12.1 Singular values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 948
11.12.2 Singular value decomposition . . . . . . . . . . . . . . . . . . . . . 948
11.12.3 Matrix norms and the condition number . . . . . . . . . . . . . . . . 951
11.12.4 Low rank approximations . . . . . . . . . . . . . . . . . . . . . . . . 952

12 Numerical analysis 954


12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 955
12.2 Numerical differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 957
12.2.1 First order derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . 958
12.2.2 Second order derivatives . . . . . . . . . . . . . . . . . . . . . . . . 959
12.2.3 Richardson’s extrapolation . . . . . . . . . . . . . . . . . . . . . . . 959
12.3 Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 959
12.3.1 Polynomial interpolations . . . . . . . . . . . . . . . . . . . . . . . . 960
12.3.2 Chebyshev polynomials . . . . . . . . . . . . . . . . . . . . . . . . . 966
12.3.3 Lagrange interpolation: efficiency and barycentric forms . . . . . . . 969
12.4 Numerical integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 971
12.4.1 Trapezoidal and mid-point rule . . . . . . . . . . . . . . . . . . . . . 972
12.4.2 Simpson’s rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 974
12.4.3 Gauss’s rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 976
12.4.4 Two and three dimensional integrals . . . . . . . . . . . . . . . . . . 980
12.5 Solving nonlinear equations . . . . . . . . . . . . . . . . . . . . . . . . . . . 980
12.5.1 Analysis of the fixed point iteration . . . . . . . . . . . . . . . . . . . 980
12.5.2 subsubsection name . . . . . . . . . . . . . . . . . . . . . . . . . . . 983
12.6 Numerical solution of ordinary differential equations . . . . . . . . . . . . . . 983
12.6.1 Euler’s method: 1st ODE . . . . . . . . . . . . . . . . . . . . . . . . 983
12.6.2 Euler’s method: 2nd order ODE . . . . . . . . . . . . . . . . . . . . 984
12.6.3 Euler-Aspel-Cromer’ s method: better energy conservation . . . . . . 985
12.6.4 Solving Kepler’s problem numerically . . . . . . . . . . . . . . . . . 986
12.6.5 Three body problems and N body problems . . . . . . . . . . . . . . 988
12.6.6 Verlet’s method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 989
12.6.7 Analysis of Euler’s method . . . . . . . . . . . . . . . . . . . . . . . 991
12.7 Numerical solution of partial differential equations . . . . . . . . . . . . . . . 993
12.7.1 Finite difference for the 1D heat equation: explicit schemes . . . . . . 993
12.7.2 Finite difference for the 1D heat equation: implicit schemes . . . . . . 994
12.7.3 Implicit versus explicit methods: stability analysis . . . . . . . . . . . 996
12.7.4 Analytical solutions versus numerical solutions . . . . . . . . . . . . 998
12.7.5 Finite difference for the 1D wave equation . . . . . . . . . . . . . . . 999
12.7.6 Solving ODE using neural networks . . . . . . . . . . . . . . . . . . 1000
12.8 Numerical optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1000
12.8.1 Gradient descent method . . . . . . . . . . . . . . . . . . . . . . . . 1001
12.9 Numerical linear algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1005
12.9.1 Iterative methods to solve a system of linear equations . . . . . . . . 1005
12.9.2 Conjugate gradient method . . . . . . . . . . . . . . . . . . . . . . . 1006
12.9.3 Iterative methods to solve eigenvalue problems . . . . . . . . . . . . 1007
1

Appendix A Codes 1008


A.1 Algebra and calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1009
A.2 Recursion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1012
A.3 Numerical integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1014
A.4 Harmonic oscillations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1014
A.5 Polynomial interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1015
A.6 Propability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1015
A.7 N body problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1019
A.8 Working with images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1020
A.9 Reinventing the wheel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1021
A.10 Computer algebra system . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1022
A.11 Computer graphics with processing . . . . . . . . . . . . . . . . . . . . . . 1022

Appendix B Data science with Julia 1026


B.1 Introduction to DataFrames.jl . . . . . . . . . . . . . . . . . . . . . . . . . . 1026

Bibliography 1028

Index 1034

Phu Nguyen, Monash University © Draft version


Chapter 1
Introduction

Contents
1.1 What is mathematics? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Axiom, definition, theorem and proof . . . . . . . . . . . . . . . . . . . . 9
1.3 Exercises versus problems . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.4 Problem solving strategies . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.5 Computing in mathematics . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.6 Mathematical anxiety or math phobia . . . . . . . . . . . . . . . . . . . 17
1.7 Millennium Prize Problems . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.8 How to learn mathematics . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.9 Organization of the book . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

Without doubt this is a very difficult chapter to write. It is my attempt to explain what
mathematics really is (Section 1.1). All learners should know at least about the big picture of
the topic they’re going to study. In Section 1.2, common terminologies in mathematics such as
axiom, definition, theorem and proof are introduced. Next, the differences between mathematical
exercises and problems are discussed (Section 1.3). The message is to focus on problems rather
than on routine exercises. Section 1.4 is then devoted to some problem solving strategies. The
role of computers in teaching and learning mathematics is treated in Section 1.5.
"I’m just not a math person”. We hear it all the time. And we’ve had enough. I tried in
Section 1.6 to uncover this myth. You can be better at maths than you’re thinking.
Mathematics is not a dead, complete subject, created thousands of years ago. In fact, it is
a living subject with many unsolved problems and everyday new mathematics is being created
or discovered. Section 1.7 presents some of these problems. It is these problems that keep
mathematicians working late at night.
Finally, the organization of the book is outlined in Section 1.9.
You do not have to try to understand everything in this chapter if your maths is not solid yet.
Just skim through it and sometimes get back to it to see the big picture.

2
Chapter 1. Introduction 3

1.1 What is mathematics?


In The Mathematical Experience–a National Book Award in Science–Philip Davis|| and Reuben
Hersh wrote

Mathematics consists of true facts about imaginary objects

Mathematicians study imaginary objects which are called mathematical objects. Some examples
are numbers, functions, triangles, matrices, groups and more complicated things such as vector
spaces and infinite series. These objects are said imaginary or abstract as they do not exist in our
physical world. For instance, in geometry a line does not have thickness and a line is perfectly
straight! And certainly mathematicians don’t care if a line is made of steel or wood. There are
no such things in the physical world. Similarly we cannot hold and taste the number three. When
we write 3 on a beach and touch it, we only touch a representation of the number three.
Why working with abstract objects useful? One example from geometry is provided as a
simple answer. Suppose that we can prove that the area of a (mathematical) circle is  times the
square of the radius, then this fact would apply to the area of a circular field, the cross section of
a circular tree trunk or the floor area of a circular temple.
Having now in their hands some mathematical objects, how do mathematicians deduce
new knowledge? As senses, experimentation and measurement are not sufficient, they rely on
reasoning. Yes, logical reasoning. This started with the Greek mathematicians. It is obvious that
we can not use our senses to estimate the distance from the Earth to the Sun. It would be tedious
to measure the area of a rectangular region than measuring just its sides and use mathematics
to get the area. And it is very time consuming and error prone to design structures by pure
experimentation. If a bridge is designed in this way, it would only be fair that the designer be
the first to cross this bridge.
What mathematicians are really trying to get from their objects? Godfrey Hardy answered
this best:ŽŽ

A mathematician, like a painter or poet, is a maker of patterns. If his patterns are


more permanent than theirs, it is because they are made with ideas.

Implied by Hardy is that mathematics is a study of patterns of mathematical objects. Let’s


confine to natural numbers as the mathematical object. The following is one example of how
mathematicians play with their objects. They start with a question: what is the sum of the first n
natural numbers? This sum is mathematically written as

S.n/ D 1 C 2 C 3 C    C n
||
Philip J. Davis (1923–2018) was an American academic applied mathematician.

Reuben Hersh (1927–2020) was an American mathematician and academic, best known for his writings on the
nature, practice, and social impact of mathematics. His work challenges and complements mainstream philosophy
of mathematics.
ŽŽ
Godfrey Harold Hardy (February 1877 – December 1947) was an English mathematician, known for his
achievements in number theory and mathematical analysis. In biology, he is known for the Hardy–Weinberg
principle, a basic principle of population genetics.

Phu Nguyen, Monash University © Draft version


Chapter 1. Introduction 4

For example, if n D 3, then the sum is S.3/ D 1 C 2 C 3, which is 6, and if n D 4 then the
sum is S.4/ D 1 C 2 C 3 C 4 and so on. Now, mathematicians are lazy creatures, they do not
want to compute the sums, again and again, for different values of n. They want to find a single
formula for the sum that works for any n. To achieve that they have to see through the problem
or to see the pattern. Thus, they compute the sum for a few special cases: for n D 1; 2; 3; 4, the
corresponding sums are
n D 1 W S.1/ D 1
23
n D 2 W S.2/ D 1 C 2 D 3 D
2
34
n D 3 W S.3/ D 1 C 2 C 3 D 6 D
2
A pattern emerges and they guess the following formula
n.n C 1/
S.n/ D 1 C 2 C 3 C    C n D (1.1.1)
2
What is more interesting is how they prove that their formula is true. They write S.n/ in the
usual form, and they also write it in a reverse orderŽŽ , and then they add the two
S.n/ D 1 C 2 C    C .n 1/ C n
S.n/ D n C .n 1/ C    C 2 C 1
2S.n/ D .n C 1/ C .n C 1/ C    C .n C 1/ C .n C 1/ D n.n C 1/
„ ƒ‚ …
n terms

That little narrative is a (humbling) example of the mathematician’s art: asking simple and
elegant questions about their imaginary abstract objects, and crafting satisfying and beautiful
explanations. Now how did mathematicians know to write S.n/ in a reverse order and add
the twos? How does a painter know where to put his brush? Experience, inspiration, trial and
error, dump luck, as Paul Lockhart puts it [42]. (The number in brackets refers to the number of
the book quoted in the Bibliography at the end of this book). That is the art of it. There is no
systematic approach to maths problems. And that’s why it is interesting; if we do the same thing
over and over again, we get bored. In mathematics, you won’t get bored.
Why there was a pattern in the sum of the first n natural numbers? Are mathematicians the
only people that look for patterns? Philip Ball in his fascinating book Patterns in Nature answers
these questions well. He wrote:
The world is a confusing and turbulent place, but we make sense of it by finding order.
We notice the regular cycles of day and night, the waxing and waning of the moon
and tides, and the recurrence of the seasons. We look for similarity, predictability,
regularity: those have always been the guiding principles behind the emergence of
science. We try to break down the complex profusion of nature into simple rules,
to find order among what might at first look like chaos. This makes us all pattern
seekers.
They can do this because of the commutative property of addition: Changing the order of addends does not
ŽŽ

change the sum.

Phu Nguyen, Monash University © Draft version


Chapter 1. Introduction 5

All high school students know that in mathematics we have different territories: algebra,
geometry, analysis, combinatorics, probability and so on. What they usually do not know is that
there is a connection between different branches of mathematics. Quite often a connection that
we least expect of. To illustrate the idea, let us play with circles and see what we can get. Here
is the game and the question: roll a circle with a marked point around another circle of the
same radius, this point traces a curve. What is the shape of this curve? In Fig. 1.1a we roll the
dashed circle around the black circle, and we follow point P and this point traces out a beautiful
heart-shaped curve, which is called a cardioid. This beautiful heart-shaped curve shows up in
some of the most unexpected places.
Got your coffee? Turn on the flashlight feature of your phone and shine the light into the cup
from the side. The light reflects off the sides of the cup and forms a caustic on the surface of the
coffee. This caustic is a cardioid (Fig. 1.1b). Super interesting, isn’t it ?

P0
O1 P O2

(a) P traces out a cardioid (b)

Figure 1.1: The cardioid occurs in geometry and in your coffee.

So far, the cardioid appears in geometry and in real life. Where else? How about times table?
We all know that 2  1 D 2, 2  2 D 4, 2  3 D 6 and so on. Let’s describe this geometrically
and a cardioid will show up! Begin with a circle (of any radius) and mark a certain number
(designated symbolically by N ) of evenly spaced points around the circle, and number them
consecutively starting from zero: 0; 1; 2; : : : ; N 1. Then for each n, draw a line between points
n and 2n mod N . For example, for N D 10, connect 1 to 2, 2 to 4, 3 to 6, 4 to 8, 5 to 0 (this is
similar to clock: after 12 hours the hour hand returns to where it was pointing to), 6 to 2, 7 to
4, 8 to 6, 9 to 8. Fig. 1.2 is the results for N D 10; 20; 200, respectively. The envelope of these
lines is a cardioid, clearly for large N .
Let’s enjoy another unexpected connection in mathematics. The five most important numbers
in mathematics are 0,1 (which are foundations of arithmetic),  D 3:14159 : : :, which is the
most important number in geometry; e D 2:71828 : : :, which is the most important number in
calculus; and the imaginary number i , with i 2 D 1. And they are connected via the following
simple relation:
ei  C 1 D 0
which is known as Euler’ equation and it is the most beautiful equation in mathematics! Why
an equation is considered as beautiful? Because the pursuit of beauty in pure mathematics is a

For more detail check https://divisbyzero.com/2018/04/02/i-heart-cardioids/.

Phu Nguyen, Monash University © Draft version


Chapter 1. Introduction 6

3 2 6 5 4
7 3
4 1 8 2
9 1

5 0 10 0

11 19

6 9 12 18
13 17
7 8 14 15 16

(a) N D 10 (b) N D 20 (c) N D 200

Figure 1.2: A cardioid emerges from the times table of two when N is big.

tenet. Neuroscientists in Great Britain discovered that the same part of the brain that is activated
by art and music was activated in the brains of mathematicians when they looked at math they
regarded as beautiful.
I think these unexpected connections are sufficient for many people to spend time playing
with mathematics. People who do mathematics just for fun is called pure mathematicians. To
get an insight into the mind of a working pure mathematician, there is probably no book better
than Hardy’s essay A Mathematician’s Apology. In this essay Hardy offers a defense of the
pursuit of mathematics. Central to Hardy’s "apology" is an argument that mathematics has value
independent of possible applications. He located this value in the beauty of mathematics.
Below is a mathematical joke that reflects well on how mathematicians think of their field:

Philosophy is a game with objectives and no rules. Mathematics is a game with


rules and no objectives.

But, if you are pragmatic, you will only learn something if it is useful. Mathematics is super
useful. With it, physicists unveil the secretes of our universe; engineers build incredible machines
and structures; biologists study the geometry, topology and other physical characteristics of DNA,
proteins and cellular structures. The list goes on. People who do mathematics with applications
in mind is called applied mathematicians.
And a final note on the usefulness of mathematics. In 1800s, mathematicians worked on
wave equations for fun. And in 1864, James Clerk Maxwell–a Scottish physicist– used them to
predict the existence of electrical waves. In 1888, Heinrich Rudolf Hertz–a German physicist–
confirmed Maxwell’s predictions experimentally and in 1896, Guglielmo Giovanni Marconi– an
Italian electrical engineer– made the first radio transmission.
Is the above story of radio wave unique? Of course not. We can cite the story of differential
geometry (a mathematical discipline that uses the techniques of differential calculus, integral
calculus, linear algebra and multilinear algebra to study problems in geometry) by the German
mathematician Georg Friedrich Bernhard Riemann in the 19th century, which was used later by
the German-born theoretical physicist Albert Einstein in the 20th century to develop his general

Phu Nguyen, Monash University © Draft version


Chapter 1. Introduction 7

relativity theory. And the Greeks studied the ellipse more than a millennium before Kepler used
their ideas to predict planetary motions.
The Italian physicist, mathematician, astronomer, and philosopher Galileo Galilei once wrote:

Philosophy [nature] is written in that great book which ever is before our eyes – I
mean the universe – but we cannot understand it if we do not first learn the language
and grasp the symbols in which it is written. The book is written in mathematical
language, and the symbols are triangles, circles and other geometrical figures, with-
out whose help it is impossible to comprehend a single word of it; without which
one wanders in vain through a dark labyrinth.

And if you think mathematics is dry, I hope that Fig. 1.3 will change your mind. These
images are Newton fractals obtained from considering this equation of one single complex
variable f .z/ D z 4 1 D 0. There are four roots corresponding to four colors in the images. A
grid of 200  200 points on a complex plane is used as initial guesses in the Newton method of
finding the solutions to f .z/ D 0. The points are colored according to the color of the root they
converge to. Refer to Section 4.5.4 for detail.

100 100
Color Color
200 200 4 4
y y 3 3
300 300
2 2
400 400 1 1
0 0
500 500
(a)
100Un-zoomed
200 300 image
400 500 (b) Zoomed
100 in image
200 300 400 500

Figure
x 1.3: Newton fractals for z 4 1xD 0.

And who said mathematicians are boring, please look at Fig. 1.4. And Fig. 1.5, where we
start with an equilateral triangle. Subdivide it into four smaller congruent equilateral triangles
and remove the central triangle. Repeat step 2 with each of the remaining smaller triangles
infinitely. What we obtain are Sierpiński trianglesŽŽ .
Let’s now play the “chaos game” and we shall meet Sierpiński triangles again. The process is
simple: (1) Draw an equilateral triangle on a piece of paper and draw a random initial point, (2)
Draw the next point midway to one of the vertices of the triangle, chosen randomly, (3) Repeat
step 2 ad infinitum. What is amazing is when the number of points is large, a pattern emerges,
and it is nothing but Sierpiński triangles (Fig. 1.6)! If you are interested in making these stunning
images (and those in Fig. 1.7), check Appendix A.11.
ŽŽ
The Polish mathematician Wacław Sierpiński (1882 – 1969) described the Sierpinski triangle in 1915. But
similar patterns already appeared in the 13th-century Cosmati mosaics in the cathedral of Anagni, Italy.

Phu Nguyen, Monash University © Draft version


Chapter 1. Introduction 8

y y

x
x

(a) y D 1000x (b) .x.t /; y.t // (c) y D x 6 =16

Figure 1.4: I love you by mathematicians. The parametric equation for the heart is: x.t / D 16 sin3 .t /,
y.t/ D 13 cos.t / 5 cos.2t / cos.3t/ cos.4t /.

Figure 1.5: Sierpiński triangles.

Figure 1.6: Chaos game and Sierpiński triangles. Processing source: check folder chaos_game_pde in
my github account mentioned in the preface.

To know what is mathematics, there is no better way than to see how mathematicians think
and act. And for that I think mathematical jokes are one good way. Mathematicians Andrej and
Elena Cherkaev from University of Utah have provided a collection of these jokes at Mathemat-
ical humor and I use the following one

An engineer, a physicist and a mathematician are staying in a hotel. The engineer


wakes up and smells smoke. He goes out into the hallway and sees a fire, so he
fills a trash can from his room with water and douses the fire. He goes back to

Phu Nguyen, Monash University © Draft version


Chapter 1. Introduction 9

Figure 1.7: Chaos game, pentagon and fractals. Processing source: check folder chaos_game_2.

bed. Later, the physicist wakes up and smells smoke. He opens his door and sees a
fire in the hallway. He walks down the hall to a fire hose and after calculating the
flame velocity, distance, water pressure, trajectory, etc. extinguishes the fire with the
minimum amount of water and energy needed. Later, the mathematician wakes up
and smells smoke. He goes to the hall, sees the fire and then the fire hose. He thinks
for a moment and then exclaims, "Ah, a solution exists!" and then goes back to bed.
to demonstrate that sometimes showing that something exists is just as important as finding
itself.
With just pen and paper and reasoning mathematics can help us uncover hidden secretes of
many many things from giant objects such as planets to minuscule objects such as bacteria and
every others in between. Let’s study this fascinating language; the language of our universe.
Hey, but what if someone does not want to become an engineer or scientist, does he/she still
have to learn mathematics? I believe he/she should because of the following reasons. According
to Greek, mathematics is learning and according to Hebrew it is thinking. So learning mathe-
matics is to learn how to think, how to reason, logically. Réne Descarte once said “I think then I
am”.
Before delving into the world of mathematics, we first need to get familiar to some common
terminologies; terms such as axioms, theorems, definitions and proofs. And the next section is
for those topics.

1.2 Axiom, definition, theorem and proof


Axioms are assumptions that all agree to be true. No proof is needed for axioms as they are
the rules of the game. Actually we cannot prove axioms and that is why we have to accept
them. Then come definitions. A definition is to define a word. For example to define an even
function, mathematicians write: ‘A function f .x/ W R ! R is an even function if for any
x 2 R: f . x/ D f .x/’. Mathematicians define something when they need it for their work.
The following joke is a good example about this:
One day a farmer called up an engineer, a physicist, and a mathematician and
asked them to fence of the largest possible area with the least amount of fence. The

Phu Nguyen, Monash University © Draft version


Chapter 1. Introduction 10

engineer made the fence in a circle and proclaimed that he had the most efficient
design. The physicist made a long, straight line and proclaimed "We can assume the
length is infinite..." and pointed out that fencing off half of the Earth was certainly
a more efficient way to do it. The Mathematician just laughed at them. He built a
tiny fence around himself and said "I declare myself to be on the outside."

Unlike scientists and engineer who study real things in our real world and that’s why they
are restricted by the laws of nature, mathematicians study objects such as numbers, functions
which live in a mathematical world. Thus, mathematicians have more freedom.
Next come theorems. A theorem is a statement about properties of one or more than objects.
One can have this theorem regarding even functions: ‘If f .x/ is an even function, then its
derivative is an odd function’. We need to provide a mathematical proof for a mathematical
statement to become a theorem.
The word "proof" comes from the Latin probare (to test). The development of mathematical
proof is primarily the product of ancient Greek mathematics, and one of its greatest achievements.
Thales and Hippocrates of Chios gave some of the first known proofs of theorems in geometry.
Mathematical proof was revolutionized by Euclid (300 BCEŽ ), who introduced the axiomatic
method still in use today. Starting with axioms, the method proves theorems using deductive
logic: if A is true, and A implies B, then B is true. Or “All men smoke weed; Sherlock Holmes
is a man; therefore, Sherlock Holmes smokes weed”.
As a demonstration of mathematical proofs, let’s consider the following problem. Given
a  b  c  0 and a C b C c  1, prove that a2 C 3b 2 C 5c 2  1.

Proof. We first rewrite the term a2 C 3b 2 C 5c 2 as (why? how do we know to do this step?)

a2 C 3b 2 C 5c 2 D a2 C b 2 C c 2 C 2b 2 C 2c 2 C 2c 2

Then using the data that a  b  c  0, we know that 2b 2 D 2bb  2ab, thus

a2 C 3b 2 C 5c 2  a2 C b 2 C c 2 C 2ab C 2ca C 2cb

Now, we recognize that the RHSŽŽ is nothing but .a Cb Cc/2 because of the well known identity
.a C b C c/2 D a2 C b 2 C c 2 C 2ab C abc C 2ca. Thus, we have

a2 C 3b 2 C 5c 2  .a C b C c/2

And if we combine this with the data that a C b C c  1, we have proved the problem. 
Ž
Common Era (CE) and Before the Common Era (BCE) are alternatives to the Anno Domini (AD) and
Before Christ (BC) notations used by the Christian monk Dionysius Exiguus in 525. The two notation systems are
numerically equivalent: "2022 CE" and "AD 2022" each describe the current year; "400 BCE" and "400 BC" are
the same year.
ŽŽ
The expression on the right side of the "=" sign is the right side of the equation and the expression on the left
of the "=" is the left side of the equation. For example, in x C 5 D y C 8, x C 5 is the left-hand side (LHS) and
y C 8 is the right-hand side (RHS).

Phu Nguyen, Monash University © Draft version


Chapter 1. Introduction 11

To indicate the end of a proof several symbolic conventions exist. While some authors
still use the classical abbreviation Q.E.D., which is an initialism of the Latin phrase quod erat
demonstrandum, meaning "which was to be demonstrated", it is relatively uncommon in modern
mathematical texts. Paul Halmos pioneered the use of a solid black square at the end of a proof
as a Q.E.D symbol, a practice which has become standard (and followed in this text), although
not universal.
The proof is simple because this is a problem for grade 7/8
a b c
students. But how about a proof with shapes? See Fig. 1.8 for
such a geometry-based proof. Terms like a2 should be seen as the c2
b2
area of a square of side a. Inside the big square of side a C b C c– a a2
of which the area is smaller than or equal to 1, we have many
smaller squares including one a2 , three b 2 and five c 2 , and it is
quite obvious that we have the inequality a2 C 3b 2 C 5c 2  1. c2
Essentially, this geometry based proof is similar to the previous b b2 b2
proof, but everyone would agree it is easier to understand. I rec-
ommend the book Proofs without words by Roger Nelsen [53] c c2 c2 c2
for such elegant proofs.
I present another problem. Let’s take the case of a triangle Figure 1.8: A proof of an alge-
inside a semicircle. If we play with it long enough, we will see onebra problem using
remarkable thing:geometry.
no matter
where on the circle we place the tip of the triangle, it always forms a nice right triangle (Fig. 1.9a).
Mathematics is full of surprises like this. But is it true? We need a proof to answer the question
How do we know it? In the same figure, I present a proof commonly given in high school
geometry classes. A complete proof would be more verbose than what I present here. Does it
exist a better proof? See Fig. 1.9b: ACBC 0 is a rectangle and thus ABC is a right triangle! This
geometric proof is said to be elegant because it allows us to see why the theorem is true.

C
C
C
ˇ ˛

ˇ 180ı
˛
A B A B A B
O O O

C0
(a) In triangle ABC : 2.˛ C ˇ/ D  (b) ACBC 0 is a rectangle

Figure 1.9: The angle inscribed in a semicircle is always a right angle (90ı ): two proofs. (a) The angles
of the same marking are equal, and thus we have 2.˛ C ˇ/ D  or ˛ C ˇ D =2. The key to the
proof is to draw the line OC which does not initially exist in the problem. And (b) Rotate ABC 180ı
counterclockwise we get a new triangle ABC 0 . Alternatively, draw OC (as in the first proof) but extend
it to meet the circle at C 0 (to have symmetry). Now the box ACBC 0 is a parallelogram. But it is a special
parallelogram: the two diagonals are equal (both are diameters of the circle). So, ACBC 0 must be a
rectangle. The second proof is considered to be better than the first because we can “see” the result.

Phu Nguyen, Monash University © Draft version


Chapter 1. Introduction 12

Not all proofs are as simple as the above ones. For example, in number theory, Fermat’s Last
Theorem states that no three positive integers a; b, and c satisfy the equation an C b n D c n
for any integer value of n greater than 2. This theorem was first stated as a theorem by Pierre
de Fermat around 1637 in the margin of a copy of Arithmetica; Fermat added that he had a
proof that was too large to fit in the margin. After 358 years of effort by countless number of
mathematicians, the first successful proof was released only very recently, in 1994, by Andrew
Wiles (1953) an English mathematician. About Wiles’ proof, it is 192 pages long.
Proofs are what separate mathematics from all other sciences. In other sciences, we accept
certain laws because they conform to the real physical world, but those laws can be modified if
new evidence presents itself. One famous example is Newton’s theory of gravity was replaced
by Einstein’s theory of general relativity. But in mathematics, if a statement is proved to be true,
then it is true forever. For instance, Euclid proved, over two thousand years ago, that there are
infinitely many prime numbers (a prime number is a number that can only be divided by itself
and 1 without remainders), and there is nothing that we can do that will ever contradict the truth
of that statement.
It is impossible to live in our society without taking some body of knowledge on authority
(i.e., without asking why). No one has the energy and capacity to check everything. But in
mathematics, the “why” question is asked at every stage, with the expectation of a clear and
indisputable answer via proofs. So, in mathematics we have only black and white. Not entirely
true...
In mathematics, a conjecture is a conclusion or a proposition which is suspected to be true
due to preliminary supporting evidence, but for which no proof or disproof has yet been found.
For example, on 7 June 1742, the German mathematician Christian Goldbach wrote a letter to
Leonhard Euler in which he proposed the following conjecture: Every positive even integer can
be written as the sum of two primes. The guess sounds true: 8 D 5C3; 24 D 19C5; 64 D 23C41
etc.. Furthermore, no one has yet found an even number for which this statement does not
work out. But no one so far has proven that exceptions cannot exist. Is the pattern real or
imaginary? If it is real, why should numbers have this property? Such mysteries are what motivate
mathematicians. This became Goldbach’s conjecture and is one of the oldest and best-known
unsolved problems in number theory and all of mathematics.
If we can find one counterexample, just one, to a conjecture then we disprove it. To illustrate
this point, I present the following example taken from the interesting book Proofs: A Long-Form
Mathematics Textbook by Jay Cummings [16]. The problem is: suppose were are investigating
how many regions are formed if one places n dots randomly on a circle and then connects them
with line. We start with n D 2; 3; 4; 5 and count the regions, The results given in Fig. 1.10
suggest that the number of regions can be 2n 1 , which forms our conjecture. It turns out this
conjecture is, however, wrong. When we put 6 points on a circle and count the number of
regions, we find that there are only 31 regions, not 32 as suggested by 2n 1 ; thus disproving the
conjecture.
This was an easy situation because we could find a counterexample for such a small value
of n. Things become much harder when the conjecture is correct for all values of n that we can
verify but we just don’t know how to prove it. That is the situation with Goldbach’s conjecture.
This example also demonstrates why mathematicians are proof-obsessed. We cannot simply

Phu Nguyen, Monash University © Draft version


Chapter 1. Introduction 13

3 1 4

nD2 nD3 nD4 nD5


2 regions 4 regions 8 regions 16 regions
Figure 1.10: A circle with n random points on it; two points are connected by a line. These lines make
regions–space bounded by closed curves.

believe in something just because it looks correct or because it holds for certain case. We need a
proof.
In addition to the above-mentioned terminologies, we also have proposition, lemma and
corollary. A proposition is a less important but nonetheless interesting true statement. A lemma
is a true statement used in proving other true statements (that is, a less important theorem that is
helpful in the proof of other more important theorems). And finally, a corollary denotes a true
statement that is a simple deduction from a theorem or proposition. For example, the sum of
the interior angles of any triangle is always 180 degrees. One corollary to that theorem: Each
interior angle of an equilateral triangle is 60 degrees.

1.3 Exercises versus problems


It is vital to differentiate exercises and problems as we should not spend too much time on
the former. It is the latter that is fun and usually leads to interesting things. Roughly speaking,
exercises are problems or questions that you know the procedure to solve them (even though p
you
b˙ b 2 4ac
might not be able to finish it). For example, a typical exercise is using the formula 2a
to solve many quadratic equations e.g. x 2 2x C 1 D 0, x 2 C 5x 10 D 0. We think that you
should solve only three quadratic equations: (i) .x 1/.x 2/ D 0 (two real solutions), (ii)
.x 1/.x 1/ D 0 (two repeated solutions) and x 2 C 1 D 0 (no real solutions) and plot the
graphs of them. That’s it. Solving fifty more quadratic equations would not bring you any joy
and useful things.
On the other hand, problems are questions that we do not know before hand the procedure
to solve. There are two types of mathematics questions: ones with known solutions and ones
with/without solutions. The latter problems arise in mathematical research. Herein, we focus on
the former–problems with known solutions but we do not know them yet. For example, consider
this problem of finding the roots of the following fourth-order equation:
x4 3x 3 C 4x 2 3x C 1 D 0 (1.3.1)
How can we solve this equation? There is no formula for x. After many attempts, we have found
that dividing this equation by x 2 is a correct direction (actually this was used by Lagrange some
200 years ago):  
2 1 1
x C 2 3 xC C4D0 (1.3.2)
x x

Phu Nguyen, Monash University © Draft version


Chapter 1. Introduction 14

Due to symmetry, we do a change of variable with u D x C 1=x , thus we obtain

u2 3u C 2 D 0 ) u D 1; uD2 (1.3.3)

If we allow only real solutions, then with u D 2, we have x C 1=x D 2 which gives x D 1.

1.4 Problem solving strategies


Whenever we have solved (successfully) a math problem, should we celebrate our success a
bit and move on to another problem as quick as possible? Celebration yes, but moving on to
another problem, no. Instead we should perform the fourth step recommended by the Hungarian
mathematician George Pólya (1887 – 1985) in his celebrated book ’How to solve it’ [55]. That
is looking back, by answering the following equations:

 Can we check the result? Substituting x D 1 into the LHS of Eq. (1.3.1) indeed yields
zero;

 Can we guess the result? Can we solve it differently? We can, by trial and error, see that
x D 1 is a solution and factor the LHS as .x 1/.x 3 2x 2 C 2x 1/. And proceed from
there.

 Can we use the method for some other problem? Yes, we can use the same technique for
equations of this form ax 4 C bx 3 C cx 2 C bx C a D 0.

This step of looking back is actually similar to reflection in our lives. We all know that once in a
while we should stop doing what we suppose to do to think about what we have done.
Another useful strategy is to get familiar with the problem before solving it. For example,
consider this two simultaneous equations:

127x C 341y D 274


(1.4.1)
218x C 73y D 111

There is a routine method for solving such equations, which I do not bother you with here. What
I want to say here is that if we’re asked to solve the following equations by hands, should we
just apply that routine method?

6 751x C 3 249y D 26 751


(1.4.2)
3 249x C 6 751y D 23 249

No, we leave that for computers. We’re better. Let’s spend time with the problem first, and we
see something special now:

6 751x C 3 249y D 26 751


(1.4.3)
3 249x C 6 751y D 23 249

Phu Nguyen, Monash University © Draft version


Chapter 1. Introduction 15

We see a symmetry in the coefficients of the equations. This guides us to perform operations that
maintain this symmetry: if we sum the two equations we get x C y D : : : And if we subtract the
first from the second we get x y D : : : (we can do the inverse to get y x D : : :). Now, the
problem is very easy to solve.
As another example of exploiting the symmetry of a problem, consider this geometry prob-
lem: a square is inscribed in a circle that is inscribed in a square. Find the ratio of the area of the
smaller square over that of the large square. We can introduce symbols to the problem and use
the Pythagorean theorem to solve this problem (Fig. 1.11a): that ratio is 1=2. But we can also
use symmetry: if we rotate (counter clockwise) the smaller square 45 degrees with respect to the
center of the circle, we get a new problem shown in Fig. 1.11b. And it is obvious that the ratio
that we’re looking for is 1=2.


x 2
x
45◦
O
x

(a) (b)

Figure 1.11: Using symmetry to solve problems.

For problem solving skills, I recommend to read Pólya’s book and the book by Paul Zeit, [72].
The latter contains more examples at a higher level than Pólya’s book. Another book is ‘Solving
mathematical problems: a personal perspective’ by the Australian-American mathematician
Tarence Tao (1975). He is widely regarded as one of the greatest living mathematicians. If you
want to learn ’advanced’ mathematics, his blog is worth of checking.

1.5 Computing in mathematics


Herein I discuss the role of computers in learning mathematics and solving mathematical prob-
lems. First, we can use computers to plot complex functions (Fig. 1.12). Second, a computer can
help to understand difficult mathematical concepts such as limit. To demonstrate this point, let’s
consider the geometric series S D 1=2 C 1=4 C 1=8 C    . Does this series converge (i.e., when
enough terms are used one gets the same result) and what is the sum? We can program a small
code shown in Fig. 1.13b to produce the table shown in Fig. 1.13a. The data shown in this table
clearly indicates that the geometric series do converge and its sum is 1.
Third, when it comes to applied mathematics, computers are an invaluable tool. In applied
mathematics, problems are not solved exactly by hands, but approximately using some algo-
rithms which are tedious for hand calculations but suitable for computers. To illustrate what

Phu Nguyen, Monash University © Draft version


Chapter 1. Introduction 16

(a) z D x 2 C 10xy C y 2 (b) r D sin2 .1:2 / C


cos .6/3

Figure 1.12: Computers are used to plot complex functions.

(a) (b)

Figure 1.13: A small Julia program to compute the geometric series.

applied mathematics is about, let’s solve this equation f .x/ D cos x p x D 0; i.e., finding all
b 2 4ac
values of x such that f .x/ D 0. Hey, there is no formula similar to b˙ 2a for this equation.
That’s why Newton developed a method to get approximate solutions. Starting from an initial
guess x0 , his method iteratively generates better approximations:

cos xn xn
xnC1 D xn C
1 C sin xn

With only four such calculations, we get x D 0:73908513 which is indeed the solution to
cos x x D 0.
And finally, computers are used to build amazing animations to explain mathematics, see
for example this YouTube video. Among various open source tools to create such animations,

Phu Nguyen, Monash University © Draft version


Chapter 1. Introduction 17

processing is an easy to use tool, based on Java–a common programming language. Fig. 1.6
were made using processing.
I have introduced two tools for programming, namely Julia and processing. This is be-
cause the latter is better suited for making animations while the former is for scientific computing.
For the role of computers in doing mathematics, I refer to the great book Mathematics by
Experiment: Plausible Reasoning in the 21st Century by Jonathan Borwein and David Bailey
[9].
But if you think that computers can replace mathematicians, you are wrong. Even for arith-
metic problems, computers are not better than human. One example is the computation of a sum
like this (containing 1012 terms)

1 1 1 1
SD C C C    C 24
1 4 9 10
Even though a powerful computer can compute this sum by adding term by term, it takes
a long time (On my macbook pro, Julia crashed when computing this sum!). The result is
S D 1:6449340668482264ŽŽ . Mathematicians developed smarter ways to compute this sum; for
example this is how Euler computed this sum in the 18th century:

1 1 1 1 1 1 1 1 1
SD C C C C C C C C
1 4 9 16 25 36 49 64 81
1 1 1 1
C C C
10 200 6000 3  106
a sum of only 13 terms and got 1:644934064499874-a result which is correct up to eight deci-
mals! The story is while solving the Basel problem (i.e., what is S D 1 C 1=4 C 1=9 C 1=16 C
   C 1=k 2 C    ; Section 2.21.4), Euler discovered/developed the so-called Euler-Maclaurin
summation formula (Section 4.18).
Computers can be valuable assistants, but only when a lot of human thought has gone into
setting up the computations.

1.6 Mathematical anxiety or math phobia


Mathematical anxiety, also known as math phobia, is anxiety about one’s ability to do mathemat-
ics. Math phobia, to some researchers, is gained—not from personal experience but from parents
and teachers and the textbooks they used. So, to the kids who think you are a math phobia, don’t
panic. That’s not your fault.
To illustrate the problem of textbooks and teachers, I quote the prologue of the classic 1910
text named Calculus Made Easy by the eccentric British electrical engineer Silvanus Phillips
Thompson (1851 – 1916):

Available for free at https://processing.org.
ŽŽ
And this number is exactly  2=6. Why  is here? It’s super interesting, isn’t it? Check this youtube video for
an explanation.

Phu Nguyen, Monash University © Draft version


Chapter 1. Introduction 18

Considering how many fools can calculate, it is surprising that it should be thought
either a difficult or a tedious task for any other fool to learn how to master the same
tricks. Some calculus-tricks are quite easy. Some are enormously difficult. The fools
who write the textbooks of advanced mathematics — and they are mostly clever
fools — seldom take the trouble to show you how easy the easy calculations are. On
the contrary, they seem to desire to impress you with their tremendous cleverness
by going about it in the most difficult way. Being myself a remarkably stupid fellow,
I have had to unteach myself the difficulties, and now beg to present to my fellow
fools the parts that are not hard. Master these thoroughly, and the rest will follow.
What one fool can do, another can.
And Thompson’s view on textbooks was shared by Cornelius Lanczos (1893-1974), a
Hungarian-American mathematician and physicist, who wrote in the preface of his celebrated
book The Variational Principles of Mechanics these words:
Many of the scientific treaties of today are formulated in a half-mystycal language,
as though to impress the reader with the uncomfortable feeling that he is in the
permanent presence of a superman. The present book is conceived in a humble
spirit and is written for humble people.
Paul Lockhart in A Mathematician’s Lament wrote the following words
THERE IS SURELY NO MORE RELIABLE WAY TO KILL enthusiasm and interest
in a subject than to make it a mandatory part of the school curriculum. Include it
as a major component of standardized testing and you virtually guarantee that the
education establishment will suck the life out of it. School boards do not understand
what math is; neither do educators, textbook authors, publishing companies, and,
sadly, neither do most of our math teachers.
Talking about teachers, Nobel winning physicist Richard Feynman has once said "If you find
science boring, you are learning it from wrong teacher". He implied that if you have a good
teacher you can learn any topic.
Let me get back to those kids who thought they fell behind the math curriculum. What should
you do? I have some tips for you. First, read A Mathematician’s Lament of Paul Lockhart. After
you have finished that book, you would be confident that if you study maths properly you can
enjoy mathematics. Second, spend lots of time (I spent one summer when I fell behind in the
9th grade) to learn maths from scratchŽ . Lockhart’s other books (see Section 1.8.1) will surely
help. And this book (Chapters 1/2/3 and Appendices A/B) could be useful.
Is math ability genetic? Yes, to some degree. Essentially none of us could ever be as good at
math as Terence Tao, no matter how hard we tried or how well we were taught. But here’s the
thing: We don’t have to! For high-school and college math, inborn talent is much less important
than hard work, preparation, and self-confidence.
Ok. What one fool can do, another can. What a simple sentence but it has a tremendous
impact on people crossing it. It has motivated many people to start learning calculus, including
Feymann. And we can start learning maths with it.
Ž
If you’re in the middle of a semester, then spend less time on other topics. You cannot have everything!

Phu Nguyen, Monash University © Draft version


Chapter 1. Introduction 19

1.7 Millennium Prize Problems


Is mathematics complete as the way it is presented in textbooks? On the contrary, far from that.
There are many mathematical problems of which solutions are still elusive to even the brightest
mathematicians on Earth. The most famous unsolved problems are the Millennium Problems.
They are a set of seven problems for which the Clay Mathematics Institute offered a US $ 7
million prize fund ($ 1 million per problem) to celebrate the new millennium in May 2000. The
problems all have significant impacts on their field of mathematics and beyond, and were all
unsolved at the time of the offering of the prize.
The Riemann hypothesis–posed by the Germain Bernhard Riemann in 1859 in his paper
“Ueber die Anzahl der Primzahlen unter einer gegebenen Grösse” (On the number of primes
less than a given magnitude)–is perhaps the most famous unsolved problem in mathematics. It
concerns the nontrivial zeroes of the Riemann zeta function, which is defined for Re s > 1 by
the infinite sum
Xn
1
.s/ D
sD1
ns

This function has trivial zeroes (that are all s such that .s/ D 0) on the negative real line, at
s D 2; 4; 6; : : : The location of its other zeroes is more mysterious; the conjecture is that

The nontrivial zeroes of the zeta function lie on the line Re s D 0:5

Yes, the problem statement is as that simple, but its proof is elusive to all mathematicians to date.
In 1900 at the International Congress of Mathematicians in Paris, the Germain mathematician
David Hilbert gave a speech which is perhaps the most influential speech ever given to math-
ematicians, given by a mathematician, or given about mathematics. In it, Hilbert outlined 23
major mathematical problems to be studied in the coming century. And the Riemann hypothesis
was one of them. Hilbert once remarked:

If I were to awaken after having slept for a thousand years, my first question would
be: Has the Riemann hypothesis been proven?

Judging by the current rate of progress (on solving the hypothesis), Hilbert may well have to
sleep a little while longer.
It is usually while solving unsolved mathematical problems that mathematicians discover
new mathematics. The new maths also help to understand the old maths and provide better
solution to old problems. Some new maths are also discovered by scientists especially physicists
while they are trying to unravel the mysteries of our universe. Then, after about 100 or 200 years
some of the new maths come into the mathematics curriculum to train the general public. And
the educators, whoever they are, hope that our kids can understand these maths–the mathematics
that were once only within the grasp of a few greatest mathematicians!

Phu Nguyen, Monash University © Draft version


Chapter 1. Introduction 20

1.8 How to learn mathematics


High school students should probably start this book with this section. Herein I provide some
reading materials (Section 1.8.1) and how to learn maths tips (Section 1.8.2).

1.8.1 Reading materials


When you’re solving problems, working through textbooks, getting into the nitty-gritty details
of each topic, it’s so easy to lose the forest for the trees and forget why you even became inspired
to study the topic that you’re learning in the first place. If you read only the text-books, you will
find the subject dull. Text-books on mathematics are written for people who already possess a
strong desire to study mathematics: they are not written to crease such a desire. Do not begin
by reading the subject. Instead, begin by reading around the subject. This is where really, really
good (and non-speculative) books on that topic come in handy: they inspire, they encourage, and
they help you understand the big picture. For mathematics and physics, the following are among
the bests (at least to me):

 A Mathematician’s Lament by Paul Lockhart;

 Measurement by Paul Lockhart;

 The Joy of x by Steven Strogatz;

 The Feynman Lectures on Physics;

 An Imaginary Tale by Paul Nahin;

 Character of Physical Law by Richard Feynman;

 Evolution of Physics by Einstein and Infeld;

 Letters to a young mathematician by Ian Stewart

In A Mathematician’s Lament Paul Lockhart describes how maths is incorrectly taught in schools
and he provides better ways to teach maths. He continues in Measurement by showing us how we
should learn maths by ‘re-discovering maths’ for ourselves. Of course what Paul suggested works
only for self study people. What if you are a high school student? There are two possibilities.
First, if you fortunately have a great teacher, then just stick with her/him. Second, if you do not
have such luck, you can ignore her/him and self study maths with your own pace. Do not forget
that mark is not important for deep understanding. Having said that, marks are vital for getting
scholarships, sadly.
The Joy of x by Steven Strogatz belongs to a family of maths books that aim to popularize
mathematics. In this family you can also find equally interesting books such as Journey through
Genius by William Dunham, or 17 equations that changed the world by Ian Stewart etc. It is
beneficial at a young age to read these books to realize that mathematics is not a dry, boring
topic. On the contrary, it is interesting. Similarly, An Imaginary Tale: The story of square root of

Phu Nguyen, Monash University © Draft version


Chapter 1. Introduction 21

p
-1 by Paul Nahin is a popular maths book which tells the fascinating story of 1. In the book,
I have referred to many other popular math books (see the Reference list).
The Feynman Lectures on PhysicsŽ by the Nobel winning physicist Richard Feymann is
probably the best to learn college level mathematics by studying physics. Bill Gates once said
’Feymann is the best teacher I never had’. In these lectures Feymann beautifully introduced
various physics topics and the mathematics required to describe them. He also describes how
physicists think about problems. Another reason to read these lectures is that it is good to
read books at a level higher than your knowledge. Feymann lectures were written for Caltech
(California Institute of Technology) undergraduates.
Evolution of Physics by the greatest physicist Einstein teaches us how to imagine. Through
imaginary thought experiments the book explains the basic concepts of physics. It is definitely a
must read for all students who want to learn physics.
And if you want to become a professional mathematician, read Letters to a young mathemati-
cian by Ian Stewart [61]. Ian Stewart (born 1945) is a British mathematician who is best-known
for engaging the public with mathematics and science through his many bestselling books, news-
paper and magazine articles, and radio and television appearances.
And don’t forget to read the history of mathematics. Here are some books on this topic:
 A history of Mathematics: an introduction by Victor Katz [33];
 A short account of the history of mathematics by W. W. Rouse Ball [6];
 Mathematics and Its History by John Stillwell [66];
 Men of Mathematics: The Lives and Achievements of the Great Mathematicians from
Zeno to Poincaré by E. T. Bell [7];
If you prefer watching the history of maths unfold, the BBC Four The story of Maths is excellent.
You can find it on YouTube.

How should we read a mathematics textbook?. Of course the first thing to notice is that we
cannot read a math book like reading a novel. The second thing is that we should not read it
page-by-page, word-by-word from the beginning to the end in one go. The third thing is that
maths textbooks are usually many times longer than necessary because they have to include a
lot of exercises (at the end of each section or chapter). Why so? Mostly to please the publishers
who aim for financial targets not educational ones! As discussed in Section 1.3, it is better to
spend time solving problems rather than exercises. It is certain that we first still have to do a few
exercises to understand a concept/method. But that’s it.
Here is one suggestion on how we should read a math book (based on many recommendations
that I have collected from various sources). It is clear that something that works for one person
might not work for others, but it can be a start:

 1st read: skim through a section/chapter first. The idea is to see the forest, not the trees.
Knowing all the trees in the first go would be too much;
Ž
The lectures are freely available at https://www.feynmanlectures.caltech.edu.

Phu Nguyen, Monash University © Draft version


Chapter 1. Introduction 22

 2nd read: read slowly (with paper/pencil) to get know the trees; focus on the motivation,
the definition, the theorem;

 3rd read: read around; read the history of the concept;

 4th read: pay attention to the proofs; study them carefully and reproduce a proof for
yourself.

1.8.2 Learning tips


One way to learn a lot of mathematics is by reading the first chapters of many books
(Paul Halmos)

It is not a surprise that many of us have studied many topics naturally i.e., without understand-
ing how the brain works. We can compensate for that lack of knowledge by reading Learning
How to Learn by Barbara Oakley and Terry Sejnowski. I do not repeat their advice here, because
they’re the experts and I am not. Instead, I provide my owns that I have learned and developed
over the years (I do not claim they are the best practices, I just feel that I should share what I
think are useful; I wish I had known them when in school):

 If you have a bad teacher, simply ignore his/her class. There are excellent math teachers
online. Learn from them instead. You can listen to the story of Steven Strogatz at https:
//www.youtube.com/watch?v=SUMLKweFAYk to see how a teacher can change your love
to mathematics and then your life;

 If you have questions (any) on maths, you can post them to https://math.
stackexchange.com and get answers;

 The best way to learn is to teach. If you do not have such opportunity, you can write about
what you know. Similar to this note. Or you can write a blog on maths. Writing is one of
the best way to consolidate your understanding of what you have learn (not only maths)ŽŽ .
You might wonder ’but writing is time consuming’. That is not true if you write just one
page per day and you’re doing that consistently for everyday;

 LATEX is the best tool (as for now) for writing mathematics. So it is not a bad idea to learn
it and use it (for Mathematics Stack Exchange you have to use LATEX anyway). This book
was typeset using LATEX; If you do not know where to start with LATEX, check this youtube
video out;

 While learning maths, it is a good habit to keep in mind that mathematics is about ideas
not formula or numbers. So, first you should be able to express the idea in your own
ŽŽ
As Dick Guindon once said Writing is nature’s way of letting you know how sloppy your thinking is.

Phu Nguyen, Monash University © Draft version


Chapter 1. Introduction 23

speaking language. Then, translate that to the language of maths. For example, the idea
of convergence of a sequence expressed in both English and mathematics:

8 > 0 9N 2 N such that 8n > N jan aj < 


however small there is a point such that beyond that all the terms are
epsilon is in the sequence point within epsilon of a

 Just like learning any speaking languages, to speak the language of maths you have to
study its vocabulary. You should get familiar with Greek symbols like ; ı, 8 etc.;

 And as Euclid told Ptolemy 1st Soter, the first king of Egypt after the death of Alexander
the Great ‘there is no royal road to geometry’, you have to do mathematics. Just as to
enjoy swimming you have to jump into the water, by just watching others swimming you
will never understand the excitement;

 Knowing the name of something doesn’t mean you understand itŽŽ There is a way to
test whether you understand something or only know the name/definition. It’s called the
Feynman Technique, and it works like this: “Without using the new word which you have
just learned, try to rephrase what you have just learned in your own language.”;

 As there is no single book that can covers everything about any topics, it is better to have
a couple of good books for any topics;

 Read mathematics books very slowly; do not lose the forest for the trees. Study the defini-
tions carefully, why we need them. Then, play with the definitions to see what properties
they might possess. Until then, study the theorems. And finally the proofs. If you just want
to be a scientist or engineer, then focus less on the proofs;

 Study the history of mathematics. Not only it tells you interesting stories but also it reveals
that great mathematicians are also human, they had to struggle, they failed many times
before succeeded in developing a sound mathematical idea;

 If you fall behind in maths, physics, chemistry (I used to in 8th grade), just focus on
improving your maths. Being better at math, you will do fine with physics and chemistry.
Remember that math is the language God talks;

 It is impossible to understand algebra if you have not mastered arithmetic. It is impossible


to understand calculus if you have not mastered algebra. So, do not rush, it is important to
go back to an early stage;

 Maths is a huge subject and it is impossible to be good at everything in maths. If you’re


not good at geometry it does not mean that you’re not good at maths. You might not be
good at pure maths, but you can be excel in applied maths;
ŽŽ
Feymann’s father once told him “See that bird? It’s a brown-throated thrush, but in Germany it’s called a
halzenfugel, and in Chinese they call it a chung ling and even if you know all those names for it, you still know
nothing about the bird.”

Phu Nguyen, Monash University © Draft version


Chapter 1. Introduction 24

 Be aware of focused vs diffuse mode of thinking. Check the book Learning How to Learn
for details. In short, diffuse mode is when your mind is relaxed and free. You’re thinking
about nothing in particular. You’re in diffuse mode when you’re walking in a park (without
a phone of course), having a bath etc. And usually it is when you’re in a diffuse mode that
you find solutions to problems that you have been struggling to solve . And of one the best
way to get into a diffuse mode is walking. It’s not a coincidence that many of the finest
thinkers in history were enthusiastic walkers. An old example is Aristotle, the famous
Greek philosopher, empiricist, who conducted his lectures while walking the grounds of
his school in Athens;
 How long should you fight before giving up to look at the solutions? We admit it is very
tempting to look at the solutions when we’re stuck. But don’t! The best is to play with
the exercises for a while (2 hours?ŽŽ ), if still no luck, then forget it, do something else.
Come back to it later, do the same thing. After one or two days, still stuck, then look at
the solutions, but only the first step, solve the problem and do self-reflection. If you have
to look at the entire solution, then make sure you can repeat all the steps by yourself later.
Only then the material is yours. Don’t fool yourself by just looking at the solutions and
think that you understand the math. No! That is illusion of competence–a mental situation
where you think you’ve mastered a set of material but you really haven’t. We all watch
Messi scoring a goal from a free kick: he just puts the ball into the high left corner so that
the goal keeper cannot reach it. But can we repeat that?;
 Do more self-reflection. What is the place that you learn most effectively? When is the
time you’re most productive? After solving every math question, ask questions like: why
the method works, why that answer, is the answer reasonable, does the method work if we
modify the question? Why I could not see the solution? Are there other ways to solve the
same problem? Only after having answered all these questions, then move to a new math
problem;
 Facing a math problem, you should do something: loosen up yourself, draw something,
write down something ... And in your head say that “I can solve it, I can solve it”. This is
called a growth mindset a term presented by Psychologist Dr. Carol Dweck of Stanford
University;
 To have a sharp mind and body we do exercies. Similarly your maths will be rusty if
you do not use it. I heard that Zdeněk Bažant– a Professor of Civil Engineering and

Archimedes has gone down in history as the guy who ran naked through the streets of Syracuse shouting
"Eureka!" — or "I have it!" in Greek. The story behind that event was that Archimedes was charged with proving
that a new crown made for Hieron, the king of Syracuse, was not pure gold as the goldsmith had claimed. Archimedes
thought long and hard but could not find a method for proving that the crown was not solid gold until he took a
bath.
ŽŽ
Of course how long before giving up is a personal decision. But I want to use Polya’s words about the pleasure
of finding something out for yourself: “A great discovery solves a great problem but there is a grain of discovery
in the solution of any problem. Your problem may be modest; but if it challenges your curioisty and brings into
play your inventive faculties, and if you solve it by your own means, you may experience the tension and enjoy the
triumph of discovery”.

Phu Nguyen, Monash University © Draft version


Chapter 1. Introduction 25

Materials Science at Northwestern University–keeps solving a partial differential equation


everyweek! Note that he is not a mathematician; but he needs maths for his work;

 If you plan to become an engineer or scientist and you were not born with drawing abilities,
then practice drawing. Many figures in this book were drawn manually and this was
intentional as it is a good way for me to practice drawing;

 Finally I have collected some learning tips into a document which can be found here.

Feynman’s Epilogue. At the end of his famous physics course at Caltech, Feynman said the
following words, I quote

Well, I’ve been talking to you for two years and now I’m going to quit. In some ways
I would like to apologize, and other ways not. I hope—in fact, I know—that two or
three dozen of you have been able to follow everything with great excitement, and
have had a good time with it. But I also know that “the powers of instruction are of
very little efficacy except in those happy circumstances in which they are practically
superfluous.” So, for the two or three dozen who have understood everything, may I
say I have done nothing but shown you the things. For the others, if I have made you
hate the subject, I’m sorry. I never taught elementary physics before, and I apologize.
I just hope that I haven’t caused a serious trouble to you, and that you do not leave
this exciting business. I hope that someone else can teach it to you in a way that
doesn’t give you indigestion, and that you will find someday that, after all, it isn’t
as horrible as it looks.
Finally, may I add that the main purpose of my teaching has not been to prepare
you for some examination—it was not even to prepare you to serve industry or the
military. I wanted most to give you some appreciation of the wonderful world and
the physicist’s way of looking at it, which, I believe, is a major part of the true culture
of modern times. (There are probably professors of other subjects who would object,
but I believe that they are completely wrong.)

This is probably the ideal learning environment that cannot be repeated by other teachers.
What is then the solution? Selft studying! With a computer connected to the world wide web,
some good books (those books that I’ve used to write this note are good in my opinion), and
amazing free teachers (e.g. 3Blue1Brown, Mathologer, blackpenredpen, Dr. Trefor Bazett), you
can learn mathematics (or any topic) in a fun and productive way.

1.9 Organization of the book


Some wise person has said ‘writing is learning’. While this is not true for many authors, it cannot
be more true for me. I have written this note first for myself then for you–our readers (whoever
you are). The ultimate goal is to learn mathematics not for grades and examinations but for
having a better understanding of the subject and then of the physical world. This goal cannot be

Phu Nguyen, Monash University © Draft version


Chapter 1. Introduction 26

achieved if we proceed with pace and shortcuts. So, we will go slowly and starting from basic
concepts of numbers and arithmetic operations all the way up to college level mathematics.
In Chapter 2 I discuss all kinds of numbers: natural numbers, integer numbers, rational
numbers, irrational numbers and complex numbers. The presentation is such that these concepts
emerge naturally. Next, arithmetic operations on these numbers are defined. Then, linear/quadrat-
ic/cubic equations are treated
p with emphasis on cubic equations which led to the discovery of
imaginary number i D 1. Inverse operations are properly introduced: addition/subtraction,
multiplication/division, exponential/logarithm.
Chapter 3 presents a concise summary of Euclidean geometry and trigonometry. Both
trigonometry functions such as sin x, cos x, tan x and cot x and inverse trigonometry func-
tions e.g. arcsin x are introduced. A comprehensive table of trigonometry identities is provided
with derivation of all identities. Few applications of this fascinating branch of mathematics in
measuring large distances indirectly is also discussed. It is a short chapter as other applications
of trigonometry are discussed in other chapters.
Chapter 4 is all about calculus of functions of single variable. Calculus is “the language God
talks” according to Richard Feynman–the 1964 Nobel-winning theoretical physicist. Feynman
was probably referring to the fact that physical laws are written in the language of calculus.
Steven Strogatz in his interesting book Infinite Powers [69] wrote ‘Without calculus, we wouldn’t
have cell phones, computers, or microwave ovens. We wouldn’t have radio. Or television. Or
ultrasound for expectant mothers, or GPS for lost travelers. We wouldn’t have split the atom,
unraveled the human genome, or put astronauts on the moon.’ Thus it is not surprising that
calculus occupies an important part in the mathematics curriculum in both high schools and
universities. Sadly, as being taught in schools, calculus is packed with many theorems, formulae
and tricks. This chapter attempts to present calculus in an intuitive way by following as much as
possible the historical development of the subject. It’s a long chapter of more than one hundred
pages. This is unavoidable as calculus deals with complex problems. But it mainly concerns the
Rb
two big concepts: derivative (f 0 .x/ and those dy, dx) and integral ( a f .x/dx).
Chapter 5 presents a short introduction to the mathematical theory of probability. Probability
theory started when mathematicians turned their attention to games of chance (e.g. dice rolling).
Nowadays it is used widely in areas of study such as statistics, mathematics, science, finance,
gambling, artificial intelligence, machine learning, computer science, game theory, and philoso-
phy to, for example, draw inferences about the expected frequency of events. Probability theory
is also used to describe the underlying mechanics and regularities of complex systems.
Chapter 6 discusses some topics of statistics. Topics are least squares, Markov chain,
After calculus of functions of single variable is calculus of functions of multiple variables
(Chapter 7). There are two types of such functions: scalar-valued multivariate functions and
vector-valued multivariate functions. An example of the former functions is T D g.x; y; z/
which represents the temperature of a point in the earth. An example of the latter is the velocity
of a fluid particle. We first introduce vectors and vector algebra (rules to do arithmetic with
vectors). Certainly dot product and vector product are the two most important concepts in vector
algebra. Then I present the calculus of these two families of functions. For the former, we will
have partial derivatives and double/triple integrals. The calculus of vector-valued functions is
called vector calculus, which was firstly developed for the study of electromagnetism. Vector

Phu Nguyen, Monash University © Draft version


Chapter 1. Introduction 27

calculus then finds applications in many problems: fluid mechanics, solid mechanics etc. In
vector calculus, we will meet divergence, curl, line integral and Gauss’s theorem.
I present tensor analysis in Chapter 8. Tensors are that Albert Einstein used in 1905 to
write his famous field equations G C g  D 8G=c 4 T . These equations are the core of
his theory of general relativity that changed forever our understanding of the universe. This
chapter discusses tensors such as the metric tensor g , its properties, its algebra and calculus.
Similar to vectors, tensors are ubiquitous in mathematics, science and engineering. Thus, a solid
understanding of them is essential.
In Chapter 9, I discuss what probably is the most important application of calculus: differen-
tial equations. These equations are those that describe many physical laws. The attention is on
how to derive these equations more than on how to solve them. Derivation of the heat equation
@2  @2 u 2 @2 u
2 , the wave equation @t 2 D c @x 2 etc. are presented. Also discussed is the problem
@
@t
D  2 @x
of mechanical vibrations.
I then discuss in Chapter 10 the calculus of variations which is a branch of mathematics that
Rb
allows us to find a function y D f .x/ that minimizes a functional I D a G.y; y 0 ; y 00 ; x/dx.
For example it provides answers to questions like ‘what is the plane curve with maximum area
with a given perimeter’. You might have correctly guessed the answer: in the absence of any
restriction on the shape, the curve is a circle. But calculus of variation provides a proof and
more. One notable result of variational calculus is variational methods such as Ritz-Galerkin
method which led to the finite element method. The finite element method is a popular method
for numerically solving differential equations arising in engineering and mathematical modeling.
Typical problem areas of applications include structural analysis, heat transfer, fluid flow, mass
transport, and electromagnetic potential.
Chapter 11 is about linear algebra. Linear algebra is central to almost all areas of mathematics.
Linear algebra is also used in most sciences and fields of engineering. Thus, it occupies a vital
part in the university curriculum. Linear algebra is all about matrices, vector spaces, systems of
linear equations, eigenvectors, you name it. It is common that a student of linear algebra can
do the computations (e.g. compute the determinant of a matrix, or the eigenvector), but he/she
usually does not know the why and the what. This chapter hopefully provides some answers to
these questions.
Chapter 12 is all about numerical methods: how to compute a definite integral numerically,
how to interpolate a given data, how to solve numerically and approximately an ordinary differ-
ential equation. The basic idea is to use the power of computers to find approximate solutions to
mathematical problems. This is how Katherine Johnson–the main character in the movie Hidden
Figures– helped put a man on the moon. She used Euler’s method (a numerical method discussed
in this chapter) to do the calculation of the necessary trajectory from the earth to the moon for
the US Apollo space program. Just that she did by hands.
The book also contains one appendix. In appendix A I provide some Julia codes that are
used in the main text. The idea is to introduce young students to programming as much early as
possible.
When we listen to a song or look at a painting we really enjoy the song or the painting
much more if we know just a bit about the author and the story about her/his work. In the same
manner, mathematical theorems are poems written by mathematicians who are human beings.

Phu Nguyen, Monash University © Draft version


Chapter 1. Introduction 28

Behind the mathematics are the stories. To enjoy their poems we should know their stories.
The correspondence between Ramanujan– a 23 year old Indian clerk on a salary of only £20
per annum and Hardy–a world renown British mathematician at Cambridge is a touching story.
Or the story about the life of Galois who said these final words Ne pleure pas, Alfred ! J’ai
besoin de tout mon courage pour mourir à vingt ans (Don’t cry, Alfred! I need all my courage
to die at twenty) to his brother Alfred after being fatally wounded in a duel. His mathematical
legacy–Galois theory and group theory, two major branches of abstract algebra–remains with us
forever. Because of this, in the book biographies and some stories of leading mathematicians
are provided. But I am not a historian. Thus, I recommend readers to consult MacTutor History
of Mathematics Archive. MacTutor is a free online resource containing biographies of nearly
3000 mathematicians and over 2000 pages of essays and supporting materials.

How this book should be read? For those who do not where to start, this is how you could
read this book. Let’s start with this chapter. Then proceed with Chapter 2, Chapter 3 and
Chapter 4. That covers more than the high school curriculum. If you’re interested in using
the maths to do some science projects, check out Chapter 12 where you will find techniques
(easy to understand and program) to solve simple harmonic problems (spring-mass or pendu-
lum) and N -body problems (e.g. Sun-Earth problem, Sun-Earth-Moon problem). If you get
up to there (and I do not see why you cannot), then feel free to explore the remaining of the books.

Conventions. Equations, figures, tables, theorems are numbered consecutively within each sec-
tion. For instance, when we’re working in Section 2.2, the fourth equation is numbered (2.2.4).
And this equation is referred to as Equation (2.2.4) in the text. Same conventions are used for
figures and tables. I include many code snippets in the appendix, and the numbering convention
is as follows. For instance Listing B.5 refers to the fifth code snippet in Appendix B. Asterisks
(*), daggers (Ž) and similar symbols indicate footnotes.
Without further ado, let’s get started and learn maths in the spirit of Richard Feynman:

I wonder why. I wonder why


I wonder why I wonder
I wonder why I wonder why
I wonder why I wonder!

Because a curious mind can lead us far. After all, you see, millions saw the apple fall, but only
Newton asked why.
And don’t forget that ability (intelligence) is malleable via efforts. If a guy of nearly 40 years
old, married with two kids and a full time job can learn math, you all can too. And you will do
it better.

Phu Nguyen, Monash University © Draft version


Chapter 2
Algebra

Contents
2.1 Natural numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.2 Doing some algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.3 Integer numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.4 Playing with natural numbers . . . . . . . . . . . . . . . . . . . . . . . . 40
2.5 If and only if: conditional statements . . . . . . . . . . . . . . . . . . . . 46
2.6 Sums of whole numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
2.7 Prime numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
2.8 Rational numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
2.9 Irrational numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
2.10 Fibonacci numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
2.11 Continued fractions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
2.12 Pythagoras’ theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
2.13 Imaginary number . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
2.14 Mathematical notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
2.15 Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
2.16 Word problems and system of linear equations . . . . . . . . . . . . . . . 103
2.17 System of nonlinear equations . . . . . . . . . . . . . . . . . . . . . . . . 108
2.18 Algebraic and transcendental equations . . . . . . . . . . . . . . . . . . 111
2.19 Rules of powers (exponentiation) . . . . . . . . . . . . . . . . . . . . . . 111
2.20 Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
2.21 Infinity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
2.22 Sequences, convergence and limit . . . . . . . . . . . . . . . . . . . . . . 144

29
Chapter 2. Algebra 30

2.23 Inverse operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148


2.24 Logarithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
2.25 Complex numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
2.26 Combinatorics: The Art of Counting . . . . . . . . . . . . . . . . . . . . 176
2.27 Pascal triangle and the binomial theorem . . . . . . . . . . . . . . . . . . 188
2.28 Compounding interest . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
2.29 Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
2.30 Modular arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
2.31 Cantor and infinity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
2.32 Number systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
2.33 Graph theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
2.34 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
2.35 Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230

Algebra is generous: she often gives more than is asked for. (Jean d’Alembert)

Algebra is one of the broad parts of mathematics, together with number theory, geometry
and analysis. In its most general form, algebra is the study of mathematical symbols and the
rules for manipulating these symbols; it is a unifying thread of almost all of mathematics. It
includes everything from elementary equation solving to the study of abstractions such as groups,
rings, and fields. Elementary algebra is generally considered to be essential for any study of
mathematics, science, or engineering, as well as such applications as medicine and economics.
This chapter discusses some topics of elementary algebra. By elementary we meant the
algebra in which the commutative of multiplication rule a  b D b  a holds. There exists other
algebra which violates this rule. There is also matrix algebra that deals with groups of numbers
(called matrices) instead of single numbers.
Our starting point is not the beginning of the history of mathematics; instead we start with the
concept of positive integers (or natural numbers) along with the two basic arithmetic operations
of addition and multiplication. Furthermore, we begin immediately with the decimal, also called
Hindu-Arabic, or Arabic, number system that employs 10 as the base and requiring 10 different
numerals, the digits 0, 1, 2, 3, 4, 5, 6, 7, 8, 9. And finally, we take it for granted the liberal use of
symbols such as x; y and write x.10 x/ rather than as

If a person puts such a question to you as: ‘I have divided ten into two parts, and
multiplying one of these by the other the result was twenty-one;’ then you know that
one of the parts is thing and the other is ten minus thing.

from al-Khwarizimi’s “Algebra” (ca. 820 AD).


Our approach is reasonable given the fact that we have limited lifespan and thus it is impos-
sible to trace the entire history of mathematics.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 31

2.1 Natural numbers


The practice of counting objects led to the development of natural numbers such as 1; 2; 3; : : :
With these numbers, we can do two basic operations of arithmetic, namely addition (C) and
multiplication ( or ). Addition such as 3 C 5 D 8 (8 is called the sum) and multiplication such
as 4  5 D 4  5 D 20 (where 20 is called the product), are easy to understand.
It can be seen that the addition and multiplication operations have the following properties

3 C 5 D 5 C 3; 3  5 D 5  3; 3  .2 C 4/ D 3  2 C 3  4 .D 18/ (2.1.1)

One way to understand why 3  5 D 5  3 is to use visual representations. Fig. 2.1 provides two
such representations; for the one on the left, the area of that rectangle does not care whether we
place it in on the long or short side, thus 3  5 must be equal to 5  3Ž . For the one on the right,
we use a block of rocks (plotted as dots): we can either have a block of 5 rows, each row having
3 rocks, thus in total there are 5  3 rocks, or have a block of 3 rows, each row having 5 rocks,
thus in total there are 3  5 rocks. For 3  .2 C 4/ D 3  2 C 3  4, see Fig. 2.2.

5
3

3 5

Figure 2.1: Visual demonstration of the commutativity of multiplication 3  5 D 5  3.

3 32 34

2 4

Figure 2.2: Visual demonstration of the associativity of addition/multiplication .2 C 4/  3 D 2  3 C 4  3.


Try to draw a similar diagram to see that .4 2/  3 D 4  3 2  3.

As there is nothing special about numbers 3; 5; 2; 4 in Eq. (2.1.1); other natural numbers
obey these rules too, one can define the following arithmetic rules for natural numbers a, b and
Ž
If you do not know what is area and the formula for the area of simple geometries, check out Section 3.1.6.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 32

c ŽŽ :

(a1) commutative aCb DbCa


(a2) commutative ab D ba
(b1) associative .a C b/ C c D a C .b C c/ (2.1.2)
(b2) associative .ab/c D a.bc/
(c1) distributive a.b ˙ c/ D ab ˙ ac
where ab means a  b, but 23 means twenty three not 2  3 D 6: the rule is the multiplication
sign is omitted between letters and between a number and a letter (for mathematicians had to
write these a lot and they do not want to spend time writing these boring signs, they always
focus on the ideas and patterns). We should pause and appreciate the power of these rules. For
example, the b1 rule allows us to put the parentheses anywhere we like (or even omit them).
Once we have recognized how numbers behave, we can take advantage of that. For example,
to compute 571  36 C 571  64 the naive way we need two multiplications and one addition.
Using the distributive property we can do: 571  .36 C 64/ D 571  100–one addition and
one easy multiplication. That’s a humbling example of the power of recognizing the pattern in
mathematics. Another example, to compute the sum 1 C 278 C 99, we can use the b1 rule to
proceed as .1 C 99/C278 D 100C278 D 378. Note also that the distributive rule can be written
as .b C c/a D ba C ca, and this is the rule we implicitly use when we write 5a C 7a D 12a.
We must note here that the introduction of a symbol (say a) to label any natural number is
a significant achievement in mathematics, done in about the 16th century. Before that mathe-
maticians only worked with specific concrete numbers (e.g. 2 or 10). With symbols comes the
power of generalization; Eq. (2.1.2) covers all natural numbers in one go! Note that we have
infinity of such numbers and just one short equation can state a property of all of them. But if
we think deeply we see that we do it all the times in our daily life. We use “man” and “woman”
to represent any man and woman, whereas “John” and “Mary” describe two particular man and
woman!
Simply put, we can say that in algebra we are doing arithmetic with just one new feature: we
use letters to represent numbers. Thus, instead of focusing on 2 C 3 we pay attention to x C 3,
or x C y. As these letters ( e.g. x; y; z; a; b; c) representing numbers and we can do arithmetic
with numbers, so we can do arithmetic with letters.
It should be emphasized that arithmetic is not mathematics. The fact that 3 C 5 is eight is not
important nor interesting; what is more interesting is 3C5 D 5C3. Professional mathematicians
are usually bad at arithmetic as the following true story can testifyŽ :

Ernst Eduard Kummer (1810-1893), a German algebraist, was sometimes slow at


calculations. Whenever he had occasion to do simple arithmetic in class, he would
get his students to help him. Once he had to find 7  9. "Seven times nine," he began,
ŽŽ
Note that we do not attempt to prove these rules. We feel that they are reasonable to accept (using devices such
as Fig. 2.1). You can consider them as the rules of the game if we want to play with natural numbers. Of course
some mathematicians are not happy with that, and they came up with other axioms (rules) and from those axioms
they can indeed prove the rules we are discussing now. If interested you can google for Peano axioms.
Ž
Source: https://www.math.utah.edu/~cherk/mathjokes.html.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 33

"Seven times nine is er – ah — ah – seven times nine is. . . ." "Sixty-one," a student
suggested. Kummer wrote 61 on the board. "Sir," said another student, "it should
be sixty-nine." "Come, come, gentlemen, it can’t be both," Kummer exclaimed. "It
must be one or the other."

However, as future engineers and scientists you have to be very exact with your calculations.
The main purpose of me telling this story to emphasize that arithmetic is not mathematics.

Abstraction and representation. As kids we were introduced to natural numbers too early that
mots of the time we take them for granted. When we’re getting old enough, we should question
them by asking questions where these numbers come from?, why we write 5 for five? etc.. From
concrete things in life such as five trees, five fishes, five cows etc. human beings have developed
number five to represent the five-ness. This number five is an abstract entity in the sense that we
never see, hear, feel, or taste it. And yet, it has a definite existence for the rest of our lives. Do
not confuse number five and its representation (5 in our decimal number system) as there are
many representations of a number (e.g. V in the Roman number system).
We observed a pattern (five-ness) and we created an abstract entity from it. This is called
abstraction. And this abstract entity is very powerful. While it is easy to explain a collection
of five or six objects (using your fingers), imagine how awkward would it be to explain a set of
thirty-five objects without using the number 35.
Now that we have the concept of natural numbers, how we are going to represent them?
People used dots to represent numbers, tallies were also used. But it was soon realized that all
these methods are bad at representing large numbers. Just think of how you would write the
number "one hundred" using dots and you understand what I meant. Only after a long period
that we developed the decimal number system with only 10 digits (0; 1; 2; 3; 4; 5; 6; 7; 8; 9) that
can represent any number you can imagine of!
Is the decimal number system the only one? Of course not, the computers only use two digits
0 and 1. Is it true that we’re comfortable with the decimal number system because we have ten
fingers? I do not know. I posed this question just to demonstrate that even for something as
simple as counting numbers, that we have taken for granted, there are many interesting aspects
to explore. A curious mind can lead us far.

History of the equal sign. The inventor of the equal sign ‘=’ was the Welsh physician and
mathematician Robert Recorde (c. 1512 – 1558). In 1557, in The Whetstone of Witte, Recorde
used two parallel lines (he used an obsolete word gemowe, meaning ‘twin’) to avoid tedious
repetition of the words ‘is equal to’. He chose that symbol because ‘no two things can be more
equal’. Recorde chose well. His symbol has remained in use for 464 years.

2.2 Doing some algebra


With these rules, we can start doing some algebra. What I meant by that is to manipulate algebraic
expressions. An algebraic expression is an expression involving numbers, parentheses, operation
signs (C; ; ;  etc.) and variables (e.g. a; b; x; y). Examples of algebraic expressions are:

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 34

3x C 1 and 5.x 2 C 3x/. Note that the multiplication sign is omitted between letters and between
a number and a letter: so we write 2x instead of 2  x. This is not an algebraic expression: 3xC,
as it does not make sense: 3x plus what?
As the first example of manipulating algebraic expressions consider this problem: what is
the square of a C b, which is .a C b/.a C b/? (think of a square of side a C b, then its area is
.a C b/.a C b/). Mathematicians are lazy, so they use the notation .a C b/2 for .a C b/.a C b/.
For a given natural number a, its square is a2 D a  a, its cube is a3 D a  a  a; they are
examples of powers of a. In Section 2.19 we will talk more about powers.
Getting back to .a C b/2 , we proceed asŽŽ

.a C b/2 D .a C b/.a C b/ (definition)


D a.a C b/ C b.a C b/ (distributive c1)
D a2 C ab C ba C b 2 (distributive c1) (2.2.1)
D a2 C ab C ab C b 2 (commutative for multiplication a2)
D a2 C 2ab C b 2

What is the significance of this? It tells us that the two seemingly different expressions .a C b/2
and a2 C 2ab C b 2 are the same! A geometry proof of this so-called identity (as it holds for any
number) is shown in Fig. 2.3a. This was how ancient Greek mathematicians thought of .a C b/2 .
They thought in terms of geometry: any quadratic term of the form ab is associated with the area
of a certain shape. And this way of geometric thinking is very useful as we will see in this book.
I am against memorizing any formula (including this identity); this is because understanding is
important.

a+b a

b ab b2 b ab − b2 b2

a+b a

a a2 ab a−b (a − b)2 ab − b2

a b a−b b
(a) (b)

Figure 2.3: Geometric visualization of .a C b/2 and .a b/2 .

What we have done in Eq. (2.2.1) was to expand and simplify an expression. That is, we
multiplied out the brackets and then simplified the resulting expression by collecting the like
terms (i.e., ab and ba).
ŽŽ
One more exercise to practice: .3a/2 D .3a/.3a/–this is just definition. Now .3a/.3a/ D .3/.3/.a/.a/
because of the associative property b2. Finally, .3a/2 D 9a2 .

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 35

Let’s pause for a moment and think more about .a C b/2 D a2 C 2ab C
b 2 . What else does this tell us? Surprisingly a lot! We can think of .a C b/2
as a hamster (in our mathematical world). If we do not touch it, talk to it,
it does not talk back. And that’s why we just see it as .a C b/2 . However,
when we talk to it by massaging it, it talks back by revealing its secret: it has
another name and it is a2 C 2ab C b 2 . So, we can think of mathematicians
as magicians (but without a tick), while magicians can get a rabbit out of an empty hat with a
tick, mathematicians can too: they poke their numbers and pop out many interesting facts.
But hey, why knowing another name of that hamster is useful? First, mathematicians–as
human beings–are curious by nature: they want to know everything about mathematical objects.
Second, probably not a very good example, but if the more you know about your enemy, the
better, don’t you think?
The next thing to do is .a C b/3 , which can be computed as

.a C b/3 D .a C b/.a C b/.a C b/ (definition)


D .a C b/2 .a C b/ (definition of .a C b/2 )
D .a2 C 2ab C b 2 /.a C b/ (Eq. (2.2.1)) (2.2.2)
D a3 C a2 b C 2a2 b C 2ab 2 C b 2 a C b 3 (distributive c1)
D a3 C 3a2 b C 3ab 2 C b 3 (collect like terms)

Of course, a geometric interpretation of this expression is available (volumes instead of areas);


but I do not think such a diagram would help here. By playing with expressions such as .a C b/2 ,
.a C b/3 etc. that our ancestors came up with the so-called Pascal triangle and the binomial
theorem (see Section 2.27).
How about .a C b C c/2 ? Of course we can do the same way by writing this term as
Œ.a C b/ C c/2 , then seeing a C b D d as just a number, then using Eq. (2.2.1) for .d C c/2 .
However, there is a better way: guessing the result! Our guess is as follows

.a C b C c/2 D a2 C 2ab C b 2 C c 2 C 2bc C 2ca (2.2.3)

The red terms are present when c D 0, the blue terms are due to the fact that a; b; c are equal:
if there is a2 , there must be c 2 . By doing this way we’re gradually developing a feeling of
mathematics.

The FOIL rule of algebra.

In many textbooks on algebra we have seen this identity:

.a C b/.c C d / D ac C ad C bc C bd

And to help students memorize it someone invented the FOIL rule (First-Outer-Inner-
Last). I’m against this way of teaching mathematics. This identity is vary natural as it
comes from the arithmetic rules given in Eq. (2.1.2). Let’s denote c C d D e (the sum of

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 36

two natural numbers is a natural number), so we can write

.a C b/.c C d / D .a C b/e D ae C be (distributive rule)


D a.c C d / C b.c C d / (replace e with c C d )
D ac C ad C bc C bd (again, distributive rule)

We have, so far, expanded brackets, it’s time to do the opposite. The problem is: write a2 b 2
in another form. We can proceed using geometry as shown in Fig. 2.4. From that, we have the
following identity
a2 b 2 D .a C b/.a b/
which can be verified by going from .a C b/.a b/: it is a2 ab C ba b 2 D a2 b 2 .
This identity can help us for example in computing 100 0022 99 9982 without a calculator
nearby. Squaring and subtracting would take quite a while, but the identity is of tremendous
help: 100 0022 99 9982 D .100 002 C 99 998/.100 002 99 998/ D 4.200 000/.
Writing a2 b 2 as .a b/.a Cb/ is called factorizing the term. In mathematics, factorization
or factoring consists of writing a number or another mathematical object as a product of several
factors, usually smaller or simpler objects of the same kind. For what? For a better understand of
the original object. For example, 3  5 is a factorization of the integer 15, and .x 2/.x C 2/ is a
factorization of the polynomial x 2 4. We have more to say about factorization in Section 2.15.
And when we meet other mathematical objects (e.g. matrices) later in the book, we shall see that
mathematicians do indeed spend a significant of time just to factor matrices.
a
b

.a b/a
a

b b .a b/b

b a b
Figure 2.4: Geometric interpretation of a2 b 2 . We have two squares, one of side a and the other of side
b. By our arrangement of these two squares, the term a2 b 2 is the area of the gray shaded region shown
in the left diagram. This figure can be divided into two rectangles, one of area .a b/b and one of area
.a b/a. Adding these two areas and we obtain .a C b/.a b/.

2.3 Integer numbers


This section discusses integer numbers that are whole numbers which can be positive or negative
(Section 2.3.1). A short introduction of negative numbers is given in Section 2.3.2. Then, in
Section 2.3.3, the arithmetic of negative numbers is given; the answer to questions such as why
1  1 D 1 can be found herein.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 37

2.3.1 Negative numbers


So far so good: addition and multiplication of natural numbers are easy. But what is more im-
portant is this observation: adding (or multiplying) two natural numbers gives us another natural
number. Mathematicians say that natural numbers are closed under addition and multiplication.
Why they care about this? Because it ensures security: we never step outside of the familiar
world of natural numbers, until... when it comes to subtraction. What is 3 5? Well, we can take
3 from 3 and we have nothing (zero). How can we take away two from nothing? It seems impos-
sible. Shall we only allow subtraction of the form a b when a  b (this is how mathematicians
say a is larger than or equal to b)? That sounds restricted, doesn’t it?

? 0 1 2 3 4
4 1D3
2 1 0 1 2 3 4 4 2D2
4 3D1 (2.3.1)
2 1 0 1 2 3 4 4 4D0
4 5D‹

If we imagine a line on which we put zero at a certain place, and on the right of zero we place
1, 2, 3 and so on (see the figure in Eq. (2.3.1)). The English mathematician, John Wallis (1616 -
1703) is credited with giving some meaning to negative numbers by inventing the number line,
which is what I am presenting here. Now, when we do a subtraction, let say 4 1, we start from
4 on this line and walk towards zero one step: we end up at three. Similarly when we do 4 2 we
walk towards zero two steps. Eventually we reach zero when have walked four steps: 4 4 D 0.
What happens then if we walk past zero one step? It is exactly what 4 5 means. We should
now be at the position marked by number one but in red (to indicate that this position is on the
left side of zero). So, we have solved the problem: 4 5 D 1. Nowadays people write 1 (read
’negative one’) instead of using a different color. Thus, 4 5 D 1.
Now we have two kinds of numbers: the ones on the right hand side of zero (e.g. 1; 2; : : :)
and the ones on the left hand side (e.g. 1; 2; : : :). The former are called positive inte-
gers and the latter negative integers; together with zero they form the so-called integers:
f: : : ; 3; 2; 1; 0; 1; 2; 3; : : :g .
The number line is kind of a two-way street: starting from zero, if we go to the right we
go in the positive direction (for we see positive integers), and if we go to the left, we follow
the negative direction. For every positive integers a, we have a negative counterpart a. We
can think of as an operation that flips a to the other side of zero. Why we have to start with
a positive integer (if all numbers should be treated equal)? If we start with a negative number,
let say, b (b > 0), then to flip it to the other side of zero is: . b/ which is b. So we have
. b/ D b for any integer–positive and negative. If b > 0 you can think of this as taken away
a debt is an asset.

Sometimes we also write positive integers as C2.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 38

2.3.2 A brief history on negative numbers


Negative numbers appeared for the first time in history in the Nine Chapters on the Mathematical
Art, which in its present form dates from the period of the Han Dynasty (202 BC – AD 220).
The mathematician Liu Hui (c. 3rd century) established rules for the addition and subtraction of
negative numbers. The Nine Chapters used red counting rods to denote positive numbers and
black rods for negative numbers. During the 7th century AD, negative numbers were used in
India to represent debts. The Indian mathematician Brahmagupta, in Brahma-Sphuta-Siddhanta
(written c. AD 630), gave rules regarding operations involving negative numbers and zero, such
as "A debt cut off from nothingness becomes a credit; a credit cut off from nothingness becomes
a debt." He called positive numbers "fortunes", zero "a cipher", and negative numbers "debts".
While we have no problems accepting positive numbers, it is mentally hard to grasp negative
numbers. What is negative four cookies? This is because negative numbers are more abstract
than positive ones. For a long time, negative solutions to problems were considered "false". In
Hellenistic Egypt, the Greek mathematician Diophantus, in his book Arithmetica, while referring
to the equation 4x C 20 D 4 (which has a negative solution of 4) saying that the equation was
absurd. This is because Greek mathematics was founded on geometrical ideas: a number is a
certain length or area or volume of something; thus number is always positive.

2.3.3 Arithmetic of negative integers


How arithmetical operations work for negative numbers? The first thing to notice is that the
symbol of subtraction is used to indicate negative integers. Why? One way to explain is that:
0 2 D 2 (taken 2 away from zero results in a debt of 2, or going from zero two steps to its
left side results in 2).
Using the number line (the one with two directions), or the concept of debts, it is not hard to
do the following arithmetical problems:

(a) adding two negative integers: . 2/ C . 3/ D 5; . 3/ C . 1/ D 4


(b) adding a negative and a positive: . 2/ C 1 D 1; . 2/ C 3 D 1 (2.3.2)
(c) subtracting two negative integers: . 3/ . 1/ D 2; . 3/ . 4/ D 1

For (c), using the rule . b/ D b for any integer–positive and negative. When b > 0, this rule
can be understood as "taken away a debt is an asset".
Let’s study now the arithmetic rules for multiplication of negative numbers. We should
always start simple: multiplication of a negative integer and a positive integer. It is obvious to
see that
. 1/ C . 1/ C . 1/ D 3
as, after all, if I borrow you one dollar a week for three weeks, then I own you three dollarsŽ .
This immediately results in the following

. 1/  3 WD . 1/ C . 1/ C . 1/ D 3 .D 3  . 1//
Ž
If you prefer thinking of geometry, the the number line is very useful: . 1/ C . 1/ C . 1/ is walking three
steps to the negative direction from zero, we must end up at 3.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 39

So, when a positive number a is multiplied with 1 it is flipped to the other side of zero on the
number line at a. That is, a D . 1/  a D . 1/a for integer a. And with that we know how
to handle 2  10, which is 20, and so on. The general rule is . a/.b/ D .ab/, for positive
integers a and b. In English, it reads "the multiplication of a positive and a negative number
yields a negative number whose numerical value is the product of the two given numerical
values".
But what maths has to do with debts? Can we deduce the rules without resorting to debts,
which are very negative. Ok, let’s compute 5  .3 C . 3// in two ways. First, as 3 C . 3/ D 0,
we have 5  .3 C . 3// D 0. But from Eq. (2.1.2), we also have (distributive rule)

5  .3 C . 3// D 5  3 C 5  . 3/ D 0 H) 5  . 3/ D 15

Thus, if we insist that the usual arithmetic rules apply also for negative numbers, we have deduced
a rule that is consistent with daily experience. From a mathematical viewpoint, mathematicians
always try to have a set of rules that works for as many objects as possible. They have the rules in
Eq. (2.1.2) for positive integers, now they gave birth to negative integers. To make positive and
negative integers live happily together the negative integers must follow the same rules. (They
can have their own rules, that is fine, but they must obey the old rules).
But how about . 1/  . 1/? One way to figure out the result is to look at the following
9
. 1/  3 D 3> >
>
. 1/  2 D 2=
H) . 1/  . 1/ D 1
. 1/  1 D 1> >
>
;
. 1/  0 D 0

and observe that from top going down, the RHS numbers get increased by one. Thus . 1/0 D 0
should lead to . 1/  . 1/ D 0 C 1 D 1. This is certainly not a proof for we’re not sure that
the pattern will repeat. This was just one short explanation. If you were not happy with that,
then . 1/  . 1/ D 1 was a consequence by our choice to maintain the arithmetic rules, the
distributive rule, in Eq. (2.1.2):

1 C . 1/ D 0 ) Œ1 C . 1/  . 1/ D 0 W 1  . 1/ C . 1/  . 1/ D 0 H) . 1/  . 1/ D 1

Coincidentally, it is similar to the ancient proverb the enemy of my enemy is my friend. If you are
struggling with this, it is OK as the great Swiss mathematician Euler (who we will meet again
and again in this book) also struggled with the fact that . 1/. 1/ D 1 too.
Knowing the arithmetic rules of negative integers, we can know study the expression .a b/2 .
We proceed as follows

.a b/2 D .a b/.a b/ (definition)


D .a b/a .a b/b (distributive c1)
D a2 ba .ab b 2 / (distributive c1) (2.3.3)
D a2 ba ab C b 2 (open bracket)
2 2
Da 2ab C b

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 40

A geometry based proof of it is given in Fig. 2.3b. The open bracket rule is: .a b/ D a C b.
That is, if a bracket is preceded by a minus sign, change positive signs within it to negative and
vice-versa when removing the bracket. Need a proof? Here it is:

.a b/ D . 1/  .a b/
D. 1/  a C . 1/  . b/
(2.3.4)
D a C . 1/  Œ. 1/  b
D a C Œ. 1/  . 1/  b D aCb

Question 1. How many are there integer numbers?

2.4 Playing with natural numbers


Once human have created numbers, they started playing with them and discovered many interest-
ing properties. For example, some numbers are even and some are odd (Fig. 2.5). Even natural
numbers are 2; 4; 6; 8; : : :, which can be written as 2; 2  2; 2  3; : : : Then, a generalization can
be made, an even number is the one of the form 2k–it is divisible by 2. And an odd number is
2k C1–it is not divisible by 2. To say about the oddness or evenness of a number, mathematicians
use the term parity.

odd even
1

3 2

5 4

Figure 2.5: Even and odd numbers. Using this visualization, an odd number is obviously written as 2k C1–
it is not divisible by 2.

To demonstrate the importance of definition in mathematics, below are three definitions of


even numbers. What is the one mathematicians use?

 Definition 1: An even number is any integer that can be divided exactly by 2;

 Definition 2: A number that is divisible by 2 and generates a remainder of 0 is called an


even number.

 Definition 3: An even number is an integer of the form n D 2k, where k is an integer.

They are equivalent, but the definition 3 is the one that is the one that mathematicians adopt.
Why? Because they can use this definition to deduce the behavior of even numbers. Precisely,

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 41

they can manipulate the expression n D 2k in many ways. I present some examples now so that
you can see the point. Other definitions are useless as they are all verbal.
Now we have two groups of even and odd numbers, questions about their relation arise. For
instance, is there any relation between even/odd numbers? Yes, for example: even times even is
even (e.g. 2  6 D 12, 4  6 D 24) or odd times odd is odd (e.g. 3  5 D 15, 3  7 D 21). But
why so?

Proof that the product of two even numbers is always an even number. According to the defini-
tion number of even number, we write 2n for the first even number and 2k for the second even
number, where n and k are integers. Their product is then .2n/.2k/. But, according to the asso-
ciativity rule of multiplication, Eq. (2.1.2), we can write this product as 2.2nk/, which is nothing
but 2m, where m D 2nk is an integer (why?). We use the definition (of even numbers) again:
the product is of the form 2m, thus it is even. 

This proof is a bit verbal, so I will try to have a shorter proof:

Proof that the product of two even numbers is always an even number. Assume that a and b are
two even numbers. According to the definition 3 (of even numbers), we then have a D 2n and
b D 2k for integers n; k. Their product is ab D .2n/.2k/, which can be rewritten as 2m, where
m D 2nk is an integer. Therefore, by definition 3, ab is even. 

Here is another property of numbers that we find: if we multiply two consecutive odd num-
bers (or consecutive even numbers) we get a number that is one less than a perfect square (e.g.,
5  7 C 1 D 36 D 62 , 10  12 C 1 D 121 D 112 ). Is this always the case, or can we find two
consecutive odd or even numbers for which this phenomenon does not occur? We can check this
forever, or we can prove it once and for all. Mathematicians are lazy, so they prefer the latter.
Here is the proof. The plan of the proof is: (i) translate the English phrase "two consecutive even
numbers" into mathematical symbols: if the first even number is 2k, then the next even number
is 2k C 2Ž , (ii) translate “multiply two consecutive even numbers and add 1” to .2k/.2k C 2/ C 1.
The remaining is simply algebra:

Proof. Let’s call the first even number by 2k, then the next even number is 2k C 2 where
k D 1; 2; 3; : : :. The sum of their product and one is:

.2k/.2k C 2/ C 1 D .2k/2 C 4k C 1 D .2k/2 C 2.2k/ C 1 D .2k C 1/2

which is obviously a perfect square. You should completely understand the above sequence of
steps: the first one is the distributive property c1 in Eq. (2.1.2) (i.e., a.b C c/ D ab C ac with
a D 2k; b D 2k and c D 2); the second step is to make appear .a C b/2 D a2 C 2ab C b 2 with
a D 2k and b D 1. And why 4k D 2.2k/? It is so because 4k D .2  2/k, which is also 2.2k/
due to property b2 .ab/c D a.bc/ in Eq. (2.1.2). 
Ž
Think of concrete examples such as 2; 4 or 6; 8, and you will see this.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 42

Why mathematicians care about perfect squares? One reason: it is super easy to compute the
square root of a perfect square.
Here is yet another property of integers: multiplying three successive integers and adding the
middle integer to this product always yields a perfect cube! For example, 234C3 D 27 D 33 .
Why?

Proof. Let’s denote three successive integers by k, k C 1, k C 2, then we can write

k.k C 1/.k C 2/ C .k C 1/ D .k C 1/ Œk.k C 2/ C 1 D .k C 1/3


„ ƒ‚ …
.kC1/2

which is a perfect cube. 

2.4.1 Divisibility
Divisibility is the ability of a number to be evenly divided by
another number. For example, four divided by two is equal to divident
two, an integer, and therefore we say four is divisible by two. D quotient
Mathematicians also say that two divides four, which means that
divisor
"2 divides 4 into 2 parts". I introduce some terminologies now: the number which is getting
divided here is called the dividend. The number which divides a given number is the divisor.
And the number which we get as a result is known as the quotient. So, in 6 W 3 D 2, 6 is the
dividend, 3 is the divisor and 2 is the quotient. Mathematicians write 3j6 to say that 3 divides 6
or equivalently 6 is divisible by 3ŽŽ . Thus, ajb means that there exists a whole number k such
that ak D b.
Here is another property of counting numbers regarding divisibility: a number is divisible
by 9 if and only if the sum of its digits is divisible by 9. For example, 351 is divisible by 9 and
3 C 5 C 1 D 9 is clearly divisible by 9. To see why, we write 351 as follows

210
351 D 3  100 C 5  10 C 1  1
D 3  102 C 5  101 C 1  100
D 3  .99 C 1/ C 5  .9 C 1/ C 1
D 3  99 C 5  9 C 3 C 5 C 1
„ ƒ‚ …
sum of the digits

D 9.3  11 C 5 C 1/

Therefore, it is divisible by 9. But this does not make a proof! It however give us some hints for
a proof. Now, we write a counting number in this form

an an 1    a1 a0
ŽŽ
ajb does not mean the same thing as a=b. The latter is a number, the former is a statement about two numbers.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 43

which is a number having n C 1 digits whatever n isŽ . Then, we do similarly to what we have
done for the number 351:
an an 1    a1 a0 D an  10n C an 1  10n 1
C    C a1  10 C a0
D an .10n 1 C 1/ C an 1 .10n 1 1 C 1/ C    C a0
D .an C an 1 C    C a1 C a0 / C 9.an an    an C an 1 an 1    an 1 C    C a1 /
„ ƒ‚ … „ ƒ‚ …
n terms n 1 terms

And that concludes the proof of the divisibility of 9: when an C an 1 C    C a1 C a0 D 9k.


Note that, in the above equation, we have used

10n 1 D 99    …9 H) an .10n
„ ƒ‚ 1/ D an  99   …
„ ƒ‚ 9 D an  9  11    …1 D 9  an an    an
„ ƒ‚ „ ƒ‚ …
n terms n terms n terms n terms

So, you have see that after the property has been discovered, the proof might not be so difficult,
especially when we have worked out a proof for a special case.
If you think about my ’proof’ of the statement ’a number is divisible by 9 if and only
Remark if the sum of its digits is divisible by 9’, you might realize that it is not complete. We
come back to this issue in Section 2.5.
A good question is how we have discovered the property in the first place? It is simple: by
playing with numbers very carefully. For example, we all know the times table for 9. If we not
just look at the multiplication, but also the inverse i.e., the division, we see this:

91D9 9W9D1
9  2 D 18 18 W 9 D 2
9  3 D 27 27 W 9 D 3
9  4 D 36 36 W 9 D 4

Then, by looking at the red numbers, the divisibility of a number for 9 was discovered. The
lesson is always to look at a problem from different angles. For example, if you see the word
‘Rivers’, it can be a name of a person not just the rivers.
Without double that there are rules regarding the divisibility of a number by 2; 3; 4; 5 and
so on. Try to find them for yourself. We shall get back to this problem when know more about
mathematics (Section 2.30.4).
Here are only a few interesting facts about natural numbers. There are tons of other interesting
results. If you have found that they are interesting, study them! The study of natural numbers
has gained its reputation as the “queen of mathematics” according to Gauss–the famous German
mathematician, and many of the greatest mathematicians have devoted study to numbers. You
could become a number theorist (a mathematician who studies natural numbers) or you could
work for a bank on the field of information protection – known as “cryptography”. Or you
Ž
How mathematicians came up with this notation? Let’s start simple with two digit number (e.g. 24; 99): they
can be written as ab. Then a three digit number is abc. But there are only about 25 alphabets! The solution is to use
subscripts a1 ; a2 . There are infinite counting numbers, we do not have to worry about using them to label anything.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 44

could become an amateur mathematician like Pierre de Fermat who was a lawyer but studied
mathematics in free time for leisure purposes.
If you do not enjoy natural numbers, that is of course also totally fine. For sciences and
engineering, where real numbers are dominant, a good knowledge of number theory is not
needed. Indeed, before writing this book, I knew just a little about natural numbers and relations
between them.
One of the amazing things about pure mathematics – mathematics done for its own sake,
rather than out of an attempt to understand the “real world” – is that sometimes, purely theoretical
discoveries can turn out to have practical applications. This happened, for example, when non-
Euclidean geometries described by the mathematicians Karl Gauss and Bernard Riemann turned
out to provide a model for the relativity between space and time, as shown by Albert Einstein.

Taxicab number 1729. The name is derived from a conversation in about 1919 involving
British mathematician G. H. Hardy and Indian mathematician Srinivasa Ramanujan. As
told by Hardy:

I remember once going to see him [Ramanujan] when he was lying ill at
Putney. I had ridden in taxi-cab No. 1729, and remarked that the number
seemed to be rather a dull one, and that I hoped it was not an unfavorable
omen. "No," he replied, "it is a very interesting number; it is the smallest
number expressible as the sum of two [positive] cubes in two different ways.

Ramanujan meant that 1729 D 13 C 123 D 93 C 103 !

Let’s see some math magics, which, unlike other kinds of magics, can be explained.

Magic numbers.

This magic trick is taken from the interesting book Alex’s Adventures in Numberland by
Alex Bellos [8]. The trick is: "I ask you to name a three-digit number for which the first
and last digits differs by at least two. I then ask you to reverse that number to give you a
second number. After that, I ask you to subtract the smaller number from the larger number.
I then ask you to add this intermediary result to its reverse. The result is 1089, regardless
whatever number you have chosen". For instance, if you choose 214, the reverse is 412.
Then, 412 – 214 = 198. I then asked you to add this intermediary result to its reverse,
which is 198 + 891, and that equals 1089.

The question is why? See some hints at this footnoteŽŽ .

ŽŽ
The proof is to start with a three-digit number abc where a; b; c 2 N and a ¤ 0. Its reverse is cba, then the
difference of cba and abc is 99.c a/. Note that c a D f2; 3; 4; 5; 6; 7; 8g. Thus the intermediary number can be
only one of f198; 297; 396; 495; 594; 693; 792g. And we write this number as xyz, with x C z D 9 and y D 9 and
thus its inverse is zyx. Now, adding xyz to zyx will result in 100.xz/C20yC.xCz/ D 1009C209C9 D 1089.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 45

2.4.2 Math contest problem


Let a1 ; a2 ; : : : ; an represent an arbitrary arrangement of the numbers 1; 2; : : : ; n|| . Prove that if
n is odd, the product .a1 1/.a2 2/.a3 3/    .an n/ is an even number. First, note that we
do not know what is a1 , a2 and so on. Second, the question is about the parity of the product of
a bunch of integers.
Starting from the fact that odd times odd is odd , we can generalize to get a new fact that
the product of a bunch of odd numbers is odd. And the product of a bunch of integers is even if
there is at least one even in those integers. Using this fact, the problem is now amount to proving
that among the integers a1 1; a2 2; a3 3; : : : ; an n there is at least one even number. We
have transformed the given problem into an easier problem: instead of dealing with a product of
numbers which we do not know, now we just need to find one even number.
Let’s make the problem concrete so that it is easier to deal with. We consider the case n D 5.
We have to prove that among the numbers

a1 1; a2 2; a3 3; a4 4; a5 5

there exists at least one even number. Proving this is hard (because it is not clear which one is
even), so we transform the problem to proving that it is impossible that all those numbers are
odd. If we can prove that, then at least one of them is even. This technique is called proof by
contradiction.
If we assume that all numbers a1 1; a2 2; a3 3; a4 4; a5 5 are odd, we get a1 is
even, a2 is odd, a3 is even, a4 is odd and a5 is even. Thus, there are three even numbers and two
odds. But in 1; 2; 3; 4; 5 there are two evens and three odds! We arrive at a contradiction, thus
our assumption is wrong. We have proved the problem, at least for n D 5.
Nothing is special about 5, the same argument works for 7; 9; ::: Actually 1; 2; 3; : : : ; n starts
with 1, an odd number, and thus there are more odd numbers than even ones. But a1 1; a2
2; : : : ; an n starts with an even number, and hence has more evens than odd numbers.
It was a good proof, but what do you think of the following proof? Even though the problem
concerns a product, let’s consider the sum of a1 1; a2 2; : : : ; an n:

S D .a1 1/ C .a2 2/ C .a3 3/ C    C .an n/


D .a1 C a2 C    C an / .1 C 2 C    C n/

Why bother with this sum? Because it is zero whatever the values of a1 ; a2 ; : : : Now the sum
of an odd number of integers is zero (which is even) leads to the conclusion that one of the
number must be even. (Otherwise, the sum would be odd; think of 3 C 5 C 7 which is odd).
Why mathematicians knew to look at the sum S instead of the product? I do not know
the exact answer. One thing is sum, product are familiar things to think of. But if that did not
convince you, then the following joke tells it best:
||
If you’re not sure what this sentence means, take n D 3 for example, then we have three integers 1; 2; 3.
Arrangements of them are .1; 2; 3/, .1; 3; 2/, .2; 1; 3/ and so on.

To be mathematicians alike, we say that the set of odd integers is closed under multiplication.

This sum is called an invariant of the problem. Thus, this problem solving technique is to find for invariants in
the problem. Check the book by Paul Zeit, [72] for more.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 46

A man walking at night finds another on his hands and knees, searching for some-
thing under a streetlight. "What are you looking for?", the first man asks; "I lost a
quarter," the other replies. The first man gets down on his hands and knees to help,
and after a long while asks "Are you sure you lost it here?". "No," replies the second
man, "I lost it down the street. But this is where the light is."

2.5 If and only if: conditional statements


When reading about mathematics, one phrase that regularly shows up is “if and only if.” This
phrase particularly appears within statements of mathematical theorems. For example, "an in-
teger number is divisible by 9 if and only if the sum of its digits is divisible by 9" is one such
theorem. But what, precisely, does this statement mean?
To understand “if and only if,” we must first know what is meant by a conditional statement.
A conditional statement is one that is formed from two other statements, which we will denote
by A and B. To form a conditional statement, we say “if A then B.” Here is one example from
our daily experiences: “If it is raining outside, then I stay inside”. By the way, in “if A then B”,
A is called the hypothesis and B is called the conclusion.
Let’s see the following two mathematical statements:

 (S1): A number is divisible by 5 if it ends in 5.

 (S2): A number is divisible by 5 only if it ends in 5.

These two statements are very similar but different in one key word “only”. Interestingly, one is
true and the other is false. Here’s why. Some of the multiples of 5 are: 5; 10; 15; 20; : : : which
convinces us that the statement (S2) ’a number is divisible by 5 only if it ends in 5’ is incorrect:
the number 10 ends in 0, but is divisible by 5.
We now consider the statement "an integer number is divisible by 9 if the sum of its digits
is divisible by 9". This statement is true (as proved in Section 2.4). As remarkable as it is, this
theorem does not answer the question: what if the sum of the digits of a number is NOT divisible
by 9? So, to have a complete answer about the divisibility of a number by 9, we have to consider
the converse: if a number is divisible by 9 then the sum of its digits is divisible by 9.
Given a conditional statement “if A then B", we’re also interested in the converse: “if B then
A". It is easy to see that the converse is not always true. For example, the converse of the second
statement in the above list is: If n is divisible by 2, then n is divisible by 4. But this is not true:
the number six is divisible by 2, but it is not divisible by four. When the converse is true, we
have a biconditional statement:

"Something is an A if and only if (iff) it is a B" .A ” B/


“If and only if” is sometimes abbreviated to “iff” and symbolized by ”. For example, the
statement "A triangle is equilateral iff its angles all measure 60°" means both "If a triangle is
equilateral then its angles all measure 60°" and "If all the angles of a triangle measure 60° then
the triangle is equilateral".

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 47

To prove a theorem stated in the form A ” B, we have to prove two things. First,
A H) B. Second, B (H A. Using the divisibility by 9 as an example, we now need to prove
‘if a number is divisible by 9 then its digits sum is divisible by 9’, as the other part was proved
previously in Section 2.4.
Proof. We know that z D an an 1    a1 a0 can be written as

an an 1    a1 a0 D .an C an 1 C    C a0 / C 9.an an    an C an 1 an 1    an 1 C    C a1 /

If 9jz, then z D 9k, thus we have

.an C an 1 C    C a0 / C 9.an an    an C an 1 an 1    an 1 C    C a1 / D 9k

which results in an C an 1 C    C a0 D 9l, or the sum of the digits is divisible by 9. 

For the statement A ” B we say that A is the necessary and sufficient condition
for B, and vice versa. Note that the hypothesis that a number ends in 5 is sufficient
Remark
for it to be divisible by 5, but it is not a necessary. Because numbers end in 0 are also
divisible by 5.

2.6 Sums of whole numbers


In this section we discuss sums involving the first n whole numbers, e.g. what is 1C2C3C  C
1000? In the process, we introduce proof by induction, which is a common mathematicalP proof
method when there is n involved. Also introduced is the sigma notation for sum i.e., niD1 i .

2.6.1 Sum of the first n whole numbers


The sum of the first n integers is written as

S.n/ D 1 C 2 C 3 C    C n (2.6.1)

The notation S.n/ indicates this is a sum and its value depends on n. The ellipsis : : : also known
informally as dot-dot-dot, is a series of (usually three) dots that indicates an intentional omission
of a word, sentence, or whole section from a text without altering its original meaning. The word
(plural ellipses) originates from the Ancient Greek élleipsis meaning ’leave out’. In the above
equation, an ellipsis    (raised to the center of the line) used between two operation symbols (+
here) indicates the omission of values in a repeated operation.
There are different ways to compute this sum. I present three ways to demonstrate that there
are usually more than one way to solve a mathematical problem. And the more solutions you
can have the better. Among these different ways to a solve a problem, if it can be applied to
many different problems, it is a powerful technique which should be studied.
The first strategy is simple: get your hands dirty by calculating manually this sum for some
cases of n D 1; 2; 3; 4; : : : and try to find a pattern. Then, we propose a formula and if we

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 48

can prove it, we have discovered a mathematical truth (if it is significant then it will be called
theorem, and your name is attached to it forever). For n D 1; 2; 3; 4, the corresponding sums are

n D 1 W S.1/ D 1
23
n D 2 W S.2/ D 1 C 2 D 3 D
2
34
n D 3 W S.3/ D 1 C 2 C 3 D 6 D
2
45
n D 4 W S.4/ D 1 C 2 C 3 C 4 D 10 D
2
From that (the red numbers) we can guess the following formula

n.n C 1/
S.n/ D 1 C 2 C 3 C    C n D (2.6.2)
2
You should now double check this formula for other n, and only
when you’re convinced that it might be correct, then prove it. Why
bother? Because if you do not prove this formula for any n, it remains
only as a conjecture: it can be correct for all ns that you have manually
checked, but who knows whether it holds for others. How are we going
to prove this? Mathematicians do not want to prove Eq. (2.6.2) n times;
they are very lazy which is actually good as it forces them to come up with clever ways. A
technique suitable for this kind of proof is proof by induction. The steps are: (1) check S.1/ is
correct–this is called the basis step, (2) assume S.k/ is correct, this is known as the induction
hypothesis and (3) prove that S.k C 1/ is correct: the induction step. So, the fact that S.1/ is
valid leads to S.2/ is correct, which in turn leads to S.3/ and so on. This is similar to the familiar
domino effect.
Proof by induction of Eq. (2.6.2). It is easy to see that S.1/ is true (Eq. (2.6.2) is simply 1 D 1).
Now, assume that it holds for k–a natural number, thus we have
k.k C 1/
S.k/ D 1 C 2 C 3 C    C k D
2
Now, we consider S.k C 1/, which is 1 C 2 C    C k C k C 1, which is S.k/ C .k C 1/. If we
can show that S.k C 1/ D 0:5.k C 1/.k C 1 C 1/, then we’re done. Indeed, we have

k.k C 1/ .k C 1/.k C 1 C 1/
S.k C 1/ D S.k/ C .k C 1/ D C .k C 1/ D
2 2

I now present another way done by the 10 years old Gauss (who would later become the
prince of mathematics and one of the three greatest mathematicians of all time, along with
Archimedes and Newton). The story goes like this. As a young student, Gauss and his classmates
were asked one day to add up all the integers from 1 to 100. We can imagine the groans. But

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 49

only a few seconds later Gauss gave the correct answer. How could he possibly have performed
such a feat? He had not seen the problem before, nor was he a phenomenal calculator. He was,
however, an extraordinary pattern recognizer. What he recognized is this: if we add 1 to 100,
or 2 to 99, or 3 to 98 etc. we all get 101, and there are 50 such pairs, the sum must be then
101  50 D 5050:
S D 1 C 2 C 3 C    C 100
S D 100 C 99 C 98 C    C 1
2S D 101 C 101 C    C 101 D 101  100 (2.6.3)
100  101
SD
2
What a great idea! A geometric illustration of Gauss’
clever idea is given in the figure: our sum is a triangle, and by
adding to this triangle another equal triangle we get a rectangle
which is easier to count the dots. Why 1 C 2 C 3 C    makes
5
a triangle? See Fig. 2.6 for the reason. The lesson here is
try to have different views (or representations) of the same
problem. In this problem, we move away from the abstract
(numbers 1; 2; 3; : : :) back to the concrete (rocks or dots) and
by playing with the dots, we can see the way to solve the 6

problem.

1 3 6 10 15

Figure 2.6: Triangular numbers are those of the form 1 C 2 C 3 C    C n.

Triangular numbers and factorial. The first 4 triangular numbers are


1 D1
3 D1C2
6 D1C2C3
10 D 1 C 2 C 3 C 4
If we replace the plus sign by the multiplication, we get factorials : 1  2 D 2Š, 1  2  3 D 3Š
and so on. It is super interesting.

The power of a formula. What is significant about Eq. (2.6.2)? First, it simplifies computation
by reducing a large number of additions to three fixed operations: one of addition, one of

Refer to Section 2.26.2 for detail on factorial.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 50

multiplication and one of division. Second, as we have at our disposal a formula which produces
a number if we plug in a number, we can, in theory, to compute S.5=2/, it is 35=8. Of course it
does not make sense to ask the sum of the first 5=2 integers. Still, formula extends the scope
of the original problem to values of the variable other than those for which it was originally
defined.Ž

The third way is to write the sum as follows


X
n
S.n/ D 1 C 2 C 3 C    C n D k (2.6.4)
kD1
Pn
The notation kD1 k reads sigma of k for k ranges from 1; 2; 3, to n; k is called the index of
summation. It is a dummy variable in the sense
Pnthat it does not appear in the actual sum. Indeed,
we can use any letter we like; we can write i D1 i ; 1 is the starting point of the summation or
the
P lower limit of the summation; n is the stopping point or upper limit of the summation. And
is the capital Greek letter sigma corresponding to S for sum. This summation notation was
introduced by Fourier in 1820. You will see that mathematicians introduce weird symbols all the
times. Usually they use Greek letters for this purpose. Note that there is no reason to be scared
of them, just like any human languages we need time to get used to these symbols.
Now comes the art. Out of the blue , mathematicians consider this identity .k 1/2 D
k 2 2k C 1 to get

.k 1/2 D k 2 2k C 1 H) k 2 .k 1/2 D 2k 1 (2.6.5)

The boxed equation is an identity i.e., it holds for k D 1; 2; 3; : : : Now, we substitute k D


1; 2; : : : ; n in the boxed identity, we get n equations, and if we add these n equations we’re led
to the following which involves S.n/ŽŽ
X
n X
n X
n
2 2
Œk .k 1/  D .2k 1/ D 2 k n D 2S.n/ n (2.6.6)
kD1 kD1 kD1

Now if the sum on the left hand side can be found, we’re
Pdone. As it turns out it is super easy to
n
compute this sum, to see that we just need to write out kD1 Œk 2 .k 1/2  explicitly:
X
n
Œk 2 .k 1/2  D .12 02 / C .22 12 / C .32 22 / C    C .n2 .n 1/2 /
kD1

D 12 C 22 12 C 32 22 C    C .n2 .n 1/2 / D n2
Ž
Believe me, it is what mathematicians
p
do and it led to many interesting and beautiful results; one of them is
the factorial of 0.5 or .1=2/Š D =2, why  here?, see Section 4.20.2.

If you are really
P wondering the origin
P of this magical step, Section
P 2.21.6 provides
Pone answer.P
ŽŽ
To see why nkD1 .2k 1/ D 2 nkD1 P k n, go slowly: nkD1 Pn.2k 1/ D n
kD1 2k
n
kD1 1. Now,
n
1 C 1 C    C 1 D n, but 1 C 1 C    C 1 D kD1 1. For the term kD1 2k, it is 2  1 C 2  2 C    C 2  n D
„ ƒ‚ …
n terms P
2.1 C 2 C    C n/ D 2 nkD1 k.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 51

This sum is known as a sum of differences, and it has a telescoping property that its sum depends
only on the first and the last term for many terms cancel each other (e.g. the red and blue terms).
We will discuss more about sum of differences, when we see that it is a powerful technique (as
the sum is so easy to compute).
Introducing the above result into Eq. (2.6.6) we can compute S.n/ and the result is identical
to the one that we have obtained using Gauss’ idea and induction.

2.6.2 Sum of the squares of the first n whole numbers


The sum of the squares of the first n whole numbers is expressed as (note that we use the same
symbol S.n/ but this time a different sum is concerned. This should not cause any confuse
hopefully)
X
n
2 2 2 2
S.n/ D 1 C 2 C 3 C    n D k2 (2.6.7)
kD1

Among the previous three ways, which one can be used now? Obviously, the clever Gauss’s
trick is out of luck here. The tedious way of computing the sum for a few cases, find the pattern,
guess a formula and prove it might work. But it is hard in the step of finding the formula.
So, we adopt the telescope sum technique starting with this identity .k 1/3 D k 3 3k 2 C
3k 1
.k 1/3 D k 3 3k 2 C 3k 1 H) k 3 .k 1/3 D 3k 2 3k C 1

It follows then
X
n X
n X
n
3 3 2
Œk .k 1/  D 3 k 3 kCn
kD1 kD1 kD1

P
But, the telescope sum on the right hand side is n3 i.e., nkD1 Œk 3 .k 1/3  D n3 . Thus, we
can write
n.n C 1/ n.n C 1/
3S.n/ D n3 C 3 nD .2n C 1/ (2.6.8)
2 2
P
where we have used the result from Eq. (2.6.2) for nkD1 k. Can we understand why the result
is as it is? Consider the case n D 4 i.e., S.4/ D 1 C 4 C 9 C 16. We can express this sum as
a triangle shown in the first picture in Fig. 2.7. As the sum does not change if we rotate this
triangle, we consider two rotations (the first rotation is an anti-clockwise 120 degrees about the
center of the triangle) shown in the two remaining figures. If we sum these three triangles i.e.,
3S.4/, we get a new triangle shown in ??. What is the sum of this triangle? It is 9.1 C 2 C 3 C 4/,
and 9 D 2.4/ C 1, so this triangle gives .2  4 C 1/.4/.5/=2, which is the RHS of Eq. (2.6.8).
Why we knew that a rotation would solve this problem? This is because any triangle in
Fig. 2.7 is rotationally symmetric.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 52

1 4 4 9

2 2 3 4 4 3 9 9

3 3 3 2 3 4 4 3 3 9 9 9

4 4 4 4 1 2 3 4 4 3 2 1 9 9 9 9
a) b)

Figure 2.7: S.n/ D 12 C 22 C 32 C    n2 : pictorial explanation of the result 3S.n/ D n.nC1/=2.2n C 1/.

2.6.3 Sum of the cubes of the first n whole numbers


The sum of the squares of the first n whole numbers is expressed as
X
n
3 3 3 3
S.n/ D 1 C 2 C 3 C    n D k3 (2.6.9)
kD1

As this point, you certainly know how to tackle this sum. We start with .k 1/4 :
.k 1/4 D k 4 4k 3 C 6k 2 4k C 1 H) k 4 .k 1/4 D 4k 3 6k 2 C 4k 1 (2.6.10)
So,
X
n X
n X
n X
n
4 4 3 2
Œk .k 1/  D 4 k 6 k C4 k n (2.6.11)
P
kD1 kD1 kD1 kD1
We know the LHS ( nkD1 Œk 4
.k 1/  D n ), and the second and third sums (from previous
4 4

problems) in the RHS except the one we are looking for (the red term), so we can compute it as:

4S.n/ D n4 C n.n C 1/.2n C 1/ 2n.n C 1/ C n


nŒn3 C .n C 1/.2n C 1/ 2.n C 1/ C 1
H) S.n/ D
4 (2.6.12)
  2
n2 .n C 1/2 n.n C 1/
H) S.n/ D D
4 2
Using Eq. (2.6.2) we can see that, the sum of the first n cubes is the square of the sum of the first
n natural numbers. Actually we can see this relation geometrically, as shown in the below figure
for the case of n D 3: S.3/ D 1 C 8 C 27 D .1 C 2 C 3/2 .
Let’s summarize all the results we have to see if a pattern existsŽŽ :

Pn 1
n.n C 1/ n2 n
kD1 k D D C
2 2 2
Pn n.n C 1/.2n C 1/ n 3
3n2 C n
kD1 k2 D D C (2.6.13)
6 3 6
Pn 2
n .n C 1/2
n 4
2n C n2
3

kD1 k3 D D C
4 4 4
Z
ŽŽ
If you know calculus, this is the younger brother of this x n dx D x nC1=nC1 C C .

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 53

Clearly, we can see a pattern which allows us to write for any whole number p (we believeŽ in
the pattern that it will hold when p D 4; 5; : : :)

X
n
npC1
kp D C R.n/ (2.6.14)
pC1
kD1

where the ratio of R.n/ over npC1 approaches zero when n is infinitely large; see Section 2.22
for a discussion on sequence and limit. This result would become useful in the development of
calculus (precisely, in the problem of determining the area under the curve y D x p ).
PnAll the sums in Eq. (2.6.13) contain two terms, and we can see why by looking at Fig. 2.8. For
kD1 k , the term n =2 is the area of the cyan
Pn triangle OAB. And the term n=2 is the area of the
1 2

pink staircases (Fig. 2.8a). Similarly, for kD1 k , the term n3 =3 is the volume of the pyramid
2

(Fig. 2.8b). If you’re good at geometry you should bePable to compute this sum geometrically
following this pyramid interpretation. However, for nkD1 k p p  3, it is impossible to use
geometry while algebra always gives you the result, albeit more involved.

1
B

1
4
2

9
3

4 16

5
A 25
O
(a) 1 C 2 C    C n for n D 5 (b) 12 C 22 C    C n2 for n D 5

Figure 2.8: Geometric interpretation of some of Eq. (2.6.13).

Question 2. We have found the sums of integral powers up to power of three. One question
arises naturally: is there a general formula that works for any power?

2.7 Prime numbers


Again by playing with natural numbers long enough and pay attention, we see this:

4 D 2  2; 6 D 2  3; 8 D 2  4; 9 D 3  3
2 D 1  2; 3 D 1  3; 5 D 1  5; 7 D 1  7

Ž
Yes, many times following your gut is the best way to go. And in maths, patterns are everywhere.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 54

So, we have two groups of natural number as far as factorizing (expressing a number as a
product of other numbers) them is concerned. In one group .2; 3; 5; 7; : : :/, the numbers can only
be written as a product of one and itself. Such numbers are called prime numbers, or just primes.
The other group .4; 6; 8; 9; : : :/ contains non-prime numbers or composite numbers. Primes are
central in number theory because of the fundamental theorem of arithmetic stating that every
natural number greater than one is either a prime itself or can be factorized as a product of primes
that is unique up to their order. For example,

328 152 D 2  2  2  3  11  11  113 D 23  31  112  113

each of the numbers 2; 3; 11; 113 is a prime. And this prime factorization is unique (order of
the factors does not count). That’s why mathematicians decided that 1 is not a prime. If 1 was
a prime then we could write 6 D 1  2  3 D 2  3: the factorization is not unique! As with
matters are made of atoms, numbers are made of prime numbers!
Theorem 2.7.1: Fundamental Theorem of Arithmetic
Every integer greater than 1 can be written in the form

p1n1 p2n2    pknk

where n1 ; n2 ; : : : are positive integers, and p1 ; p2 ; : : : are primes. The factorization is unique,
except possibly for the order of the factors.

This theorem tells us two things: (1) any integer greater than 1 can be written in this form
p1n1 p2n2    pknk –this is about the existence of such a factorization and (2) this form is unique–this
is about uniqueness of the factorization; no other factorization is possible.
In this apparently introductory exposute to primes, we start with the question of how many
primes are there and the beautiful proof of Eulicd (Section 2.7.1). Then, we discuss the prime
number theorem which is about the distribution of primes (Section 2.7.2). Finally, we talk about
prime twins and the fascinating of Yitang Zhang (Section 2.7.3).

2.7.1 How many primes are there?


Now we have discovered new type of number–the primes–we ask questions about them and
usually we discover interesting things. And we create new maths. One question is: how many
prime numbers are there? How to answer this question? The easiest way is to count them. We
count the number of primes from 1 to N for N D 100; 1 000; 10 000; : : :, designated by .N / ,
then we divide it by N to have the density of prime numbers. The result of this analysis is given
in Table 2.1. Of course a computer program was used to get this resultŽ . The result shows that
.100/ D 25; that is there are 25 primes in the first 100 integers. Among the first 1 000 integers,
there are 168 primes, so .1000/ D 168, and so on. Note that as we considered the first 100,
1000 and 10 000 integers, the percentage of primes went from 25% to 12.3%. These examples

The Greek letter  makes a “p” sound, and stands for “prime".
Ž
I used the package Primes.jl which provides the function isprime(n) to check if a given n is prime or not.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 55

Table 2.1: The density of prime numbers.

N .N / .N /=N

100 25 0.25
1 000 168 0.168
10 000 1 229 0.123
100 000 9 592 0.096
1 000 000 78 498 0.079

suggest, and the prime number theorem confirms, that the density of prime numbers at or below
a given number decreases as the number gets larger.
But if we keep counting for bigger N we see that the list of prime numbers goes on. Indeed,
there are infinite prime numbers as proved by Euclid more than 2000 years ago. His proof is one
of the most famous, most often quoted, and most beautiful proofs in all of mathematics. Why? It
is because the largest prime that Euclid knew was probably a small number, but with reasoning
only he could prove that there are infinite primes.
His proof is now known as proof by contradiction (also known as the method of reductio ad
absurdum, Latin for "reduction to absurdity"). To use this technique, we assume the negate of
the statement we are trying to prove and use that to arrive at something impossibly correct. So,
we assume that there are finite prime numbers namely p1 ; p2 ; : : : ; pn . And from this assumption
we do something to arrive at something absurd, thus invalidating our starting point.
Euclid considered this number p:

p D p1  p2      pn C 1

Because we have assumed there are only n primes, p cannot be a prime. Thus, according to
the fundamental theorem of arithmetic, p must be divisible by any of pi (1  i  n), but the
above equation says that p divides by any pi always with remainder of 1. A contradiction! So
the assumption that there are finite primes is wrong, and thus there are infinite prime numbersŽŽ .

2.7.2 The prime number theorem


The prime number theorem states that among the first n integers, there are about n= log.n/
primes. This theorem was proved at the end of the 19th century by Hadamard and de la Valle
Poussin, and log.n/ is the natural logarithm of n, see Section 2.24.
How did mathematicians come up with this amazing theorem? Because they’re lazy. Instead
of counting the primes one by one, which is tediously slow, this theorem gives us in one go
approximately how many primes from 1 to n. But exactly how mathematicians discovered this
ŽŽ
A misconception is that p is always a prime. One example 2  3  5  7  11  13 C 1 D 30031 D 59  509,
not a prime.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 56

theorem? A good thing to know is that it took them lots of time to do so. And it was the work of
many great men.
The first step is to have a table of all the primes from 1 to n when n is as big as possible
(keep in mind that back then there was no computer). An early figure in tabulating primes is
John Pell, an English mathematician who dedicated himself to creating tables of useful numbers.
Thanks to his efforts, the primes up to 100 000 were widely circulated by the early 1700s. By
1800, independent projects had tabulated the primes up to 1 million.
Having the tables (or data) is one thing and getting something out of it is another. And we
need a genius to do that. And that genius was Gauss. In a letter to his colleague Johann Encke
about prime numbers, Gauss claimed merely to have looked at the data and seen the pattern; his
complete statement reads "I soon recognized that behind all of its fluctuations, this frequency is
on the average inversely proportional to the logarithm."

25 175
1200
150
20
1000
125

15 800
100

75 600
10
50 400
5
25 200

0 0 0
0 20 40 60 80 100 0 200 400 600 800 1000 0 2000 4000 6000 8000 10000

Figure 2.9: Plot of the prime counting function .N / for N D 102 ; 103 ; 104 .

We are not Gauss, so we need to visualize the data. We can say .N / is a function and call
it the prime counting function. It is a function because when we feed to it a number it returns
another number. In Fig. 2.9 the plot of .N / is given for N D 102 ; 103 ; 104 . What can we get
from these plots? It is clear that as N get larger and larger .N / can be considered as a smooth
function. Among all functions that we know of it is N=log N that best approximates .N /.
But why log? See Table 2.2 and the red numbers. The red number is exactly log 10. In this
table, the third column is N=.N / and the first entry in the fourth column is the difference of
the second entry and the first entry in the 3rd column. Let f .N / be the mysterious function for
N=.N /, then we have f .10N / D f .N / C 2:3. A function that turns a product into a sum!
That can be a logarithm. Indeed, log.10N / D log N C log 10, and log 10 D 2:3. This table was
probably the one that Gauss merely looked at and guessed correctly the function. Without doubt,
he was a genius.
However, Gauss did not prove his conjectureŽ . The theorem was proved independently by
Jacques Hadamard and Charles Jean de la Vallée Poussin in 1896 using ideas introduced by
Bernhard Riemann (guess what, Riemann was Gauss’s student), in particular, the Riemann zeta
function (Section 4.20.3).

Created using the function step of matplotlib.
Ž
I am not sure why. Maybe the maths of his time was not sufficient.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 57

Table 2.2: The density of prime numbers. The fourth col is the difference in 3rd col.

N .N / N=.N / 

1 000 000 78 498 12.7392 2.30794


10 000 000 664 579 15.0471 2.30961
10 000 000 5 761 455 17.3567 2.30991
100 000 000 50 847 534 17.3567 -

2.7.3 Twin primes and the story of Yitang Zhang


It would be incomplete if we did not mention twin primes. The first 25 prime numbers (all the
prime numbers less than 100) are

f2; 3; 5; 7; 11; 13; 17; 19; 23; 29; 31; 37; 41; 43; 47; 53; 59; 61; 67; 71; 73; 79; 83; 89; 97g

Mathematicians call the prime pairs .3; 5/, .5; 7/, .11; 13/ etc. the twin primes. Thus, we have
the following definition:

Definition 2.7.1
A couple of primes .p; q/ are said to be twins if q D p C 2.

Note that except .2; 3/, 2 is the smallest possible distance (or gap) between two primes.
Mathematicians then ask the same old question: how many are there twin primes? It is unknown
whether there are infinitely many twin primes (the so-called twin prime conjecture) or if there
is a largest pair. The breakthrough work of Yitang Zhang in 2013, as well as work by James
Maynard, Terence Tao and others, has made substantial progress towards proving that there are
infinitely many twin primes, but at present this remains unsolved. For a list of unsolved maths
problems check here.
It is usually while solving unsolved mathematical problems that mathematicians discover
new mathematics. The new maths also help to understand the old maths and provide better
solution to old problems. Then, after about 100 or 200 years some of the new maths come into
the mathematics curriculum to train the general public.

Yitang Zhang (born February 5, 1955). On April 17 2013, a paper arrived in the inbox of
Annals of Mathematics, one of the discipline’s top journals. Written by a mathematician virtually
unknown to the experts in the field — a 58 year old§ lecturer at the University of New Hampshire
named Yitang Zhang — the paper claimed to have taken a huge step forward in solving the twin
primes conjecture, one of mathematics’ oldest problems. Just three weeks later Zhang‘s paper
§
“No mathematician should ever allow himself to forget that mathematics, more than any other art or science, is
a young man’s game,” Hardy wrote. He also wrote, “I do not know of an instance of a major mathematical advance
initiated by a man past fifty.”

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 58

was accepted. Rumors swept through the mathematics community that a great advance had been
made by an unknown mathematician — someone whose talents had been so overlooked after he
earned his doctorate in 1991 that he had found it difficult to get an academic job, working for
several years as an accountant and even in a Subway sandwich shopŽŽ .
“Basically, no one knows him,” said Andrew Granville, a number theorist at the Université
de Montréal. “Now, suddenly, he has proved one of the great results in the history of number
theory.” For Zhang’s story, you can watch this documentary movie.
There are many more interesting stories about primes but we stop here, see Fig. 4.74 for a
prime spiral.

2.8 Rational numbers


This section expands our system of number to include the so-called rational numbers–those
numbers such as 3=3; 7=3 and so on.
Section 2.8.1 presents a definition of rational numbers, including a geometric construction
using straightedge and compass. The arithmetic of them is discussed in Section 2.8.2, where the
rules of addition/subtraction, multiplication/division of rational numbers are given. Fractions of
which denominators are 10; 100; 1000; : : : can be written using decimals; e.g. 1=10 D 0:1; and a
discussion of decimals is given in Section 2.8.3.

2.8.1 What is 5=2?


Sharing is not easy whether you are a kid or an adult. And it is also the case with mathematics.
While it is straightforward to recognize that 6=3 D 2 (six candies equally shared by 3 kids) or
8=2 D 4, what is 5=2? From an algebraic point of view, we can say that while the equation
3x D 6 has one solution: x D 2, the equation 2x D 5 has no integer solution. In other words,
integers are not closed under division. Again, it was needed to expand our system of number one
more time. This is a modern view of how rational numbers (numbers such as 5=2) are defined.
Historically, it was developed very practically.
While counting objects resulted in the development of natural numbers, it was the practical
problem of measurement (measuring length and area) that led to the birth of rational numbers.
Similar to counting discrete objects (one bird, two carrots etc.), one needs to define a unit before
measurement can be done. For example, how long is a rod? We can define a unit of length to
which we assign a value of 1 and then the rod length is expressed in terms of this unit. If the unit
is meter, the rod is 5 meters. If the unit is yard, the rod is 5.46807 yards.
One problem arises immediately. Not all quantities can be expressed as integral multiples of
a unit. A rod can be one meter and something long. To handle this, we define a sub-unit. For
ŽŽ
The pursuit of tenure requires an academic to publish frequently, which often means refining one’s work within
a field, a task that Zhang has no inclination for. He does not appear to be competitive with other mathematicians,
or resentful about having been simply a teacher for years while everyone else was a professor. As he did not have
to publish many papers he had all the time to focus on big problems. I think his situation is somehow similar to
Einstein being a clerk in the Swiss patent office.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 59

example, we can divide 1 meter into 100 equal parts and each part (which we call a centimeter
by the way) is now a new unit. The rod length is now 120 centimeters, or 120.1=100/ meters.
We can generalize this by dividing 1 into m equal parts to obtain 1=m the measure of our new
sub-unit. Any length can then be expressed as an integral multiple of 1=m or n=m, a ratio. And
that’s how mathematicians defined rational numbers.

Definition 2.8.1
A rational number is a number that can be written in the form p=q where p and q are integers
and q is not equal to zero; p is named the numerator and q is the denominator.

The requirement that q is not equal to zero comes from the fact that division by zero
is meaningless. Because, if we allowed it, we would get absurd results. For instance, as
0  1 D 0  2, divide both sides by 0, we get 1 D 2, which is non-sense. It is quite obvious that
the fraction p=p is nothing but one.

Cut a line into n equal segments. Now, we have to discuss how to get 1=n geometrically using
compass and straightedge. A straightedge is simply a guide for the pencil when drawing straight
lines. In most cases you will use a ruler for this, since it is the most likely to be available, but you
must not use the markings on the ruler during constructions. Why not? Because we’re trying to
define rational numbers (e.g. 1=2) using only what we have so far: the whole numbers. So, at
this stage, we do not actually have rulers!
The steps are (illustrated for a division of a segment into three C
equal parts):

 Draw a line from the start point A, heading somewhat up-


wards i.e., AC ;
A
 Use the compass to divide it into 3 equal segments; B

 Use the compass to create a parallel line heading back-


wards and down from the end point B; that is the line BD.
Note that D is the intersection of two circles: one is cen-
tered at A with radius BC , and the other is centered at B
D
with radius AB;

 Use the compass to divide BD into 3 equal segments;

 Connect the intersection points of the two new lines, and


where they cross the original line it will be neatly subdi-
vided.

Rational here does not mean logical or reasonable, it is a ratio of two integers.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 60

2.8.2 Arithmetic with rational numbers


We now need to define addition, multiplication and division for rational numbers. We first present
these rules here (explanations follow immediately):

a c ac a c ad C bc a=b a d
 D ; C D ; D  (2.8.1)
b d bd b d bd c=d b c
Surprisingly the rule for multiplication is easier to grasp than that for addition. We refer to
Fig. 2.10 for an illustration. Imagine of a square wooden plate of unit sides. Thus the area of this
plate is 1 (whatever the unit). Now we divide the longer side into 3 equal parts, so each part is
1/3. Similarly, we chop the shorter edge into two halves, so each part is 1/2. Now, the area of one
peace is 1=3  1=2 and it must be equal to 1=6 as there are six equal rectangular pieces, and in total
they make a unit square of which the area is one. If we take two pieces then the area is 2=6 and
they make a rectangle of sides 2=3 and 1=2; so 2=3  1=2 D 2=6. That’s why we have the rules of
multiplying two rationals: multiplying the numerators and multiplying the denominators, then
take the ratio.
1 1 1 2 1 2
× = × =
3 2 6 3 2 6

1 1
1/2 1/6 1/2 2/6

1/3 1/3 2/3


1 1

Figure 2.10: Multiplication of two rational numbers.

1/2

1 2 3
= =
2 4 6 1/4

a c
= =⇒ ad = bc
b d
1/6

Figure 2.11: Equality of two rational numbers. The rational 1=2 is said to be in its lowest term as it is
impossible to simplify it. On the other hand, 2=4 is not in lowest term. How to know when a=b is in a
lowest term? It is when the greatest common divisor of a and b is 1. See Section 2.34.1 for a discussion
on how to find the greatest common divisor of two integers.

We now move to the problem of adding two rationals. First, it is not hard to add two rational

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 61

numbers when they have the same denominator:

1 3 1C3 4
C D D
2 2 2 2

This is because one-half plus three halves is certainly four halves, which is 4=2. This is similar
to one carrot plus two carrots is three carrots. The unit is just a half instead of 1 carrot. For
rational numbers having different denominators, the rule is then to convert them to have the
same denominator:
1 4 13 42 3 8 11
C D C D C D
2 3 23 32 6 6 6
The conversion is based on the equality of two rational numbers explained in Fig. 2.11.
Finally, we need to define division. What is .a=b/  .c=d /? We use a trick here:

a a d a d
b b
 c b
 c a d
c D c d
D D 
d d
 c
1 b c

That is, we multiplied both the numerator and denominator with d=c , to make the denominator
equal one. The rule is: divide a fraction is multiply its inverse. For example, we have

1 2 2=3 2 7 14
D 1  D 2; D  D
1=2 1 5=7 3 5 15

How we are use this division rule is correct? We can check. If 6 W 3 D 2 then 6 D 2  3. So, we
do the same check:
a
b ad a ad c a
c D H) D  D
 

d
bc b b
c d
 b

Percentage. In mathematics, a percentage (from Latin per centum meaning "by a hundred") is a
ratio expressed as a fraction of 100. It is often denoted using the percent sign ("%"), although the
abbreviations "pct.", "pct" and sometimes "pc" are also used. For example, 50% is 50=100 D 1=2.
As a ratio, a percentage is a dimensionless number (pure number); it has no unit of measurement.
Arithmetic is important but this is more important. We have to check whether the rules of
integers, stated in Eq. (2.1.2), still hold for the new number–the rationals? It turns out that the
rules hold. For example, the addition is still commutative:

a c ad C bc bc C ad bc ad c a
C D D D C D C
b d bd bd bd bd d b

Note that in the proof we have used ad C bc D bc C ad , as these numbers are integers. Why
this is important? Because mathematicians want to see 2 D 2=1–that is an integer is a rational
number. Thus, the arithmetic for the rationals must obey the same rules for the integers.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 62

2.8.3 Decimal notation


Decimals are a convenient and useful way of writing fractions of which denominators are powers
of ten e.g. 10; 100; 1000; : : : For example, 3=10 is written as 0:3. We can understand this notation
as follows. Consider for example the number 351

210
351 D 3  100 C 5  10 C 1  1 D 3  102 C 5  101 C 1  100

which means that the units are in the 0 position, the tens in the 1 position and the hundreds in
the 2 position, and the position decides the power of tens. Now, 3=10 D 3  10 1 is zero unit and
three tenths, thus the digit 3 must be placed on the 1 position, which is before the units: 03, but
we need something to separate the two digits otherwise it is mistaken with 3. So, 3=10 is written
as 0:3. The decimal point separates the units column from the tenths column. Similarly, 3=100,
which is three hundredths, is written as 0:03–the number 3 is at position 2. The number 351:3
is understood as

210 1
351:3 D 3  102 C 5  101 C 1  100 C 3  10 1

The Flemish mathematician Simon Stevin (1548–1620), some-


times called Stevinus, first used a decimal point to represent a fraction
with a denominator of ten in 1585. While decimals had been used
by both the Arabs and Chinese long before this time, Stevin is cred-
ited with popularizing their use in Europe. An English translation of
Stevin’s work was published in 1608 and titled Disme, The Arts of
Tenths or Decimal Arithmetike, and it inspired the third president of
the United States Thomas Jefferson to propose a decimal-based cur-
rency for the United States (for example, one tenth of a dollar is called
a dime).
If we do long division for rationals we see the following decimals

1 1 1
D 0:25; D 0:3333 : : : ; D 0:142857142857 : : : (2.8.2)
4 3 7

First I introduce some terminologies. In decimals the number of places filled by the digits after
(to the right of) the decimal point are called the decimal places. Thus, 0:25 has 2 decimal places
and 0.2 has 1 decimal place. That’s boring (but we need to know the term to understand other
people). What’s more interesting lies in Eq. (2.8.2): we can see that there are two types of
decimals for rational numbers. The decimal 0:25 is a terminating decimal. The (long) division
process terminates. On the other hand, 1=3 D 0:3333 : : : with infinitely many digits 3 as the
division does not terminate. The decimal 0.3333... is called a recurring decimal. How about 1=7?
Is it a recurring decimal? Of course it is, you might say. But think about this: how can you sure
that the red digits repeat forever? It could be like this: 1=7 D 0:142857142857 : : : 142857531 : : :
But things are not that complicated for rational numbers. Any recurring decimal has the pattern

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 63

forever. And the reason is not hard to see. Let’s look at the following division of integers by 7:

0 D 0  7 C 0; 6 D 0  7 C 6; 12 D07C5
1 D 0  7 C 1; 7 D 0  7 C 0; 13 D07C6
2 D 0  7 C 2; 8 D 0  7 C 1; 14 D07C0
3 D 0  7 C 3; 9 D 0  7 C 2; 15 D07C1
4 D 0  7 C 4; 10 D 0  7 C 3; 16 D07C2
5 D 0  7 C 5; 11 D 0  7 C 4; 17 D07C3

Look at the remainders: there are only six (except 0) of them: f0; 1; 2; 3; 4; 5; 6g. That’s why
1=7 D 0:142857142857 : : :, which has a cycle of six–the length of the repeating digits.

Sometimes you’re asked to find the fraction corresponding to a recurring decimal. For exam-
ple, what is the fraction of 0:2272727 D 0:227 where the bar on 27 is to indicate the repeated
digits. To this end, we write 0:227 D 0:2 C 0:027. Now, we plan to find the fraction for 0:027.
We start with y D 0:27, then taking advantage of the repeating pattern, we will find a linear
equation in terms of y to solve for it:

100y D 27:27
27 27 2 27 5
99y D 27 H) y D H) 0:027 D H) 0:227 D C D
99 990 10 990 22
Is 0.9999... equal to 1? We all know that 1=3 D 0:3, multiplying both sides with 3, we obtain
1 D 0:9 D 0:9999 : : : And there are many other proofs for this. For example, the following
proof is common and easy to get:

x D 0:999 : : :
100x D 99:999 : : :
99x D 99 H) x.D 0:999 : : :/ D 1
But what is going on here? The problem is at the equal sign, and the never ending 9999. To fully
understand this we need to go to infinity and this will be postponed until Section 2.22.

2.9 Irrational numbers


So we have integers and rational numbers. It is easy to see that
the average of two rational numbers is also a rational number. 0 1/8 1/4 1/2 1

Thus, for any two rational numbers, even if they are really close
to each other, we can always find another rational number in be-
tween them. We might be tempted to conclude that all numbers −3 −2 −1 0 1 2 3

are rational. And the ancient Greeks believed it for a while. But,
surprisingly that’s not true. There exists another type of numbers and this section is devoted to
the discussion of those numbers that are not the rationals. They are the ... irrationals.
Our story starts with a square and itspdiagonal (Section 2.9.1): it shows that the diagonal
of a square of side a has a length of a 2 which is an irrational number. We move then to

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 64

p
the arithmetic
p of the
p irrationals (Section 2.9.2). From square root a we generalize
p
to cube
p
roots
p
3
a and to a (Section 2.9.3). Section 2.9.4 is about writing = 2 as =2 and why
n 1 2
p p
4 C 2 3 D 1 C 3: this is about rationalizing the denominators and p simplifying radicals.
Next in our discussion is a special irrational–the golden ratio:  D 1C 5=2 (Section 2.9.5).

Section 2.9.6

2.9.1 Diagonal of a unit square


Our journey with the evolution of the number systems continues with a
simple but perplexing problem. Consider a unit square as shown in the
beside figure, what is the length of the diagonal d ? Let’s assume that
we know the Pythagorean theorem, then it is obvious that d 2 D 12 C
12 D 2 (see Fig. 2.12 for a geometry explanation without resorting d 1
to Pythagorean theorem). But what d exactly is? It turns out that d
cannot be expressed as a=b where a and b are integers. In other words,
there are no integers a and b such that a2 =b 2 D 2. Nowadays, we call 1
such a number an irrational number.
It would be inconvenient to refer to d as a number such that d 2 is
two. We need a name p for it and a symbol as well. Nowadays, we say d is the square root of 2,
and write it as d D 2. We will talk more about roots later in Section 2.9.3.

B
d
1 √
2
1
1
D B C

O 1 A 2

C
(a) (b)
p
Figure 2.12: Proof that the diagonal of a unit square has a length of 2: Starting with one unit square,
we add three more unit squares to the problem, and we suddenly get a symmetrical geometry object. The
area of the square ABCD is d 2 and this square is twice as plarge as the unit square. Thus, d 2 D 2 (a). A
geometric construction of a line segment with length being 2 (b). We start
p with the right triangle OAB
with AO D AB D 1. The Pythagorean theorem then tells us that OB D p 2. Now using a compass, draw
a circle centered at O p
and with OB as radius we get point C with OC D 2. And that point C is where
the irrational number 2 lives.

p
How are we going to prove that 2 is irrational? The only information we have is the
definition of an irrational number–the number which is not a=b. So, the goal is to prove that

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 65

p p
2 ¤ a=b. Where do we begin? It seems easier if we start with 2 D a=b, and play with this
to see if somethingp come up. We’re trying to use
p proof by contradiction. Let’s do it.
Assume that 2 is a rational number i.e., 2 D a=b or a2 =b 2 D 2 where a; b are not both
even (if they are, one can always cancel out the factor 2). So, a2 D 2b 2 which is an even number
(since it is 2 multiplied by some number). Thus, a is an even number (even though this is rather
obvious, as always, prove it). Since a is even, we can express it as a D 2c where c D 1; 2; 3; : : :

a D 2c H) a2 D 4c 2 H) 4c 2 D 2b 2 H) b 2 is even, or b is even

So, we are led to the fact that both a; b are even, which is in contradiction with a; b being not
both even. So, the square root of two must be irrational. We used proof by contradiction. To
use this technique, we assume the negate of the statement we are trying to prove and use that to
arrive at something impossibly correct.
Examples
p of irrational numbers include square rootsp
of integers that are not complete squares
e.g. 10, cube roots of integers that are not cubes, like 3 7, and so on. Multiplying an irrational
number by a rational coefficient or adding a rational number to it produces again an irrational
numberŽŽ . The most famous irrational number is –the ratio of a circle circumference to its
diameter–  D 3:14159265 : : : The decimal portion of  is infinitely long and never repeats
itself.

2.9.2 Arithmetic of the irrationals


Now that we have discovered the irrationals, how shouldp wep do arithmetic
p with them? That’s the
question we try to answer now. Should we say that 2 C 7 D 9–that is we’re saying that
the sum of square roots is the square root of the sum. It sounds a nice rule. To know whether
this rule p
is reasonable,
p pwe go back (always) to our old friend: the whole numbers. It is easy to
see that 4 C 9 ¤ 13 . We havep foundpone counterexample pthus pwe have disproved the
aforementioned rule. But then what is 2 C 7? The panswerp is 2 C p7!
Moving now to multiplication,
p p wepobserve that 4  9 D 6 D 36. And we can use
a calculator to see that a  b D ab. For a mathematical proof, check Section p p
2.19. As
division is related to multiplication, we should have the same rule. One example: = 9 D 2=3 D
p
4
4=9.

In practical applications (in science and engineering for


p instance) we often approximate irra-
tional numbers by fractions or decimals (e.g. we replace 2 by 1:414) becausep actual physical
objects cannot be constructed exactly anyway. Thus, we p are happy with 2  1:414, and if we
need more accuracy, we can use a better approximation 2  1:414213.
p
But then
p why in mathematics courses, students are asked to compute, let say, =1C 2 without
1

replacing 2 with 1:414? There are many reasons. One is that mathematicians love patterns
not the answer. For example, the Basel problem asked mathematicians to compute the sum of
p p
ŽŽ
For example, assume that 2 C r1 D r2 where r1 ; r2 are two rationals, then we get 2 D r2 r1p . But
rationals are closed under
p subtraction i.e., r 2 r 1 is a rational. Thus we arrive at the absurd conclusion that 2 is
rational. Therefore, 2 C r1 must be irrational.

Because the LHS is 5, and square of 5 is 25 not 13.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 66

infinite terms:
1 1 1
S D1C C C C 
4 9 16
Anyone knows that the answer is 1:6449, approximately. But Euler and many other mathemati-
cians were not happy with that: they wanted to find out a formula/expression for S. That problem
defied all mathematicians except Euler. He eventually found out that the exact answer is  2=6.
Not only this is a beautiful result in itself, Euler had discovered other mathematical results while
working on this problem.

p
n
2.9.3 Roots x
A square root of a number x is a number y such that y 2 D x; in other words, a number y whose
square (the result of multiplying the number by itself, or y  y) is x. For example, 4 and 4 are
square roots of 16, because 42 D . 4/2 D 16. Every nonnegative real number xphas a unique
nonnegative square root, called the principal square root, which is denoted by x where the
p
symbol is called the radical sign. The term (or number) whose square root is being considered
is known as the radicand. In other words, the radicand is the number or expression underneath
the radical sign. The radical symbol was first used in print in 1525, in Christoph Rudolff’s CossŽ .
It is believed that this was because it resembled a lowercase "r" (for "radix"). The fact that the
p
symbol of square root is is not as important as the concept of square root itself. However, for
the communication of mathematics, we have to get to know and use this symbol when it has
become standard.
The definition of a square root of x as a number y such that y 2 D x has been generalized
p in
the following way. A cube root of x is a number y such that y D x; it is denoted by x. We
3 3

need a cube root when we know the volume of a box and need to determine its side. Extending
to other roots is straightforward. If n is
pan integer greater than two, an nth root of x is a number
y such that ypn D x; it is denoted by n x.
What is 4? It is a number y such that y 2 D 4 which is absurd. So, we only compute
square roots of positive numbers, at least for now.
p
Calculation of square roots. What is the value of 5? And you have to find that value with-
out using a calculator. Why bothering with this? Because you could develop an algorithm for
calculating a square root of any positive number by yourself. Itself is a big achievement (even
though someone had done it before you). Furthermore, this activity is important if you later on
follow a career in applied mathematics, sciences and engineering. In these areas people often
use approximate methods to solve problems; for example they solve the equation x D sin x
approximately using algorithms similar (in nature) to the one we are discussing in this section.
If you are lazy and just use a calculator, you would learn
p nothing!
Perhaps the first algorithm used for approximating x is known as the Babylonian method.
The method is also known as Heron’s method, named after the first-century Greek mathematician
Hero of Alexandria who gave the first explicit description of the method in his AD 60 work
Metrica. So, what is exactly the algorithm? It starts with an initial guess of the square root x0
Ž
Christoph Rudolff (1499-1545) was the author of the first German textbook on algebra "Coss". Check this.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 67

and this observation: if x0 is smaller than the true square root of S , then S=x0 is larger than the
root of S. So, an average of these two numbers might be a better approximation:
 
1 S
x1 D x0 C (2.9.1)
2 x0

And we use x1 to compute x2 D 0:5.x1 C S=x1 /. The process is repeated until we get the
value that we aim for. How good is it algorithm? Using Juliap(see Listing A.1) I wrote a small
function implementing this algorithm. Using it I computed 5 (which is about 2:236067977)
with x0 D 2 and the results are given in Table 2.3.
p
Table 2.3: Calculation of 5 with starting value x0 D 2.
p
n xn error e D xn 5

1 2.25 1.00e-2
2 2.2361111 4.31e-5
3 2.2360680 2.25e-8

The performance of the algorithm is so good, with three iterations and simple calculations
we get a square root of 5 with 6 decimals. However, there are many questions to be asked. For
example, where did Eq. (2.9.1) come from? p
One derivation of Eq. (2.9.1) is as follows. Assume that x0 is close to S , and e is the error
in that approximation, then we have .x0 C e/2 D S . We can solve for e from this equation:

S x0
.x0 C e/2 D S H) x02 C 2x0 e C e 2 D S H) e D (2.9.2)
2x0 2

where e 2 was omitted as it is negligible. Having obtained e, adding e to x0 we will get Eq. (2.9.1).
Actually, the Babylonian method is an example of a more general method–the Newton method
for solving f .x/ D 0–see Section 4.5.4.
p
How about the calculation of n x? Does the Newton method still work? If so, what should
be the initial guess? Is the Newton method fast? Using a small program you can investigate all
these questions, and discover for yourselves some mathematics.

2.9.4 Rationalizing denominators and simplifying radicals


p p
Do you remember that when you wrote 1= 2 and your strict teacher corrected it to 2=2? They
are the same,pso why bother? I think that the reason is historical. Before calculators, it is easier
to compute 2=2 (as approximately 1:4142135=2) than to compute 1=1:4142135. And thus it
has become common to not write radicals in the denominators. Now, we know the why, let’s
move to the how.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 68

p
How to rationalize the denominatorp
of this term
p 1=.1 C 2/? The secret lies in the identity
.aCb/.a b/ D a2 b 2 , and thus .1C p2/.1 2/ D 1, the radical is gone. So, we
pmultiply
the nominator and denominator by 1 2, which is the conjugate radicalŽŽ of 1 C 2:
p p
1 1 1 1 2 1 2 p
p D p 1D p  p D D 2 1
1C 2 1C 2 1C 2 1 2 1
And it is exactly the same idea when we have to divide two complex numbers .a C bi/=.c C d i/.
We multiply the nominator and denominator by c d i , which is the complex conjugate of c Cd i .
This time, doing so eliminates i in the denominator
p as i 2p
D 1.
In general the radical conjugate of a C b c is a b c. When multiplied together it gives
us a2 b 2 c. The principle of rationalizing denominators is as simple as that. But, let’s try this
problem: simplify the following expression
1 1 1 1 1 1
SD p C p p Cp p Cp p Cp C p
3C2 2 2 2C 7 7C 6 6C 5 5C2 2C 3
A rush application of the technique would work, but in a tedious way. Let’s spend time with the
expression and we see something special, a pattern (in the red terms we have 7; 6 then 6; 5):
1 1 1 1 1 1
SD p C p p Cp p Cp p Cp C p
3C2 2 2 2C 7 7C 6 6C 5 5C2 2C 3
So, we rewrite the expression as
1 1 1 1 1 1
SDp p Cp p Cp p Cp p Cp p Cp p
9C 8 8C 7 7C 6 6C 5 5C 4 4C 3
p p p p
Now, we apply the trick to, say, 1=. 9 C 8/ and get a nice result of 9 8. Doing the same
for other terms, and add them altogether gives us:
p p p p p p p p p p p p p
SD 9 8C 8 7C 7 6C 6 5C 5 4C 4 3D3 3
p
where all terms, except the first and last, are canceled leaving us a neat final result of 3 3.
This is called a telescoping sum and we see this kind of sum again and again in mathematics,
for instance Section 2.21.4. The name comes from the old collapsible telescopes you see in
pirate movies, the kind of spyglass that can be stretched out or contracted at will. The analogy
is the original sum appears in its stretched form, and it can be telescoped down to a much more
compact expression.
Another common
p type of exercise about square/cube roots is to simplify radicals. For exam-
p §
ple, what is 4 Cp2 3 . As we p know that the radicand should be a perfect squareŽ , we thus
assume that 4 C 2 3 D .a C 3/2 , and we’re going to find a:
p p
4 C 2 3 D .a2 C 3/ C 2a 3
ŽŽ
The word conjugate comes from Latin and means (literally) "to yoke together", and the idea behind the word
is that the things that are conjugate are somehow bound to each other.
§
Without using a computer. Again, the purpose is not about the result.
Ž
The whole exercise would be silly if the radicand was not a perfect square!

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 69

From that we have two equations


p by p equating the red and blue terms: 4 D a2 C 3 and 2 D 2a,
p
which gives us a D 1. So 4 C 2 3 D 1 C 3. This technique is called the method of
undetermined coefficients.
Let’s challenge ourselves a bit further. Can we simplify the following radical?
q
p p p
104 6 C 468 10 C 144 15 C 2006

The solution is based on the belief that the radicand must be a perfect square i.e., it is of the form
.   /2 . And this radicand has 4 terms, we think
p of the
p identity
p .x C y C z/ D x C    , and this
2 2

leads to the beautiful compact answer of 13 2C4 3C18 5. Well, I leave the details for youŽ .

Some exercises on square roots.

1. Simplify the following


q q
6 p 6 p
26 C 15 3 26 15 3

2. China Mathematical Olympiad, 1998, evaluate the following square root:


r
1998  1999  2000  2001 C 1
AD
4

3. Simplify the following


p p p p p p
10 C 1 C 10 C 2 C    C 10 C 99
p p p p p p
10 1 C 10 2 C    C 10 99

The answer is ... 1:41421356::: for the first


p question. For the second question, usepthe
strategy of solving a simpler problem e.g. 1  2  3  4 C 1,pwhich is nothing but 52 ,
to see the pattern. For the third question, the answer is 1 C 2, which can be guessed
using a short Julia script.

Common errors in algebraic expression manipulations. Understanding the rules of rational


numbers, we can avoid the following mistakes:
p p
2
3x C 6x 4 4 3 x 2
C 3 x 4
C 3 x x2 C x4 C x
D 6x ; D
  
3x2
 3 x 2 x2
p p p p p p
Ž
Details: we wish 104 6 C 468 10 C 144 15 C 2006 to be of the form .x a C y b C z c/2 . The question
is: what are a; b; c? Look at 6; 10; 15 and ask why not 6; 10; 14, then you’ll see that apD 2; bpD 3; cpD 5. For
x; y; z we have xz D 234, yz D 72, xy D 52. A teacher proceeds the reverse with .x a C y b C z c/2 , and
thus she can generate infinitely many problems of this type. But, as a student you just need to do just one.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 70

The correct answer for the first expression is 1 C 2x 2 . It is clear that .6 C 3/=6 is definitely not
3! If you’re not sure, one example can clarify the confuse.p
About the second expression, due to the square root in 3x,p it is incorrect to cancel 3 inside
the square root. This is clear if you think of the last term as 3=3, forget the x, and this is
definitely not 1!

2.9.5 Golden ratio


Suppose you want to divide a segment of a line into two parts, so that the ratio of the larger part
(a units long) to the smaller part (b units long) is the same as the ratio of the whole (c units long)
to the larger part. This ratio is known as the Golden Ratio (also known as the Golden Section,
Golden Mean, Divine Proportion). Let’s find its value first, which is quite simple:

a c aCb 1
D D H)  D 1 C or 2  1D0 (2.9.3)
b a a 

where  D a=b. Solving the above quadratic equation for  (Section 2.13.2 discusses quadratic
equations), we get
  p
1 2 1 1C 5
 D 1 C H)  D D 1:618033988 (2.9.4)
2 4 2

The number  is irrational§ . It exhibits many amazing properties. Euclid (325-265 B.C.) in his
classic book Elements gave the first recorded definition of . His own words are ‘A straight
line is said to have been cut in extreme and mean ratio when, as the whole line is to the greater
segment, so is the greater to the lesser’. The German astronomer and mathematician Johannes
Kepler once said ‘Geometry has two great treasures: one is the theorem of Pythagoras, the other
the division of a line into extreme and mean ratio. The first we may compare to a mass of gold,
the second we may call a precious jewel.’

Golden rectangles and mathematical spiral. Let’s start with a square of any side, say x (the
pink square in Fig. 2.13a), then construct a rectangle by stretching the square horizontally by a
factor of  (what else?). What obtained is a golden rectangle (b). If we put the (original pink)
square over the green rectangle so that the left edges are aligned, you get two areas following the
golden ratio (c). In (c), the right (green) rectangle is also a golden rectangle as the ratio of the
sides is 1= 1 D . Now, we split it into a square and a rectangle, then we get another rectangle,
and we repeat this infinitely (d). Starting from the left most square, let’s draw a circular arc, then
another arc for the next square etc. What you obtain is a spiral which appears in nature again
and again (Fig. 2.14).
The golden ratio appears in a pentagon as shown in Fig. 2.15. In this figure, ABCDE is a
pentagon of which the sides are one (i.e., jABj D 1, the notation jABj is to denote the length of
§
p
Because 5 is irrational.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 71

golden rectangle x (φ − 1)x

(φ − 1)x
x x2 x φx2 x x2 (φ − 1)x2

x φx φx
a) b) c) d)

Figure 2.13: From golden ratio  to golden rectangles (green colored) and mathematical spirals.

Figure 2.14: Spirals occur in various forms in nature.

edge AB), and the diagonals are d (noting that all diagonals of the pentagon are of equal length,
see Fig. 2.15a). Looking now at Fig. 2.15b, as the two triangles AH C and EHD (shaded)
are similar (using the AA test for they have the same marked anglesŽ ), we have the ratio of
corresponding sides is 1=d . But, we have jCH j D 1 (as the triangle CDH is isosceles), thus we
must have jHEj D 1=d . Now, jCEj D d D jCH j C jHEj D 1=d C 1: the short portion of the
diagonal CE plus the longer portion equals the diagonal itself. So, d D . The flake in Fig. 1.7
is also related to the golden ratio. It’s super cool, isn’t it?ŽŽ .

B B
1
1
d d
C A C ˛ A
O
ˇ
1
d H
1=d
˛ ˇ
D E D 1 E
a/ b/

Figure 2.15: The ratio of a diagonal over a side of a pentagon is the golden ratio.

Ž
Check Section 3.1 for a presentation of Euclidean geometry.
ŽŽ
Check this wikipedia for detail.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 72

2.9.6 Axioms for the real numbers


Now that we have rational numbers and irrational numbers, it is time to introduce a new term–
real numbers. The real numbers include all the rationalpnumbers, such as the integer 5 and
the fraction 2=3, and all the irrational numbers, such as 2,  and so on. The adjective real in
this context was introduced in the 17th century by René Descartes, who distinguished between
real and imaginary roots of polynomials. The set of all real numbers is denoted by R. To do
arithmetic with real numbers, we use the following axioms (accepted with faith) for a; b; c being
real numbers:
Axiom 1: aCb DbCa (Commutative law for addition)
Axiom 2: ab D ba (Commutative law for multiplication)
Axiom 3: .a C b/ C c D a C .b C c/ (Associative law for addition)
Axiom 4: .ab/c D a.bc/ (Associative law for multiplication)
Axiom 5: aC0D0Ca Da (Existence of additive identity) (2.9.5)
Axiom 6: a C . a/ D 0 (Existence of additive inverse)
Axiom 7: a1D1a Da (Existence of multiplicative identity)
Axiom 8: a  a1 D 1; a ¤ 0 (Existence of multiplicative inverse)
Axiom 9: a.b C c/ D ab C ac (Distributivelaw law)
We use these axioms all the time without realizing that we are actually using them. As an
example, below are three results which are derived from the above axioms:
a D . 1/a
. a/ D Ca D a (2.9.6)
.a b/ D a C b
The third is known as a rule saying that if a bracket is preceded by a minus sign, change positive
signs within it to negative and vice-versa when removing the bracket.ŽŽ
Proof. First we prove a D . 1/a using the axioms in Eq. (2.9.5):
aD aC0 (Axiom 5)
aD aC0a (a  0 D 0)
aD a C .1 C . 1//  a (Axiom 6)
aD a C a C . 1/  a (Axiom 9)
aD. 1/  a (Axiom 6)
With that result, it is not hard to get . a/ D . 1/. a/ D . 1/. 1/.a/ D 1a D a (axiom 4).
For .a b/ D a C b, we do:
.a b/ D . 1/.a b/ ( x D . 1/x)
D. 1/a C . 1/. b/ (Axiom 9)
D a C . 1/Œ. 1/b ( x D . 1/x)
D a C . 1/. 1/b D aCb

ŽŽ
Always use one example to check: .5 2/, which is 3, is equal to 5 C 3, which is 2. So the rule is ok.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 73

You might be thinking: are mathematicians crazy? About these proofs of obvious things
George Pólya once said

Mathematics consists of proving the most obvious thing in the least obvious way
(George Pólya)

But why they had to do that? The answer is simple: to make sure the axioms selected are
minimum and yet sufficient to provide a foundation for the theory they’re trying to build.

2.10 Fibonacci numbers


There is a special relationship between the Golden Ratio and the Fibonacci Sequence to be
discussed in this section. The original problem that Fibonacci investigated (in the year 1202)
was about how fast rabbits could breed in ideal circumstances. Suppose a newly-born pair of
rabbits (one male, one female) are put in a field. Rabbits are mature at the age of one month.
And after one month, a mature female can produce another pair of rabbits (male and female).
Furthermore, it is assumed that our rabbits never die. The puzzle that Fibonacci posed was: How
many pairs will there be in one year?
At the end of the first month, they mate, but there is still one only one pair. At the end of the
second month the female produces a new pair, so now there are two pairs of rabbits in the field.
At the end of the third month, the original female produces a second pair, making three pairs in
all in the field. At the end of the fourth month, the original female has produced yet another new
pair, the female born two months ago produces her first pair also, making five pairs (Fig. 2.16).

Figure 2.16: Fibonacci’s rabbit problem.

This led to the Fibonacci sequence 1; 1; 2; 3; 5; 8; 13; 21; 34; : : : which can be defined as
follows.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 74

Definition 2.10.1
The Fibonacci sequence starts with 1,1 and the next number is found by adding up the two
numbers before it:
Fn D Fn 1 C Fn 2 ; n  2 (2.10.1)

For example, F2 D F1 C F0 D 1 C 1 D 2, F3 D F1 C F2 D 1 C 2 D 3 and so on. Now


comes the first surprise: there is a connection between the Fibonacci numbers and the golden
ratio. To see the relation between Fibonacci sequence and the golden ratio, we computed some
Fibonacci numbers and computed the ratio between two consecutive Fibonacci numbers. The
data shown in Table 2.4 indicates that far along in the Fibonacci sequence the ratios approach
.Ž . But why? Let’s denote by x the ratio of consecutive Fibonacci numbers:

Table 2.4: Ratios of two consecutive Fibonacci numbers approach the golden ratio .

n Fn FnC1 =Fn

2 2 -
3 3 1.50000000
4 5 1.66666667
:: :: ::
: : :
19 6765 -
20 10946 1.61803400
21 28657 1.61803399

FnC1 FnC2 FnC3


xD D D D 
Fn FnC1 FnC2
Hence, we can write FnC2 in terms of x and FnC1 , and FnC1 in terms of x and Fn to finally get
FnC2 in terms of x and Fn :
FnC2 D xFnC1 D x 2 Fn
Now, in the above equation, we replace FnC2 D FnC1 C Fn , and again replace FnC1 by xFn ,
we get

Fn C FnC1 D x 2 Fn ; Fn C xFn D x 2 Fn

Now, divide the last equation by Fn and we get x 2 D x C 1: the same quadratic equation that
the golden ratio satisfies. That is why the ratio of consecutive Fibonacci numbers is the golden
ratio.
Ž
Of course, this table was generated by a small Julia program. Eq. (2.10.1) is a recursive definition, so in this
program we also used that technique. In a program, we define a function and within its definition we use it.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 75

There exists another interesting relation between the golden ratio and Fibonacci numbers; it
is possible to express the powers of the golden ratios in terms of a C b where a; b are certain
Fibonacci numbers. The procedure is as follows:

2 D 1 C 
 3 D  2 D .1 C / D  C  2 D 1 C 2
(2.10.2)
 4 D  3 D .1 C 2/ D  C 2.1 C / D 2 C 3
 5 D 3 C 5

That is, starting with  2 D 1 C , which is just the definition of , we raise the exponent by one
to get  3 , and replace  2 by 1 C . Then, we use  3 to get the the fourth-power, and so on. The
expression for  5 was not obtained by detailed calculations, but by guessing, again we believe
the pattern we are seeing: the coefficients of the power of the golden ratio are the Fibonacci
numbers. In general, we can write:
 n D Fn 2 C Fn 1  (2.10.3)
Notepthat the equation  D 1 C 1= has two solutions, one is  and the other is D
1=2.1 5/ and these two solutions are linked together by  D 1. That is the negative
solution is 1=. If we have Eq. (2.10.3) for , should we also have something similar for
1=–the other golden ratio? Following the same procedure done in Eq. (2.10.2). As 1= is a
solution to  D 1 C 1=, we have
1
D1 

Squaring the both sides of this, using  2 D 1 C  and  D 1 C 1=:
 
1 2 1
D 1 2 C  2 D 2  D 1
 
And from that we get . 1= /3 and so on:
 
1 3 1
D .1 /.2 / D 3 2 D 1 2
 
 
1 4 1
D .1 /.2 / D 3 2 D 2 3
 
 
1 5 1
D .1 /.3 2/ D 3 5
 
In all the final equalities, we have used  D 1 C 1= so that final expressions are written in
terms of 1=. Now, we’re ready to have the following
 
1 n 1
D Fn 2 Fn 1 (2.10.4)
 

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 76

Now comes a nice formula for the Fibonacci sequence, a direct formula not recursive one. If
we combine Eqs. (2.10.3) and (2.10.4) we have
9
 n D Fn 2 C Fn 1  =  n  
 n n 1 1
1 1 H)  D C Fn
D Fn 2 Fn 1 ;  
1

 
p
And thus, (because  C 1= D 5)
  n  "  nC1 #
1 1 1 1
Fn 1 D p n ; Fn D p  nC1 (2.10.5)
5  5 

And this equation is now referred to as Binet’s Formula in the honor of the French mathematician,
physicist and astronomer Jacques Philippe Marie Binet (1786 – 1856), although the same result
was known to Abraham de Moivre a century earlier. p
We have one question for you: in Eq. (2.10.5),  D 0:5.1 C 5/ is an irrational number,
and Fn is always a whole number. Is it possible?
The purpose of this section was to present something unexpected in mathematics. Why
on earth the golden ratio (which seems to be related to geometry) is related to a bunch of
numbers coming from the sky like the Fibonacci numbersŽ ? But there are more. Eq. (2.10.1) is
now referred to as a difference equation or recurrence equation. And similar equations appear
again and again in mathematics (and in science); for example in probability as discussed in
Section 5.8.7.

History note 2.1: Fibonacci (1170 – 1240–50)


Fibonacci was an Italian mathematician from the Republic of Pisa,
considered to be "the most talented Western mathematician of the
Middle Ages". Fibonacci popularized the Hindu–Arabic numeral sys-
tem in the Western World primarily through his composition in 1202
of Liber Abaci (Book of Calculation). He also introduced Europe to
the sequence of Fibonacci numbers, which he used as an example in
Liber Abaci.
Although Fibonacci’s Liber Abaci contains the earliest known de-
scription of the sequence outside of India, the sequence had been described by Indian
mathematicians as early as the sixth century.

Ž
That is not entirely true as we can see Fibonacci numbers in the petals of flowers. That is, in nearly all flowers,
the number of petals is one of the numbers of 3, 5, 8, 13, 21, 34, 55, 89. For example, lilies have three petals,
buttercups have five, many delphiniums have eight, marigolds have thirteen, asters have twenty-one, and most
daisies have thirty-four, fifty-five, or eighty-nine.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 77

2.11 Continued fractions


First, continued fractions are fractions of fractions . To see what it means, let’s consider a rational
number 45=16 and we first express it as a whole number–called the integer part–plus another
fraction a=b, which is smaller than 1. Next, we invert this fraction to the form 1=.b=a/, and we
keep doing this process for the fraction b=a. For example, 45=16 D 2:8125, thus 2 is its integer
part, so we can write 45=16 D 2 C 13=16. Now 13=16 < 1, we invert it to get 1=.16=13/, and the
whole process is repeated for 16=13:
45 13 1 1 1
D2C D2C D2C D2C
16 16 16=13 3 1
1C 1C
13 1
4C
3
Ok, so continued fraction is just another way to
p write a number. What’s special? Let’s explore
more. How about an irrational number, like 2?pHow to express the square root of 2 as a
continued fraction? Of course we start by writing 2 D 1 C    :
p p 1 p p
2D1C 2 1D1C p .because . 2 C 1/. 2 1/ D 1/
1C 2
p p
Now, we replace 2 in the fraction i.e., 1=.1 C 2/ by the above equation, and doing so gives
us:
p 1 1 1
2D1C p D1C D1C (2.11.1)
1C 2 1 1
2C p 2C
1C 2 1
2C
2 C 
We got an infinite continued fraction. Note that for 45=16, a rational number, we got a finite
continued fraction. Eq. (2.11.1) can be used to compute square roots (in the time with no
calculator). The term “continued fractions” was first used by John Wallis in 1653 in his book
Arithmetica infinitorum.
Using the same idea, we can write the golden ratio  as an infinite continued fraction
1 1 1
 D1C H)  D 1 C D1C (2.11.2)
 1 1
1C 1C
 1
1C

p
And as  D 0:5.1 C 5/, we get this beautiful equation:
p
1C 5 1
D1C (2.11.3)
2 1
1C
1 C 

http://www.maths.surrey.ac.uk/hosted-sites/R.Knott/Fibonacci/cfINTRO.html.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 78

q p
p p
And that is not the end. You have probably seen this: 0:5.1 C 5/ D 1C 1C 1 C   .
Here is why:
r q
1 p p
 D 1 C H)  2 D 1 C  H)  D 1 C  H)  D 1C 1C 1 C 

Fixed point iterations. Now, we’re going to compute  using its definition:  D 1C1=. We’re
using a method called fixed point iterations. What is a fixed point of a function? The definition
given below answers that question:

Definition 2.11.1
A fixed point x  of a function f .x/ is such a point that x  D f .x  /.

I am not sure about the origin of this method, but in Section 3.7, we shall see how the Persian
astronomer al-Kashi (c. 1380 – 1429) in his book The Treatise on the Chord and Sine, computed
sin 1ı to any accuracy. For that problem, he needed to solve a cubic equation of which solution
was not available at his time. And he presented a fixed point iteration method to compute sin 1ı .
A geometric illustration of a fixed point is shown in Fig. 2.17. Among other things, this
concept can be used to solve equations g.x/ D 0. First, we rewrite the equation in this form
x D f .x/, then starting with x0 , we compute a sequence .xn / D .x1 ; x2 ; : : : ; xn / with (for now,
a sequence is nothing but a list of numbers. In Section 2.22, we talk more about sequences)
xnC1 D f .xn / (2.11.4)
As shown in Fig. 2.18a, the sequence .xn / converges to the solution x  , if x0 was chosen properly.
Starting from x0 , draw a vertical line that touches the curve y D f .x/, then go horizontally until
we get to the diagonal y D x. The x-coordinate of this point is x1 , and we repeat the process.
Fig. 2.18(b,c) are the results of fixed point iterations for the function y D 2:8x.1 x/. What
we are seeing is called a cobweb.
y
yDx

f .x  / D x  y D f .x/

x
x

Figure 2.17: A fixed point of a function f .x/ is the intersection of the two curves: y D f .x/ and y D x.

I demonstrate how this fixed point iteration scheme works for the golden ratio . In Table 2.5,
I present the data obtained with nC1 D 1 C 1=n with two starting points 0 D 1:0 and another
0 D 0:4. Surprisingly, both converge to the samep solution of 1.618. Thus, the second negative
solution of  D 1 C 1=–which is  D 0:5.1 5/–escaped. In Fig. 2.19, we can see this
clearly.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 79

1.0

y
yDx 0.8

f .x1 /
0.6
y D f .x/
f .x0 / 0.4

0.2

x 0.0
x0 x1 x 0.0 0.2 0.4 0.6 0.8 1.0

(a) (b) (c)

Figure 2.18: Fixed point iterations for the function x D 2:8x.1 x/. In (b,c), the red line is y D x. The
code used to generate Figs b and c is in fixed_point_iter.jl.

Table 2.5: Fixed point iterations for  D 1 C 1=: nC1 D 1 C 1=n .

n nC1 n nC1

1 2.0 1 -1.5
2 1.5 2 0.3333333
3 1.666666 3 3.9999999
4 1.6 4 1.25
5 1.625 5 1.8
:: :: :: ::
: : : :
19 1.618034 19 1.618034
20 1.618034 20 1.618034

There are many questions remain to be asked regarding this fixed point method. For example,
for what functions the method works, and can we prove that (to be 100% certain) that the
sequence .xn / converges to the solution? To answer these questions, we need calculus and thus
I postpone the discussion to Section 12.5. In that section you will find the answer to why we
could not get the negative solution
p of  D 1 C 1=: the first derivative of the function 1 C 1=
evaluated at  D 0:5.1 5/ is 2:61 which is larger than one. This also demonstrates the
importance of the derivative of a function. It can tell us so many things about a function!

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 80

 

Figure 2.19: Fixed point iterations for  D 1 C 1=: nC1 D 1 C 1=n with the starting 0 D 0:4. The
positive solution (which is the golden ratio) is marked by a black dot whereas the negative solution by a
red dot.

2.12 Pythagoras’ theorem


Ask any school student to name a famous mathematician, and more often than not they will opt
for Pythagoras, if they can think of one. Pythagoras’s present-day fame rests on the theorem
that bears his name: Pythagoras’s theorem or Pythagorean theorem. We have no idea whether
Pythagoras actually proved the theorem. In fact, we don’t know whether it was his theorem at
all. But he got the credit and his name stuck.
Pythagoras’ theorem is a fundamental relation in Euclidean geometry among the three sides
of a right triangle. It states that the area of the square whose side is the hypotenuse (the side
opposite the right angle) is equal to the sum of the areas of the squares on the other two sides.
This theorem can be written as an equation relating the lengths of the sides a, b and c, often
called the "Pythagorean equation":
a2 C b 2 D c 2 (2.12.1)
where c represents the length of the hypotenuse and a and b the lengths of the triangle’s other
two sides (Fig. 2.20). For example, in a right triangle if two legs are 3 and 4, the hypotenuse is 5
(because 52 D 32 C 42 ). This makes the famous 3 4 5 right triangle.
The theorem has been given numerous proofs – possibly the most for any mathematical
theoremŽ . I present in Fig. 2.21b one proof. And young students are recommended to prove this
theorem as many ways as possible.

Proof. The proof given in Fig. 2.21b is using the same strategy adopted in Fig. 2.12: we start
with one right triangle of sides a; b; c and we add three more, arranged in a special way that
together they make a big square of side a C b. Then, we compute the area of this big square in
Ž
An early 20th century mathematician named Elisha Scott Loomis collected and published 367 proofs in a
book called The Pythagorean Proposition.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 81

c2
a2 a c
b

b2

Figure 2.20: Pythagorean theorem. The sum of the areas of the two squares on the legs (a and b) equals
the area of the square on the hypotenuse (c).

two ways. First way is .a C b/2 , which is a2 C 2ab C b 2 . And the second way is: its area is
equal to the sum of areas of the four triangles and the square of side c (why this gray shaded
shape is a square? Look at the angles ˛). This area is 4.1=2/ab C c 2 . So, we have

a2 C 2ab C b 2 D 2ab C b 2 H) a2 C b 2 D c 2

William Dunham called this proof the "Chinese Proof" as it was embodied in the hsuan-thu
diagram of a square tilted in another square, dated from somewhere between 1000BC - 1 AD
[18]. Note that this diagram was for the specific 3 4 5 right triangle only. 

a b

a
c c
b

c
b
a α c
α
b a
(a) The hsuan-thu dia- (b) The modern proof
gram

Figure 2.21: The Chinese proof of the Pythagorean theorem.

Question 3. Is the converse of the Pythagorean theorem true? That is, in a triangle, if the square
of one side is equal to the sum of the squares of the other two sides, is the angle opposite the
first side a right angle?

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 82

2.12.1 Pythagorean triples


Plimpton 322 is a Babylonian clay tablet, believed to have been written about 1800 BC, has
a table of four columns and 15 rows of numbers in the cuneiform script of the period. This
table lists two of the three numbers in what are now called Pythagorean triples. One example is
.3; 4; 5/.

Definition 2.12.1
Integer triples .a; b; c/ are called Pythagorean triples if they satisfy the equation a2 C b 2 D c 2 .

How to generate Pythagorean triples? There are more than


one way and surprisingly using complex numbers is one of them.
If you need a brief recall on complex numbers, see Section 2.25.
Let’s start with a complex number z D u C vi where u; v are
positive integers and
p i is the number such that i D 1. Its
2

modulus is jzj D u2 C v 2 . The key point is that the modulus


of the square of z is u2 C v 2 , which is an integer. Indeed, let’s
compute z 2 and its modulus:
p
z 2 D .u C vi/2 D u2 v 2 C 2uvi H) jz 2 j D .u2 v 2 /2 C .2uv/2 D u2 C v 2
This result indicates that .u2 v 2 /2 C .2uv/2 D .u2 C v 2 /2 . Thus, the triple .u2 v 2 ; 2uv; u2 C
v 2 / is a Pythagorean triple! We are going to compute some Pythagorean triples using this and
Table 2.6 presents the result.

Table 2.6: Pythagorean triples .u2 v 2 ; 2uv; u2 C v 2 /.

.u; v/ .u2 v 2 ; 2uv; u2 C v 2 /

(2,1) (3,4,5)
(4,2) (12,16,20)
(3,2) (5,12,13)
(4,3) (7,24,25)
(5,4) (9,40,41)

Note that the triples .3; 4; 5/ and .12; 16; 20/ are related; the latter can be obtained by mul-
tiplying the former by 4. The corresponding right triangles are similar. Generally, if we take
a Pythagorean triple .a; b; c/ and multiply it by some other number d , then we obtain a new
Pythagorean triple .da; db; dc/. This leads to the so-called primitive Pythagorean triples in
which a; b; c have no common factors. A common factor of a; b and c is a number d so that
each of a; b and c is a multiple of d . For example, 3 is a common factor of 30; 42, and 105, since
30 D 3  10; 42 D 3  14, and 105 D 3  35, and indeed it is their largest common factor. On
the other hand, the numbers 10; 12, and 15 have no common factor (other than 1).

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 83

If .a; b; c/ is a primitive Pythagorean triple, it can be shown that

 a; b cannot be both even;

 a; b cannot be both odd;

 a is odd, b is even (b D 2k) and c is odd. See Table 2.6 again.

From the definition of .a; b; c/, we write

a2 C b 2 D c 2 H) b 2 D .c a/.c C a/ (2.12.2)

As both a and c are odd, so its sum and difference are even. Thus, we can write c a D 2m,
c C a D 2n. Eq. (2.12.2) becomes

.2k/2 D .2m/.2n/ H) k 2 D mn H) m D p 2 ; n D q2 (2.12.3)

Now, we can solve for a; b; c in terms of p and q, and obtain the same result as before:

c D p2 C q2; a D q2 p2; b D 2k D 2pq

2.12.2 Fermat’s last theorem


In number theory, Fermat’s Last Theorem states that no three positive integers a; b, and c satisfy
the equation an C b n D c n for any integer value of n greater than 2. The cases n D 1 and
n D 2 have been known since antiquity to have infinitely many solutions. Recall that n D 2, the
solutions are Pythagoreas triplets discussed in Section 2.12.
We found Fermat’s Last Theorem in the margin of his copy of Diophantus’ Arithmetica:
“It is impossible to separate a cube into two cubes, or a fourth power into two fourth powers,
or in general, any power higher than the second, into two like powers.” He also wrote that “I
have discovered a truly marvelous proof of this proposition which this margin is too narrow to
contain.” This habit of not revealing his calculations or the proofs of his theorems frustrated his
adversaries: Descartes came to call him a “braggart”, and the Englishman John Wallis referred
to him as “that damn Frenchman”. About this story, there is a story of another mathematician
that goes like this

A famous mathematician was to give a keynote speech at a conference. Asked for an


advance summary, he said he would present a proof of Fermat’s Last Theorem – but
they should keep it under their hats. When he arrived, though, he spoke on a much
more prosaic topic. Afterwards the conference organizers asked why he said he’d
talk about the theorem and then didn’t. He replied this was his standard practice,
just in case he was killed on the way to the conference.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 84

The son of a wealthy leather merchant, Pierre de Fermat (1601 – 1665)


studied civil law at the University of Orleans (France) and progressed to
a comfortable position in the Parliament of Toulouse, which allowed him
to spend his spare time on his great love: mathematics. In the afternoons,
Fermat put aside the law and dedicated to mathematics. He studied the
treatises of the scholars of classical Greece and combined those old ideas
with the new methods of algebra by François Viète.
After 358 years of effort by mathematicians, the first successful proof
was released in 1994 by Andrew Wiles (1953) an English mathematician.
In 1963, a 10-year-old boy named Andrew Wiles read that story of Fermat’s last theorem, was
fascinated, and set out to dedicate his life to proving Fermat’s Last Theorem. Two decades later,
Wiles became a renowned mathematician before deciding to return to his childhood dream. He
began to secretly investigate finding the solution to the problem, a task which would take seven
years of his life.
There are many books written on this famous theorem, for example Fermat’s Last Theorem:
The Book by Simon Singh. I strongly recommend it to young students. About Wiles’ proof,
it is 192 pages long and I do not understand it at all. Note that I am an engineer not a pure
mathematician.

2.12.3 Solving integer equations


Find positive integers that satisfy the following equationŽ
p p p
a C b D 2009 (2.12.4)
How can we solve this? Some hints: (1) a and b are symmetrical so if .a; b/ is a solution, so
is .b; a/; (2) usually squaring is used to get rid of square roots. But we have to first isolate a; b
before squaring:
p p p
a D 2009 b
p
a D 2009 C b 2 2009b .squaring the above/
p p p
Thus, we have 2009b D c, where c is a positive integer. Now we rewrite 2009b as p 7 41b.
Now we know that only the square root of a perfect square is a natural number, thus 41b is a
natural number when b D 41m2 , where m 2 N (this is similar to writing m is a natural number,
but shorter, we will discuss about this notation later). Since a and b are playing the same role,
we also have a D 41n2 , n 2 N. With these findings, Eq. (2.12.4) becomes:
p p p
n 41 C m 41 D 7 41 H) n C m D 7
It is interesting that the scary looking equation Eq. (2.12.4) is equivalent to this easy equation
n C m D 7, which can be solved by kids of 7 years ago and above by a rude method: trial and
error guessing (Table 2.7).
p p p
Ž
The expression a C b D 2009 is not an identity since it does not hold for all a and b. That’s why it’s
called an equation. We have more to say about equations in Section 2.13.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 85

p p p
Table 2.7: Solutions to aC b D 2009.

.n; m/ (0,7) (1,6) (2,5) (3,4)


.a; b/ (0,7) (41,1476) (164,1025) (369,656)

If we’re skillful enough and lucky –if we transform the equations in just the right way– we
can get them to reveal their secrets. And things become simple. Creativity is required, because
it often isn’t clear which manipulations to perform. And that’s why mathematics is exciting. If
all problems can be solved by routine procedures, we all get bored!

History note 2.2: Pythagoras (c. 570 BC – 495 BC)


Pythagoras was an an ancient Ionian Greek philosopher and the epony-
mous founder of Pythagoreanism. In antiquity, Pythagoras was cred-
ited with many mathematical and scientific discoveries, including the
Pythagorean theorem, Pythagorean tuning, the five regular solids, the
Theory of Proportions, the sphericity of the Earth, and the identity of
the morning and evening stars as the planet Venus. Pythagoras was the
first to proclaim his being a philosopher, meaning a “lover of ideas.”
Pythagoras had followers. A whole group of mathematicians signed
up to be his pupils, to learn everything he knew, they were called the
Pythagoreans. Numbers, Pythagoras believed, were the elements behind the entire uni-
verse. The Pythagoreans had sacred numbers. Seven was the number of wisdom, 8 was
the number of justice, and 10 was the most sacred number of all. Every part of math was
holy. When they solved a new mathematical theorem, they would give thanks to the gods
by sacrificing an ox.

2.12.4 From Pythagorean theorem to trigonometry and more

Many of the triangles encountered in real life are not right-angled, so the direct applications of
the Pyathagorean theorem may seem restricted. However, noting that any triangle can be cut into
two right-angled ones, and any polygons can be cut into triangles. So right-angled triangles are
the key: they show that there is a useful relation between the shape of a triangle and the lengths
of its sides. The subject that developed from this insight is trigonometry meaning ‘triangle
measurement’. We have an entire Chapter 3 for this topic.
Later in Section 3.20.1 we shall see that the Pythagorean theorem provides us a way to
compute distance between points defined on a coordinate system.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 86

2.13 Imaginary number


Our next destination in the universe of numbers is i with the property i 2 D 1. To understand
the context in which i was developed or discovered, we need to talk about solving equations of
one single unknown. We will discuss three types of equation as shown in Eq. (2.13.1):

2x D 5 linear equation (2.13.1a)


2
x C 10x D 39 quadratic equation (2.13.1b)
3 2
x C x C 10x D 39 cubic equation (2.13.1c)

From these three examples, we can see that an equation is a formula that expresses the equality
of two expressions (e.g. 2x and 5 in Eq. (2.13.1a)), by connecting them with the equals sign D.
In these three equations, x is called the unknown of the equation. Of course, instead of x, we
cam equally use y or z, but no mathematicians would use a or b. We shall discuss why later.
Our job is to solve the equation: finding x such that it satisfies the equation (make the equation
a true statement). For example, x D 5=2 is the solution to the first equation for 2.5=2/ D 5.
The presentation on solving these linear/quadratic/cubic equations serves only to introduce
the need for mathematicians to consider square root of negative numbers. We shall get back to
the topic of solving equations in depth in Section 3.21.

History note 2.3: Importance of mathematical notations


There are many major mathematical discoveries but only those which can be understood
by others lead to progress. However, the easy use and understanding of mathematical
concepts depends on their notation.
The convention we use (letters near the end of the alphabet representing unknowns e.g.
x; y; z) was introduced by Descartes in 1637. Other conventions have fallen out of favor,
such as that due to Viète who used vowels for the unknowns and consonants for the
knowns.

We start off with linear equations (Section 2.13.1), then we move to quadratic equations
(Section 2.13.2). Then, we discuss cubic equations–the context in which imaginary number i first
appeared in Section 2.13.3. Viète’s solution of cubic equation using trigonometry functions is
presented in Section 2.13.4. And finally, the interesting history of the cubic equation is presented
in Section 2.13.5.

2.13.1 Linear equation

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 87

2400 miles
Let us consider the following word problem: two planes,
which are 2 400 miles apart, fly toward each other. Their t=0
x x − 60
speeds differ by 60 miles per hour. They pass each
other after 5 hours. Find their speeds. As we shall see,
this problem involves a linear equation of the form
Eq. (2.13.1a). Let’s denote by x (miles/hour) the speed t=5
of the left plane. Thus, the speed of the right plane is 5x 5(x − 60)
x 60 (or x C 60). The distance traveled by the left
plane after 5 hours is x1 D 5x and that by the right plane is x2 D 5.x 60/. Then, we get the
following equation (see figure)
5x C 5.x 60/ D 2400
which is the mathematical expression of the fact that the two planes have traveled a total distance
of 2 400 miles. To solve this linear equation, we simply massage it in the way that x is isolated
in one side of the equality symbol: x D : : : So, we do (the algebra is based on the arithmetic
rules stated in Eq. (2.1.2); there is nothing to memorize here! )
5x C 5.x 60/ D 2400 , 10x 300 D 2400 , 10x D 2400 C 300 , x D 2700=10 D 270
Thus, the speed of one plane is 270 miles per hour and the speed of the other plane is 210 miles
per hour. In the above equation, the symbol , means ‘equivalent’ that is the two sides of this
symbol are equivalent: one can go from one side to the other and vice versa; sometimes ” is
used for the same purpose.
Let’s see another problem: adding 6 to a number results in a number that is three times itself;
Find the number. Of course, we use x to label the number, then we have: 6 C x D 3x. Solving
this we get x D 3. Now, comes one key idea: instead of solving all concrete instances of linear
equations (e.g. 2x D 5 or 5x 7 D 0), mathematicians solved the following equation

b
ax C b D 0 ” x D (2.13.2)
a
The equation ax C b D 0 is called a linear equation because x is a linear term. Geometrically,
if we plot the function y D ax C b on the Cartesian plane we get a line. Section 3.20 discusses
Cartesian plane and analytic geometry.
Have you ever wondered why mathematicians did not write Eq. (2.13.2) as ac C b D 0 (i.e.,
using c for x)? Noting that in Eq. (2.13.2), a; b; x are all real numbers, but a; b play one role and
x plays another role, in the sense that x cannot be arbitrary, it depends on a and b in a specific
way. By using the letters near the end of the alphabet e.g. x; y; z to represent the unknown, we
can easily spot out what are unknowns and what are knowns. The expression
b
xD
a

Actually there is another obvious rule: if a D b, then a C c D b C c for any c. I used it in the third step to
remove 300 in 10x 300. Many teachers tell students: move a number from one side to the other side of the D
symbol, change the sign. Yet another rule to memorize!

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 88

is called a formula. It relates x to given a and b. It allows us to compute/evaluate x. For example,


if a D 2 and b D 6, then x D 6=2 D 3. Our world is governed by many formula; probably
the most famous one is Einstein’s E D mc 2 .
Usually solving a linear equation in x is straightforward, but the following equation looks
hard:
x x x
xC C C  C D 4021
1C2 1C2C3 1 C 2 C 3 C    C 4021
The solution is 2021. If you cannot solve it, look at the red term (all the denominator has same
form) and ask yourself what it is.

2.13.2 Quadratic equation


We move now to the quadratic equation. It emerged from a very practical problem: if you have a
rectangular piece of land, of which one side is longer than the other side 10 units and the area of
the land is 39 unit squared, how long are the sides? Already familiar with symbols (we take them
for granted), we denote by x the length of the shorter side of the rectangle; then the length of
the longer side is x C 10. The area of this land is thus x.x C 10/ and it is equal to 39. Therefore,
we obtain this quadratic equation x.x C 10/ D 39 or x 2 C 10x 39 D 0. To see the power of
symbols, this is how the same equation was referred to in the ninth century: “a square and ten
of its roots are equal to thirty-nine”; roots refer to x term, squares to x 2 terms and numbers to
constants.
Let’s now solve this quadratic
p
equation without using the well x+5
know quadratic formula b˙ b 2 4ac
=2a. Instead, we adopt a geo-
metrical approach typically common in the ancient time by Babylo- 5 5x 25

nian mathematicians around 500 BC. By considering all the terms


x+5
in the equation as area of some rectangles (see next figure), we see
that the area of the biggest square of side x C 5 equals the sum of x x2 5x
areas of smaller rectangles. This lead us to write
x 5
2 2 2
.x C 5/ D x C 10x C 25 D 39 C 25 D 64 D 8

where, in the second equality, we have replaced x 2 C 10x by 39. The last equality means
x C 5 D 8, and hence x D 3. The ancient mathematicians stopped here i.e., they did not find out
the other solution–the negative x D 13, because for them numbers simply represent geometric
entities (length, area, ...). Note that 82 D 64 and . 8/2 is also 64.
Algebraically, the above is equivalent to using the identity .a C b/2 D a2 C 2ab C b 2 , we
have added 25 D 52 to complete the square .x C 5/2 , which is the area of the square of side
x C 5. This is also the key to solving cubic equations by a similar completing a cube procedure.
Instead of solving many concrete quadratic equations (e.g. 2x 2 5x C 3 D 0, x 2 C 2x
6 D 0 and so on), it was a big mind shift in mathematics when mathematicians considered
ax 2 C bx C c D 0 with a ¤ 0. This equation with undefined a; b; c, called coefficients, cover
all quadratic equations at once. If we can solve it, we solve all quadratic equations! To solve it,

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 89

first we rewrite it as x 2 C b=ax C c=a D 0ŽŽ , then we complete the square:


  r
2 b b2 c b2 b 2 b2 c b b2 c
x C2 x C 2 C D0” xC D 2 ” xC D˙
2a 4a a 4a2 2a 4a a 2a 4a2 a

And from that we obtain the famous quadratic formula:


p
b b 2 4ac
xD ˙ (2.13.3)
2a 2a

This expression is called a solution by radicals or an algebraic solution for the quadratic. That
is, it involves only the original coefficients in the equation–that is, a; b and c–and the algebraic
operations of addition, subtraction, multiplication, division, and extraction of roots, used only
finitely oftenŽ . That formula is the first formula ever that expresses the roots (i.e., solutions) of
an equation in terms of its coefficients.
The number b 2 4ac plays an important role in the roots: if it is positive, we get two roots,
if it is negative we get no roots (we cannot have square root of negative numbers) and if it is
zero we have two roots of the same value b=2a; this is called a root of multiplicity of two. This
number b 2 4ac is called the discriminant§ , designated by : it discriminates the roots.
As a preparation for the cubic equation, I present another way to solve the quadratic equation.
Let’s consider this quadratic equation x 2 C bx C c D 0 where the coefficient of the x 2 term is
one to simplify the maths. This method is based on the observation p that it is easy to solve this
quadratic equation x 2 d D 0 (d  0): the solutions are x D ˙ d . The question is: can we
convert x 2 C bx C c D 0 to x 2 d D 0?
The answer is yes: using a change of variable x D u b=2 enables us to get rid of the term
bx to obtain this reduced quadratic equation u2 D d , with d D b 2 =4 c. How? Just plugging
x D u b=2 into the equation x 2 C bx C c D 0, do some algebraic p manipulations and we get
u D d . But we can solve
2
p easily this reduced equation: u D ˙ d (assuming d > 0). From u,
we then get x: x D ˙ d b=2. But, how did we know of this change of variable? By trial and
errors I guess: starting with x D u C z, where we do not know what z is. Introducing this x into
x 2 C bx C c D 0 we obtain:

.u C z/2 C b.u C z/ C c D 0 ” u2 C .2z C b/u C bz C z 2 C c D 0

Since we do not like the term .2z C b/u, we choose z such that 2z C b D 0. If you still wonder
why we know that x D u C z, then I do not have a convincing answer. But there is a geometric
meaning behind this change of variable. To answer that we need to know Cartesian coordinates
and plotting functions. I leave this to Section 3.21. With geometry this change of variable is no
more mysterious.

Make sure you understand why we can do this.


ŽŽ

Why we need to mention this? Recall that we have seen in Eq. (2.11.1) an expression that involves an infinite
Ž

number of terms.
§
The term "discriminant" was coined in 1851 by the British mathematician James Joseph Sylvester.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 90

What we have just done? For the equation x 2 C bx C c D 0, we used a change of


variable to get u2 d D 0. That is, we removed the term bx, which is the term of
Remark order one less than the highest order term x 2 . Why is this important? Because it is
how mathematicians solved the cubic equation x 3 C ax 2 C bx C c D 0. Which term
they will remove then?

Irrational numbers arise naturally


p when solving quadratic equations. The roots of
x 2
6x C 4 D 0 are 3 ˙ 5–two irrational numbers. Al-Khwarizmi referred to
irrational numbers as "inaudible" numbers, which was later translated to the Latin
Remark
surdus, which means "deaf" or "mute." From this we obtained the term surd, which
is still
p used, typically for the square roots of non-square integers [58]. That is, we
call 5 a surd.

Quadratic equations in disguise. Many equations are actually quadratic equations in disguise.
For example, x 4 2x 2 C 1 D 0 is a quadratic equation t 2 2t C 1 D 0 with t D x 2 .
To demonstrate unexpected things in maths, let’s consider this equation:
p
5 x D 5 x2

To remove the square root, we follow the old rule: squaring both sides of the equation:

5 x D 25 10x 2 C x 4

Ops! We’ve got a quartic equation! Now comes the magic of maths, when I first saw this it was
like magic. Instead of seeing the equation as a quartic equation in terms of x, how about seeing
it as a quadratic equation in terms of 5??? With that in mind, we re-write the equation as

5 x D 52 5.2x 2 / C x 4 ” 52 .2x 2 C 1/5 C x 4 C x D 0


p
And we solve for 5 using the quadratic formula b˙ b2 4ac=2a (or we can complete a square):
p
.2x 2 C 1/ ˙ .2x 2 C 1/2 4.x 4 C x/ .2x 2 C 1/ ˙ j.2x 1/j
5D D
2 2
Simplifying the above, we get two equations:
( p p
x2 C x D 5 21 1 1 17
H) x D ; xD
x2 x C 1 D 5 2 2
p
(We had to ignore two roots that would not make
p sense for
p 5 xD5 x 2 ; as the LHS is a
non-negative number, so is the RHS. Thus, 5  x  5).
p
Should we memorize the quadratic formula? The formula b˙ b2 4ac=2a is significant as it
is the first formula ever that expresses the roots of an equation in terms of its coefficients. But

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 91

young students should not memorize and use it to solve quadratic equations. If we just use the
formula all the time we would forget how it was derived (i.e., the completing the square method).
Did we completely solve the quadratic equation? If your answer is yes, think again.
We have found two solutions given in Eq. (2.13.3). But who ensures that these are
Remark the only solutions that the quadratic equation can have? Anyway, we anticipate that
a quadratic equation can have at most two solutions. And this thinking eventually
led to the fundamental theorem of algebra.

2.13.3 Cubic equation


The most general form of a cubic equation is written as

ax 3 C bx 2 C cx C d D 0; a¤0 (2.13.4)

The condition a ¤ 0 is needed, otherwise Eq. (2.13.4) becomes a quadratic equation (suppose
that b ¤ 0). As we can always divide Eq. (2.13.4) by a, it suffices to consider the following
cubic equationŽŽ
x 3 C bx 2 C cx C d D 0 (2.13.5)

It turned out that solving a full cubic equation Eq. (2.13.5) was not easy. So, in 1545, the Italian
mathematician Gerolamo Cardano (1501–1576) presented a solution to the following depressed
cubic equation (it is always possible to convert a full cubic equation to the depressed cubic by
using this change of variable x D u b=3 to get rid of the quadratic term–the term bx 2Ž )

x 3 C px D q (2.13.6)

of which his solution is


s r s r
3 q q2 p3 3 q q2 p3
xD C C C C (2.13.7)
2 4 27 2 4 27

It was actually Scipione del Ferro who first discovered this solution. We have more to say about
this fascinating story in Section 2.13.5. In
r manyq books, Cardano’s formula is written slightly
q p 3
q 2=4 C p . Let’s first see one example.
3
different: x D q=2 C q 2=4 C p3=27 C q=2
3
27

ŽŽ
Note that the b; c; d in Eq. (2.13.5) are different from Eq. (2.13.4).
Ž
Again, calculus helps to understand why this change of variable: x D b=3 is the x-coordinate of the inflection
point of the cubic curve y D x 3 C bx 2 C cx C d . Note, however, that at the time of Cardano, calculus has not yet
been invented. But with the success of reducing a quadratic equation to the form u2 d D 0, mathematicians were
confident that they should be able to do the same for the cubic equation.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 92

Example 2.1
To check his formula, Cardano solved a concrete example: x 3 C 6x D 20. We have p D 6
and q D 20, plugging them into the formula, Eq. (2.13.7) (note that p 3 =27 D .p=3/3 to ease
the calculation): we obtain
q q
3 p 3 p
x D 10 C 108 10 C 108

Is this messy expression really the solution? For us, it is easy: using a calculator to check. But
Cardano did not have that luxury. What he did then? p He noted thatp x D 2 is one solution to
3
p 3
p
x C 6x D 20: 2 C 6.2/ D 20. If he can show that 10 C 108
3 3
10 C p108 is indeedp
nothing but 2, then his formula is correct. Noting that 108 D 36  3, hence 108 D 6 3.
With that, we obtain (with some trial/errors)
p p p p
10 C 108 D .1 C 3/3 ; 10 C 108 D . 1 C 3/3

Because of this Cardano could get rid of the ugly cube root and got 2!

For the moment, how del Ferro/Cardano came up with Eq. (2.13.7) is not as important as
how Eq. (2.13.7) led to the discovery of the imaginary number, now designated by i: i 2 D 1.
To see that, just consider the following equation

x3 15x D 4 (2.13.8)

Using Eq. (2.13.7) with p D 15 and q D 4, we get


q q
3 p 3 p
x D 2C 121 2C 121 (2.13.9)

As Eq. (2.13.7) was successfully used to solve many depressed cubic equations,
p it was perplexing
that for Eq. (2.13.8) it involves the square root of a negative number i.e., 121.
So, Cardano stopped there and it took almost 30 years for someone to make progress. It was
Rafael Bombelli (1526-1572)–another Italian– in 1572 who examined Eq. (2.13.9). He knew
that x D 4 is a solution to Eq. (2.13.8). Thus, he was about to check the validity of the following
identity
q q
3 p 3 p ‹
2C 121 2C 121 D 4 (2.13.10)
where the LHS is the solution if the cubic formula is correct and 4 is the true solution. In the
process, he accepted the square root of negative numbers and treated
p it as an ordinary number.
In hisp
own words, it was a wild thought as he had no idea about 121. He computed this term
.2 C 1/3 as
p p p p
.2 C 1/3 D 8 C 3.2/2 1 C 3.2/. 1/2 C . 1/3
p p p p (2.13.11)
D 8 C 12 1 6 1 D 2 C 11 1D2C 121

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 93

p3
p p p
3
p p
Thus, he knew 2 C 121 D 2C 1. Similarly, he also had 2C 121 D 2C 1.
Plugging these into Eq. (2.13.9) indeed gave him four (his intuition was correct):
q q
3 p 3 p
x D 2C 121 2C 121 D 4 (2.13.12)

About this, Leibniz–the co-inventor of calculus–expressed his confusion, “I did not understand
how ... a quantity could be real, when imaginary or impossible numbers were used to express it.”
[47]. And he found this so astonishing that among some unpublished papers of him, we found
several such expressions, as if he had calculated them endlessly.
Knowing one solution x D 4, it is straightforward to find the other solutions using a
factorization as

x3 15x 4 D 0 ” .x 4/.x 2 C 4x C 1/ D 0

If you’re not sure of this factorization, please refer to Section 2.29.2. The other so-
Remark lutions can be found by solving the quadratic equation x 2 C 4x C 1 D 0. This
quadratic equation gives us two solutions. Thus, in total the considered cubic equa-
tion has three solutions. That makes sense; the quadratic equation has two solutions,
it is reasonable then to suppose that a cubic equation has three solutions. This rea-
soning eventually leads to the fundamental theorem of algebra. To conclude, if we
can find one solution to the cubic equation, we can find the remaining ones.

del Ferro’s method to solve the depressed cubic equation. For unknown reason, he considered
the solution x D u C v. Putting this into the depressed cubic equation, we get:

.u3 C v 3 / C .3uv C p/.u C v/ D q

He needed another equation (as there are two unknowns), so he considered 3uv C p D 0, or
v D p=3u. With this, the above equation becomes u3 C v 3 D q, orŽ
p3
u3 Dq
27u3
which is a disguised quadratic equation with t D u3 :
r r
2 p3 q q2 p3 q q2 p3
t qt D 0 H) t1 D C C ; t2 D C
27 2 4 27 2 4 27
q
3 p
Select only the solution t1 D q=2 C q 2=4 C p27 , we get u D 3 t1 . By symmetry, we can do the
p
same thing but with v: we get v D 3 t2 . Finally, the solution to the original depressed cubic
equation is x D u C v, which is Cardano’s formula given in Eq. (2.13.7). Looking back, we can
see that del Ferro used this change of variable x D u p=3u to convert a cubic equation to a
sixth order equation (but actually a simple quadratic equation).
Ž
Later on Lagrange called this equation the “resolvent” of the original cubic equation.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 94

2.13.4 How Viète solved the depressed cubic equation


Viète exploited trigonometry to solve the depressed cubic equation. Note the similarity of
cos3  D 3=4 cos  C 1=4 cos.3 / (a trigonometric identity) and the cubic equation x 3 D px C q.
We put them side by side to see this similarity:
3 1
cos3  D cos  C cos.3 /
4 4 (2.13.13)
3
x D px C q

It follows that x D a cos  where a and  are functions of p; q. Substituting this form of x into
the cubic equation we obtain
p q
cos3  D 2
cos  C 3 (2.13.14)
a a
As any  satisfies the above trigonometric identity, we get the following system of equations to
solve for a and  in terms of p and q:
p 3 p  p 
2
D 2 3p 1 3 3q
a 4 H) a D  D cos 1
(2.13.15)
; p
q 1 3 3 2p p
D cos.3 /
a3 4
Thus, the final solution is
p   p 
2 3p 1 1 3 3q
xD p cos cos p (2.13.16)
3 3 2p p

Does Viète’s solution work for the case p D 15 and q D 4 (the one that caused trouble with
Cardano’s solution)? Using Eq. (2.13.16) with p D 15 and q D 4, we get
  p 
p 1 1 2 5
x D 2 5 cos cos (2.13.17)
3 25
which can be evaluated using a computer (or calculator) to give 4 (with angle of 1.3909428270).
Note that this equation also gives the other two roots 3:73205 p (angle is 1:3909428270 C 2)
and 0:267949 (angle is 1:3909428270 C 4). And there is no 1 involved! What does this
tell us? The same thing (i.e., the square root of a negative number) can be represented by i and
by cosine/sine functions. Thus, there must be a connection between i and sine/cosine. We shall
see this connection later.
Seeing how Viète solved the cubic equation, we can unlock de Ferro’s solution. de Ferro
used this identity .u C v/3 D u3 C v 3 C 3u2 v C 3uv 2 . We put this identity and the depressed
cubic equation altogether

.u C v/3 D 3uv.u C v/ C u3 C v 3
x3 D px C q

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 95

So, with x D u C v we obtain from the depressed cubic equation .u C v/3 D p.u C v/ C q.
Compare this with the identity .u C v/3 D    , we then get two equations to solve for p and q:
pD 3uv; q D u3 C v 3
And voilà! We now understand the solution of de Ferro. Obviously the algebra of his solution
is easy, what is hard is to think of that identity .u C v/3 D u3 C v 3 C 3u2 v C 3uv 2 in the first
place.

History note 2.4: Viète (1170 – 1240–50)


François Viète, Seigneur de la Bigotière (1540 – 23 February 1603)
was a French mathematician whose work on new algebra was an
important step towards modern algebra, due to its innovative use of
letters as parameters in equations. He was a lawyer by trade, and
served as a privy councilor to both Henry III and Henry IV of France.
Viète’s most significant contributions were in algebra. While letters
had been used to describe an unknown quantity by earlier writers,
Viète was the first to also use letters for the parameters or constant
coefficients in an equation. Thus, while Cardano solved particular cubic equations such
as

x3 5x D 6
Viète could treat the general cubic equation

A3 C px D q
where p and q are constants. Note that Viète’s version of algebra was still cumbersome
and wordy as he wrote ‘D in R - D in E aequabitur A quad’ for DR AE D A2 in our
notation.

2.13.5 History about Cardano’s formula


This section gives a brief historical account on the cubic formula following the interesting book
Journey through Genius by William Dunham.
Scipione del Ferro (1465 – 1526), a professor at the University of Bologna, was first to
make significant advancement in solving cubic equations. del Ferro could solve the depressed
cubic equations–those of the form x 3 C cx D d , but he did not publish his discovery. Instead
he kept it until his death. The reason is that he had to have secret weapons to be deployed in
“mathematical duels”–challenges from other Italian mathematicians. On his dead bed, he passed
over his formula to his student Antonio Fior. Although Fior was not so good a mathematician
as his mentor in 1535 he leveled a challenge at the noted Brescian scholar Niccolo Fontana
(1499-1557), better known as Tartaglia (meaning "the stammerer", after a teenage facial injury
from a French soldier’s sword).

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 96

The self-taught Tartaglia (Fig. 2.22a) discovered how to solve a different form of the cubic
— one missing the linear term; that is x 3 C mx 2 D n. This set the stage for a mathematical duel
between Fior and Tartaglia. In 1535, they exchanged 30 problems with a deadline of a month
and a half. Tartaglia sent Fior a variety of problems, whereas the mathematically weaker Fior
sent Tartaglia 30 depressed cubics, employing the “all eggs in one basket” strategy. Just days
before the deadline, Tartaglia discovered how to solve the depressed cubic, and quickly finished
all 30 problems. Meanwhile, Fior solved none of Tartaglia’s problems. News spread throughout
Italy of Tartaglia’s achievement, and Fior, humiliated, faded from view.
But then entered Gerolamo Cardano (Fig. 2.22b). Cardano tried, and failed, to replicate
Tartaglia’s success with the cubic, so he began a pressure campaign to convince Tartaglia to
share his method, even promising a vow of secrecy:

I swear to you by the Sacred Gospel, and on my faith as a gentleman, not only never
to publish your discoveries, if you tell them to me, but I also promise and pledge my
faith as a true Christian to put them down in cipher so that after my death no one
shall be able to understand them.

(a) Tartaglia (b) Cardano

Figure 2.22: Tartaglia and Cardano: two major figures in the solution of cubic equations.

Eventually, in 1539, Tartaglia relented and shared his knowledge for depressed cubic equa-
tions with Cardano, albeit written as a poem. For the clever Cardano, however, just knowing the
result, was enough to discover the underlying mathematics. He even discovered how to solve
the full cubic equation ax 3 C bx 2 C cx C d D 0. Despite his vow to Tartaglia, Cardano shared
these results to his talented assistant Ludovico Ferrari. Cardano recognized the importance of
these accomplishments and desperately wanted to publish the results. But he could not do so
because of his oath.
Then, on a trip to Bologna in 1543, Cardano saw in del Ferro’s notebooks the solution of the
depressed cubic before Tartaglia. In Cardano’s mind, this discovery freed him of his obligation
to Tartaglia. Two years later, Cardano published Ars Magna, which contained his and Ferrari’s
work on cubic and quartic equations.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 97

To describe his formula, re-given below


s r s r
3 q q2 p3 3 q q2 p3
xD C C C C
2 4 27 2 4 27

Cardano wrote

Cube one-third the coefficient of x; add to it the square of one-half the constant of
the equation; and take the square root of the whole. You will duplicate [repeat] this,
and to one of the two you add one-half the number you have already squared and
from the other you subtract one-half the same. Then, subtracting the cube root of
the first from the cube root of the second, the remainder which is left is the value of
x.

2.14 Mathematical notation


Up to this point we have used what is called mathematical notations to express mathematical
ideas. It is time to discuss these notations. Mathematical notation consists of using symbols for
representing operations, unspecified numbers, relations and any other mathematical objects, and
assembling them into expressions and formulas. Beside mathematics, mathematical notation
is widely used in science and engineering for representing complex concepts and properties in
a concise, unambiguous and accurate way. For instance, Einstein’s equation E D mc 2 is the
representation in mathematical notation of the mass–energy equivalence. Mathematical notation
was first introduced by François Viète, and largely expanded during the 17th and 18th century
by René Descartes, Isaac Newton, Gottfried Leibniz, and overall Leonhard Euler.

2.14.1 Symbols
The use of many symbols is the basis of mathematical notation. They play a similar role as
words in natural languages. They may play different roles in mathematical notation similarly as
verbs, adjective and nouns play different roles in a sentence.

Lectures as symbols. Letters are typically used for naming mathematical objects. Typically the
Latin and Greek alphabets are used, but some letters of Hebrew alphabet are sometimes used.
We have seen a; b; ˛; ˇ and so on. Obviously these alphabets are not sufficient: to have more
symbols, and for allowing related mathematical objects to be represented by related symbols,
diacritics (e.g. f 0 ), subscripts (e.g. x2 ) and superscripts (e.g. z 3 ) are often used. For a quadratic
equations, we can use x and y to denote its two roots. But it is sometimes better to use x1 and
x2 (both are x and we can see what is the first and what is the second root). What is more, when
we want to talk about the n roots of an n-order polynomial, we have to use x1 ; x2 ; : : : ; xn . Why
because we do not even know what is n.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 98

Other symbols. Symbols are not only used for naming mathematical objects. They can be used
p
for operations (C; ; ; ; ; : : :), for relations (D; >; <; : : :), for logical connectives ( H)
; ”; _; : : :), for quantifiers (8; 9/ and for other purposes.
What we need to know is that a notation is a personal choice of the particular mathematician
who used it for the first time. If interested, you can read A history of mathematical notations by
the Swiss-American historian of mathematics Florian Cajori (1859 –1930) [13].

2.15 Factorization
I have discussed a bit about factorization when presenting the identity a2 b 2 D .a b/.a C b/.
Herein, we delve into this topic with more depth. Recall that factorization or factoring consists
of writing a number or another mathematical object as a product of several factors, usually
smaller or simpler objects of the same kind. Factorization was first considered by ancient Greek
mathematicians in the case of integers. They proved the fundamental theorem of arithmetic,
which asserts that every positive integer may be factored into a product of prime numbers, which
cannot be further factored into integers greater than one. For example,

48 D 16  3 D 2  2  2  2  3

Then comes the systematic use of algebraic manipulations for simplifying expressions (more
specifically equations) dated to 9th century, with al-Khwarizmi’s book The Compendious Book
on Calculation by Completion and Balancing.
The following identities are useful for factorization, for two real numbers a and b:

(a) difference of squares: a2 b 2 D .a b/.a C b/


(b) difference of cubes: a3 b 3 D .a b/.a2 C ab C b 2 / (2.15.1)
(c) sum of cubes: 3
a Cb 3
D .a C b/.a 2 2
ab C b /

In using these identities, we need to see 1 as 12 or 13 , then the identity appears. For example,
a3 1 is a3 13 D .a 1/.a2 C a C 1/. This is similar to in trigonometry we see 1 as
sin2 x C cos2 x.
The first method for factorization is finding a common factor and using the distributive law
a.b C c/ D ab C ac. For example,

6x 3 y 2 C 8x 4 y 3 10x 5 y 3 D 2x 3 y 2 .3 C 4xy 5x 2 y/

Another technique is grouping:

4x 2 C 20x C 3xy C 15y

Then, factorizing each group and a common factor for the entire expression will show up:

4x.x C 5/ C 3y.x C 5/ D .x C 5/.4x C 3y/

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 99

In many cases, we have to look at the expressions carefully so that the identities in Section 4.13.2
will appear. For example, let’s simplify the following fraction
x 6 C a2 x 3 y
x 6 a4 y 2
We can process the numerator as x 3 .x 3 C a2 y/. About the denominator we should see it as
.x 3 /2 .a2 y/2 , then things become easy as the denominator becomes .x 3 C a2 y/.x 3 a2 y/.
And the fraction is simplified to x 3=x 3 a2 y Ž .

Example 2.2
The next exercise about factorization is the following expression:
a3 C b 3 C c 3 3abc
AD
.a b/2 C .b c/2 C .c a/2
Now we make some observations. First, the nominator is of degree three (a3 ) and the denom-
inator is of second degree. Second the three variables a; b; c are symmetrical. Thus, if that
expression can be factorized into a polynomial, it must be of this form of degree 1 (similar to
a3=a2 D a D a1 )

A D pa C qb C rc H) A D p.a C b C c/
The fact that p D q D r stems from the symmetry of a; b; c. To find p, just use b D c D 0
in the original expression, we find that p D 0:5. Thus, one answer might be:
aCbCc
AD
2
And now we just need to check if
 
3 3 3 aCbCc
a Cb Cc 3abc D Œ.a b/2 C .b c/2 C .c a/2 
2
And it is indeed the case. Thus, the answer is 0:5.a C b C c/.
The above method is not the usual one often presented in textbooks. Here is the textbook
method:
.a3 C b 3 / C c 3 3abc D .a C b/3 3ab.a C b/ C c 3 3abc
3 3
D Œ.a C b/ C c  3ab.a C b C c/
D Œ.a C b/ C cŒ.a C b/2 .a C b/c C c 2  3ab.a C b C c/
D .a C b C c/.: : :/

where in the third equality we have used the identity x 3 C y 3 D .x C y/.x 2 xy C y 2 /. Now
you see why in the expression of A, we must have the term 3abc, not 4abc or anything else.
It must be 3abc, otherwise there is nothing to simplify!
Ž
Nothing is mysterious here. We all know that 23=53 D 2=5: number 3 in both the numerator and denominator
gone.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 100

Another powerful method to do factorization is to use the identity difference of squares i.e.,
.X /2 .Y /2 D .X Y /.X C Y /. The thing is we have to make appear the form .X/2 .Y /2
called difference of squares. One way is to complete the square by adding zero to an expression.
For example, suppose that we need to factorize the following expression:

A D x4 C 4

We add zero to it so that a square appears:

 
A D .x 2 /2 C 22 C 4x 2 4x 2
D .x 2 C 2/2 .2x/2
D .x 2 C 2 C 2x/.x 2 C 2 2x/

Let’s solve one challenging problem in which we will meet a female mathematician and an
identity attached to her name. The problem is: compute the following without calculator:

.104 C 324/.224 C 324/    .584 C 324/


AD
.44 C 324/.164 C 324/    .524 C 324/

Observe first that 324 D 4  81 D 4  34 . Then all terms in A have this form: a4 C 4b 4 with
b D 3. So, let’s factorize a4 C 4b 4 :

.a2 /2 C .2b 2 /2 D .a2 /2 C 4a2 b 2 C .2b 2 /2 4a2 b 2


D .a2 C 2b 2 /2 4a2 b 2 (2.15.2)
D .a2 C 2b 2 C 2ab/.a2 C 2b 2 2ab/

This identity is known as the Sophie Germain identity, named after the French mathematician,
physicist, and philosopher Marie-Sophie Germain (1776 – 1831). Despite initial opposition from
her parents and difficulties presented by society, she gained education from books in her father’s
library and from correspondence with famous mathematicians such as Lagrange, Legendre, and
Gauss (under the pseudonym of ’Monsieur LeBlanc’). Because of prejudice against her sex, she
was unable to make a career out of mathematics, but she worked independently throughout her
life. Before her death, Gauss had recommended that she be awarded an honorary degree, but
that never occurred!

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 101

Sophie Germain was born in an era of revolution. In the year of her


birth, the American Revolution began. Thirteen years later the French
Revolution began in her own country. In many ways Sophie embodied
the spirit of revolution into which she was born. Sophie’s interest in
mathematics began during the French Revolution when she was 13
years old and confined to her home due to the danger caused by revolts
in Paris. She spent a great deal of time in her father’s library, and one
day she ran across a book in which the legend of Archimedes’ death
was recounted. Legend has it that "during the invasion of his city by
the Romans Archimedes was so engrossed in the study of a geometric
figure in the sand that he failed to respond to the questioning of a
Roman soldier. As a result he was speared to death". This sparked Sophie’s interest. If someone
could be so engrossed in a problem as to ignore a soldier and then die for it, the subject must be
interesting! Thus she began her study of mathematics.
Using Eq. (2.15.2) with b D 3, we have:
.a2 /2 C 324 D .a2 C 18 C 6a/.a2 C 18 6a/ D Œa.a C 6/ C 18Œa.a 6/ C 18 (2.15.3)
Now A is making sense: in the above identity we have a 6 and a C 6, and note that the numbers
in the nominator and denominator in A differ by 6: 10 and 4, 22 and 16 etc. This means that
there are many terms that can be canceled. Indeed, with Eq. (2.15.3), we have:
104 C 324 10( (
.10  16 C 18/.( 4C 18/
(((
D
4 C 324
4 .4( 10 C 18/.4  . 2/ C 18/
( (
(
( ((
4
58 C 324 .58  64 C 18/.(58( (
52(C(18/
(
(
D
524 C 324  58 C(18/.52  46 C 18/
(
.52
( (( ((

Almost all terms cancel each other and we get A D 373.


To master factorization we need practices and patient. We need to have a feeling of common
algebraic expressions. And one way to achieve that is to play with algebra so that it becomes
your friend.
We leave this fraction for you to simplify itŽ
p p p
2C 3C 4
p p p p p
2 C 3 C 6 C 8 C 16
Why factorization? Because factored expressions are usually more useful than the correspond-
ing un-factored expressions. For example, we use factorization to simplify fractions. We use
factorization to solve equations. It is hard to know what is the solution of x 3 6x 2 C11x 6 D 0,
but it is easy with .x 1/.x 2/.x 3/ D 0. Factors can be helpful for checking expressions.
For instance, consider a triangle of sides a; b; c, its area is denoted by A, then we have two
equivalent expressions for 16A2 :
16A2 D 2b 2 c 2 C 2c 2 a2 C 2a2 b 2 a4 b 4 c 4
D .a C b C c/.a C b c/.b C c a/.c C a b/
p
Ž
The answer is 2 1.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 102

As we know that the triangle area will be zero if a C b D c, and thus the factored expression for
16A2 reveals this clearly while the un-factored expression does not. By the way, the factored
expression above is known as Heron’s formula, see Eq. (3.1.1).

Manipulation of algebraic expressions is a useful skill which can be learned. Herein we discuss
some manipulation techniques. Consider this problem: given that the sum of a number and its
reciprocal (i.e., its inverse) is one, find the sum of the cube of that number and the cube of its
reciprocal.
We can proceed as follows. Let’s denote by xp the number, we then have x C 1=x D 1.
Solving this quadratic equation p we get x D .1 ˙ i 3/=2. Now, to get x 3 C 1=x 3 we need to
compute x , which is .1 ˙ i 3/ =8, but that would be difficultŽŽ . There should be a better way.
3 3

This is what we need


1
S D x3 C 3
x
and we have x C 1=x D 1. Let’s cube both sides of x C 1=x D 1 and S will show up:
 3  
1 1
3 1 1
xC D x C 3 C 3x 2 C 3x 2
x x x x
1
13 D S C 3.x C /
x
1 D S C 3  1 H) S D 2

We found S without even solving for x. With p that success, how about this problem: finding
S Dx 2021
C 1=x 2021
given that x C 1=x D 2? Obviously the solution just presented would
not work as no one would dare to expand .x C 1=x/2021 . To handle such crazy exponent (of
2021), we need complex p numbers and de Moivre’s formula discussed later in Section 2.25.2.
The solution is S D 2. Through this exercise, we see again unexpected connections in maths:
we use complex numbers (depending on i which we call imaginary) to compute something real.
Let’s consider another problem: given two real numbers x ¤ y that satisfy
(
x 2 D 17x C y
y 2 D 17y C x
p
What is the value of S D x 2 C y 2 C 1?
The problem is obviously symmetrical, so we will perform symmetrical operations: we sum
the two given equations, and we subtract the second from the first one:
(
x 2 C y 2 D 18.x C y/
(2.15.4)
x 2 y 2 D 16.x y/

Then, we multiply the resulting equations, and we can compute x 2 C y 2 : x 2 C y 2 D .16/.18/:


ŽŽ
Not really, but for maths–as an art form–we aim for beautiful solutions not ugly ones.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 103

.x 2 C y 2 /.x 2 y 2 / D .16/.18/.x 2 y 2/
p
Thus, S D .16/.18/ C 1. Another way (a bit slower) is to solve for x C y from the second
equation of Eq. (2.15.4), and then put it into the first to solve for x 2 C y 2 .

2.16 Word problems and system of linear equations


Consider this word problem ’To complete a job, it takes: Alice and Bob 2 hours, Alice and
Charlie 3 hours and Bob and Charlie 4 hours. How long will the job take if all three work
together?’
For future scientists and engineers working with these word problems is more useful than
solving a given system of equations (for instance Eq. (2.15.4)). This is because for engineers
and scientists setting up the equations is an important task. If they cannot solve the equations,
they can ask a computer or a friend from the maths department to do it. We see it quite often;
the English theoretical physicist Stephen Hawking (1942 – 2018) had collaborated with the
British mathematician Roger Penrose. Another example is the collaboration between Albert
Einstein and Marcel Grossmann. Grossmann (April 9, 1878 – September 7, 1936) was a Swiss
mathematician and a friend and classmate of Albert Einstein. Einstein told Grossmann: “You
must help me, or else I’ll go crazy.”
Now, let’s get back to Alice, Bob and Charlie. There are three sentences which can be
translated into three equations and solving them give the three unknowns. This is how word
problems work. The question is how to get a correct translation. Many US college students gave
an answer of 4.5 hours for this problem. Do you see why that must be wrong? If not, you should
develop a habit of guessing a plausible solution without solving it. Paul Dirac (1902 – 1984), an
English theoretical physicist who is regarded as one of the most significant physicists of the 20th
century, once said ’I consider that I understand an equation when I can predict the properties of
its solutions, without actually solving it’.

Figure 2.23: Alice, Bob and Charlie pouring concrete into a container. Why 100? The idea is not to use
small numbers such as 1; 2; : : : to avoid working with fractions. If you like, choosing 60 is working fine.
But I think 100 is a nice number.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 104

There are many ways to translate the words into equations. But it is probably easy if we
think of a specific job of, let say, filling concrete into a container of 100 m3 (see Fig. 2.23). Let’s
denote by A, B and C the number of concrete volume (in m3 ) that Alice, Bob and Charlie can
pour into the container within 1 hour§ . With this, it is straightforward to translate the sentence
’to complete a job, it takes Alice and Bob 2 hours’ to 2A C 2B D 100. So, we have this system
of equationsŽ

2A C 2B D 100
3A C 3C D 100 (2.16.1)
4B C 4C D 100

We have a system of three linear equations that is why we call it a system of linear equations.
The solution of this system is the three numbers A; B; C that when substituted into the system
we get true statements. How are we going to solve it? We know how to solve ax C b D 0, so
the plan is to remove/eliminate two unknowns and we’re left with one unknown. To remove two
unknowns, we first remove one unknown. To do that we can use any equation, e.g. B C C D 25,
write the to-be-removed unknown in terms of the other: for instance C D 25 B. Now C is
gone.
We can start removing any unknown, I start with C : from the third equation, we can get
C D 25 B, put it into the second equation of Eq. (2.16.1) we get 3A 3B D 25. This and the
first equation is the new system (with only two unknowns A; B) that we need to solve. We do the
same thing again: from 2A C 2B D 100 we get B D 50 A (i.e., we’re removing B), put that
into 3A 3B D 25: A D 175=6. Now we go backward to solve for B and for C . Altogether, the
solution is A D 175=6, B D 125=6 and C D 25=24ŽŽ . Then, if the time required for all three
people work together to fill the container is t , then the amount of concrete will be .A C B C C /t .
Thus, we have
100 24
.A C B C C /t D 100 H) t D D hours (2.16.2)
ACB CC 13
This solution is plausible because it is smaller than the two hours that take Alice and Bob;
Charlie should be useful even though he is a bit slower than the other two kids.
Let’s consider another word problem taken from The joy of x by the American mathematician
Steven Strogatz (born 1959). The problem goes like this. If the cold faucet can fill a bathtub
in half an hour and the hot faucet fills it in one hour, then how long does it take if both faucets
are filling together the bathtub? At the age of 10 or 11 Strogatz’s answer was 45 minutes when
given this problem by his uncle. What’s your solution?
§
You can name them x; y; z or whatever.
Ž
We have made an assumption that the efficiency of Alice, Bob and Charlie is constant, even though that it
is possible that when working with Bob Charlie might work harder than with Alice. Thus, solving word (or real)
problems demands us to ignore unimportant things to make the problem manageable.
ŽŽ
Did we solve the system? Even though we spent sometime and found A; B; C satisfying the solution, to be
honest with you, we have just found one solution. Of course if we can prove that this system has only one solution,
then our A; B; C are the solution. Can you explain why this system has a unique solution and when such a system
does not have solution? And can it have more than one solutions?

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 105

Here is his uncle’s solution. In one minute, the cold faucet fills 1=30 of the bathtub and
the hot faucet fills 1=60 of the bathtub. So, together they can fill 1=30 C 1=60 D 1=20 of the
bathtub in one minute. Thus, it takes them 20 minutes. That’s the answer. What if we do not
know fractions?
Is it possible to get the same answer without using fractions? Yes, using hours instead of
minutes! So, in one hour the cold faucet can fill two bathtubs, and the hot faucet fills one bathtub.
Together, in one hour they can fill 3 bathtubs. So, it takes them 1=3 hour to fill in one bathtub.
This is the solution of the older Strogatz. It does not involve fractions but it involves 3 bathtubs.
We could not think of this solution if our mind is fixed with the image of a real bathtub: one
bathtub with two faucets. Don’t forget maths is a world existing independently with our world.
You can do anything you like in this mathematical world!
Let’s stretch farther, can we solve this problem without doing any maths? Still remember
Paul Dirac’s above mentioned quote? This is the way to have deep understanding. Setting up the
equations and solving them without doing this step is like a robot.
Let’s try. Ok, we know that the cold faucet fills the tub in 30 minutes, so regardless the rate
of the hot faucet, together they have to fill in the tub in less than 30 mins. On the other hand, if
the hot faucet rate was the same as the cold one, then together they would do the job in 15 mins.
So, without doing any maths, we know the answer t is 15 < t < 30. What we have just done
is, according to Polya in How to solve it, considering special cases of the problem that we’re
trying to solve. We might not be able to solve the original problem, but we can solve at least
some simpler problems.

Systems of linear equations in chemistry. Back then in high school I did not know how to
balance chemical equations like the following one C3 H8 C 5 O2 ! 3 CO2 C 4 H2 O. The
problem is to find whole numbers x1 ; x2 ; x3 ; x4 such that

x1 C3 H8 C x2 O2 ! x3 CO2 C x4 H2 O

That is, to balance the total numbers of carbon (C), hydrogen (H) and oxygen (O) atoms on the
left and on the right of the chemical reactionŽŽ . Now, C, H and O play similar role of Alice, Bob
and Charlie. There are three atoms, and conservation of each atom gives one equation:

3x1 D x3 (balancing the total numbers of carbon)


8x1 D 2x4 (balancing the total numbers of hydrogen) (2.16.3)
2x2 D 2x3 C x4 (balancing the total numbers of oxygen)

Again, we see a system of linear equations! Solving this is easy: elimination technique. There
is one catch: we have four unknowns but only three equations. Let x4 D n, then we can
solve for x1 ; x2 ; x3 in terms of n: x1 D n=4, x3 D 3n=4, x2 D 5n=4. Take n D 4, we get
x1 D 1; x2 D 5; x3 D 3. If you take n D 8 you get another four solutions. Thus, we have
infinite number of solutions (which makes sense for 2 D 2 or 4 D 4 and so on).

ŽŽ
Because atoms are neither destroyed nor created in the reaction.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 106

Systems of linear equations. Eq. (2.16.1) is one example of a system of linear equations. In
these systems, there are n equations for n unknowns x1 ; x2 ; : : : ; xn where all equations are linear
in terms of xi (i D 1; 2; : : :) i.e., we will not see nonlinear terms like xi xj . In what follows, we
give examples for n D 2; 3; 4:

1x1 C 2x2 C 4x3 C 1x4 D1


2x1 C 2x2 D 10
2x1 C x2 D 1 2x1 C 1x2 C 1x3 C 7x4 D3
; 3x1 C 3x2 D 2 ;
3x1 C 2x2 D 2 5x1 C 1x2 C 3x3 C 4x4 D5
4x2 C 4x3 D 4
6x1 C 7x2 C 2x3 C 3x4 D2

If we focus on how to solve these equations, we would come up with the so-called Gaussian
elimination method (when we’re pressed to solve a system with many unknowns, say n  6).
On the other hand, if we are interested in the question when such a system has a solution, when
it does not have a solution and so on, we could come up with matrices and determinant. For
example, we realize that putting all the coefficients in a system of linear equations in an array
like
2 3
1 2 4 1
62 1 1 77
AD6 45 1 3 45
7 (2.16.4)
6 7 2 3

and we can play with this array similarly to the way we do with numbers. We can add them,
multiply them, subtract them. And we give it a name: A is a matrix. Matrices, determinants
and how to solve efficiently large systems of linear equations (n in the range of thousands and
millions) belong to a field of mathematics named linear algebra, see Chapter 11.
We’re not sure about the original source of systems of linear equations, but systems of linear
equations arose in Europe with the introduction in 1637 by René Descartes of coordinates in
geometry. In fact, in this new geometry, now called analytical geometry, lines and planes are
represented by linear equations, and computing their intersections amounts to solving systems
of linear equations.
But if systems of linear equations only come from analytical geometry we would only have
systems of 3 equations (a plane in 3D is of the form ax C bc C cz D 0), and life would be
boring. Systems of linear equations appear again and again in many fields (e.g. physics, biology,
economics and in mathematics itself). For example, in structural engineering–a sub-discipline of
civil engineering which deals with the design of structural elements (beams, columns, trusses),
we see systems of linear equations; actually systems of many linear equations. For example,
consider a bridge shown in Fig. 2.24a which is idealized as a system of trusses of which a part
is shown in Fig. 2.24b. Applying the force equilibrium to Fig. 2.24b we will get a system of 9
linear equations for the 9 unknown forces in the nine trusses.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 107

(a) (b)

Figure 2.24: Systems of linear equations in structural engineering.

Some word problems.

1. Two dogs, each traveling 10 ft/sec, run towards each other from 500 feet apart. As
they run, a flea flies from the nose of one dog to the nose of the other at 25 ft/sec.
The flea flies between the dogs in this manner until it is crushed when the dogs
collide. How far did the flea fly?

2. Alok has three daughters. His friend Shyam wants to know the ages of his daughters.
Alok gives him first hint: The product of their ages is 72. Shyam says this is not
enough information Alok gives him a second hint: the sum of their ages is equal
to the number of my house. Shyam goes out and look at the house number and
tells “I still do not have enough information to determine the ages”. Alok admits
that Shyam cannot guess and gives him the third hint: my oldest daughter likes
strawberry ice-cream.” With this information, Shyam was able to determine all
three of their ages. How old is each daughter?

Regarding the daughter-age problem, we have three unknowns and three hints, so it seems
to be a good problem. But did you try to set up the equations? There is only one equation, that is
xyz D 72 if x; y; z are the ages of the daughters. What if the product of their ages is a smaller
number, let say, 12? Ah, we can list out the ages as there are only a few cases. If that method
works for 12, of course it will work for 72; just a bit extra work. If you still cannot find the
solution, check this this website out. What if the product of their ages was a big number?
This is a good exercise to show that we should be flexible. Setting up equations is a good
method to solve word problems; but it does not solve all problems. There seems to be a problem
that defy all existing mathematics. And it is a good thing as it is these problems that keep
mathematicians working late at nights.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 108

Algebra is a language of symbols. Now, if we think again about the word problems, we see that
algebra is actually a language–a language of symbols (such as a, or A). What is the advantage
of this language? It is comprehensible: it can translate a lengthy verbose problem into a compact
form that the eyes can see quickly and the mind can retain what is going on. Compare this
To complete a job, it takes: Alice and Bob 2 hours, Alice and Charlie 3 hours and
Bob and Charlie 4 hours. How long will the job take if all three work together?
and
2A C 2B D 100
3A C 3B D 100
4B C 4C D 100
Anyone can undoubtedly recognize the powerful of algebra.

2.17 System of nonlinear equations


Contrary to systems of linear equations where we have a systematic method (e.g. Gauss elim-
ination method) to find the solutions, systems of nonlinear equations are harder to solve. But
they are less important than systems of linear equations. That’s why we have an entire course on
linear algebra just to handle systems of linear equations, whereas there is no course on systems
of nonlinear equations. I bet you’re correctly guessing that one good way to tackle a system of
nonlinear equations is to somehow transform it to a system of linear equations.
Let’s consider the following two equations:
x 3 C 9x 2 y D 10
(2.17.1)
y 3 C xy 2 D 2
Can we eliminate one variable? It might be possible, but we do not dare to follow that path. Try
it and you’ll see why. There must be a better way. Why? because this is a math exercise! High
school students should be aware of this fact: nearly all questions in a test/exam have solutions
and it is usually not hard and time consuming (as the test duration is finite!). Furthermore, if
there is a hard question, its mark is often low. Thus, you do not need to spend all of your time to
study to get A grades. Use that time to explore the world.
We present the first solution by considering .x C 3y/3 . Why this term? Because upon
expansion, we will have terms appearing in the two equations:
.x C 3y/3 D x 3 C 9x 2 y C 27xy 2 C 27y 3
D x 3 C 9x 2 y C 27.xy 2 C y 3 /
D 10 C 27  2 D 64
Now, we have x C 3y D 4 or x D 4 3y. Of course, we substitute x in Eq. (2.17.1) to get an
equation in terms of y:
y 3 C .4 3y/y 2 D 2 H) y 3 2y 2 C 1 D 0 (2.17.2)

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 109

Recognizing y D 1 is one solution of the above equation, we can factor its LHS and write
p p
1 ˙ 5 5  3 5
.y 1/.y 2 y 1/ D 0 H) y D 1 .x D 1/; y D .x D /
2 2
Is this solution a good one? Yes, but it is not general as it cannot be used when the second
equation is slightly different e.g. y 3 C 5xy 2 D 2. We need another solution which works for
any coefficients.
What is special about Eq. (2.17.1)? We see x 3 , x 2 y 1 , y 3 and x 1 y 2 ; these terms are all of
cubic order! If we do this substitute y D kx (or x D ky), all these terms become x 3 , kx 3 , k 3 x 3
and k 2 x 3 , and thus we can factor out x 3 and thus cancel this x 3 and we have an equation for k.
That’s the trick:

x 3 C 9x 2 y D 10 H) x 3 .1 C 9k/ D 10
y 3 C xy 2 D 02 H) x 3 .k 3 C k 2 / D 2

By dividing the first equation by the second one, we get the following cubic equation for k:

1 C 9k D 5.k 3 C k 2 / H) 5k 3 C 5k 2 9k 1 D 0 H) .k 1/.5k 2 C 10k C 1/ D 0


p
with solutions k D 1 (which results in x D y D 1) and k D 1 ˙ 2 5=5. Having k, we can
solve for x as
10 25 p p
x3 D D p H) x 3 D 5.9 5 C 20/; x3 D 5.9 5 20/ (2.17.3)
1 C 9k 20 ˙ 9 5
p
But we do not know how to (if we did not know the solutions x D 532 5 ) compute x from the
above expressions for x 3 . It turns out that using x D ky makes our life easier: we can get y from
y 3 . Try it. I am not sure why this is better, probably because y is simpler than x (Eq. (2.17.2)).
Let’s solve another system of radical equations:
p p
xC yD3
p p (2.17.4)
xC5C yC3D5

We can isolate terms involving y and square to get two equations for x:
( p p ( p
xD3 y x D9Cy 6 y
p p H) p
xC5D5 yC3 x C 5 D 25 C y C 3 10 y C 3

which leads to the following equation for y


p p
7 5 yC3D 3 y

This exercise was not about solving cubic equations, so this cubic equation must be easy. That’s why guessing
one solution is the best technique here.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 110

which can be solved for y by squaring both sides (two times). Not bad, but not elegant. What if
we can have a change of variable so that
x D ./2 ; x C 5 D ./2
p p
then, we can get rid of the square roots for x and x C 5 easily. Such a change of variable
does exist! And it is related to the familiar identity .p ˙ q/2 D p 2 ˙ 2pq C q 2 :
(
.p C q/2 D p 2 C 2pq C q 2
2 2 2
H) .p C q/2 .p q/2 D 4pq
.p q/ D p 2pq C q
Dividing both sides by 4, and introducing a new variable r such that pq D r, we get:
   
1 2 1 2 1 r 2 1 r 2
.p C q/ D .p q/ C pq H) pC D p Cr (2.17.5)
4 4 4 p 4 p
And that’s what we need: the red term is x D ./2 , then x C5 D ./2 . So, using the boxed equation
in Eq. (2.17.5), we introduce these changes of variables:
8̂  2 8̂  
ˆ 1 5 ˆ 1 5 2
ˆx D
< a ˆx C 5 D
< aC
4 a 4 a
ˆ   H)
ˆ  
ˆ ˆ
2
1 3 1 3 2
:̂y D b :̂y C 3 D bC
4 b 4 b
The original system of equations (2.17.4) become simply as:
8̂    
1 5 1 3 8
ˆ
< a C b D3 < aCb D8
2 a 2 b
    H) 5
ˆ1 5 1 3 : C3 D2
:̂ aC C bC D5 a b
2 a 2 b
which can be solved easily. A correct change of variable goes a long way!
Sometimes we can solve a hard equation by converting it to a system of equations which is
easier to deal with. As one typical example, let’s solve the following equation:
q q
3 p 3 p
14C x C 14 xD4
If we look at the terms under p
the cube roots, we seep
something special: their sum is constant i.e.,
p p
without x. So, if we do u D 14 C x and v D 14
3 3
x, we have u3 C v 3 D 28. And of
course, we also have u C v D 4 from the original equation. Thus, we have
(
uCv D4
u3 C v 3 D 28
which can be solvedpto have u D p 1; v D 3, and from that we get x D 169. If the equation is
p p
slightly changed to 14 C x C 3 14 a x D 4, a is any number, then our trick would not
3

work. Don’t worry you will not see that in standardized tests. In real life, probably. But then we
can just use a numerical method (e.g. Newton’s method, discussed in Section 4.5.4, or a graphic
method) to find an approximate solution.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 111

2.18 Algebraic and transcendental equations


The linear, quadratic and cubic equations that we have just seen belong to a general class of
algebraic equations. Other equations which contains polynomials, trigonometric functions, loga-
rithmic functions, exponential functions etc., are called transcendental equations. For example
x D cos x is a transcendental equation.

Definition 2.18.1
A polynomial equation of the form f .x/ D an x n Can 1 x n 1 Can 2 x n 2 C  Ca1 x Ca0 D
0 is called an algebraic equation. An equation which contains polynomials, trigonometric
functions, logarithmic functions, exponential functions etc., is called a transcendental equation.

In Section 2.13 we have solved linear/quadratic/cubic equations directly. That is, the so-
lutions ofpthese equations can be expressed as roots of the coefficients in the equations e.g.
x D b˙ b 2 4ac=2a in case of quadratic equations. It is also possible to do the same thing for
fourth-order algebraic equations (the formula is too lengthy to be presented here). But, as the
French mathematician and political activist Évariste Galois (1811 – 1832) showed us, polyno-
mials of fifth order and beyond have no closed form solutions using radicals. Why fifth order
equations so hard? To answer this question, we need to delve into the so-called abstract algebra–
a field about symmetries and groups. I do not know much about this branch of mathematics, so
I do not discuss it here. I strongly recommend Ian Stewart’s book Why Beauty Is Truth: The
History of Symmetry [62].
For transcendental equations, we need to use numerical methods i.e., those methods that
give approximate solutions not exact ones expressed as roots of the coefficients in the equations.
For example, a numerical method would give the solution x D 0:73908513 to the equation
cos x x D 0. We refer to Section 4.5.4 for a discussion on this topic.
Associated with algebraic equations and transcendental equations we have algebraic and
transcendental numbers, respectively. An algebraic number is any complex number (including
real numbers) that is a root of a non-zero polynomial in one variable with rational coefficients
(or equivalently, by clearing denominators, with integer coefficients). All integers and rational
numbers are algebraic, as are all roots of integers. Real and complex numbers that are not
algebraic, such as  and e, are called transcendental numbers. If you’re fascinated by numbers,
check out [54].

2.19 Rules of powers (exponentiation)


If a square has a side of length x, its area is then x 2 , and the volume of a cube of side x is x 3 . In
general, if n is a positive integer and x is any real number, then x n corresponds to the repeated
multiplication x  x  x     x, where x appears n times in the product. We call this x raised to
the power of n or x to the power of n. Here, x is the base and n is the exponent or the power.
This section is about the arithmetic of such powers x n . That is, how we can do addition/sub-
traction, multiplication/division of powers. For example, what is .23 /  .25 /. We start humbly

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 112

with the case x D 2 in Section 2.19.1. Section 2.19.2

2.19.1 Powers of 2
We know that 22 D 2  2 and 23 is nothing but 2  2  2. The product of a number (here 2) by
itself a number of times is called a power. Herein, we discuss this concept. The two to power
four is two multiplied by itself four times, which is expressed as

24 WD 2„  2ƒ‚
 2 …
2 (2.19.1)
4 times

Thus, 24 is nothing but a shorthand for 2  2  2  2. So, for positive integer as exponents, a
power is just a repeated multiplicationŽŽ .
We can deduce rules for common operations with powers. I first summarize these rules below,
for a reals a; b and m; n being positive integers:

(a) Product rule 1 am  an D amCn (b) Quotient rule 1 am=an D am n

(c) Power of power rule .am /n D amn (d) Zero exponent rule a0 D 1
p
(d) Negative exponent rule a n
D 1
an
(e) Rational exponent rule am=n D n
am
an
(f) Product rule 2 .ab/n D an b n (g) Quotient rule 2 .a=b/n D bn
(2.19.2)
And our task is to prove them in what follows. We use a specific number 2 but shall generalize
the result to any real number.
For example, multiplication of two powers of two is given by (m; n are positive integers)

2m  2n WD .2
„  2 ƒ‚
    …
2/  .2„  2 ƒ‚ 2/ D 2mCn
    … (2.19.3)
m times n times

which basically says that to multiply two exponents with the same base (2 here), you keep the
base and add the powers. And this is the product rule am  an D amCn for m; n 2 N Ž .
The next thing is certainly division of two powers. Division of two powers of two is written
as
2m
D 2m n (2.19.4)
2n
If that was not clear, we can always check a concrete case. For example,

25 22222 2  2  2  2  2
D D D 2  2 D 22 D 25 3
2 3 222 2  2  2
ŽŽ
We did the same game before: multiplication (of 2 integers) is a repeated addition. Now, we define a new math
object based on repeated multiplication. Why? Because it saves time.
Ž
Refer to Section 2.25.8 for what N is. Briefly it is the set (collection) of all natural numbers. Instead of writing
the lengthy “n is a natural number”, mathematicians write a 2 N.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 113

How about raising a power i.e., a power of a power such as .23 /2 ? It’s 82 D 64 D 26 . And we
generalize this to .2m /n :

.2m /n WD .2„  2 ƒ‚
    …
2/  .2
„  2 ƒ‚
    …
2/      .2„  2 ƒ‚ 2/ D 2mn
    … (2.19.5)
m times m times m times
„ ƒ‚ …
n times

And we also have this result .2m /n D .2n /m as both are equal to 2mn .
So far so good, we have rules for powers with positive integer index. How about zero and
negative index e.g. 20 and 2 1 ? To answer these questions, again we follow the rule applied to
1  1 D 1: the new rule should be consistent with the old rule. Thus, we compute 23 ; 22 ; 21
to see the pattern to deduce what should be 20 and 2 1 . From the data in Table 2.8: 20 D 1 and
2 1 D 1=2: in this table, while going down from the top row, the value of any row in the third
column is obtained by dividing the value of the previous row by two. Thus, 2 2 D 1=4 D 1=22 .
So, we have 2 n D 1=2n .

Table 2.8: Powers of 2 with positive, zero and negative index.

n 2n Value

3 222 8
2 22 4
1 2 2
0 20 1
1 2 1
1/2

The next natural


p question is how to find powers topa rational index e.g. 21=2 . It turns out that
it is nothing but 2. Why? Suppose that we write 2 D 2 and we want to find a. We first
a

write p p
2 D 2  2 D 2a  2a
Now, we want the product rule to be still valid even a is not integer (we do not know yet what it
is). So, we have
1
2 D 2aCa D 22a H) 2a D 1 H) a D
2
So, p
21=2 D 2 (2.19.6)
which reads ’2 to the power of 1=2 is the square root of 2’, nothing new comes up here. This
also indicates that the power of power rule is assumed to hold for rational exponents. We have
.21=2 /2 D 2.1=2/2 D 21 D 2, which is true because 21=2 is the square root of 2.
In the same manner, 21=3 is computed as
p3
.21=3 /3 D 2.1=3/3 D 2 H) 21=3 D 2

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 114

And 25=3 is computed as


p
3
.25=3 /3 D 2.5=3/3 D 25 H) 25=3 D 25

We can now generalize these results to base a, which is a real number. This was obtained
by replacing 2 by a–a real number, as in previous development there is nothing special about 2;
what we have done for 2 works exactly for any real number. That’s why we have Eq. (2.19.2).
Now that we have defined powers with a rational index am=n . Do all the rules (e.g. the
product rule) still apply for such powers? That is do we still have am=n ap=q D am=nCp=q ? To
gain insight, we can try few examples. For instance, 31=2  31=2 equals 3 (from square), but is
also equal 3 from 31=2C1=2 D 31 . Now we need a proof, once and for all!

Proof. We write am=n as aq m=q n and ap=q as apn=q n , then it follows


qm pn p p p p
qn q mCpn
D am=nCp=q
qn qn qn
a qn a qn D aq m apn D aq m apn D aq mCpn D a qn

A bit historical account

A bit of history about notation of exponents is in order. The notation we use today to
denote an exponent was first used by Scottish mathematician, James Hume in 1636. How-
ever, he used Roman numerals for the exponents. Using Roman numerals as exponents
became problematic since many of the exponents became very large so Hume’s notation
didn’t last long. A year later in 1637, Rene Descartes became the first mathematician
to use the Hindu-Arabic numerals of today as exponents. It was Newton who first used
powers with negative and rational index. Before him, Wallis wrote 1=a2 instead of a 2 .

2.19.2 Power with an irrational index


For a number raised to a fractional exponent,
p i.e., ap=q , the result is the denominator-th root of
the number raised to the numerator, i.e., ap . Again, we should ask ourselves this question: so
q

what happens when you raise a number to an irrational number? Obviously it is not so simple to
break it downplike what we have done in e.g. Eq. (2.19.6).
p
What is 2 2 ? It cannot be 2 multiplied by itself 2 times! So, the definition in Eq. (2.19.1)
no longer works. In other words, the starting point that a power is just a repeated multiplication is
no longer valid. This situation is similar to multiplication is a repeated addition (23 D 2C2C2)
does not apply to 2  3:4. p
To see what might be 2 2 , we can proceed as follows, without a calculator of course.
Otherwise
p we would not learn anything interesting but a meaningless number. We approximate
2 successively by 1:4, 1:41,
p 1:414 etc. and we compute the corresponding powers (e.g.
21:4 D 214=10 D 27=5 D 27 ). The results given in Table 2.9 show that as a more accurate
5

approximation of the square root of 2 is used, the powers converge to a value. Note that I have

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 115

p p
used a calculator to compute each approximation of 2 2 e.g. 214=10 D 214 . This is not
10

cheating as the main point here is to get the value of these approximations.

p
Table 2.9: Calculation of 2 2.

p
2.6390158
10
21:4 214=10 D 214
p
2.6573716
100
21:41 2141=100 D 2141
p
2.6647496
1000
21:414 21414=1000 D 21414
21:4142 2.6651190
21:41421 2.6651190
21:41421356 2.6651441383063186
21:414213562 2.665144142000993
21:4142135623 2.665144142555194
21:41421356237 2.6651441426845075

p
But, how can we be sure that 2 2 is a number? This can be guaranteed by looking at the
function 2x as shownpin Fig. 2.25. There is no hole in this curve or the function is continuous,
so there must exist 2 2 .
Are the rules of powers still apply for irrational index? Do we still have ax ay D axCy with
x; y being irrational numbers? If so, we say that the power rules work for real numbers, and
we’re nearly done (if we did not have complex numbers). How to prove this? One easy but not
strict way is to say that we can always replace ax by ar with r is a rational number close to x,
and ay by at . Thus ax ay  ar at D arCt p . To conclude, Eq. (2.19.2) is valid for real numbers
a; b and for real exponents p
(e.g. m can be 2).
We have calculated
p 2 by approximating the square root of 2 with a rational number, e.g.
2

. However, calculating the 1000th root is not an easy task. There must be
1414=1000 1000
2 D 2 1414

a better way to compute 2x for any real number x directly and efficiently. For this, we need cal-
culus (Chapter 4). That is, algebra can only help us so far, to go further we need new mathematics.

Adding up powers of two. Let’s consider the summation of powers of two starting from 21 to
2n 1 :
X
n 1
S.n/ D 1 C 2 C 4 C 8 C    C 2n 1 D 2i D‹ (2.19.7)
i D0
P
I have added the shorthand notation using the sigma just for people not familiar with this to
practice using it. It is useless for our purpose here though. To find the expression for S.n/, we
need to get our hands dirty by computing S.n/ for a number of values of n. The results for
n D 1; 2; 3; 4 are tabulated in Table 2.10. From this data we can find a pattern (see columns 3

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 116

2x
4

x
3 2 1 0 1 2 3

Figure 2.25: Plot of function 2x .

and 4 of this table). And this brings us to the following conjecture:


S.n/ D 1 C 2 C 4 C 8 C    C 2n 1
D 2n 1 (2.19.8)
And if we can prove that this conjecture is correct then we have discovered a theorem.
Table 2.10: S.n/ D 1 C 2 C 4 C 8 C    C 2n 1.

n S.n/ S.n/ S.n/

1 1 2 1 21 1
2 3 4 1 22 1
3 7 8 1 23 1
4 15 16 1 24 1

Proof. It is easy to see that S.1/ is correct (1 D 21 1). Now, assume that S.k/ is correct, or
1 C 2 C 4 C 8 C    C 2k 1
D 2k 1
Multiplying this equation by two results in the following
2 C 4 C 8 C    C 2k 1
C 2k D 2  2k 2
k kC1
1 C 2 C 4 C 8 C  C 2 D 2 1
So, S.k C 1/ is correct. 

Why powers?

I think that the concept of power emerged from practical geometry problems. If you have
a square of length 2, what is the area? It is 2  2 or two squared. If you have a cube of
length 2, the volume is 222 or two cubed. The notation 23 is just a convenient shortcut
for 2  2  2. Then, mathematicians generalize to an for any n.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 117

Becoming mathematician like.


p
p
p 2
2

What is 2 ? It is 2, an integer! You can check this using a calculator and then prove

it using the rules of powers that you’re now familiar with. Let’s go crazy: how about   ?

The second power of x or x squared?

We know that x; x 2 ; x 3 are called the first, second and third powers of x. But we also
know that x 2 is written/read x squared and x 3 as x cubed. Why? This is because ancient
Greek mathematicians see x 2 as the area of a square of side x.

Scientific notation.

When working with very large numbers such as 3 trillion we do not write it as
3 000 000 000 000 as there are too many zeros. Instead, we write it as 3  1012 (there
are 12 zeros explicitly written). Any number can be written as the product of a number
between 1 and 10 and a number that is a power of ten. For example, we can write 257 as
2:57  102 and 0.00257 as 2:57  10 3 . This system is called the scientific notation.
Doing arithmetic with this notation is easier due to properties of exponents. For example,
when we multiply numbers, we multiply coefficients and add exponents:

.3  106 /  .4  108 / D .3  4/  1014 D 12  1014 D 1:2  1015

The scientific notation immediately reveals how big a number is. We use the order of
magnitude to measure a number. Generally, the order of magnitude of a number is the
smallest power of 10 used to represent that number. For example, 257 D 2:57  102 , so
it has an order of magnitude of 2.

2.20 Inequalities
In mathematics, an inequality is a relation which makes a non-equal comparison between two
numbers or mathematical expressions. It is used most often to compare two numbers on the
number line by their size. There are several different notations used to represent different kinds
of inequalities:

 The notation a < b means that a is less than b.

 The notation a > b means that a is greater than b.

 The notation a  b means that a is less than or equal b.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 118

Inequalities are governed by the following properties:

(a) transitivity if a  b and b  c then a  c


(b) addition if x  y and a  b then x C a  y C b
(c1) multiplication if x  y and a  0 then ax  ay (2.20.1)
(c2) multiplication if x  y and a  0 then ax  ay
(d) reciprocals if x  y and xy > 0 then 1=x  1=y

I skip the proof of these simple properties herein. But if you find one which is not obvious you
should convince yourself by proving it.
Section 2.20.1 presents some simple inequality problems. Section 2.20.2 is about inequalities
involving the arithmetic and geometric means. The Cauchy-Schwarz inequality is introduced
in Section 2.20.3. Next, inequalities concerning absolute values are treated in Section 2.20.4.
Solving inequalities e.g. finding x such that jx 5j  3 is presented in Section 2.20.5. And
finally, how inequality can be used to solve equations is given in Section 2.20.6.

2.20.1 Simple proofs


Let’s solve the following inequality problems. Of course we’re forbidden to use a computer/cal-
culator. I repeat that we are not interested in which term is larger; instead we’re interested in the
mathematical techniques used in solving these inequality problems. Our task is to compare the
left hand side and right hand side terms (alternatively replacing the question mark by either > or
< symbol):
p p p p
1. 19 C 99 ‹ 20 C 98

2. 1998
1999
‹ 1999
2000

101999 C1 101998 C1
3. 102000 C1
‹ 101999 C1

4. 19991999 ‹ 20001998

One simple technique is to transform the given inequalities to easier ones. For the first problem,
we square two sides:
p p
19 C 99 C 2 19  99 ‹ 20 C 98 C 2 20  98
p p
19  99 ‹ 20  98
19  99 ‹ 20  98 D .19 C 1/  98
19  99 ‹ 19  98 C 98
19 ‹ 98
p p p p
Now we know ‹ should be <, thus 19 C 99 < 20 C 98.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 119

For the second problem, let’s first replace fractions:


1998 1999

1999 2000
1998  2000 ‹ 19992

Now come the trick; we replace 1999 by 0:5.1998C2000/, and the solution follows immediately:
 
1998 C 2000 2
1998  2000 ‹
2
4  1998  2000 < .1998 C 2000/2

Because .x C y/2  4xy for all x; y due to .x y/2  0.


Let’s solve this inequality with another way. Looking at 1998=1999 and 1999=2000, they are both
of the form x=1Cx . So, if we consider the function f .x/ D x=.1 C x/, then our problem becomes
comparing f .1998/ and f .1999/. If f .x/ is either a monotonically increasing or decreasing
function, then we know the answer to the question. To reveal the nature of f .x/ we need to
massage it a bit: f .x/ D 1=.1 C 1=x/. And with this form, f .x/ is a monotonically increasing
function. Thus, f .1998/ < f .1999/.
For the third problem, denoting 101998 by x and writing other numbers in terms of x, we
will have a simple inequality.

2.20.2 Inequality of arithmetic and geometric means


Starting with .x y/2  0, one can get .x C y/2  4xy. Now consider only non-negative x; y,
then by taking the square root for both sides of .x C y/2  4xy, we have

xCy p
 xy (2.20.2)
2

As the left hand side is the arithmetic mean (AM)


M
of x; y and the RHS is the geometric mean (GM), this
inequality is known as the AM-GM inequality. A ge- A
ometry explanation of this inequality is given in the
aCb
next figure. Consider a circle with diameter BC and AM D
p 2
center O. Select a point A on the circle, and the trian- GM D ab
gle ABC is a right triangle (if this is not clear, check ˛ ˇ
Fig. 3.17). Draw AH perpendicular to BC : BH D a
B H O C
and H C D b. Then .aCb/=2 D OM p , the radius of the a b
circle. It can be shown that AH D ab ŽŽ . It is obvious
that when A is traveling on the circle we always have AH  OM .
There are different ways to see this. Using trigonometry, from two right triangles ABH and ACH we have
ŽŽ

AH D a tan ˛ and AH D b tan ˇ. Then, AH 2 D ab tan ˛ tan ˇ. But tan ˛ tan ˇ D 1. In fact, this result is the
geometric mean theorem in Euclidean geometry, see Section 3.1.10.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 120

Now we should ask this question: does this AM-GM inequality holds for more than 2 num-
bers? For example, do we also have, for example

aCbCc p
3 aCbCcCd p
4
 abc;  abcd
3 4
for a; b; c; d  0?
Let’s first check for the case of 4 numbers as it is easier. Indeed, using the AM-GM for the
case of two numbers, we can write
9
aCb p >
 ab = aCbCcCd p p
2 H)  ab C cd
cCd p >
 cd ; 2
2
p p
Using again the AM-GM for two numbers ab and cd , we get what we wanted to verify§ :
q
aCbCcCd p p aCbCcCd p4
2 ab cd ; or  abcd
2 4
Now, we show that using the AM-GM for 4 numbers, we can get the AM-GM for 3 num-
bers. The idea is of course to remove d so that only three numbers a; b; c are left. Using
d D .a C b C c/=3ŽŽ , and the AM-GM inequality for 4 numbers, we have

aCbCc s  
aCbCcC aCbCc
3  abc
4 (2.20.3)
4 3

which is equivalent to
s  
aCbCc aCbCc
 4
abc
3 3

and a simple raising to 4th power gives the final result:

aCbCc p
3
 abc
3
Good! Should we aim higher? Of course. We have AM-GM for n D 2; 4 and certainly we have
similar inequalities for n D 2k for k 2 N. And from n D 4 we obtained the AM-GM for n D 3.
We can start from n D 32, and get n D 31, so on. It seems that we have the general AM-GM
inequality given by
§
pp p
If it is not clear to you that S D xy D 4 xy, here is the details: S D ..xy/1=2 /1=2 D .xy/1=4 . See ?? if
still not clear.
ŽŽ
This is the term we need to appear.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 121

a1 C a2 C    C an p
 n a1 a2 : : : an (2.20.4)
n

I present a proof of this inequality carried out by the French mathematician, civil engineer, and
physicist Augustin-Louis Cauchy (1789 – 1857) presented in his Cours d’analyse. This book is
frequently noted as being the first place that inequalities, and ı  arguments were introduced
into Calculus. Judith Grabiner wrote Cauchy was "the man who taught rigorous analysis to
all of Europe. The AM-GM inequality is a special case of the Jensen inequality discussed in
Section 4.5.2.
Cauchy used a forward-backward-induction. In the forward step, he proved the AM-GM
inequality for n D 2k for any counting number k. This is a generalization of what we did for the
n D 4 case. In the backward step, assuming that the inequality holds for n D k, he proved that
it holds for n D k 1 too.
Proof. Cauchy’s forward-backward-induction of the AM-GM inequality. Forward step. Assume
the inequality holds for n D k, we prove that it holds for n D 2k. As the inequality is true for k
numbers, we can write
a1 C a2 C    C ak p
 k a1 a2 : : : ak
k
akC1 C akC2 C    C a2k p
 k akC1 akC2 : : : a2k
k
Adding the above inequalities, we get
a1 C a2 C    C a2k p p
 k a1 a2 : : : ak C k akC1 akC2 : : : a2k
k
And apply the AM-GM for the two numbers in the RHS of the above equation, we obtain
q
a1 C a2 C    C a2k p p p
 2 k a1 a2 : : : ak k akC1 akC2 : : : a2k D 2 2k a1 a2 : : : a2k
k

Proof. Cauchy’s forward-backward-induction of the AM-GM inequality. Backward step. As-
sume the inequality holds for n D k, we prove that it holds for n D k 1. As the inequality is
true for k numbers, we can write
a1 C a2 C    C ak p
 k a1 a2 : : : ak
k
To get rid of ak , we replace it by a1 Ca2 CCak 1=k 1, and the above inequality becomes
a1 C a2 C    C ak 1 r
a1 C a2 C    C ak 1 C a1 C a2 C    C ak
k 1  k
a1 a2 : : : ak
1
1
k k 1
Phu Nguyen, Monash University © Draft version
Chapter 2. Algebra 122

A bit rearrangement of the LHS gives us


r
a1 C a2 C    C ak 1 k a1 C a2 C    C ak 1
 a1 a2 : : : ak 1
k 1 k 1
Raising two sides of the above inequality to kth power:
 
a1 C a2 C    C ak 1 k a1 C a2 C    C ak 1
 a1 a2 : : : ak 1
k 1 k 1
And we get what we needed:
a1 C a2 C    C ak 1 p
 k 1
a1 a2 : : : ak 1
k 1

Isoperimetric problems. If x C y D P where P is a given positive number, then Eq. (2.20.2)
gives us xy  P 2 =4. And the maximum of xy is attained when x D y. In other words, among
all rectangles (of sides x and y) with a given perimeter (P ), a square has the maximum area.
Actually we can also discover this fact using only arithmetic, see Table 2.11. This is a special
case of the so-called isoperimetric problems. An isoperimetric problem is to determine a plane
figure of the largest possible area whose boundary has a specified length.
The Roman poet Publius Vergilius Maro (70–19 B.C.) tells in his epic Aeneid the story of
queen Dido, the daughter of the Phoenician king of the 9th century B.C. After the assassination
of her husband by her brother she fled to a haven near Tunis. There she asked the local leader,
Yarb, for as much land as could be enclosed by the hide of a bull. Since the deal seemed very
modest, he agreed. Dido cut the hide into narrow strips, tied them together and encircled a
large tract of land which became the city of Carthage (Fig. 2.26). Dido knew the isoperimetric
problem!

Figure 2.26: Queen Dido’s isoperimetric problem.

Another isoperimetric problem is ‘Among all planar shapes with the same perimeter the
circle has the largest area.’ How can we prove this? We present a simple ‘proof’:

1. Among triangles of the same perimeter, an equilateral triangle has the maximum area;

2. Among quadrilaterals of the same perimeter, a square has the maximum area;

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 123

3. Among pentagon of the same perimeter, a regular pentagon has the maximum area;

4. Given the same perimeter, a square has a larger area than an equilateral triangle;

5. Given the same perimeter, a regular pentagon has a larger area than a square

We can verify these results. And we can see where this reasoning leads us to: given a perimeter,
a regular polygon with infinite sides has the largest area, and that special polygon is nothing but
our circle!

Table 2.11: Given two whole numbers such that n C m D 10 what is the maximum of nm.

n m nm

1 9 9
2 8 16
3 7 21
4 6 24
5 5 25

Now, let’s solve the following problem: assume that a; b; c; d are positive integers with
a C b C c C d D 63, find the maximum of ab C bc C cd . This is clearly an isoperimetric
problem. This term A D abCbc Ccd is not nice to a and d in the sense that a and d appear only
once. So, let’s bring justice to them (or make the term symmetrical): A D abCbcCcd Cda da.
A bit of algebra leads to A D .a C c/.b C d / da.
Now we visualize A as in Fig. 2.27. Now the problem becomes maximize the area of the
big rectangle and minimize the small area ad . The small area is 1 when a D d D 1. Now the
problem becomes easy.

Figure 2.27

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 124

2.20.3 Cauchy–Schwarz inequality


For a; b; c; x; y; z being real numbers, the Cauchy–Schwarz inequalities read:

.ax C by/2  .a2 C b 2 /.x 2 C y 2 /


(2.20.5)
.ax C by C cz/2  .a2 C b 2 C c 2 /.x 2 C y 2 C z 2 /

The proof of these inequalities is straightforward. Just expand all the terms, and we will end up
with: .ay bx/2  0 for the first inequality and .ay bx/2 C .az cx/2 C .bz cy/2  0
for the second inequality, which are certainly true. Can we have a geometric interpretation of
.ax C by/2  .a2 C b 2 /.x 2 C y 2 /? Yes, see Fig. 2.28; the area of the parallelogram EF GH is
the area of the big rectangle ABCD minus the areas of all triangles:

A D .a C c/.b C d / .ab C dc/ D ad C bc


p p
But this area is at most equal to a2 C b 2 c 2 C d 2 stemming from the fact that the area of a
parallelogram is maximum when it is a rectangle (proof in the right figure of Fig. 2.28: xy sin ˛
attains the maximum value of xy when sin ˛ D 1 or ˛ D =2). Thus, we have
p p
ad C bc  a2 C b 2 c 2 C d 2 or .ad C bc/2  .a2 C b 2 /.c 2 C d 2 /

c a
D C
G
d

H b A = xy sin α
h y
√ a2 +

b F
2
√ d α
c +
2
b2

d
A E B x
a c

Figure 2.28: Geometric meaning of .ax C by/2  .a2 C b 2 /.x 2 C y 2 /.

Now you might have guessed correctly what we are going to do. We generalize Eq. (2.20.5)
to

.a1 b1 C a2 b2 C    C an bn /2  .a12 C a22 C    C an2 /.b12 C b22 C    C bn2 /


!2 ! n !
X n X n X (2.20.6)
ai bi  ai2 bi2
i D1 i D1 i D1

And this is the Cauchy–Schwarz inequality, also known as the Cauchy–Bunyakovsky–Schwarz


inequality. The inequality for sums, Eq. (2.20.6), was published by Augustin-Louis Cauchy
(1821), while the corresponding inequality for integrals was first proved by Viktor Bunyakovsky

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 125

(1859). The modern proof of the integral version was given by Hermann Schwarz (1888). The
Cauchy–Schwarz inequality is a useful inequality in many mathematical fields, such as vector
algebra, linear algebra, analysis, probability theory etc. It is considered to be one of the most
important inequalities in all of mathematics.
We need to prove Eq. (2.20.6), but let’s first use Eq. (2.20.5) to prove some interesting
inequalities given below.
1. Example 1. For a; b; c > 0, prove that .a2 b C b c C c 2 a/.ab 2 C bc 2 C ca2 /  9a2 b 2 c 2
p p p p
2. Example 2. For a; b; c  0, prove that 3.a C b C c/  a C b C c
3. Example 3. For a; b; c; d > 0, prove that 1=a C 1=b C 4=c C 16=d  64=.a C b C c C d /.
4. Example 4. Let a; b; c > 0 and abc D 1, prove that:
1 1 1 3
C 3 C 3 
a3 .b C c/ b .c C a/ c .a C b/ 2
This is one question from IMO 1995. The International Mathematical Olympiad (IMO) is
a mathematical Olympiad for pre-university students, and is the oldest of the International
Science Olympiads.
Example 1: using Eq. (2.20.5) for .a2 b C b c C c 2 a/.ab 2 C bc 2 C ca2 / to have .a2 b C b c C
c 2 a/.ab 2 C bc 2 C ca2 /  : : :, and use the 3 variable AM-GM inequality for the p : :2: Example
p 2:
direct
p application of Eq. (2.20.5) after writing 3.a C b C c/ as .1 2
C 12
C 1 2
/.. a/ C . b/ 2
C
. c/2 /.
About Example 4, even though we know we have to use the AM-GM inequality and the
Cauchy–Schwarz inequality, it’s very hard to find out the way to apply these inequalities. Then, I
thought why I don’t reverse engineer this problem i.e., generate it from a fundamental fact. Let’s
do it and see what happens.
Let x; y; z > 0 and xyz D 1, using the AM-GM inequality we then immediately have
p
x C y C z  3 3 xyz D 3
I call this the fundamental inequality (for x; y; z > 0 and xyz D 1). And what we want to do
is to do some algebraic manipulations to this fundamental inequality and hope that the IMO
inequality will show up. That’s the plan. Looking at the IMO problem, it is of the form S  3
where S is something that we seek out to find out. To get that, we can start from
S.x C y C z/  .x C y C z/2 .H) S  x C y C z  3/
Re-writing the above as, we see the Cauchy–Schwarz inequality appears:
.1x C 1y C 1z/2  S.x C y C z/ (2.20.7)
This is because the LHS is in the form .axCby Ccz/2 . Of course we rewrite the 1s by something
p p
else, e.g. 1 D z C y= z C y, then the LHS of the above becomes (I label it by A)
 2
p x p y p z
A WD y C zp C z C xp C x C yp
yCz zCx xCy

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 126

Applying the Cauchy–Schwarz inequality to A, we get:


 2 
x y2 z2
A  2.x C y C z/ C C (2.20.8)
yCz xCz xCy
Now, comparing Eqs. (2.20.7) and (2.20.8) we have found out our S :
 2 
x y2 z2
S D2 C C
yCz xCz xCy
And since S  3, we have a new inequality:
x2 y2 z2 3
Let x; y; z > 0 and xyz D 1, then C C 
yCz xCz xCy 2
And this inequality can be a good exercise for a test. But not for the IMO as it is too obvious
with the square terms (which cause us to think of the Cauchy–Schwarz inequality). Now a bit of
transformation will gives us another inequality. Note that xyz D 1, so x 2 D x 2  1 D x 2 .xyz/,
and the same for y 2 and z 2 , and voilá cubic terms appear:

x 2 .xyz/ y 2 .xyz/ z 2 .xyz/ 3


C C 
yCz xCz xCy 2
3 3 3
x y z 3
C C 
1=y C 1=z 1=x C 1=z 1=x C 1=y 2
With a D 1=x, b D 1=y, c D 1=z, we get the IMO inequality:
1 1 1 3
C 3 C 3 
a3 .b C c/ b .c C a/ c .a C b/ 2
The cubic terms made this problem hard to prove. And of course, the solution is often presented
in a reverse order by starting with a D 1=x, b D 1=y, c D 1=z, but, sadly without explaining
where that idea came from.

Some special cases. For b1 D b2 D : : : D bn D 1, we have


.a1 C a2 C    C an /2  n.a12 C a22 C    C an2 /
or re-writing so that the AM appears, we obtain the so-called root-mean square-arithmetic mean
inequality:
s
a1 C a2 C    C an a12 C a22 C    C an2
 (2.20.9)
n n
This is because the RHS is the root mean square (RMS) which is the square root of the mean
square (the arithmetic mean of the squares):
s
a12 C a22 C    C an2
RMS D Q D (2.20.10)
n

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 127

Proof. Now is the time to prove Eq. (2.20.6). Let’s start with the simplest case:

.a1 b1 C a2 b2 /2  .a12 C a22 /.b12 C b22 /

We consider the following functionŽ , which is always non-negative:

f .x/ D .a1 x C b1 /2 C .a2 x C b2 /2  0 for 8x

We expand this function to write it as a quadratic equation:

f .x/ D .a12 C a22 /x 2 C 2.a1 b1 C a2 b2 /x C .b12 C b22 /

Now we compute the discriminant  of this quadratic equation:


 
 D 4 .a1 b1 C a2 b2 /2 .a12 C a22 /.b12 C b22 /

As f .x/ D 0 does not have roots or at most has one root, we have   0. And that concludes
the proof. For the general case Eq. (2.20.6), just consider this function f .x/ D .a1 x C b1 /2 C
.a2 x C b2 /2 C    C .an x C bn /2 . 

What happened to IMO winners? One important point is that the IMO, like almost all other
mathematical olympiad contests, is a timed exam concerning carefully-designed problems with
solutions. Real mathematical research is almost never dependent on whether you can find the
right idea within the next three hours. In real maths research it might not even be known which
questions are the right ones to ask, let alone how to answer them. Producing original mathematics
requires creativity, imagination and perseverance, not the mere regurgitation of knowledge and
techniques learned by rote memorization.
We should be aware of the phenomenon of ’burn-out’, which causes a lot of promising
young mathematicians–those who might be privately tutored and entered for the IMO by pushy,
ambitious parents–to become disenchanted in mathematics and drop it as an interest before they
even reach university. It is best to let the kids follow their interests.
If you enjoy participating in IMO contests, it is absolutely fine. Just be aware of the above
comments and you can listen to Terence Tao–who won the IMO when he was only 13 years
old–to see how he grew up happily.

2.20.4 Inequalities involving the absolute values


In many maths problems we need to measure the distance between two points, to know how
close or far they are from each other. Note that numbers can be seen as points living on the
number line. On this number line, there lives a special number: zero. And we want to quantify
the distance from any point x to zero. Thus, mathematicians defined jxj–the absolute value of
x–as the distance of x from zero. For instance, both –2 and +2 are two units from zero, thus
j2j D j 2j D 2.
Ž
How mathematicians knew to consider this particular function? No one knows.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 128

For any real number x, the absolute value or modulus of x is defined as



x; if x  0
jxj D (2.20.11)
x; if x < 0:

For example, the absolute value of 3 is 3, and the absolute value of 3 is also 3. The notation jxj,
with a vertical bar on each side, was introduced by the German mathematician Karl Weierstrass
(1815 – 1897) in 1841. He was often cited as the "father of modern analysis" and we will have
more to say about him in Chapter 4.
With that, let’s solve the first absolute value inequality:

jxj < 3

which means finding all values of x so that the distance of x from zero is less than 3. With a
simple picture (Fig. 2.29a), we can see that the solutions are:

3<x<3 or x 2 . 3; 3/

I have also presented the solutions using set notation x 2 . 3; 3/. The notation .a; b/ indicates
all number x such that a < x < b. It is called an open bracket as the two ends (i.e., 3 and 3)
are not included. Then the symbol 2 means belong to. We will have more to say about sets in
Section 2.31.

Figure 2.29: Geometry of jxj as the distance from x to zero.

The next problem is:


j2x C 3j  6
And by seeing 2x C 3 as X, the above becomes jXj  6, of which solutions are 6  X  6.
Now, replacing X by the small x, then we have (using the rules in Eq. (2.20.1))

6  2x C 3  6 ” 6 3  2x  6 3” 9=2  x  3=2

Or using the set notation, we can also express the solution as

x 2 Œ 9=2; 3=2

Here, Œa; b indicates a closed bracket.


Let’s move to the problem of finding all values of x so that the distance of x from zero is
bigger than something:
jxj  3

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 129

Again with a simple picture (Fig. 2.29b), we can see that the solutions are:
x3 or x  3
Or using the set notation, we can also express the solution as
x 2 . 1; 3 [ Œ3; 1
where the symbol [ in A [ B means a union of both sets A and B. Noting that it is not
necessary to write solutions of inequality problems using set notations. It is clear that we do
not gain anything by writing x 2 Œ 9=2; 3=2 instead of 9=2  x  3=2. Since set theory is
the foundation of [modern] mathematics, thus some people though that an early exposure to it
might be useful. That’s why/how set theory entered in high school curriculum.

Triangle inequality. Now comes probably the most important inequality involving absolute
values:

ja C bj  jaj C jbj (2.20.12)

This inequality is used extensively in proving results regarding limits, see Section 4.10. (We
actually used already in Section 2.22) Why triangles involved here? It comes from the fact that
for a triangle the length of one side is smaller than the sum of the lengths of the other sides.
Using the language of vectors, see Section 11.1, this is expressed as
jja C bjj  jjajj C jjbjj
Note the similarity of Eq. (2.20.12) compared with the above inequality. That explains its name.

2.20.5 Solving inequalities


Solving an inequality is to find (real) x such that the inequality holds. For example, find x such
that
4x 2
p < 2x C 9 (2.20.13)
.1 1 C 2x/2
First, we determine x such that the inequality makes senseŽŽ :
1
x ¤ 0; 1 C 2x  0; 2x C 9  0 ” x  ; x¤0
2
Then, we simplify the LHS of Eq. (2.20.13), because we see it is of the form ./2 and we can
remove the square root in the denominator:
 2 p !2
4x 2 2x 2x.1 C 1 C 2x/ p
p D p D D 2 C 2x C 2 1 C 2x
.1 1 C 2x/2 1 1 C 2x 2x
ŽŽ
That is, terms under square roots must be non-negative; in this problem the LHS is non-negative and smaller
than the RHS, thus the RHS must be non-negative. We’re not interested in the case x D 0, because the problem
would be come 0 < 9: nothing there to do!

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 130

Thus, Eq. (2.20.13) is simplified to


p 45
2 C 2x C 2 1 C 2x < 2x C 9 ” x <
8
Combined with the condition of x so that the inequality makes sense, we have the final solution:
0:5  x < 45=8 and x ¤ 0. Alternatively, using the set notation, we can write the solution as
(draw a fig like Fig. 2.29 would help):
   
1 45
x2 ; 0 [ 0;
2 8

2.20.6 Using inequalities to solve equations


Let’s first solve the following equation:
p
x 2 C 2x C 3 D 4 x2

The first approach is to square both sides to get rid of the square root. Doing so results in a
fourth order polynomial equation, which is something we should avoid. Let’s see if there is an
easier way. Note that the RHS is always smaller or equal 2. How about the LHS? It is equal
to .x C 1/2 C 2 which is always bigger or equal than 2. So, we have an equation in which the
LHS  2 and RHS  2. The only case where they are equal is when both of them are equal to
two: p
.x C 1/2 C 2 D 2; 4 x 2 D 2 ” x D 1; x D 0
There is no real solutions! If you prefer a visual solution: the LHS is a parabola facing up with
a vertex at . 1; 2/ while the RHS is a semi-circle centered at .0; 0/ with radius of 2 above the
x-axis. These two curves do not intersect! Of course this ‘faster’ method would not work if
number 3 in the LHS is replaced by another number so that the two curves intersectŽ .

2.21 Infinity
This section presents a few things about infinity, the concept of something that is unlimited,
endless, without bound. The common symbol for infinity, 1, was invented by the English math-
ematician John Wallis in 1655. Mathematical infinities occur, for instance, as the number of
points on a continuous line or as the size of the endless sequence of counting numbers: 1, 2, 3
etc.
The symbol 1 essentially means arbitrarily large or bigger than any positive number. Like-
wise, the symbol 1 means less than any negative number.
This section mostly concerns infinite sums e.g. what is the sum of all positive integers. Such
sums are called series. In Section 2.21.1 I present arithmetic series (e.g. 2 C 4 C 6 C    ), in
Section 2.21.2 I present geometric series (e.g. 1C2C4C   ), and in Section 2.21.3 the harmonic
Ž
What we would do then? Either solving a fourth order equation or using Newton’s method.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 131

series 1 C 1=2 C 1=3 C    . In Section 2.21.4, the famous Basel problem is presented. Section
Section 2.21.5 is about the first infinite product known in mathematics, and the first example of
an explicit formula for the exact value of .
Why we have to bother with infinite sums? One reason is that many functions can be ex-
pressed as infinite sums. For example,
X
1
f .x/ D a0 C .an cos nx C bn sin nx/
nD1
1 2 1 4 1 6
.1 x 2 /1=2 D 1 x x x C 
2 8 16

2.21.1 Arithmetic series


A child’s mother gives him 10 cents one day. Everyday thereafter his mom gives him
3 more cents than the previous day. After 10 days, how much does he have?
This simple problem exhibits what is called an arithmetic series. After day 1, he has 10 cents.
On the second day he gets 13 cents, on the third day 16 cents, and so on. The list of amounts he
gets each day
10; 13; 16; 19; 22; : : : ;
is called a sequence. When we add up the terms in this sequence to get the total amount he has
at some point
10 C 13 C 16 C 19 C 22 C 25 C 28 C 31 C 34 C 37
the result is a series or precisely a finite series, because the number of terms is finite. Shortly,
we shall discuss infinite series in which the number of terms is infinite. In this particular case,
where each term is separated by a fixed amount from the previous one, both series and sequence
are called arithmetic.
The amount that the boy has is simply obtained as a sum of ten terms, it is 235. But we need
a smarter way to solve this problem, just in case we face this problem: what is the amount after
a year? Doing the sum for 365 terms is certainly a boring task.
What we want here is a formula that gives us directly the arithmetic series. And mathemati-
cians solve this specific problem by considering a general problem (as it turns out it is easier
to handle the general problem with symbols). Let’s first define a general arithmetic sequence
with a being the first term and d being the difference between successive terms. The arithmetic
sequence is then
a; a C d; a C 2d; : : : ; a C .n 1/d; : : : (2.21.1)
where the nth term is a C .n 1/d . Now, the sum of the first n terms of this sequence is
a C a C d C a C 2d C    C a C .n 1/d . To compute this sum, we follow Gauss, by writing
the sum S in the usual order and in a reverse order (for 4 terms only, which is enough to see the
point):
S D a C aCd C a C 2d C a C 3d
S D a C 3d C a C 2d C a C d C a
2S D 2a C 3d C 2a C 3d C 2a C 3d C 2a C 3d

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 132

We can see that 2S D 4  .2a C 3d /, or S D .4=2/.2a C 3d / D .4=2/Œ.a/ C .a C 3d /. Now


we see the pattern, and thus the general arithmetic series is given by
num. of terms
a C a C d C    C a C .n 1/d D .1st term C final term/ (2.21.2)
2
Thus, with observation, we have developed a formula that just requires us to do one addition and
one multiplication, regardless of the number of terms involved! That’s the power of mathematics.

2.21.2 Geometric series


Suppose that the door is two meters away. To get to it, you must travel half of the distance (one
meter), then half of what is left (half a meter), then half of what is left (a quarter of a meter), and
so on. In total you must travel a distance of 1 C S with S is the following infinite sum:

1 1 1 X 1 1
SD C C  D (2.21.3)
2 4 8 i D1
2i

where the ellipsis ‘. . . ’ means ‘and so on forever’. This sum is called a geometric series, that is
a series with a constant ratio (1=2 for this particular case) between successive terms. Geometric
series are among the simplest examples of infinite series with finite sums, although not all of
them have this property. Why ‘geometric’? I shall explain it shortly.
Hey! What kind of human that walking to a door like that? The story is like this, you might
guess correctly that it came from a philosopher. In the fifth century BC the Greek philosopher
Zeno of Elea posed four problems, and the above is one of them passed on to us by Aristotle.
Zeno was wondering about the continuity of space and time.
To have an idea what S might be, you can compute it for some concrete values of n to see
what the sum might be. I did that for n up to 20 (of course using a small Julia code, Listing A.2)
and the result given in Table 2.12 indicates that S D 1. Even though the sum involves infinite
terms it converges to a finite value of one! And a geometry representation of this sum shown in
Fig. 2.30 confirms this. Noting that, in the past, Zeno argued that you would never be able to
get to the door! This is because the Greeks had no notion that an infinite number of terms could
have a finite sum.
Although we have numerical and geometric evidence that the sum is one, we still need a
mathematical proof. We need to do some algebra tricks here. The idea is: we do not go to
infinity (where is it?), thus we consider only n terms in the sum, then we see what happens to
this sum when we let n go to infinity (the danger isPfor n not for us, and this works). That’s why
mathematicians introduce the partial sum Sn D niD1 1=2i . With this symbol, they start doing
some algebraic manipulations to it and it reveals its secret to them. First they multiply Sn by 1=2
and put Sn and .1=2/Sn together to see the connection:
1 1 1 1 1
Sn D C C C    C n 1 C n
2 4 8 2 2
1 1 1 1 1
Sn D C C    C n C nC1
2 4 8 2 2

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 133

Terms S
1 0.5
2 0.75
3 0.875
:: ::
: :
10 0.9990234375
20 0.9999990463
Pn 1 P1
Table 2.12: S D i D1 2i . Figure 2.30: Geometry visualization of S D 1
i D1 2i .

What next then? Many terms are identical in Sn and half of it, so it is natural to subtract them
from each other to cancel out the common terms:

1 1 1 1
Sn Sn D H) Sn D 1
2 2 2nC1 2n

Because the series involves infinite terms, we should now consider the case when n is very large
i.e., n ! 1. For such n, the term 1=2n –which is the inverse of a giant number–is very very
small, and thus Sn is approaching one, which means that S approaches one too:

S D1 when n ! 1

There is nothing special about 1=2; 1=4; : : : in the series. Thus, we now generalize the above
discussion to come up with the following geometric series, with the first term a and the ratio r:

S D a C ar C ar 2 C ar 3 C    (2.21.4)

Then, we introduce the partial sum Sn (n is the number of terms) and multiply it with r, rSn , as
follows

Sn D a C ar C ar 2 C ar 3 C    C ar n 1

rSn D 0 C ar C ar 2 C ar 3 C    C ar n

It follows then,
a
.1 r/Sn D a ar n H) Sn D .1 r n/
1 r
Or,
a
a C ar C ar 2 C ar 3 C    C ar n 1
D .1 r n/ (2.21.5)
1 r
For the particular case of a D 1, we have this result

X
1
1 rn
ri D 1 C r C r2 C r3 C    C rn 1
D (2.21.6)
i D0
1 r

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 134

The Rice And Chessboard Story. There’s a famous legend about the origin of chess that goes
like this. When the inventor of the game showed it to the emperor of India, the emperor was so
impressed by the new game, that he said to the man "Name your reward!". The man responded,
"Oh emperor, my wishes are simple. I only wish for this. Give me one grain of rice for the first
square of the chessboard, two grains for the next square, four for the next, eight for the next
and so on for all 64 squares, with each square having double the number of grains as the square
before."
Let’s see how many grains would be needed. It can be seen that the total number of grains is
a geometric series with a D 1 and r D 2. Using Eq. (2.21.6), we can compute it:

1
S D 1 C 2 C 4 C ::: D .1 264 / D 18; 446; 744; 073; 709; 551; 615 (2.21.7)
1 2
The total number of grains equals 18; 446; 744; 073; 709; 551; 615 (eighteen quintillion four hun-
dred forty-six quadrillion, seven hundred forty-four trillion, seventy-three billion, seven hundred
nine million, five hundred fifty-one thousand, six hundred and fifteen)! Not only it is a very
large number, it is also a prime; the number of grains is the 64th Mersenne number. A Mersenne
number is a prime number that is one less than a power of two (2n 1). This number is named
after Marin Mersenne, a French Minim friar, who studied them in the early 17th century.
So we have seen two geometric series, one in Eq. (2.21.3) with r D 1=2 < 1 and one in the
chessboard legend with r D 2 > 1. While the first series converges, or is convergent (i.e., as the
number of terms get bigger and bigger the sum does not explode, it settles to a finite value), the
second series diverges (or is divergent); the more terms result in a bigger sum. The question now
is to study when the geometric series converges. Before delving into that question, noting that r
can be negative; actually mathematicians want it to be. Because they always aim for a general
result.
To see why geometric series with r < 1 converge, let’s look at Eq. (2.21.5). We have the term
1 r n which depends on n. But we also know that if 1 < r < 1 (or compactly jrj < 1 using
the absolute value notation), then r n approaches zero when n is getting bigger and bigger. You
can try these numbers 0:510 , 0:511 , 0:512 and you will see that they become smaller and smaller
and approaching zero (On a hand calculator, start with 0.5 and press the x 2 button successively,
you will get zero). Not a mathematical proof, but for now it is more than enough. For a proof,
we need the concept of limit. (Actually we have seen the idea of limit right in Table 2.12).
So, we have for jrj < 1, 1 r n goes to one when n goes to infinity. From Eq. (2.21.5) the
geometric series thus becomes
a
a C ar C ar 2 C ar 3 C    D ; for jrj < 1 (2.21.8)
1 r
Note that this formula holds only for jrj < 1. If we use it for jrj > 1, we would get absurd
results. For example, with r D 2, this formula gives us

1 C 2 C 4 C 8 C  D 1

which is absurd. Weird things can happen if we use ordinary algebra to a divergent series! Now

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 135

we can understand why Niels Henrik AbelŽŽ said “Divergent series are the devil, and it is a
shame to base on them any demonstration whatsoever”.

Using the geometric series formula to express repeating decimals. We can use geometric
series to prove that a repeating decimal is a rational number. For example,

0:22222222 : : : D 0:2 C 0:02 C 0:002 C   


2 2 2
D C C C    .a D 2=10; r D 1=10/
10 100 1000
2 9 2
D = D using Eq. (2.21.8)
10 10 9
And in the same manner, we have this
9 9
0:99999 : : : D 0:9 C 0:09 C 0:009 C    D = D1 (2.21.9)
10 10
Can you name a number that is larger than 0.999... and smaller than 1? If not, these two numbers
are the same!

2.21.3 Harmonic series


The harmonic series is the divergent infinite series:

1 1 1 X1 1
S D 1 C C C C  D (2.21.10)
2 3 4 k
kD1

Why is this series called the harmonic series? We can find the following answer everywhere.
It is such called because each terms of the series, except the first, is the harmonic mean of its
two nearest neighbors. And the explanation stops there!. This response certainly raises more
questions than it answers: What is the harmonic mean? To have a complete understanding, we
have to trace to the origin.

The harmonic mean. We know


p the arithmetic mean of two numbers a and b is A D 0:5.a C b/.
The geometric mean is G D ab. The harmonic mean is H D 2ab=.a C b/. Or equivalently,
1=H D 0:5.1=a C 1=b/; H is the reciprocalŽ of the average of the reciprocals of a and b. So,
ŽŽ
Niels Henrik Abel (1802 – 1829) was a Norwegian mathematician. His most famous single result is the first
complete proof demonstrating the impossibility of solving the general quintic equation in radicals. He was also an
innovator in the field of elliptic functions, discoverer of Abelian functions. He made his discoveries while living in
poverty and died at the age of 26 from tuberculosis. Most of his work was done in six or seven years of his working
life. Regarding Abel, the French mathematician Charles Hermite said: "Abel has left mathematicians enough to
keep them busy for five hundred years." Another French mathematician, Adrien-Marie Legendre, said: "what a head
the young Norwegian has!"). The Abel Prize in mathematics, originally proposed in 1899 to complement the Nobel
Prizes, is named in his honor.
Ž
Reciprocal is like inverse. Mathematicians love doing this. For example, instead of saying “perperndicular”,
they say “orthogonal”.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 136

1=n is the harmonic mean of 1=.n 1/ and 1=.n C 1/ for n > 1. Now, we are going to unfold
the meaning of these means.
It is a simple matter to find the average of two numbers. For example, the average of 6 and
10 is 8. When we do this, we are really finding a number x such that 6; x; 10 forms an arithmetic
sequence: 6,8,10. In general, if the numbers a; x; b form an arithmetic sequence, then
aCb
x aDb x H) x D (2.21.11)
2
Similarly, we can define the geometric mean (GM) of two positive numbers a and b to be the
positive number x such that a; x; b forms a geometric sequence. One example is 2; 4; 8 and this
helps us to find the formula for GM:
x b p
D H) x D ab
a x
Now getting back to the harmonic series. What is the value of S ? I do not know, so I
programmed a small function and let the computer compute this sum. And for n D 1010 (more
than a billion), we got 25.91. Now, we know this sum is infinity, thus called a divergent series.
How can we prove that? The divergence of the harmonic series was first proven in the 14th
century by the French philosopher of the later Middle Ages Nicole Oresme (1320–1382). Here
is what he did:
1 1 1
S D1C C C C 
2 3 4 
1 1 1 1 1 1 1
S >1C C C C C C C C    .replace 1=3 by 1=4/
2 4 4 5 6 7 8
 
1 1 1 1 1 1 (2.21.12)
S >1C C C C C C C  .1=4 C 1=4 D 1=2/
2 2 5 6 7 8
 
1 1 1 1 1 1
S >1C C C C C C C .replace 1=5; 1=6; 1=7 by 1=8/
2 2 8 8 8 8
„ ƒ‚ …
1=2

So Oresme compared the harmonic series with another one which is divergent and smaller
than the harmonic series. Thus, the harmonic series must diverge. This proof, which used a
comparison test, is considered by many in the mathematical community to be a high point
of medieval mathematics. It is still a standard proof taught in mathematics classes today. Are
there other proofs? How about considering the function y D 1=x and the area under the curve
y D 1=x? See Fig. 2.31. The area under this curve is infinite and yet it is smaller than the area
of those rectangles in this figure. This area of the rectangles is exactly our sum S .
It is interesting to show that one can get the harmonic series from a static mechanics problem
of hanging blocks (Fig. 2.32). Let’s say that we have two identical blocks and want to position
them one on top of the other so that the top one has the largest overhang, but doesn’t topple
over. The way to do that, using statics, is to place the top block (labeled by 1 in the referred
figure) precisely halfway across the one underneath (i.e., block ). 2 In this way, the center of

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 137

Z ∞ dx
= ln(∞) = ∞
1 x

1/2 1
1
1/3 y=
x
1/2
1/3
x
1 2 3 4 5 6

Figure 2.31: Calculus-based proof of the divergence of the harmonic series. The harmonic series and the
are under the curve y D 1=x leads to a famous constant in mathematics. Can you find it?

mass of the top block falls on the left edge of the bottom block. So, with two blocks, we can
have a maximum overhang of 1=2.
With three blocks, we first have to find the center of mass of the two blocks
1 and .2 As
shown in Fig. 2.32b, this center’s x coordinate is x12 D 3=4 (check Section 7.8.7 for a refresh
on how to determine the center of mass of an object). Now we place block 3 such that its left
edge is exactly beneath that center. From that we can deduce that the overhang for the case of
three blocks is 1=2 C 1=4. Continuing this way, it can shown that the overhang is given by
 
1 1 1 1 1 1
C C C  D 1 C C C 
2 4 6 2 2 3
which is half of the harmonic series. Because the harmonic series diverges, it is possible to have
an infinite overhang!
To understand why similar series possess different properties, we put the geometric and the
harmonic series together below
1 1 1 1 1 1
Sgeo D 0 C C C C C C 
2 4 8 16 32 64
1 1 1 1 1 1 1
Shar D 1 C C C C C C C   
2 3 4 5 6 7 8
Now we can observe that the terms in the geometric series shrink much faster than the terms in
the harmonic series e.g. the sixth term in the former is 0.015625, while the corresponding term
is just 1=7 D 0:142857143.

2.21.4 Basel problem


The Basel problem was first posed by the Italian mathematician and clergyman Pietro Mengoli
(1626 – 1686) in 1650 and solved by Leonhard Euler in 1734. As the problem had withstood the
attacks of the leading mathematicians of the day (Leibnitz, Bernoulii brothers, Wallis)ŽŽ , Euler’s
ŽŽ
Jakob Bernouilli expressed his eventual frustration at its elusive nature in the comment “If someone should
succeed in finding what till now withstood our efforts and communicate it to us, we shall be much obliged to him”.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 138

1
m1 x1 + m2 x2 x1 + x2 3

1 x12 = = =
m1 + m2 2 4
x12
1/2 ○
2

3/4


1
2mx12 + mx3 11
1/2 x123 = =
x123 ○
2 3m 12
1/4 ○
3
1/6 ○
4

0 11/12
x

Figure 2.32: Stacking identical blocks, each has a mass of m, with maximum overhang and its relation to
the harmonic series. Without loss of generality, the length of each block is one unit. Noting that x123 is
the x coordinate of the center of mass of three blocks ,1 2 and .
3

solution brought him immediate fame when he was twenty-eight. The problem is named after
Basel, the hometown of Euler as well as of the Bernoulli family who unsuccessfully attacked
the problem.
The Basel problem asks for the precise summation of the reciprocals of the squares of the
natural numbers, i.e., the precise sum of the infinite series:

1 1 1 1
S D1C C C C    C 2 C    D‹ (2.21.13)
4 9 16 k

Before computing this series, let’s see whether it converges. The series converges which can
be verified by writing a small code. For a proof, we follow Oresme’s idea of using a comparison
test. The idea is to compare this series with a larger series that converges. We compare the
following series

1 1 1
S1 D 1 C C C C  .S1 is nothing but S/
22 33 44 (2.21.14)
1 1 1
S2 D 1 C C C C 
12 23 34

And if S2 converges to a finite value, then S1 should be convergent to some value smaller as
S1 < S2 . Indeed, we can re-write the partial sum of the second series as a telescoping sum

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 139

(without 1)Ž

1 1 1 1
S2 .n/ 1D C C C  C
12 23 34 n.n C 1/
       
1 1 1 1 1 1 1
D 1 C C C  C
2 2 3 3 4 n nC1
        (2.21.15)
1 1 1 1 1 1 1
D 1  C   C   C  C
2 2 3 3 4 n nC1
1 1
D1 H) S2 .n/ D 2
nC1 nC1

When n is approaching infinity the denominator in 1=nC1 is approaching infinity and thus this
fraction approaches zero. So, S2 converges to two. Therefore, S1 should converge to something
smaller than two. Indeed, Euler computed this sum, first by considering the first, say, 100 terms,
and found the sum was about 1.6349§ . Then, using ingenious reasoning, he found thatŽŽ

1 1 1 1 2
S D1C C C C  C 2 C  D
4 9 16 k 6

How Euler came up with this result? He used the Taylor series expansion of sin x, and the infinite
product expansion. See Section 4.15.7 for Euler’s proof and Section 3.11 for Cauchy’s proof.
In what follows, I present another proof. This proof is based on the following two lemmas‘ :

4X
1
1
 SD ;
3 nD1 .2n 1/2

Z 0
1
 x m ln xdx D
1 .m C 1/2

Ž
See Section 2.21.6 to see why mathematicians think of this way to compute S2 .
§
How Euler did this calculation without calculator is another story. Note that the series converge very slow
i.e., we need about one billion terms to get an answer with 8 correct decimals. Euler could not do that. But he is a
genius; he had a better way. Check Section 4.18 for detail.
ŽŽ
We have to keep in mind that at that time Euler knew, see Section 4.15.4, that another related series has a sum
related to :
 1 1 1
D1 C 
4 3 5 7

In mathematics, a lemma (plural lemmas or lemmata) is a generally minor, proven proposition which is used as
a stepping stone to a larger result. For that reason, it is also known as a "helping theorem" or an "auxiliary theorem".

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 140

which can be proved straightforwardly. Then, the sum in the Basel problem can be written as
1 Z
4X 4X 4 X 0 2n
1 1
1 1
SD D D x ln xdx
3 nD1 .2n 1/2 3 nD0 .2n C 1/2 3 nD0 1
Z X 1 
4 0 2n
D ln x x dx .the sum is a geometric series/
3 1 nD0
Z
4 0 ln x
D dx
3 1 1 x2
where in the first equality, we simply changed the dummy variable 2n 1 to 2n C 1 as both
represent odd numbers. In the second equality, we used the second lemma with 2n plays the role
of m. In the third equality, we change the order of sum and integration and finally we computed
the sum which is a geometric series 1 C x 2 C x 4 C    ŽŽ . Why the geometric series appear here
in the Basel problem? I do not know, but that is mathematics: when we have discovered some
maths, it appears again and again not in maths but also in physics!
R
And why calculus (i.e., integral f .x/dx) in a class of algebra? Why not? We divide
mathematics into different territories (e.g. algebra, number theory, calculus, geometry
Remark etc.). But it is our invention, maths does not care! Most of the times all mathematical
objects are somehow related to each other. You can see algebra in geometry and vice
versa. That’s why I presented this proof here in the chapter about algebra.

2.21.5 Viète’s infinite product


Viète’s formula is the following infinite product of nested radicals (a nested radical is a radical
expression–one containing a square root sign, cube root sign– that contains or nests another
radical expression) representing the mathematical constant :
q p
p p p p
2 2 2C 2 2C 2C 2
D  (2.21.16)
 2 2 2
Viète formulated the first instance of an infinite product known in mathematics, and the first
example of an explicit formula for the exact value of . Note that this formula does not have any
practical application (except that it allows mathematicians to compute  to any accuracy they
wantŽ ; they’re obsessed with this task). But in this formula we see the connection of geometry
(), trigonometry and algebra.
Viète’s formula may be obtained as a special case of a formula given more than a century
later by Leonhard Euler, who discovered that (a proof is given shortly), for n large we have
sin x x x x
D cos 1 cos 2    cos n (2.21.17)
x 2 2 2
ŽŽ
If not clear this is a geometric series with a D 1 and r D x 2
Ž
This problem itself does not have any practical application!

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 141

Evaluating this at x D =2:


2   
D cos cos cos  (2.21.18)
 4 8 16
p p
Starting with cos =4 D 2=2 and using the half-angle formula cos ˛=2 D 1Ccos.˛/=2 (see
Section 2.25.5 for a proof), the above expression can be computed as
q p
p p p p
2 2 2C 2 2C 2C 2
D 
 2 2 2
Proof. Here is the proof of Eq. (2.21.17). The starting point is the double-angle formula sin x D
2 sin x2 cos x2 , and repeatedly apply it to sin x=2, then to sin x=4 and so on:
x x
sin x D 2 sin cos
2 2
x x x
D 2.2 sin cos / cos (2.21.19)
4 4 2
x x x x x x x x
D 2.2/.2 sin cos / cos cos D 23 sin 3 cos 1 cos 2 cos 3
8 8 4 2 2 2 2 2
Thus, after n applications of the double-angle formula for sin x, we get

x Y
n
n x x x x n x
sin x D 2 sin n cos 1 cos 2    cos n D 2 sin n cos i (2.21.20)
2 2 2 2 2 i D1 2
Q
where in the last equality, I used the short-hand Pi symbol (it is useless for the proof here, I
just wanted to introduce this notation). It is used in mathematics to represent the product of a
bunch of terms (think of the starting sound of the word “product”)ŽŽ .
Dividing both sides by x gives us
x
sin x sin n
D 2 cos x cos x    cos x
x x=2n 21 22 2n
As the red term approaches 1 when n is very large (this is the well known trigonometry limit
limh!0 sin h=h D 1 or simply sin h  h when h is small), Euler’s formula follows. 
Viète had a geometry proof, which is now presented. When  is present, there is a circle
hidden somewhere. As this formula should be applicable to any circle, let’s consider a circle
of unit radius. The idea is to compare the area of this circle (which is ) with that of regular
polygons inscribed in the circle. Starting with a square, then an octagon, then hexadecagon and
so on, see Fig. 2.33. If you do not know about trigonometry, then read Chapter 3 and get back.
For an octagon, its area is eight times the area of the triangle OAB with OH D cos =8 and
AB D 2 sin =8:
p
1    2
A8 D .8/. /.2 sin / cos D 4 sin D 4
2 8 8 4 2
Qn
ŽŽ
To practice, mathematicians write i D1 i to mean the product of the first n integers (e.g. .1/.2/    .n/).

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 142

where sin 2x D 2 sin x cos x was used for x D =8. And thus, equating this area to the circle
area, we get the following equation
p p
2 2 2
 D4 H) D
2  2
which is only a rough approximation for . We need to use a polygon of more sides. For an
hexadecagon, similarly we have
s p p
1    1 2=2 4 2
A16 D .16/. /.2 sin / cos D 8 sin D 8 Dp p
2 16 16 8 2 2C 2

And again, equating this area to the circle area gives us


p p p p
4 2 2 2 2C 2
Dp p H) D
2C 2  2 2

Continuing this process, we shall get Eq. (2.21.16).

B
π/4 H
π/2
1 π/8 A
O O 1

Figure 2.33: Geometry proof of Viète’s formula.

Viète was a typical child of the Renaissance in the sense that he freely mixed the methods
of classical Greek geometry with the new algebra and trigonometry. However, Viète did not
know the concept of convergence and thus did not worry whether his infinite sequence of op-
erations would blow up or not. That is, one gets different value for  with different numbers
of terms adopted (we discuss this issue in Section 2.22). As an engineer or scientist of which
sloppy engineering mathematics is enough, we just need to write a code to check. But as far
as mathematicians are concerned, they need a proof for the convergence/divergence of Viète’s
formula. And the German and Swiss mathematician Ferdinand Rudio (1856-1929) proved the
convergence in 1891.

2.21.6 Sum of differences

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 143

P
We have seen that a sum of differences (e.g. nkD1 Œk 2 .k 1/2 )
depends only on the first and the last term. And this is related to
actually something we see everyday: a staircase. Assume that
someone is climbing a very long and irregular staircase (see fig-
ure). And he wants to calculate the total vertical rise from the
bottom of the staircase to the top. Of course this is equal to the
sum of the heights of all the stairs. And the height of each stair
is the difference between the altitude of its top and that of its bottom. So, we have a sum of
differences. And this sum should be the same as the difference between the altitude of the top
(y4 in our illustration) and that of the bottom (y0 ).
Let’s use sum of differences to find the sum of some things. We start with the numbers
0; 1; 2; 3; 4; : : :, then we consider the square of these numbers i.e., 0; 1; 4; 9; 16; : : : Now we con-
sider the differences of these squares: 1; 3; 5; 7; 9; : : : (see Table 2.13). Now we can immediately
have this result: the sum of the first n odd numbers is n2 :

1 C 3 C 5 C 7 C    C .2n C 1/ D n2

Now, we have discovered another fact about natural numbers: the sum of the first odd numbers
is a perfect square. Using dots can you visualize this result, and obtain this fact geometrically?
Yes, facts about mathematical objects are hidden in their world waiting to be discovered. And as
it turns out many such discoveries have applications in our real world.

Table 2.13: Sum of the first n odd numbers.

i 0 1 2 3 4 5
i2 0 1 4 9 16 25
i2 .i 1/2 1 3 5 7 9

In Section 2.6.1, we considered the sum S D 1 C 2 C 3 C    C n and computed it in three


different ways. Now is the fourth way using the fact that the sum of the first n odd numbers is n2 .
This sum S consists of the evens and the odds, but we only know the sum of the odds. So, we
can transform the evens to the odds: 2 D 1 C 1, 3 D 2 C 1 and so on. Without loss of generality,
assume n D 8, then we can write S as follows

S D 1 C 2 C 3 C 4 C 5 C 6 C  C 8
D 1 C .1 C 1/ C 3 C .3 C 1/ C 5 C .5 C 1/ C 7 C .7 C 1/
 2
8 8 89
D 2.1 C 3 C 5 C 7/ C .1 C 1 C 1 C 1/ D 2 C D
2 2 2

I just wanted to show that the motivation for the trick of considering k 2 .k 1/2 D 2k 1
presented in Section 2.6.1 comes from Table 2.13.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 144

Let’s now consider the sum of an infinite series that Huygens asked Leibniz to solve in 1670Ž :
the sum of the reciprocals of the triangular numbersŽŽ :
1 1 1 1 1
SD C C C C C 
1 3 6 10 15
Leibniz solved it by first constructing a table similar to Table 2.14. The first row is just the
reciprocals of the natural numbers. The second row is, of course, the difference of the first row.
Thus, the sum of the second row is
 
1 1 1 1 1 1 1 S
C C C  D C C C  D
2 6 12 2 1 3 6 2
Since this sum is the sum of differences, it is equal to the difference between the first number
of the first row, which is one, and the last number (which is zero). But S is twice the sum of the
second row, thus S D 2.

Table 2.14: Leibniz’s table of the reciprocals of natural numbers.


1 1 1 1 1 1 1
i 1 2 3 4 5 6
::: 1
1 1 1 1 1
.i 1/ .i/ 2 6 12 20 30

And who was this Gottfried Wilhelm Leibniz? He would later become the co-inventor of
calculus (the other was Sir Isaac Newton). And in calculus that he developed, we have this fact:
Rb R
a f .x/dx D F .b/ F .a/. What is this? On the LHS we have a sum ( is the twin brother of
˙ ) and on the RHS we have a difference! And this fact was discovered by Leibniz–the guy who
played with sum of differences. What a nice coincidence!

2.22 Sequences, convergence and limit


Let’s start off this section with a discussion of just what a sequence is. A sequence is nothing
more than a list of numbers written in a specific order. The list may or may not have an infinite
number of terms in them although we will be dealing exclusively with infinite sequences (which
are fun to play with). One way to write a sequence is

.a1 ; a2 ; : : : ; an ; anC1 ; : : :/ (2.22.1)

where a1 is the first term, a2 is the second term, or generally an is the nth term. So, we use an to
denote the nth term of the sequence and .an / to denote the whole sequence (instead of writing
Ž
When Leibniz went to Paris, he met Dutch physicist and mathematician Christiaan Huygens. Once he realized
that his own knowledge of mathematics and physics was patchy, he began a program of self-study, with Huygens as
his mentor, that soon pushed him to making major contributions to both subjects, including discovering his version
of the differential and integral calculus.
ŽŽ
Refer to Fig. 2.6 for an explanation of triangular numbers.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 145

the longer expression a1 ; a2 ; a3 ; : : :). You will also see this notation fa1 ; a2 ; : : : ; an ; anC1 ; : : :g
in other books. P
Let’s study the sequence of the partial sums Sn D niD1 1=2i of the geometric series in
Eq. (2.21.3). That is the infinite list of numbers: S1 ; S2 ; S3 ; : : : I compute Sn for n D 1; 2; : : : ; 15
and present the data in Table 2.15 and I plot Sn versus n in Fig. 2.34. What we observe is that
as n is getting larger and larger (in this particular example this is when n > 14), Sn is getting
closer and closer to one. We say that the sequence .Sn / converges to one and its limit is one. In
symbol, it is written as
lim Sn D 1 (2.22.2)
n!1

As the limit is finite (in other words the limit exists) the sequence is called convergent. A
sequence that does not converge is said to be divergent.

1
n Sn
0:9
1 0.5 0:8
2 0.75 0:7
0:6
3 0.875 0:5
4 0.9375 Sn
:: ::
: :
14 0.999939 n
15 0.999969 5 10 N 15

P Figure 2.34: Plot of Sn versus n.


Table 2.15: The sequence . niD1 1=2i /.

In the previous discussion, our language was not precise as we wrote when n is larger and
larger (what measures?) and Sn gets closer to one (how close?). Mathematicians love rigorŽŽ , so
they reword what we have written as to say the limit of the sequence .an / is a:

8 > 0 9N 2 N such that 8n > N jan aj < 


however small there is a point such that beyond that all the terms are (2.22.3)
epsilon is in the sequence point within epsilon of a

So, the small positive number  was introduced to precisely quantify how an is close to the limit
a. The number N was used to precisely state when n is large enough. The symbol 8 means “for
all” or “for any”. The symbol 9 means “there exists”.
Now, we can understand why 1 D 0:9999 : : : Let

S1 D 0:9; S2 D 0:99; S3 D 0:999


ŽŽ
To see to what certain rigor means to them, this joke says it best: A mathematician, a physicist, and an engineer
were traveling through Scotland when they saw a black sheep through the window of the train. "Aha," says the
engineer, "I see that Scottish sheep are black." "Hmm," says the physicist, "You mean that some Scottish sheep are
black." "No," says the mathematician, "All we know is that there is at least one sheep in Scotland, and that at least
one side of that one sheep is black!"

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 146

and so on. Sn will stand for the decimal that has the digit 9 occurring n times after the decimal
point. Now, in the sequence S1 ; S2 ; S3 ; : : :, each number is nearer to 1 than the previous one, and
by going far enough along, we can make the difference as small as we like. To see this, consider

10S1 D 9 D 10 1
10S2 D 9:9 D 10 0:1
10S3 D 9:99 D 10 0:01

And thus,
 
1
lim 10Sn D lim 10 D 10 H) lim Sn D 1 or 0:9999 : : : D 1
n!1 n!1 10n 1 n!1

After this brief introduction, we begin with some examples about limit in Section 2.22.1. We
shall soon realize that there is not much we can do with just the definition (of limit), thus we
shall derive some rules that govern limit (Section 2.22.2). Finally, we discuss in Section 2.22.3.

2.22.1 Some examples


Now that we know what is a limit of a sequence, the next thing is to play with some limits to get
used to them. Here are a few limit exercises:
1
1. Prove that lim D 0.
n!1 n
1
2. Prove that lim D 0.
n!1 n2
1
3. Prove that lim D 0.
n!1 n.n 1/
n2
4. Prove that lim D 1.
n!1 n2 C 1

First, we must ensure that we feel comfortable with the facts that all these limits (except the
last one) are zeros. We do not have to make tables and graphs as we did before (as we can do
that in our heads now). The first sequence is 1; 1=2; 1=3; : : : ; 1=1 000 000; : : : and obviously the
sequence converges to 0. For engineers and scientists that is enough, but for mathematicians,
they need a proof, which is given here to introduce the style of limit proofs.
The proof is based on the definition of a limit, Eq. (2.22.3), of course. So, what is , and N ?
We can pick any value for the former, say  D 0:0001. To choose N we use jan j < 0:0001 or
1=n < 0:0001. This occurs for n > 10 000. So we have

with  D 0:0001, for all n > N D 10 000, jan j < 

Done! Not really, as  D 0:0001 is just one particular case, mathematicians are not satisfied
with this proof; they want a proof that covers all the cases. If we choose  D 0:00012, then

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 147

1= D 8 333;333, not an integer. In our case, we just need N D 8 334. That is when the ceiling
function comes in handy: dxe is the least integer greater than or equal to x. If there is a ceiling,
then there should be a floor; the floor function is bxc which gives the greatest integer smaller
than or equal to x.
Here is the complete proof. Let  be anyŽŽ small positive number and select N as the least
integer greater than or equal to 1= or N D d1=e using the new ceiling function. Then, for
8n > N , we have 1=n < 1=N < .
And we can prove the second limit in the same way. But, we will find it hard to do the same
for the third and fourth limits. In this case, we need to find the rules or the behavior of general
limits (using the definition) first, then we apply them to particular cases. Often it works this way.
And it makes sense: if we know more about something we can have better ways to understand
it. In calculus, we do the same thing: we do not find the derivative of y D tan x directly but via
the derivative of sin x and cos x and the quotient rule.

2.22.2 Rules of limits


We have sequences in our own hands, what we’re going to do with them? We combines them:
we add them, we multiply them, we divide themŽ . And when we do that we discover some laws,
similarly with the way people discovered that a C b D b C a if a; b are real numbers.
What is limn!1 .1 C 1=n/? It is one, and it equals 1 plus limn!1 1=n (which is zero). And
if we see 1 as a sequence .1; 1; 1; : : :/ then we have just discovered the rule stating that the
limit of the sum of two convergent sequences equals the sum of the two limits. Let’s write this
formally. Considering two sequences .an / and .bn / and they are convergent with limits a and b,
respectively, then we guess that

lim .an C bn / D lim an C lim bn D a C b


n!1 n!1 n!1

Proof. Let  be any small positive number. As .an / converges to a, there exists N1 such that
8n > N1 , jan aj < =2 (why 0:5?). Similarly, as .bn / converges to b, there exists N2 such
that 8n > N2 , jbn bj < =2. Now, let’s choose N D max.N1 ; N2 / (so that after N terms, both
sequences converge to their corresponding limits), then 8n > N , we have


<jan aj <  
2 H) j.a a/ C .bn b/j  jan aj C jbn bj < C D 
:̂ jbn bj <
 „ n
ƒ‚ … 2 2
2 triangle inequality

So we have 8n > N; j.an C bn / .a C b/j < . Thus, limn!1 .an C bn / D limn!1 an C


limn!1 bn . If we now reverse the proof particularly the red term, we understand why we selected
0:5 in jan aj < =2. See also Fig. 2.35 for a better understanding of this proof. As can be seen,
in limit proofs, we use extensively the triangle inequality. 

This is what mathematicians want.


ŽŽ

This is exactly what Diego Maradona–the great Argentina footballer–did. He kicks soccer balls. What he did
Ž

when he saw a tennis ball? He kicks it! Watch this youtube video.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 148

sn
aCb

jbn bj < =2


a

jan aj < =2

N2 N1 n

Figure 2.35: Visual proof of the summation rule of limits.

We are now confident to state the other rules of limits below. The proofs are similar in nature
as the proof for the summation rule, but some tricks are required. I refer to textbooks for those
proofs.
(a) lim .an ˙ bn / D lim an ˙ lim bn
n!1 n!1 n!1
(b) lim .c  an / D c  lim an
n!1 n!1
(2.22.4)
(c) lim .an bn / D . lim an /. lim bn /
n!1 n!1 n!1
(d) lim .an =bn / D . lim an /=. lim bn /
n!1 n!1 n!1
Now equipped with more tools, we can solve other complex limit problems. For example,
n2 1
lim D lim (algebra)
n!1 n2 C 1 n!1 1 C 1=n2
limn!1 1
D (quotient rule)
limn!1 .1 C 1=n2 /
limn!1 1
D (summation rule for the denominator)
limn!1 1 C limn!1 1=n2
1
D D1
1C0
We just needed to compute one limit: limn!1 1=n2 . The key step is the first algebraic manipu-
lation step. The limit of n2=n2 C1 is one, and what does it tell us? It tells us that when n is large,
n2 and n2 C 1 are the same. In fact, if we replace 1 by a < 1, the limit is still one.

2.22.3 Properties of sequences

2.23 Inverse operations


It is always a good habit for once in a while to stop doing what we’re doing to ponder on the big
picture. In this section, we look at carefully the operations we have discussed so far: addition,

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 149

subtraction, multiplication, division, power and root.


We start with three numbers a; b; c in which a; c are known. We want to find b such that
a C b D c; this leads to b pD c a. Similarly, finding b such that ab D c gives b D c=a. And
for b D c, we get b D c. An example would help for theproot operation: which number
a a

which is powered to 3 gives 8 (that is x 3 D 8). Of course x D 3 8.


We summarize these operations below:
(a) addition (a0 ) subtraction
aCb Dc bDc a
(b) multiplication (b0 ) division
(2.23.1)
ab D c b D c=a
(c) polynomial (c0 ) root p
a
ba D c bD c
where the right column is the inverse of the operations in the left column. An inverse operation

undoes the operationp . Starting with the number 2, pressing the x 2 button on a calculator gives
you 4, and pressing x button (on 4) gives you back 2.
This is a powerful way to see subtraction, division and taking roots. For example, we do
not have to worry about subtraction as a totally new operation; in fact subtraction is merely the
inverse of addition. Later on, when you learn linear spaces, you will see that only addition is
defined for linear spaces. This is because 5 3 is simply 5 C . 3/. Actually we do inverse
operations daily; for example when we put shoes on and take them off.

2.24 Logarithm
The question which number which is powered to 2 gives 4 (i.e., x 2 D 4) gave us the square root.
And a similar question, to which index 2 is raised to get 4? (that is find x such that 2x D 4),
gave us logarithm. We summarize these two questions and the associated operations now
p2
x 2 D 4 H) x D 4
(2.24.1)
2x D 4 H) x D log2 4
Looking at this, we can see that logarithm is not a big deal; it is just the inverse of 2x in the same
manner as square root is the inverse of x 2 . For the notation log2 4 we read logarithm base 2 of 4.
You can understand these two equations by usingpa calculator. Starting with the number 2,
pressing the x 2 button gives you 4, and pressing the x button (on 4) gives you back 2–that’s
why it is an inverse. Similarly, starting with 2, pressing the button 2x yields 4 and pressing
the button log2 x returns 2. Historically, logarithm was discovered in an attempt to replace
multiplication by summation as the latter is much easier than the former, see Section 2.24.3.
It was invented by the Scottish mathematician, physicist, and astronomer John Napier (1550 –
1617) in early 17th centuryŽŽ .

If musicians can unbreak one’s heart, mathematicians can too.
ŽŽ
The story is very interesting, see [30] for details. In 1590, James VI of Scotland sailed to Denmark to meet

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 150

2.24.1 Rules of logarithm


After this new loga b was discovered, we need to find the rules for them. If you play with them for
a while, you will discover the rules for logarithms. One way is to start with the second equation
in Eq. (4.4.30): 2x D 4 H) x D log2 4. For x D f0; 1; 2; 3; 4; 5; 6; 7g, 2x yields a geometric
progression (GP): 1; 2; 4; 8; 16; 32; 64; 128 (with r D 2); the corresponding logarithms (base 2)
are an arithmetic progression (AP): 1; 2; 3; 4; 5; 6; 7 (see Table 2.16).

Table 2.16: Logarithm of a geometric progression is an arithmetic progression.

x 1 2 4 8 16 32 64 128

log2 x 0 1 2 3 4 5 6 7

From this table, we see that log2 4 D 2, log2 8 D 3 and log2 32 D 5. There is a relation
between these numbers: log2 32 D log2 4 C log2 8. And 32 D 4  8, thus log2 .4  8/ D
log2 4 C log2 8.
In the same manner, we observe that log2 64=2 D log2 64 log2 2. By playing with them
long enough, people (and you can too if you’re given a chance) discovered the following rules
for logarithm (p; m; n are positive integers):

(a) Definition loga ab D b (b) Inverse rule loga b


1
D loga b
(c) Product rule loga bc D loga b C loga c (d) Quotient rule loga bc D loga b loga c
(e) Power rule 1 loga b p D p loga b (f) Power rule 2 loga b m=n D m
n
loga b
(2.24.2)
We are going to prove these rules. The first one loga a D b is coming from the definition
b

of logarithm ax D b H) x D loga b. To prove the product rule, we first show a proof for a
particular case log2 .4  8/, to get confident that the rule is correct and use this particular proof
for a general proof.
It is obvious that log2 .4  8/ D 5 because 25 D 32. We can also proceed as followŽ

log2 .4  8/ D log2 .22  23 / D log2 .25 / D 5 D 2 C 3 D log2 22 C log2 23

And thus we have proved the product rule for a concrete case of a D 2, b D 4 and c D 8. The
key step in this proof was to rewrite 4 D 22 and 8 D 23 i.e., expressing 4 and 8 in terms of
powers of 2. That is used in the following proof of the product rule loga bc D loga b C loga c:

Anne of Denmark–his prospective wife and was accompanied by his physician, Dr John Craig. Bad weather had
forced the party to land on Hven, near Tycho Brahe’s observatory. Quite naturally, Brahe demonstrated to the party
the process of using trigonometry identities to replace multiplication by summation. And Dr Craig happened to
have a particular friend whose name is John Napier. With that Napier set out the task of his life: developing a
method to ease multiplication. Twenty years later he had succeeded. And we have logarithm.
Ž
Logarithm is the inverse of powers, so make sure you understand powers.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 151

Proof of the product rule. Denote b D ax (thus x D loga b) and c D ay (hence y D loga c),
then we can write

loga bc D loga ax ay D loga axCy D x C y D loga b C loga c

The product rule is for a product of two numbers, but we can generalize it to a product of
any number of terms. For example, loga xyz D loga .xy/z D loga .xy/ C loga z D loga x C
loga y C loga z. From this, it is easy to see the power rule, when x D y D z.

Proof of the power rule 1. The proof of the power rule 1 uses the product rule (first consider the
case p is a positive integer):

loga b p D loga .b
„  b ƒ‚
  …
b / D loga b C loga b C    C loga b D p loga b
„ ƒ‚ …
p times p times

Interestingly, this rule also works when p is a negative integer i.e., p D q and q is a
counting number. To see that we need to observe that loga 1=b D loga b. Why? See Table 2.17.
This table is obtained from Table 2.16 by extrapolating what is true to the cases that we’re not
sure (and want to investigate). (We did this because we believe (again) in patterns.) In this table,
if we go from right to left, log2 x is reduced by one starting from 4, therefore after 0 we should
get 1 and 2. So, we guess that log2 1=4 D log2 4. Is it true? Yes, because log2 1=4 D 2
(as 2 2 D 1=4). To prove loga 1=b D loga b, observe that 0 D loga 1 D loga .b/.1=b/ D
loga b C loga .1=b/.

Table 2.17: Logarithm of a geometric progression is an arithmetic progression.


1 1
x 4 2
1 2 4 8 16

log2 x 2 1 0 1 2 3 4

The proof of the quotient rule uses the product rule and the power rule 1: loga b
c
D loga bc 1
.

Proof. Proof of the power rule 2 (with rational index). Setting u D b m=n , then un D b m . Thus,

loga un D n loga u
loga b m D n loga b m=n (use un D b m , u D b m=n )
m loga b D n loga b m=n

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 152

Table 2.18: Logarithms bases 2 and 3 of 3; 5; 6; 7 and their ratios (last row).

log2 3 log2 5 log2 6 log2 7


log3 3 log3 5 log3 6 log3 7

1:58496 1:58496 1:58496 1:58496

It is often the case that we need to change the base of logarithm. Let’s find the formula for that.
The idea, as always, is to play with the numbers and find a pattern. So, we compute the logarithm
with two bases (2 and 3) for some positive integers and put the results in Table 2.18. But hey
we do not know how to compute, let say, log3 5! I was cheating here, I’ve used a calculator. We
shall come back to this question shortly.
From this table, we can see that log2 x=log3 x D ˛, where ˛ is a constant. We aim to look for
this constant. Let’s denote log2 x D y, thus x D 2y , then we can compute log3 x in terms of y
as

log3 x D log3 2y D y log3 2 H) log3 x D log3 2 log2 x

We are a bit cheating here as we have used the power rule for logarithm loga b p D p loga b even
when p is not a whole number (y is real here). Lucky for us, this rule is valid for the case p is
real; but to show that we need calculus (see Chapter 4, Section 4.4.14). There is nothing special
about a and b here, so we can generalize the above result to arbitrary bases a and b
loga x
loga x D loga b  logb x; or loga b D ; or loga b logb x D loga x (2.24.3)
logb x

2.24.2 Some exercises on logarithms


Herein I present some exercises on logarithms to become fluent in this new operation. Find x
such that log8 2 D x. We just need to use the definition of logarithm and rules of powers:
log8 2 D x H) 2 D 8x D .23 /x D 23x
Now, we have 2 D 23x , leading to 3x D 1, the solution is x D 1=3. Let’s solve another logarithm
equation: p
logp3 9 D x
3

p p
Using the definition of logarithm we then have . 3/x D 3 9. Using rules of powers, we rewrite
this in the form 3.:::/ D 3.:::/ , then we’re done:
x 2 4
3 2 D 3 3 H) x D
3
The next exercise is to compute log3 1=243. We need the inverse rule and note that 243 D 35 :
1
log3 D log3 243 D log3 35 D 5
243
Phu Nguyen, Monash University © Draft version
Chapter 2. Algebra 153

Finally, let’s compute the following product:


P D .log2 3/.log3 4/.log4 5/.log5 6/ : : : .log31 32/
Looks scary at first, but if you recall the boxed equation in Eq. (2.24.3), where the bases are
cancelled, then P is nothing but log2 32, which is simply 5.

2.24.3 Why logarithm useful


In the early 17th century, due to colonization by the Europeans, world trade was really taking
off. There was intense interest in astronomy, since this increased the chances of a ship coming
back with its bounty. Clockmakers were also in great demand. All of this required more and
more sophisticated calculation.
Activities like banking and trade resulted in huge volumes of calculation, and accuracy was
essential. For example, compound interest and the distances of moons, planets and stars involved
a large number of multiplications and divisions. But this was very tedious and time-consuming,
as well as being prone to error.
Surely there had to be a better way?
Logarithms were developed in the early 17th century by the Scotsman John Napier and the
Englishman Henry Briggs (who later suggested base 10 rather than Napier’s strange choice).
Their ideas were refined later by Newton, Euler, John Wallis and Johann Bernoulli towards the
end of the 17th century.
When the idea of logarithm hit the scene in the early seventeenth century, its impact was
substantial and immediate. Modern historians of mathematics, John Fauvel and Jan van Maanen,
illustrate this vividly:
When the English mathematician Henry Briggs learned in 1616 of the invention
of logarithms by John Napier, he determined to travel the four hundred miles north
to Edinburgh to meet the discoverer and talk to him in person.
A common argument for the use of technology is that it frees students from doing boring,
tedious calculations, and they can focus attention on more interesting and stimulating conceptual
matters. This is wrong. Mastering “tedious” calculations frequently goes hand-in-hand with a
deep connection with important mathematical ideas. And that is what mathematics is all about,
is it not? We use calculators in a physics class or chemistry class, but not in a math class.
To show the usefulness of logarithm assume we have to compute this product 18 793:26 
54 778:18 (without a calculator of course). Using logarithm turns this multiplication problem
into a summation one (product rule):
log10 .18 793:26  54 778:18/ D log10 18 793:26 C log10 54 778:18
Assume that we know the logs of 18 793:26 and 54 778:18 (we will come to how to compute
them in a minute, Briggs provided tables for such values; which became obsolete with the born
of modern calculators), then sum them to get A. Finally, the product we are looking for is then
simply 10A (there were/are tables for this and thus we obtain the product just by summing two
numbers).

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 154

2.24.4 How Henry Briggs calculated logarithms in 1617


In 1617, Briggs published a table of logarithms (base 10) followed in 1624 by the more com-
plete Arithmetica Logarithmica. Briggs is viewed by Goldstine‘ as one of the great figures of
numerical analysis.
§
Here is what Briggs did, without
pp a calculator . He calculated the successive square roots
p
of 10 i.e., 10 D 101=2 , then 10 D 101=4 etc. We denote this by 10s with s D 1=2n
(n D 0; 1; 2; : : :) and put them in the third column of Table 2.19. Briggs might have used the
algorithm described in Eq. (2.9.1) for this task. We of course used a calculator (as we’re not
interested in the square root itself herein).
p
2s
Table 2.19: Successive roots of 10: 10s or 10.
10s 1
n s D 1=2n 10s D 1 C   s= s

0 10
1 3.16227766
2 1.77827941
3 1.33352143
4 1.15478198
5 1.15478198
6 1.07460783 2.38745051
7 1.03663293 2.34450742
8 1.01815172 2.32342038
9 1.00903504 2.31297148
10 0.00097656 1.00225115 0.00225115 0.43380638 2.30777050
11 0.00048828 1.00112494 0.00112494 0.43405039 2.30517585
12 0.00024414 1.00056231 0.00056231 0.43417242 2.30387999
13 0.00012207 1.00028112 0.00028112 0.43423345 2.30323242
::
:
20 9.53674316e-7 1.00000219 0.00000219 0.434294005 2.30258762

From the third column in Table 2.19, we can see that successive square roots of 10 will be

Herman Heine Goldstine (1913 – 2004) was a mathematician and computer scientist, who worked as the
director of the IAS machine at Princeton University’s Institute for Advanced Study, and helped to develop ENIAC,
the first of the modern electronic digital computers. He subsequently worked for many years at IBM as an IBM
Fellow, the company’s most prestigious technical position.
§
I learned of this in Feynman’s Lectures on Physics, Vol I.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 155

of the form 1 C  where  is a very small positive number. We put  in the fourth column of
Table 2.19. Another observation is that the logarithm (base 10) of the number in the third column
(which is the second column) is proportional to ; looking at the second column, rows 10 and
11, s is decreased by half, and the corresponding values in the fourth column are decreasing half
as well. We can find this proportion by calculating s= (fifth column).
These two observations allow us to write for a positive number x (not just 10)
n
x 1=2 D 1 C ; log.1 C /  ˛ (2.24.4)

where ˛ D 0:43429448190325 (later on we know that ˛ D log10 e, e D 2:718281 is the famous


number discussed later in Section 2.28). And thus, by taking the logarithm of both sides of the
first equation in Eq. (2.24.4) and using the second equation, we get
1 n

n
log x D log.1 C /  ˛ H) log x  2n ˛.x 1=2 1/ (2.24.5)
2
In summary, Brigg’s algorithm for logarithm of x is to calculate successive square roots of x,
minus 1, multiplied by ˛ and 2n . Quite simple, but one question arises: what should be the value
for n? It should be large enough so that the approximation log.1 C /  ˛ holds, but it must
n
be small to reduce the numerical error in the calculation of .x 1=2 1/.
With this algorithm, Briggs computed the logarithms of all prime numbers smaller than 100;
there are only 25 such primes. From this, the logarithms of composite numbers are simply the
sum of the logarithms of their prime factors. For example, log 21 D log.3  7/ D log 3 C log 7.
Another observation from Brigg’s calculations is that, for a small x, we have

10x  1 C kx (2.24.6)

which can be seen from the sixth column of Table 2.19. And we have k D 1=˛. With calculus,
we will know that k D ln 10 (ln x is the logarithm of base e).

History note 2.5: Henry Briggs (1561 – 1630)


Henry Briggs was an English mathematician notable for changing
the original logarithms invented by John Napier into common (base
10) logarithms, which are sometimes known as Briggsian logarithms
in his honor. In 1624 his Arithmetica Logarithmica was published,
in folio, a work containing the logarithms of thirty thousand natural
numbers to fourteen decimal places (1-20,000 and 90,001 to 100,000).
Briggs’s early research focused primarily on astronomy and its appli-
cations to navigation, and he was among the first to disseminate the
ideas of the astronomer Johannes Kepler (1571–1630) in England.

2.24.5 Solving exponential equations


Let’s consider the following ‘basic’ exponential equations:

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 156

1. Solving the following equation

4x 3  2x C 2 D 0

2. Solving the following equation


4x C 6x D 9x

The first one is simply a quadratic equation in disguise (t 2 3t C 2 D 0 with t D 2x ). Here,


we use the rule of raising a power saying that .am /n D amn D .an /m . For the second one, we
divide the equation by 4x :
 x  2x
3 3
1C D
2 2
Then, it is a disguised quadratic equation. Of course, our solution method will not work if, for
example, number 4 was replaced by 5 in this equation! Can you find the rules that high school
teachers used (probably) to generate this kind of equation? It is .ab/x C .b 2 /x D .a2 /x , for a; b
being positive integers.
Ok, let’s solve non-standard exponential equations; they are more fun. One such equation is
the following:
x 1
16 x  5x D 100
We know that logarithm of a product is the sum of logarithms, and sum is easier to deal with. So,
we take logarithm of both sides of the equation: (we do not know which base is the best, so we
use a for that)
x 1
loga 16 C x loga 5 D loga 100
 x 
x 1
”4 loga 2 C x loga 5 D 2 loga 10
x
Looking at the red numbers, you see that they are related: 5=10/2. If we pick a D 10, we can
get nice numbers:
 
x 1 10
4 log10 2 C x log10 D2
 x  2
x 1
” 4 log10 2 C x.1 log10 2/ D 2
x
” .1 log10 2/x 2 C .4 log10 2 2/x 4 log10 2 D 0

Finally, we get a quadratic equation in terms of x, even though the coefficients are a bit scary.
Don’t worry, this is an exercise, the answers are usually of a compact form. So, using the
quadratic formula, we have:
8
2 4 log10 2 ˙ 2 <2
xD D 4 log10 2 log10 4
2.1 log10 2/ : D
2 log10 2 2 log10 5

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 157

That’s it! We used the fundamental property of logarithm to get a quadratic equation. If the
numbers 16,5,100 are replaced by others, then still we have a quadratic equation.

Can we have another solution, easier? Yes, if we divide the original equation by 100, factor
100 D 4  52ŽŽ , after that we take logarithm base 10:

x 1 x 1
 5x
16 x 16 x  5x
D1” D1
100 4  52
x 2
” 4 x   5x 2 D 1
x 2
” log10 4 C .x 2/ log10 5 D 0
x
” .x 2/.log10 4 C x log10 5/ D 0

No need to use the quadratic formula.

How about this equation?

2x C 21=x D 4

Well, this is non-standard, and using the AM-GM inequality is the key as the LHS is always
greater or equal 4! If the RHS is 5 instead of 4, then we have to use the graphic method (plot
the function of the LHS and see where it intersects with the horizontal line y D 5) or Newton’s
method.

ŽŽ
That’s the key point as 4 and 5 appear in 16x 1=x
 5x . Don’t forget that 16 D 42 .

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 158

Some exercises on non-standard exponential equations.

1. Solving the following equation


xC2 5x 10
3 3x4 7 D 2  3 3x4

2. Solving the following equation


x
2x D 3 2 C 1

3. Solving the following equation


p
42x C 2 x
C 1 D .129 C 8 2/.4x C 2 x
2x /

4. Solving the following equation

4x C 9x C 25x D 6x C 10x C 15x

Solution of the first two equations is x D 2. In the first equation, pay attention to the
exponents, they’re related! In the second one, it is easy to see x D 2 is one solution.
You need to prove it’s the only solution. For the fourth equation, pay attention to the red
numbers: on the LHS we have 4 D 22 , 9 D 32 and 25 D 52 . And on the RHS we have
6 D 2  3, 10 D 2  5 and 15 D 3  5. Thus, all we have are numbers 2; 3; 5: squares of
them and products of them. This leads to a2 C b 2 C c 2 D ab C bc C ca. The answer is
x D 0.

2.25 Complex numbers


Bombelli’s
p insight into the nature of the Cardano
p formula broke the mental logjam concerning
1. His work made clear that manipulating 1 using ordinary arithmetic
p results in perfectly
correct results (i.e., real numbers come out of expressionspinvolving 1). Despite the success
of Bombelli, there still lacked a physical interpretation of 1. Mathematicians of the sixteenth
century were tied to the Greek tradition of geometry, and they felt uncomfortable with concepts
to which they could not give a geometric meaning (so that they can see it).
To read this section you need some knowledge about trigonometry. Chapter 3 provides a
discussion of this topic. You might need to check it before continuing.
We start with a definition of complex numbers and how arithmetic of them is done in Sec-
tion 2.25.1. We then discuss the powers of complex numbers with the well known de Moivre
formula (Section 2.25.2). After that, root of complex numbers is treated, because root is sim-
ply power with a fractional exponent (Section 2.25.3). It is the square root of 1 that led us
to i and to complex numbers, we now want to check what the square root of i brings to the

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 159

table (Section 2.25.4). Complex numbers provide an easy way to derive many trigonometric
identities, and Section 2.25.5 is about this. What is the power of a real number with a complex
exponent? This question gives us the most beautiful equation in mathematics: Euler’s identity
e i ˛ D cos ˛ C i sin ˛ (Section 2.25.6). What is i i ? Is it a real number or a complex number;
these questions are studied in Section 2.25.7. Finally, a summary of our system of numbers from
integers, rationals, reals to complex numbers is given in Section 2.25.8.

2.25.1 Definition and arithmetic of complex numbers

The idea of a complex number as a point in the complex plane


was first described by Caspar Wessel in 1799, although it had 3i
Im

been anticipated as early as 1685 in Wallis’s A Treatise of Alge- P (1 + 2i)


bra. Wessel’s memoir appeared in the Proceedings of the Copen- 2i

hagen Academy but went largely unnoticed. In 1806 Jean-Robert i

Argand independently issued a pamphlet on complex numbers


ŽŽ
3 Re
and provided a rigorous proof of the fundamental theorem of al- −3 −2 −1 1 2
−i
gebra. Carl Friedrich Gauss had earlier published an essentially Q(−2 − i)

topological proof of the theorem in 1797 but expressed his doubts −2i

at the time about "the true metaphysics of the square root of 1". −3i
It was not until 1831 that he overcame these doubts and published
his treatise on complex numbers as points in the plane, largely Figure 2.36: Complex plane.
establishing modern notation and terminology .
In the complex plane, the horizontal axis represents the real (Re) part and the vertical axis
represents the imaginary (Im) part. Thus, the number 1 C 2i is a point obtained by walking on
the real axis one step and then walking along the imaginary two steps upward.

Definition 2.25.1
A complex
p number z is the one given by z D a C bi where a and b are real numbers and
i D 1–the imaginary unit; a is called the real part, and b is called the imaginary part.
Geometrically, a complex number is a point in a complex plane, shown in Fig. 2.36.

The adjective complex in complex numbers indicate that a complex numbers have more than
one part, rather than complicated.
As a new number, we need to define arithmetic rules for complex numbers. We first list the

ŽŽ
Jean-Robert Argand (1768 – 1822) was an amateur mathematician. In 1806, while managing a bookstore in
Paris, he published the idea of geometrical interpretation of complex numbers known as the Argand diagram and is
known for the first rigorous proof of the Fundamental Theorem of Algebra.

Gauss was a star thus people would be more willing to accept his theory than that proposed by Wessel and
Argand who were relatively unknown.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 160

rules for addition/subtraction and multiplication as follows


.a1 C b1 i/ C .a2 C b2 i/ D .a1 C a2 / C .b1 C b2 /i
.a1 C b1 i/ .a2 C b2 i/ D .a1 a2 / C .b1 b2 /i (2.25.1)
.a1 C b1 i/.a2 C b2 i/ D a1 a2 b1 b2 C .a1 b2 C a2 b1 /i
How these rules were defined? It depends. In the first way, we can assume that the rule of
arithmetic for ordinary numbers also apply for complex numbers, then there is no mystery
behind Eq. (2.25.1): we treat i as an ordinary number and whenever we see i 2 we replace that by
1 (hence i 3 D i 2 i D i ). In the second way, one first defines the addition and multiplication
of two vectors. The rule for addition follows the rule of vector addition (known since antiquity
from physics), see Fig. 2.37a. It was Wessel’s genius to discover/define the multiplication of two
vectors: the resulting vector has a length being the product of the lengths of the two vectors and
a direction being the sum of the direction of the two vectors (with respect to a horizontal line),
see Fig. 2.37b. How he got this multiplication rule? As I am not good at geometry I do not want
to study his solution. But do not worry, with a new way to represent points on a plane, his rule
reveals its mystery to us!

Im
9

a + b = 10 + 9i

b
+
a
6i

3
3+
b=

3i
7+
a=
3 7 10 Re

(a) addition (b) multiplication

Figure 2.37: Addition and multiplication of complex numbers.

For a point on a plane, there are many ways to define its location. Im P .a C bi /
We have used the Cartesian coordinates so far, but we can also use P .r cos  C i r sin  /

polar coordinates .r; /; refer to Section 4.13.1 if you do not know
what polar coordinates are. Using polar coordinates lead to the so-
called polar form of complex numbers. This polar form is easy to b r
obtain by just relating the Cartesian coordinates .a; b/ to .r; / using
the trigonometric functions: a D r cos  and b D r sin . The polar 

form of a complex number z D a C bi is given by Re


a
z D r.cos  C i sin / (2.25.2)
p Figure 2.38
where r D a2 C b 2 is called the modulus of z and tan  D a=b,  is the argument of the
complex number, see Fig. 2.38. More compactly, people also write z D r†.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 161

Using the polar form, the multiplication of two complex numbers z1 D r1 .cos ˛ C i sin ˛/
and z2 D r2 .cos  C i sin / is written as

z1 z2 D r1 .cos ˛ C i sin ˛/  r2 .cos  C i sin /


D r1 r2 Œ.cos ˛ cos  sin ˛ sin / C i.sin  cos ˛ C sin ˛ cos / (2.25.3)
D r1 r2 Œcos.˛ C / C i sin.˛ C /

From which the geometry meaning of multiplication of two complex numbers is obtained, ef-
fortlessly and without any geometric genius insight! With Euler’s identity e i D cos  C i sin 
(see Section 2.25.6)ŽŽ , it is even easier to see the geometric meaning of complex number multi-
plication:
)
z1 D r1 .cos ˛ C i sin ˛/ D r1 e i ˛

H) z1 z2 D r1 r2 e i.˛Cˇ /
z2 D r2 .cos ˇ C i sin ˇ/ D r2 e

Now we can understand whyp complex numbers live in the


p complex plane given in Fig. 2.36. The
question is always where 1 lives? Let’s represent 1 as r† with unknown length and
unknown angle. What Wessel knew? He defined multiplication of two vectors, so he used it:
p p
. 1/. 1/ D r 2 †2 ” 1 D r 2 †2

But we know where 1 stays; left to the origin at a distance of one. In other words, 1 D
1†180ı , thus:
(
rD 1
1†180ı D r 2 †2 H)
D 90ı
p
And thus 1 is p
on an axis perpendicular to the horizontal axis and at a unit distance from the
origin, here stays 1 which is now designated by the iconic symbol i (standing for imaginary):

p
i WD 1 D 1†90ı

But that is just one i , if we go one around (or a any number of rounds) starting from i we get
back to it. So,
 
i D i sin C k2 ; k 2 N (2.25.4)
2

ŽŽ
We have anticipated that there must be a link between i and sine/cosine, but we could not expect that e is
involved. To reveal this secrete we need the genius of Euler. Refer to Section 2.28 for what e is.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 162
j, j 4
1+j
i 2
A nice problem about complex numbers. For j D e 3 , let’s
compute the following product:
120◦
j 3, j 6
P D .1 C j /.1 C j 2 /.1 C j 3 /    .1 C j 2023 / O

The first thing to do is to notice that j is a point on the complex


plane; it is on the unit circle, with an argument of 120ı . Then,
1 + j2
.j; j 4 ; : : :/, .j 2 ; j 5 ; : : :/ and .j 3 ; j 6 ; : : :/ are three verices of an j 2, j 5
isolescent triangle (see figure). Thus, we have


i
1 C j D 1 C j 4 D    D ei 3 ; 1 C j2 D 1 C j5 D  D e 3

And,
1 C j 3 D 1 C j 6 D    D 2 ” 1 C j 3k D 2; k D 1; 2; : : :

So, 1 C j and 1 C j 2 are complex conjugated, thus its product is a real number; actually it is
much nicer: .1 C j /.1 C j 2 / D 1, and there are 674 terms 1 C j 3 ; 1 C j 6 ; 1 C j 9 ; : : : all being
2, thus P D 2674 . Here are the detail:

P D .1 C j /.1 C j 2 / .1 C j 3 / .1 C j 4 /.1 C j 5 / .1 C j 6 /    .1 C j 2023 /


„ ƒ‚ … „ ƒ‚ … „ ƒ‚ … „ ƒ‚ …
1 2 1 2

Question 4. If i rotates a vector in the complex plane, then what will rotate a vector in a 3D
space? This was the question that led the Irish mathematician William Hamilton (1805 – 1865)
to the development of quartenion, to be discussed in Section 11.1.6.

The absolute value of a complex number. In Section 2.20.4


Im
we have met the absolute value of a real number x, denoted
by jxj which is the distance from x to zero. We extend this p
|z1 − z2| = (a2 − a1)2 + (b2 − b1)2
concept to complex numbers now. The absolute value of a z2
complex number z D a C bi, denoted by jzj, is the distance b2
|

from z to the origin (i.e.,


p the point of coordinates .0; 0/).
z2

b2 − b1

1
|z

Obviously jzj D r D a2 C b 2 (Fig. 2.38). Usually we’re


interested in the distance between two complex numbers b1 z1 a2 − a1
z1 D a1 C b1 i and z2 D a2 C b2 i . The Pyathagorean theorem
(for
p the shaded right triangle) gives us that distance: it is simply
.a2 a1 /2 C .b2 b1 /2 . And it is also jz1 z2 j D jz2 z1 j;
the second expression is to show that as the distance between a1 a2 Re

two points are the same if we measure from one point or the other, the distance formula must be
symmetric with respect to z1 ; z2 .

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 163
Im
Complex conjugate. If we multiply two complex numbers we z
get a complex number. But that is not entirely true: .x C yi/.x
r
yi / D x 2 C y 2 , which is a real number. For any complex number
y
z D x C yi, there is a related complex number of the form
zN D x yi with the special property that z zN is a real number. θ
We now define the concept of complex conjugate. The complex θ Re
conjugate of a complex number is the number with an equal real
−y
part and an imaginary part equal in magnitude but opposite in r
sign (see figure). That is, (if x and y are real) then the complex
conjugate of xCyi is equal to x yi. The complex conjugate of z z̄
is often denoted as z (read as z bar). In polar form, the conjugate
of re i  is re i  , which can be shown using Euler’s formula. The
product of a complex number and its conjugate is a real number .x C yi/.x yi/ D x 2 C y 2 .
In other words, jz zj N D jzj2 .
Below is a summary of some of the properties of the conjugates. Proofs just follow the
definition of conjugate.

(a) The complex conjugate of a complex conjugate of z is z: z D z

(b) The complex conjugate of a sum is the sum of the conjugates: z C w D z C w

(c) The complex conjugate of the product is the product of the conjugates: zw D zw

(d) z is a real number if and only if z D z.

Since z D z, jzj is called an involution. An involution, involutory function, or self-inverse


function is a function f that is its own inverse. We will see that rules (b) and (c) apply to many
mathematical objects.

2.25.2 de Moivre’s formula


Knowing how to multiply complex numbers, we can now get powers of a complex number
z. Why bothering with this? Just curiosity. We have been playing with the powers of natural
numbers and real numbers. And now we have a new toy (complex number), it is logical to try
the old rules with this new kid. More often, interesting things come out (in this case, many
useful trigonometric identities can be derived). For example, the two and three powers of z D
r.cos ˛ C i sin ˛/ are

z 2 D zz D r 2 Œcos.2˛/ C i sin.2˛/
(2.25.5)
z 3 D z 2 z D r 3 Œcos.3˛/ C i sin.3˛/

which can be generalized to z n D r n Œcos.n˛/ C i sin.n˛/ where n is any positive integer. When
r D 1 this formula is simplified to:

.cos ˛ C i sin ˛/n D cos.n˛/ C i sin.n˛/ (2.25.6)

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 164

which is a highly useful formula, known as de Moivre’s formula (also known as de Moivre’s
theorem and de Moivre’s identity), named after the French mathematician Abraham de Moivre
(1667 – 1754). Refer to Section 2.25.6 to see how it leads to the famous Euler’s identity: e i  C
1 D 0.
It is obvious that the next thing to do is to consider negative powers e.g. z 2 . To do so,
let’s start simple with z 1 which can be computed straightforwardly. We have z D a C bi D
r.cos  C i sin /. We can compute z 1 using algebra asŽ :

1 1 1 a bi 1
z D D D 2 D .cos  i sin /
z a C bi a Cb 2 r
Thus, we get
1 1
Œr.cos  C i sin /
D .cos  i sin /
r
which shows that de Moivre’s formula still works for n D 1.
Alright, we’re ready to compute any negative power of a complex number. For example, z 2

is given by

1 1
z 2
D .z 1 2
/ D .cos  i sin /2 D .cos 2 i sin 2/ (2.25.7)
r2 r2
Now, we’re confident that de Moivre’s formula holds for any integer. If you want to prove it you
can use proof by induction.

2.25.3 Roots of complex numbers


Having computed powers of a complex number, it is natural to consider its roots i.e., powers with
fractional exponents. The idea is again to use Eq. (2.25.6). But we consider a complex number
written as z D cos ˛=m C i sin ˛=m, for m being a positive integer, then Eq. (2.25.6) gives
 
˛ ˛ m
cos C i sin D cos.˛/ C i sin.˛/ (2.25.8)
m m

which immediately gives us the formula to compute the m-th root of any complex number
p  
p ˛ ˛
m
r.cos.˛/ C i sin.˛// D r cos C i sin
m
(2.25.9)
m m

This is sometimes also referred to as de Moivre’s formula.


p As the first application of this new formula, we use Eq. (2.25.9) to prove that
3
p
2C 121 D 2 C i . Why this particular number? It is famous! Still remember Cardano,
Bombelli and the cubic equation? This number appeared in Eq. (2.13.9), that led mathematicians
into the investigation of imaginary i and ultimately resulted in complex numbers.
Ž
If not clear about the last step, see Fig. 2.38, and noting that a2 C b 2 D r 2 , and so on.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 165

Proof. First, we write the number under the cube root in polar form of a complex number, then
we use Eq. (2.25.9) to get the answerŽŽ :
p
z D2C 121 D 2 C 11i D 11:18034.cos 1:39094283 C i sin 1:39094283/
p
3
z 1=3 D 11:18034.cos 0:46364761 C i sin.0:46364761/ D 2 C i

p
3
p
Doing the
p same thing, we can also see that 2 121 D 2 i . But 2 i is the conjugate
3
p
of 2 C i D 2 C 121. That is, the conjugate of the kth root is the kth root of the conjugate.

Fifth roots of unity. As another application of Eq. (2.25.9), we are going to compute the fifth
roots of unity (one). There are five of them thanks to the fundamental theorem of algebra. We
also do that using algebra, and demonstrate that the two approaches yield identical results. First,
we write 1 D cos 2k, k D 0; 1; 2; : : : Then, Eq. (2.25.9) provides the fifth roots:
p
5
p
5 2k 2k
1D cos 2k D cos C i sin
5 5
Thus, the four fifth roots of 1 are (note that k D 0 gives the obvious answer of 1)
2 2
k D 1 W cos C i sin D 0:309017 C 0:9510565i
5 5
4 4
k D 2 W cos C i sin D 0:809017 C 0:5877853i
5 5 (2.25.10)
6 6
k D 3 W cos C i sin D 0:809017 0:5877853i
5 5
8 8
k D 4 W cos C i sin D 0:309017 0:9510565i
5 5
As can be seen, these four roots are complex numbers, and together with the real root (i.e., 1)
they are vertices of a pentagon inscribed pin the unit circleŽ , see Fig. 2.39a. And this is true for
any nth roots of unity (see Fig. 2.39b for 4 1). What else can we say about them? Among these
four complex roots, two are in the upper half of the circle, and the other twos are in the bottom
half: they are the conjugates of the ones in the upper half. In Section 2.29.2 a proof is provided.

Question 5. Eq. (2.25.10) are the fifth roots of unity expressed in a transcendental form (because
of trigonometric functions). Can you express the fifth roots of unity algebraically? That is writing
them using only the four fundamental arithmetic
p
operations C; ; ;  and extraction of roots.
For example, writing cos 2=5 D 1C 5=4 is writing it algebraically. Why bother? Note that
mathematicians are not p
engineers or scientists, they do not care if cos 2=5 D 0:30901. For
them, cos 2=5 is 1C 5=4.
p
ŽŽ
Note that 121 D 112 i 2 , thus 121 D 11i.
Ž
Why? The unit circle is x 2 C y 2 D 1, and the point cos 2
5
C i sin 2
5
has x D cos 2=5 and y D sin 2=5
satisfying x C y D 1.
2 2

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 166

Im Im
kD1 i

kD2

2=5
O kD0
Re 1 O 1 Re

kD3
kD4 i

(a) fifth roots of one (b) fourth roots of one

Figure 2.39: The fifth roots of one–solutions of z 5 D 1–are vertices of a pentagon inscribed in the unit
circle. The four complex roots, come in pairs: z1 and z2 are complex conjugated and similarly for z2 ; z3 .

We can also find these roots using algebra. To do so, we solve the equation z 5 1 D 0,
which is equivalent to solving .z 1/.z 4 C z 3 C z 2 C z C 1/ D 0 thanks to geometric series
Eq. (2.21.6)ŽŽ . So, we have
z5 1 D 0 ” .z 1/.z 4 C z 3 C z 2 C z C 1/ D 0 H) .z 4 C z 3 C z 2 C z C 1/ D 0
The last step is possible because we assume z ¤ 1 i.e., we’re looking for other roots rather than
the boring 1. For the above quartic equation, we use Lagrange’s clever trick by dividing this
equation by z 2 to getŽ
   
2 1 1 2 1 1
z CzC1C C 2 D0” z C 2 C zC C1D0
z z z z
Due to symmetry, we do a change of variable with u D z C 1=z , then z 2 C 1=z 2 D u2 2 thus
the above equation becomes
u2 C u 1 D 0 (2.25.11)
of which solutions are p p
1C 5 1 5
u1 D ; u2 D
2 2
Having obtained
p
u, we can solve for z from z 2
uz C 1 D 0 (which is a quadratic equation
again): z D u˙ u 2 4
=2. Finally, the four solutions are
p s p p s p
5 1 1 5C 5 5 1 1 5C 5
z1 D Ci ; z2 D i
4 2 2 4 2 2
p s p p s p (2.25.12)
5 1 1 5 5 5 1 1 5 5
z3 D Ci ; z4 D i
4 2 2 4 2 2
ŽŽ
If not clear, Eq. (2.21.6) gives us: 1 C z C z 2 C z 3 C z 4 D z 5 1=z 1.
Ž
Actually, z 4 C z 3 C z 2 C z C 1 D 0 is called a reciprocal polynomial equation for if we put 1=z into the
equation we get an equation of the same form in which 1=z replaces z. A polynomial equation an x n C an 1 x n 1 C
   C a2 x2 C a1 x C a0 D 0 is a reciprocal equation iff an D a0 ; an 1 D a1 ; an 2 D a2 ; : : :.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 167

which are identical to the solutions given in Eq. (2.25.10), if you do the calculation to get the
decimals.
Why u D z C 1=z gave us a quadratic equation in terms of u? Why not a cubic or
a quartic equation? This is so because the expression z C 1=z can only have two
values. Thus, u can have two values, therefore u must be the solutions to a quadratic
Remark
equation. But why z C 1=z can only have two values? Using Eq. (2.25.12), compute
z1 C 1=z1 ; z2 C 1=z2 ; : : : you’ll see that even though there are four zs there are only
two z C 1=z. But why?

The takeaway point from solving z 4 C z 3 C z 2 C z C 1 D 0 is that to find the roots


of the 4th-degree equation, we were able to reduce the problem to finding the roots
Remark of three quadratic equations—one with integer coefficients, Eq. (2.25.11), and the
two whose coefficients included the roots of the first (i.e., z 2 uz C 1 D 0). And in
the end, the solutions consisted of nested square roots of rational numbers.

Some exercises on z n 1 D 0.

1. Solving the following equation algebraically

z6 1D0

2. Solving the following equation algebraically

z7 1D0

If you like it try to solve algebraically z n 1 D 0 for n up to 11. Great mathematicians


in the past including Euler, Lagrange, Vandermonde solved them and paved the way for
Gauss–at the age of 19–to crack z 17 1 D 0, which is intimately related to the geometry
problem of constructing a polygon of 17 sides using only straightedge and compass. It was
this achievement that made Gauss decided to choose mathematics (instead of philosophy)
as a career.

2.25.4 Square root of i


p
It was 1 that led to the development of complex numbers. Then, a natural question is whether
a square root of i exists. If so, then we do not need to invent new kinds of number, as with complex
numbers we can do all arithmetic operations: addition, multiplication, division, subtraction,
power, and root of any number.
We can use de Moivre’s formula to compute this root as follows. First, we adopt Eq. (2.25.4)
to express i in the polar form i D i sin .1C4k/=2 D cos .1C4k/=2 C i sin .1C4k/=2, noting that

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 168

cos .1C4k/=2 D 0, then we use Eq. (2.25.9):


r
p .1 C 4k/ .1 C 4k/ .1 C 4k/ .1 C 4k/
i D cos C i sin D cos C i sin
2 2 4 4
So, there exits two square roots of i :
p p
  2 2
k D 0 W cos C i sin D C i
4 4 2 p 2 p (2.25.13)
5 5 2 2
k D 1 W cos C i sin D i
4 4 2 2
We can also use ordinary algebra to get this result. Let’s assume that the square root of i is a
complex number:
p
i D a C bi H) i D 0 C i1 D a2 b 2 C i2ab

Thus, we have a and b satisfying the following system of equations by comparing the real parts
and imaginary parts of the two complex numbersŽŽ

a2 b 2 D 0; 2ab D 1
p
of which solutions are a D b D ˙ 2=2. And we get the same result. We have used the method
of undetermined coefficients.

2.25.5 Trigonometry identities


de Moivre’s formula Eq. (2.25.6) can be used to derive various trigonometry identities. For
example, with n D 2, we can write
.cos ˛ C i sin ˛/2 D cos.2˛/ C i sin.2˛/
cos2 ˛ sin2 ˛ C i2 cos ˛ sin ˛ D cos.2˛/ C i sin.2˛/
(2.25.14)
H) cos.2˛/ D cos2 ˛ sin2 ˛
H) sin.2˛/ D 2 sin ˛ cos ˛
where we have used the fact that if two complex numbers are equal, the corresponding real and
imaginary parts must be equal.
And with n D 3, we have
.cos ˛ C i sin ˛/3 D cos.3˛/ C i sin.3˛/
cos3 ˛ 3 cos ˛ sin2 ˛ C 3i cos2 ˛ sin ˛ i sin3 ˛ D cos.3˛/ C i sin.3˛/
(2.25.15)
H) cos.3˛/ D cos3 ˛ 3 cos ˛ sin2 ˛
H) sin.3˛/ D 3 cos2 ˛ sin ˛ sin3 ˛
ŽŽ
We imply that two complex numbers are equal if they have the same real and imaginary parts, which is
reasonable.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 169

The last two equations can be modified a little bit to get this

cos.3˛/ D cos3 ˛ 3 cos ˛.1 cos2 ˛/ D 4 cos3 ˛ 3 cos ˛


(2.25.16)
sin.3˛/ D 3.1 sin2 ˛/ sin ˛ sin3 ˛ D 3 sin ˛ 4 sin3 ˛

Can you do the same thing for cos.5˛/ in terms of cos.˛/? A knowledge of the binomial theorem
(Section 2.27) might be useful.
In the same manner, Eq. (2.25.9) allows us to write
p  
˛ ˛
m
cos.˛/ C i sin.˛/ D cos C i sin
m m
which, for m D 2 gives
p  
˛ ˛
cos.˛/ C i sin.˛/ D cos C i sin
2 2
or, after squaring both sides
˛ ˛ ˛ ˛
cos.˛/ C i sin.˛/ D cos2 sin2 C 2i cos sin
2 2 2 2
which results in the familiar trigonometry identities
˛ ˛ ˛ ˛
cos.˛/ D cos2 sin2 ; sin.˛/ D 2 cos sin (2.25.17)
2 2 2 2
which yields (from the first of Eq. (2.25.17)) the equivalent half-angle identities
r r
˛ 1 C cos.˛/ ˛ 1 cos.˛/
cos D ; sin D
2 2 2 2

2.25.6 Power of real number with a complex exponent


The question is: what 23C2i is? To answer this question, recall de Moivre’s formula that reads

.cos ˛ C i sin ˛/n D cos.n˛/ C i sin.n˛/

And if, we denote f .˛/ D cos ˛ C i sin ˛, then we observe that (thanks to the above equation)

f .˛/ D cos ˛ C i sin ˛ H) Œf .˛/n D f .n˛/

Which function has this property? An exponential! For example,

f .x/ D 2x H) Œf .x/n D .2x /n D 2nx D f .nx/

With that, it is reasonable to appreciate the following equation (see below for a popular proof)

e i ˛ D cos ˛ C i sin ˛ (2.25.18)

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 170

which, when evaluated at ˛ D  yields one of the most celebrated mathematical formula, the
Euler’s theorem:

ei  C 1 D 0 (2.25.19)

which connects the five mathematical constants: 0; 1; ; e; i. You have met numbers 0,1 and i .
We will meet the number e in Section 2.28. And of course,  the ratio of a circle’s circumference
to its diameter. This identity is influential in complex analysis. Complex analysis is the branch
of mathematical analysis that investigates functions of complex numbers. It is useful in many
branches of mathematics, including algebraic geometry, number theory, analytic combinatorics,
applied mathematics; as well as in physics, including the branches of hydrodynamics, thermo-
dynamics, and particularly quantum mechanics. Refer to Section 7.12 for an introduction to this
fascinating field.
So, it is officially voted by mathematicians that e i  C 1 D 0 is the most beautiful equation||
in mathematics! As one limerick (a literary form particularly beloved by mathematicians) puts it

e raised to the pi times i ,


And plus 1 leaves you nought but a sigh.
This fact amazed Euler
That genius toiler,
And still gives us pause, bye the bye.

It is possible to express the sine/cosine functions in terms of the complex exponential:

i˛ ei ˛ C e i˛
e D cos ˛ C i sin ˛ H) cos ˛ D (2.25.20)
2
i˛ ei ˛ e i˛
e D cos ˛ i sin ˛ H) sin ˛ D (2.25.21)
2i

Proof. Here is one proof of e i D cos  C i sin  if we know the series of e x , sin x and cos x.
We refer to Sections 4.15.5 and 4.15.6 for a discussion on the series of these functions.
Start with the series of e x where x is a real number:

x x2 x3 x4 x5
ex D 1 C C C C C C 
1Š 2Š 3Š 4Š 5Š
|| i
e C 1 D 0 is actually not an equation. An equation (in a single variable) is a mathematical expression of
the form f .x/ D 0, for example, x 2 C x 5 D 0, which is true only for certain values of the variable, that is, for
the solutions of the equation. There is no x, however, to solve for in e i  C 1 D 0 . So, it isn’t an equation. It isn’t
an identity, either, like Euler’s identity e i ˛ D cos ˛ C i sin ˛, where ˛ is any angle, not just  radians. That’s what
an identity (in a single variable) is, of course, a statement that is identically true for any value of the variable. There
isn’t any variable at all, anywhere, in e i  C 1 D 0: just five constants.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 171

Replacing x by i , which is a complex number (why can we do this?, see Section 7.12):

i i 2 2 .i/3 .i/4 .i/5


e i D 1 C C C C C C 
 1Š 2Š 3Š
  4Š 5Š 
2 4  3 5
D 1 C    Ci C 
2Š 4Š 1Š 3Š 5Š
„ ƒ‚ … „ ƒ‚ …
cos  sin 
D cos  C i sin 


With Euler’s identity, it is possible to derive the trigonometry identity for angle summation
without resorting to geometry; refer to Section 3.8 for such geometry-based derivations. Let’s
denote two complex numbers on a unit circle as z1 D cos ˛ C i sin ˛ D e i ˛ , z2 D cos ˇ C
i sin ˇ D e iˇ , we then can write the product z1 z2 in two ways

z1 z2 D e i.˛Cˇ / D cos.˛ C ˇ/ C i sin.˛ C ˇ/


z1 z2 D .cos ˛ C i sin ˛/.cos ˇ C i sin ˇ/ D cos ˛ cos ˇ sin ˛ sin ˇ C i.sin ˛ cos ˇ C cos ˛ sin ˇ/

Equating the real and imaginary parts of z1 z2 given by both expressions, we can deduce the
summation sine/cosine identities, simultaneously!

History note 2.6: Caspar Wessel (1745-1818)


Wessel was born in Norway and was one of thirteen children in a
family. In 1763, having completed secondary school at Oslo Cathe-
dral School, he went to Denmark for further studies. He attended
the University of Copenhagen to study law, but due to financial pres-
sures, could do so for only a year. To survive, he became an assistant
land surveyor to his brother and they worked on the Royal Danish
Academy of Sciences and Letters’ topographical survey of Denmark.
It was the mathematical aspect of surveying that led him to exploring
the geometrical significance of complex numbers. His fundamental
paper, Om directionens analytiske betegning, was presented in 1797 to the Royal Danish
Academy of Sciences and Letters. Since it was in Danish and published in a journal rarely
read outside of Denmark, it went unnoticed for nearly a century. The same results were
independently rediscovered by Argand in 1806 and Gauss in 1831. In 1815, Wessel was
made a knight of the Order of the Dannebrog for his contributions to surveying.

Now we can answer the question asked in the beginning of this section: what is z D 23C2i ?

z D 23C2i D 23  22i D 8  4i
i
D 8  e ln 4 D 8  .cos.ln 4/ C i sin.ln 4//

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 172

And finally, it is possible to compute a logarithm of a negative number. For example, start with
e i  D 1, take the logarithm of both sides:
ei  D 1 H) ln. 1/ D i
Thus, the logarithm of a negative number is an imaginary number. That’s why when we first
learned calculus, logarithm of negative numbers was forbidden. This should not be the case since
we accept the square root of negative numbers! To know more about complex logarithm, check
out Section 7.12.
In the story of complex numbers, we have not only Wessel but also Jean-Robert Argand
(1768 – 1822), another amateur mathematician. In 1806, while managing a bookstore in Paris,
he published the idea of geometrical interpretation of complex numbers known as the Argand
diagram and is known for the first rigorous proof of the Fundamental Theorem of Algebra. We
recommend the interesting book An imaginary
p tale: The story of square root of -1 by Paul Nahin
[47] on more interesting accounts on i D 1.

One exercise on complex numbers.

Assume that f .z/ D zC1=z 1, compute f 1991 .2 C i/, where f 3 .z/ D f .f .f .z///. Don’t
be scared by 1991! Note that this is an exercise to be solved within a certain amount
of time after all. Let’s compute f 1 .2 C i/, f 2 .2 C i/, and a pattern would appear for a
generalization to whatever year that the test is on:
3Ci
f .2 C i/ D D2 i
1Ci
f 2 .2 C i/ D f .f .2 C i// D f .2 i/ D 2 C i
f 3 .2 C i/ D f .f .f .2 C i/// D f .2 C i/ D 2 i

So, you see the pattern. 1991 is an odd number so f 1991 .2 C i/ D f .2 C i/ D 2 i.

Some complex numbers problems.

1. Find the imaginary part of z 6 with z D cos 12ı C i sin 12ı C cos 48ı C i sin 48ı .

2. If  is a constant st 0 <  <  and x C 1=x D 2 cos , then find x n C 1=x n in


terms of n and ; n is any positive integer.

3. Evaluate
X
1
cos.n /
nD0
2n
where cos  D 1=5.
The answers are 0, cos n and 6=7, respectively. If it is not clear about the third
problem, see below for a similar problem.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 173

We are now going to solve a problem in which we see the interplay between real numbers and
imaginary numbers. That’s simply amazing. The problem is: Given the complex number 2 C i,
and let’s denote an and bn the real and imaginary parts of .2 C i/n , where n is a non-negative
integer. The problem is to compute the following sum

X
1
an bn
SD
nD0
7n

Let’s find an and bn first. That seems a reasonable thing to do. Power of an imaginary number?
We can use de Moirve’s formula. To this end, we need to convert our number 2 C i to the polar
form:
p 2 1
2 C i D 5.cos  C i sin /; cos  D p ; sin  D p
5 5
Then, its power can be determined and from that an , bn will appear to us:
p p p
.2 C i/n D . 5/n .cos n C i sin n/ H) an D . 5/n cos n; bn D . 5/n sin n

Now the sum S is explicitly given by:

X1 p n p 1  
. 5/ cos n. 5/n sin n 1X 5 n
SD n
D sin 2n
nD0
7 2 nD0
7

We did some massage to S to simplify it. Now comes the good part: we leave the real world and
move to the imaginary one, by replacing sin 2n by the imaginary part of e i 2n :
1  
1X 5 n
SD Im e i 2n (2.25.22)
2 nD0 7

As the sum of the imaginary parts is equal to the imaginary of the sumŽŽ , we write S as:
1  
1 X 5 n i 2 n
S D Im .e /
2 nD0 7

What is the red term? It is a geometric series!, of the form 1; a; a2 ; : : : with a D .5=7/e i 2 , and
we know its sum 1=.1 a/ :
1 1
S D Im
2 1 75 e i 2
ŽŽ
If not clear, one example is of great help: .a1 C b1 i / C .a2 C b2 i / D .a1 C a2 / C i.b1 C b2 /. Thus sum of
imaginary parts (b1 C b2 ) equals the imaginary of the sum.

Herein we accept that the results on geometric series also apply to complex numbers. Note that a has a
modulus of 5/7 which is smaller than 1.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 174

We know e i , thus we know its square e i 2 , thus the above expression is simply 7=16. Details
are as follow. First, we find the imaginary part of 1 51ei2 by:
7

1 7
5 i 2
D .e i ˛ D cos ˛ C i sin ˛/
1 7
e 7 5.cos 2 C i sin 2/
7Œ7 5 cos 2 C i5 sin 2/
D (remove i in the denominator)
.7 5 cos 2/2 C .5 sin 2/2

Then, the imaginary part is given by

1 35 sin 2
Im 5 i 2
D
1 7
e 74 70 cos 2

Thus, S is simplified to
1 35 sin 2 7
SD D ::: D
2 74 70 cos 2 16
We have skipped some simple calculations in : : :
Is there a shorter solution? Yes, note that S involves an bn as a product, so we do not really
need to know an and bn , separately. From the fact that .2 C i/n D an C ibn , what we do to get
an bn ? Yes, we square the equation: .2 C i/2n D an2 bn2 C 2ian bn . Thus, an bn is half of the
imaginary part of .2 C i/2n . Plugging this into S and we fly off to the result in no time.

2.25.7 Power of an imaginary number with a complex exponent


This section is devoted to i i . Is it an imaginary or real number? Why bother? Just out of curiosity!
We use Euler’s identity to write i D e i =2 (i has r D 1 and  D =2), then we raise it to an
exponent of i :
  
z D a C bi D re i H) i D e i 2 ) i i D .e i 2 /i D e 2

So, i i is a real number! Actually i i has many values, we have just found one of themŽŽ :
h ii
i. 
2 C2n / i i. 
2 C2n /

2n
i De H) i D e De 2

Long before Euler wrote e i D cos  C i sin , the Swiss mathematician Johann Bernoulli
(1667 – 1748)–one of the many prominent mathematicians in the Bernoulli family and Euler‘s
teacher–already computed i i using a clever technique. It is presented here so that we can enjoy
it all (assume you know a bit of calculus here). He considered the area of 1/4 of a unit circle:
Z 1 p

D 1 x 2 dx
4 0

ŽŽ
Check Section 7.12 for detail.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 175

Now comes the clever idea, he used the following ’imaginary’ substitution using i (note that if
we proceed with the standard substitution x D sin , we will get =4 D =4, which is useless;
that’s why Bernoulli had to turn to i to have something new coming up):

xD iu H) dx D idu; 1 x 2 D 1 C u2

Then, the above integral becomes


Z ip

D i 1 C u2 du
4 0

And the red integral can be computed (check Section 4.7 if you’re not clear):

 i 
D Œsec  tan  C ln.sec  C tan /0
4 2

with tan   D i . Thus, we have


 i
D ln.i /
4 2

And from that the result i i D e =2 follows. As we have seen, once accepted i, mathematicians
of the 17th century played with them with joy and obtained interesting results. And of course
other mathematicians did similar things; for example, the Italian Giulio Carlo dei Toschi Fagano
(1682-1766) played with a circle but with its circumference, and got the same result as Bernoulli
[47]. It is similar to we–ordinary human–soon introduce many new tricks with a new FIFA play
station game.
Now comes a surprise. What is 1 ? We have learned that 1x D 1, so you might be guessing
1 D 1. But then you get only one correct answer. To see why just see 1 as a complex number


1 D 1 C 0i D e i.2n/ with n being an integer, thus

2/  
1 D .e i.2n/ / D e i.2n D cos 2n 2 C i sin 2n 2

where in the last equality we have used Euler’s identity e i D cos  C i sin . From this we see
that only with n D 0 we get 1 D 1, which is real. Other than that we have complex numbers!
This is because sin 2n 2 is always different from zero for all integers not 0. Why that? Because
 is irrational, a result by the Swiss polymath Johann Heinrich Lambert (1728–1777). To see
why, let’s solve sin 2n 2 D 0, of which solutions are

 m
sin 2n 2 D 0 ” 2n 2 D m ” 2 D
n

which cannot happen as  cannot be expressed by m=n because it is an irrational number.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 176

2.25.8 A summary of different kinds of numbers


We have come a long way starting with counting numbers. We have added more numbers to the
list during the journey. Let’s summarize what numbers we have:

Natural numbers: N D f0; 1; 2; 3; 4; 5; 6; : : :g


Integer numbers: Z D f: : : ; 3; 2; 1; 0; 1; 2; 3; : : :g
 
5 22
Rational numbers: Q D ; ; 1:5; : : :
3 7
p
Real numbers: R D f 1; 0; 1; 2; ; e; : : :g
Complex numbers: C D f2 C 3i; 1; 0; i; 3 C 4i; : : :g

Note that we have introduced different symbols to represent different collections of numbers.
Instead of writing ‘a is a non-negative integer number’, mathematicians write a 2 N. When
they do so, they mean that a is a member of the set (collection) of non-negative integers; this
set is symbolically denoted by N. The notation Z comes from the German word Zahlen, which
means numbers. The notation Q is for quotients.
In mathematics, the notion of a number has been extended
over the centuries to include 0, negative numbers, rational num-
bers such as one third (1=3), real numbers such as the square root
of 5 and , and complex numbers which extend the real numbers
with a square root of 1. Calculations with numbers are done with
arithmetical operations, the most familiar being addition, subtrac-
tion, multiplication, division, and exponentiation. Besides their
practical uses, numbers have cultural significance throughout the
world. For example, in Western society, the number 13 is often regarded as unlucky.
The German mathematician Leopold Kronecker (1823 – 1891) once said, "Die ganzen Zahlen
hat der liebe Gott gemacht, alles andere ist Menschenwerk" ("God made the integers, all else is
the work of man").
But is that all? Not at all. Complex numbers are cool but after all they are just points on a bor-
ing flat plane. Mathematicians wanted to have points in space! And they created other numbers,
one of them is quartenions of the form a C bi C cj C d k briefly discussed in Section 11.1.6.

2.26 Combinatorics: The Art of Counting


Suppose you’re asked to solve the following problems

 At one party each man shook hands with everyone except his spouse, and no handshakes
took place between women. If 13 married couples attended, how many handshakes were
there among these 26 people?

 How many ordered, nonnegative integer triples .x; y; z/ satisfy the equation x C y C z D
11?

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 177

 A circular table has exactly 60 chairs around it. There are N people seated around this
table in such a way that the next person to be seated must sit next to someone. What is
smallest possible value of N ?
What would you do? While solving them you will see that it involves counting, but it is tedious
sometimes to keep track of all the possibilities. There is a need to develop some smart ways of
counting. This section presents such counting methods. Later in Section 5.2, you will see that to
correctly compute probabilities we need to know how to count correctly and efficiently.
We start our discussion with the product rule–which is the basic rule of counting–and tree dia-
gram (Section 2.26.1). We then move to factorial (Section 2.26.2), permutations (Section 2.26.3)
and combinations (Section 2.26.4). In Section 2.26.5, permutations with repetition is treated.
Finally, the pigeonhole principle is discussed (Section 2.26.6).

2.26.1 Product rule


One basic principle of counting is the product rule. Suppose we want to
count the number of ways to pick a shirt and pair of pants to wear. If we
have 3 shirts and 2 pairs of paints, the total number of ways to choose an
outfit is 6. Why? This can be seen by drawing a tree diagram. At the first
branch we choose a pair of pants, denoted by P1 , and at the second branch
we choose the other pair of pants (P2 ). Now, in the P1 branch, we have three
sub-branches: one for each shirt. In total, the number of outfits is the number
of leaves at the end of the tree. In general if we have n ways to choose the
first and m ways to choose the second, independent of the first choice, there
are nm ways–that’s why we have the name ’product rule’–to choose a pair. And of course no
one can stop us to use this rule for more than three things. Imagine that in addition to two pairs
of pants and 3 shirts, we have 2 hats to choose. It is quite easy to extend the tree diagram, and
see that there are 2  6 D 2  3  2 ways.

2.26.2 Factorial
Assume that we have to arrange three books on a shelve. The titles of the three books are A, B
and C . The question is there are how many ways to do the arrangement? If we put A on the left
most there are two possibilities for B and C : ABC and ACB. If we put B on the left most, then
there are also two possibilities: BAC and BCA. Finally, if C is put in the left most, then we
have CAB and CBA. In summary, we have six ways of arrangement of three books:
ABC ACB BAC BCA CAB CBA
How about arranging four books A; B; C; D? Again, let’s put A on the left most position, there
are then six ways of arranging the remaining three books (we have just solved that problem!).
Similarly, if B is put on the left most position, there are six ways of arranging the other three
books. Going along this reasoning, we can see that there are
4  number of ways to arrange 3 books D 4  6 D 24

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 178

ways, and they are

ABCD ABDC ACBD ACDB ADBC ADCB


BACD BADC BCAD BCDA BDAC BDCA
CABD CADB CBAD CBDA CDAB CDBA
DABC DACB DBAC DBCA DCAB DCBA

What if we have to arrange five books? We can see that the number of arrangements is five times
the number of arrangements for 4 books. Thus, there are 5  24 D 120 ways.
There is a pattern here. To see it clearly, let’s denote by An the number of arrangements for
n books (n 2 N). We then have A5 D 5A4 || , but A4 D 4A3 , we continue this way until A1 –the
number of arrangements of only one book, which is one:

A5 D 5A4
D 5  .4A3 /
(2.26.1)
D 5  4  3A2
D 5  4  3  2  A1 D 5  4  3  2  1

with A1 being one as there is only one way to arrange one book. We are now able to give the
definition of factorial.

Definition 2.26.1
For a positive integer n  1, the factorial of n, denoted by nŠ, is defined as

Y
n
nŠ D n  .n 1/  .n 2/      3  2  1 D i
i D1

which is simply a product of the first n natural numbers.

From this definition, it follows that nŠ D n.n 1/Š. Using this for n D 1, we get 1Š D 1  0Š,
so 0Š D 1. This is similar to a negative multiplied a negative is a positive. The notation nŠ was
introduced by the French
Q mathematician Christian Kramp (1760 – 1826) in 1808. We recall the
shorthand notation i (called the pi product notation) that was introduced in Eq. (2.21.20).
To understand the notation nŠ, let’s compute some factorials: 5Š D 120, 6Š D 720, not so
large, but 10Š D 3 628 800! How about 50Š? It’s a number with 65 digits:

50Š D 30 414 093 201 713 378 043 612 608 166 064 768 844 377 641 568 960 512 000 000 000 000

No surprise that Kramp used the exclamation mark for the factorial. Note that I have used Julia
to compute these large factorials. I could not find out the explanation of the name factorial,
however.

||
Just the translation of "the number of arrangements for 5 books is five times the number of arrangements for 4
books".

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 179

Question 6. For a given n 2 N, is nŠ 2 N? It is always a good practice, when we have a new


function/operation, we should check the result it gives us. Does it give us something not in the
number system we’re familiar with?

Factorions. A factorion is a number which is equal to the sum of the factorials of its digits. For
example, 145 is a factorion, because

145 D 1Š C 4Š C 5Š

Can you write a program to find other factorions? The answer is 40 585 and see Listing A.4 for
the program.

One problem involving factorial. Let’s consider a problem involving factorial: which one of
these numbers 5099 and 99Š is larger? The first attempt is to naturally consider the ratio of these
numbers and write out them explicitly (and see if the ratio is smaller than one or not):

5099 50  50      50
D
99Š 99  98  97      2  1
Now, instead of working directly with 99 terms in the numerator and 99 terms in the denominator,
we divide the 99 terms in the numerator into two groups and we’re left with one number 50.
Similarly, we divide the product in the denominator into two groups and left with 50:
49 terms 49 terms
‚ …„ ƒ ‚ …„ ƒ
99
50 .50  50      50/   .50  50      50/
50
D
99Š .99  98      51/    .49  48      2  1/
50

We can cancel the single 50s, and then combine one term in one group with another term
in the other group in the way that 99 is paired with 1, 98 with 2 (why doing that? because
99 C 1 D 100 D 50  2|| ), and so on:
    
5099 502 502 502
D 
99Š 99  1 98  2 51  49

Now, it is becoming clearer that we just need to compare each term with 1, and it is quite easy
to see that all terms are larger than 1 e.g. 502=991 > 1. This is so because we haveŽŽ
 2
2 2 aCb
.a b/ > 0 H) .a C b/ > 4ab H) > ab
2

So, 5099 is larger than 99Š


||
Also because pairing numbers is a good technique that we learned from the 10 year old Gauss.
ŽŽ
Another way is to write 99  1 D .50 C 49/.50 49/ D 502 492 < 502 . In other words, the rectangle 99  1
has an area smaller than that of the square of side 50.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 180

Another way is to use the AM-GM inequality for n D 99 numbers 1; 2; : : : ; 99. And that
proof also gives us a general result that for n 2 N and n  2,
 
nC1 n
> nŠ
2

Factorial equation. Let’s solve one factorial equation: find n 2 N such that

nŠ D n3 n

Without any clue, we proceed by massage this equation a bit as we see some common thing in
the two sides:

n.n 1/.n 2/Š D n.n 1/.n C 1/ H) .n 2/Š D n C 1

because n and n 1 cannot be zero (as n D f0; 1g do not satisfy the equation). At least, now we
have another equation, which seems to be less scary (e.g. n3 gone). What’s next then? The next
step is to replace .n 2/Š by .n 2/.n 3/Š:
nC1 n 2C3 3
.n 2/.n 3/Š D n C 1 H) .n 3/Š D D D1C
n 2 n 2 n 2
Doing so gives us a direction to go forward: a factorial of a counting number is always a counting
number, thus 1 C 3=n 2 must be a counting number, and that means that n 2j3, or

n 2 D f1; 3g H) n D f3; 5g

It is obvious that n D 3 is not a solution, thus the only solution is n D 5.


Another solution is to look at the boxed equation .n 2/Š D n C 1 and think about the LHS
and the RHS. It is a fact that nŠ is a very large number, much larger than n C 1 for n larger than
a certain integer. Thus, the two sides are equal only when n is a small integer. Now, we write
.n 2/Š D .n 2/.n 3/ : : : .3/.2/.1/, which is larger than or equal to 2.n 2/. Now we have,

.n 2/Š  2.n 2/; and .n 2/Š D n C 1

Therefore, we can deduce that

n C 1  2.n 2/ H) n  5 H) n D f5; 4g

Stirling’s approximation is an approximation for factorials. It is named after the Scottish


mathematician James Stirling (1692-1770). The factorial of n can be well approximated by:
p p  n n
nŠ  2nnC1=2 e n D 2 n ; for n D 1; 2; : : : (2.26.2)
e
The need to develop this formula is that it is hard to compute the factorial of a large counting
number, especially in Stirling’s time. We shall see this shortly.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 181

Proof of Stirling’s approximation. From Section 4.20.2 on the Gamma function, we have the
following representation of nŠ: Z 1
nŠ D x n e x dx
0

Using the change of variable x D ny, and y n D e n ln y , the above becomes


Z 1
nC1
nŠ D n e n.ln y y/ dy
0

What
R 1 isx 2the blue p integral? If I tell you it is related to the well known Gaussian integral
1e dx D , do you believe me? If not, plot e n.ln y y/
for n D 5 and y 2 Œ0; 5 you will
see that the plot resembles the bell curve. Thus, we need to convert ln y y to y 2 . And what
allows us to do that? Taylor comes to the rescue. Now, we look at the function ln y y and plot
it, we see that it has a maximum of 1 at y D 1 (plot it and you’ll see that), thus using Taylor’s
series we can write ln y y  1 .y 1/2 =2, thus
Z 1
n nC1 2
nŠ D e n e n.y 1/ =2 dy
0

p p
Thus, another change of variable t D nx= 2, and the red integral becomes
Z 1
p Z 1 p
n.y 1/2 =2 2 t2 2
e dy D p e dx D p
0 n 0 n
R1 2 p
Why the lower integration bound is zero not 1 and we still can use 1 e x dx D ? This
is because the function e n.ln y y/ quickly decays to zero (plot and you see it), thus we can extend
the integration from Œ0; 1 to . 1; 1/. Actually the method just described to compute the blue
integral is called the Laplace method. 

What is the lesson from Stirling’s approximation for nŠ? We have a single object which is
nŠ. We have a definition of it: nŠ D .1/.2/    .n/. But this definition is useless when n is large.
By having another representation of nŠ via the Gamma function, we are able to have a way to
compute nŠ for large n’s.

Some exercises involving factorials.

1. Compute the following sum


3 4 2001
SD C C  C
1Š C 2Š C 3Š 2Š C 3Š C 4Š 1999Š C 2000Š C 2001Š

Semifactorial or double factorial. Now we know that 1  2  3  4  5  6  7  8  9 is 9Š But, in many


cases, we encounter this product 1  3  5  7  9, which is 9Š without the even factors 2; 4; 6; 8. In

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 182

1902, the physicist Arthur Schuster introduced the notation nŠŠ for such products, called it the
"double factorial". So,
9ŠŠ D 9  7  5  3  1; 8ŠŠ D 8  6  4  2
So, the double factorial or semifactorial of a number n, denoted by ŠŠ, is the product of all the
integers from 1 up to n that have the same parity (odd or even) as n:
8̂ nC1
ˆ
ˆ Y2
ˆ
ˆ .2k 1/; if n is odd
<n.n 2/.n 4/    .3/.1/ D
nŠŠ D kD1 (2.26.3)
ˆ
ˆ Y
n

ˆ
2
ˆn.n 2/.n 4/    .2/.1/ D if n is even
:̂ .2k/;
kD1

It is obvious that nŠŠ ¤ .nŠ/Š; e.g. 4ŠŠ D .4/.2/ D 8 but .4Š/Š D .24/Š.
Double factorials can also be defined recursively. Just as we can define the ordinary factorial
by nŠ D n.n 1/Š for n  1 with 0Š D 1, we can define the double factorial by

nŠŠ D n.n 2/ŠŠ (2.26.4)

for n  2 with initial values 0ŠŠ D 1ŠŠ D 1.

2.26.3 Permutations
Now we know that there are nŠ ways to arrange n distinct books. Why just books? Generally
there are nŠ permutations of the elements of a setŽ having n elements. A permutation of a set
of n objects is any rearrangement of the n objects. For example, considering this set f1; 2; 3g,
we have these arrangements (permutations): f1; 2; 3g; f1; 3; 2g; f2; 1; 3g; f2; 3; 1g; f3; 1; 2g and
f3; 2; 1g.
We have used the simplest way to count the number of permutations of a set with n elements:
we listed all the possibilities. But we can do another way. Imagine that we have n distinct books
to be placed into n boxes. For the first box, there are n choices, then for each of these n choices
there are n 1 choices for the second box, for the third box there are n 2 choices and so on.
In total there will be n.n 1/.n 2/    .3/.2/.1/ ways. When we multiply all the choices we
are actually using the so-called basic rule of counting: the product rule.
There are 5Š ways to arrange 5 persons in 5 seats. But, there are how many ways to place
five people into two seats? There are only 5  4 D 20 ways because for the first seat we have
5 choices and for the second seat we have 4 choices. Assuming that the five people are named
A; B; C; D; E, then the 20 ways are:
AB BC CD DE AC AD AE BD BE CE
BA CB DC ED CA DA EA DB EB EC
Now, what we need to do is to find how the result of 20 is related to 5 people and 2 seats. For 5
people and 5 seats, the answer is 5Š. So, we expect that 20 should be related to the factorials of 5
Ž
Don’t worry about this word ‘set’; it is what mathematicians use when they mean a collection of things.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 183

and 2–the only information of the problem. Indeed, it can be seen that we can write 20 D 5  4
in terms of factorials of 5 and 2:
54321 5Š 5Š
54D D D
321 3Š .5 2/Š

We now generalize this. Assume we have a n-set (i.e., a set having n distinct elements) and we
need to choose r elements from it (r  n). There are how many ways to do so if order matters?
In other words, how many r-permutations? For example considering this 3 set fA; B; C g and
we choose 2 elements. We have six ways: fA; Bg, fB; Ag, fA; C g, fC; Ag, fB; C g, fC; Bg.
The number of r-permutations of an n-element set is denoted by P .n; r/ or sometimes by
Pn , which is defined as:
r


P .n; r/ D Pnr D (2.26.5)
.n r/Š
And we can write P .n; r/ explicitly as:

n.n 1/.n 2/    .n r C 1/.n r/Š


P .n; r/ D D n.n 1/.n 2/    .n r C 1/
.n r/Š

This expression is exactly telling us what we have observed. We need to choose r elements;
there are n options for the first element, n 1 options for the second element, ... and n r C 1
options for the last element.

2.26.4 Combinations
In permutations, the order matters: AB is different from BA. Now, we move to combinations in
which the order does not matter. Let’s use the old example of placing five people into two seats.
These are 20 arrangements of five people A; B; C; D; E into two seats (there are 5 options for
the 1st seat and 4 options for the second seat):

AB BC CD DE AC AD AE BD BE CE
BA CB DC ED CA DA EA DB EB EC

And if AB is equal to BA i.e., what matter is who seats next to who not the order, there are only
10 ways. When order does not matter, we are speaking of a combination. My fruit salad is a
combination of apples, grapes and bananas. We do not care the order the fruits are in.
We can observe that:
20 5Š
10 D D
2 .5 2/Š2Š
which leads to the following r-combinations equation:
!
n nŠ Pr
D Cnr D D n (2.26.6)
r .n r/ŠrŠ rŠ

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 184

The last equality shows the relation between permutation and combination; there are less com- 
binations than permutations due to repetitions. And there are rŠ repetitions. The notation nr is
read n choose r.
n
r
is also called the binomial coefficient. This is because the coefficients in the binomial
theorem are given by nr (Section 2.27).

Question 7. The factorial was defined for positive integers. Is it too restrict? If you’re feeling
this way, that’s very good. What p is the value of .1=2/Š? The result is surprising; it is not an
integer, it is a real number: 0:5 .

2.26.5 Generalized permutations and combinations


Permutations with repetition. With 3 a’s and 2 b’s how many 5-letter words can we make? Of
course we do not care about meaningless words. It is clear that we can have these words:

aaabb; aabab; abaab; baaab; aabba; ababa; baaba; abbaa; babaa; bbaaa (2.26.7)

That is ten words. The question now is how to derive a formula, as listing works only when there
are few combinations. First, let’s denote by N the number of 5-letter words that can be made
from 3 a’s and 2 b’s. Second (and this is the key to the solution), we convert this problem to the
problem we’re familiar with: permutations without repetition by using a1 ; a2 ; a3 for 3 a’s and
b1 ; b2 for 2 b’s. Obviously there are 5Š 5-letter words from a1 ; a2 ; a3 ; b1 ; b2 . We can get these
words by starting with Eq. (2.26.7). For each of them, we add subscripts 1; 2; 3 to the a’s (there
are 3Š ways of doing that), and then we add subscripts 1; 2 to the b’s (there are 2Š ways). Thus,
in total there are N 3Š2Š 5-letter words. And of course we have N 3Š2Š D 5Š, thus

N D
3Š2Š
Now we generalize the result to the case of n objects which are divided into k groups in
which the first group has n1 identical objects, the second group has n2 identical objects, ..., the
kth group has nk identical objects. Certainly, we have n1 C n2 C    C nk D n. The number of
permutations of these n such objects are


(2.26.8)
n1 Šn2 Š    nk Š
For the special case that k D 2, we have one group with r identical elements and one group with
n r elements:
aa   …
„ ƒ‚ a bb   …
„ ƒ‚ b
r n r

There are

rŠ.n r/Š

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 185

n

permutations of such set. Coincidentally, it is equal to r
:
!
nŠ n
D (2.26.9)
rŠ.n r/Š r

To remove this confusion between permutations and combinations, we can change how we
look at the problem. For example, the problem of making 5-letter words with 3 a’s and 2 b’s can
be seen like this. There are 5 boxes in which we will place 3 a’s into 3 boxes. The remaining

boxes will be reserved for 2 b’s. How many way to select 3 boxes out of 5 boxes? It is 53 .

Instead of placing the a’s first we can place the b’s first. There 52 ways of doing so. There-
 
fore, 52 D 53 . Thus, we have the following identity
! !
n n
D (2.26.10)
k n k

We can check this identity easily using algebra. But the way we showed it here is interesting in
the sense that we do not need any algebra. This is proof by combinatorial interpretation. The
basic idea is that we count the same thing twice, each time using a different method and then
conclude that the resulting formulas must be equal.

2.26.6 The pigeonhole principle


Let’s solve this problem. If a Martian has an infinite number of red, blue, yellow, and black socks
in a drawer, how many socks must the Martian pull out of the drawer to guarantee that he has a
pair?
It is obvious that if he pulls out two socks, it is not certain he will get a pair; for example he
can get a red and a blue shock. The result is the same if he pulls out three socks or four socks
(for this case he might get red/blue/yellow/black socks). Only when he gets out five socks, he
certainly will get a pair.
Let’s solve another problem. A bag contains 10 red marbles, 10 white marbles, and 10 blue
marbles. What is the minimum number of marbles we have to choose randomly from the bag to
ensure that we get four marbles of the same color?
We can try and see that with 9 marbles we cannot ensure that there are four marbles of
the same color. One example is RRRW W WBBB where R stands for red and so on. And
that example shows that with 10 marbles it is 100% that we get four marbles of same color.
Regardless of the color of the 10th marble, we will get either four red marbles, or four blue or
four white ones.
It is clear that these two problems involve the art of counting. And there is a general principle
that governs this type of problems. And this principle is related to pigeons.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 186

Suppose that a flock of 10 pigeons flies into a set of 9 pigeon-


holes to roost. Because there are 10 pigeons but only 9 pigeon-
holes, at least one of these 9 pigeonholes must have at least two
pigeons in it. This illustrates a general principle called the pigeon-
hole principleŽŽ , which states that if there are more pigeons than
pigeonholes, then there must be at least one pigeonhole with at
least two pigeons in it. We can use it to easily solve the first prob-
lem of picking socks. Here, the pigeons are the socks and the (pigeon-)holes are the sock colors.
Because there are four holes, we need at least five pigeons so that at least one hole contains two
pigeons. As a hole represents a color, when there are two pigeons (socks) in a hole that indicates
that there are two socks of the same color!
The second problem of marbles is a generalization of the first problem. In the first problem,
we just need at least one hole having two pigeons. In the second problem, we need at least one
hole having four marbles. There should be an extended version of the pigeonhole principle.
If we put 11 marbles in 3 holes, then we can either have f5; 3; 3g or f4; 4; 3g; that is there is
at least one hole that holds at least 4 marbles. How 4 is related to 11 and 3? It is d11=3e. Still
remember the ceiling function? If not, check Section 2.22.1. The extended pigeonhole principle
states that if we put p pigeons in h holes, where p > h, then at least one hole must hold at least
dp= he pigeons. We can try other examples and observe that this extended version holds true. We
need a proof, but let’s use it to solve the marble problem. The problem can be cast as finding the
number of pigeons p to be put in 3 holes (as there are 3 colors) so that one hole has dp=3e D 4.
Solving this gives us p D 10.

Proof the generalized pigeonhole principle. Here is the proof of the extended pigeonhole princi-
ple. We use proof by contradiction: first we assume that no hole contains at least dp= he pigeons
and based on this assumption, we’re then led to something absurd. If no hole contains at least
dp= he, then every hole contains a maximum of dp= he 1 pigeons. Thus, p holes contains a
maximum of
.dp= he 1/ h

pigeons. We’re now showing that this number of pigeons is smaller than p:

.dp= he 1/ h < p; .dxe < x C 1/

This is impossible, because we have started with p pigeons.




ŽŽ
This principle is also known as Dirichlet box principle, named after the German mathematician Johann Peter
Dirichlet (1805 – 1859).

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 187

Some problems to practice the pigeonhole principle.

1. Every point on the plane is colored either red or blue. Prove that no matter how the
coloring is done, there must exist two points, exactly a mile apart, that are the same
color.

2. Given a unit square, show that if five points


p
are placed anywhere inside or on this
square, then two of them must be at most =2 units apart.
2

2.26.7 Solutions to questions


Herein we’re going to solve the questions asked at the beginning of the section. The first question
is: At one party each man shook hands with everyone except his spouse, and no handshakes took
place between women. If 13 married couples attended, how many handshakes were there among
these 26 people?Ž .
The number of handshakes is consisted of the number of handshakes between men and the
number of handshakes between men and women. The latter is easy to get: it is 13  12 D 156 for
every men can shake hands with 12 women and there are 13 men. For the former–the handshakes
between men–there are two ways to get it. One way is to use formula and one way is to start
from the scratch. I present the former way first. The first man can shake hands with 12 men, the
second with 11, the third with 10, and so on. In total there are 12 C 11 C    C 2 C 1 D 78
handshakes between men. But, 1 C 2 C    C 11 C 12 D 0:5.12  13/, using the formula for the
sum of the first n positive integer Eq. (2.6.2). However, 0:5.12  13/ D 13 2
. And that’s exactly
the second way to get the number of handshakes between men: we just need to choose two men
among 13 men, and there are 13 2
ways to do just that.
Next, we solve the question "How many ordered, nonnegative integer triples .x; y; z/ satisfy
the equation x C y C z D 11?". A mistake that I made was: there are 12 choices for x, 12
choices for y and 1 choice for z; thus the solution is 144. It is wrong because we cannot use the
product rule here as x; y are not independent! Actually, we can solve this equation quite easily
by reasoning as follows: fix x (x D x),
N then find how many ways we can have for y Cz D 11 x. N
Then, we vary xN from 0 to 11 and count the corresponding ways:
xN D 0 W there are 12 choices for y: y D 0; 1; 2; : : : ; 11
xN D 1 W there are 11 choices for y: y D 0; 1; 2; : : : ; 10
(2.26.11)
::: :::
xN D 11 W there are 1 choices for y: y D 0
Therefore, there are 12C11C  C1 D 78 nonnegative triples .x; y; z/ such that x Cy Cz D 11.
Should we stop here? No! Why 78? How is it related to 11 and x; y; z? If we cannot answer
these questions, when facing the problem x1 C x2 C x3 C x4 C x5 D 150, how can we solve it?
The above brute force approach would be tedious.
Ž
This is Problem 16 from the 1990 AHSME (American High School Mathematics Examination) Problems.
The AHSME was replaced with the AMC 10 and AMC 12 in 2000.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 188


Lucky for us, we got 78 from 12 C 11 C    C 1, which is 13 . The task now is to understand
13
 2
why 2 is here. To this end, it is much easier to look at the problem from a different angle.
Instead of x; y; z being the solutions to the equation x C y C z D 11, we think of 11 objects
to be distributed to three bins. And it is possible to have empty bins. With this view, we can
develop a way to visualize the problem: we use stars to represent the objects and bars to denote
the bins. For example, ˇˇ
.0; 0; 11/ W ˇˇ ? ?ˇ? ? ? ?ˇ ? ? ? ??
.3; 3; 5/ W ? ? ?ˇ ? ? ? ˇ ? ? ? ??
With this representation, the solution appears itself: in the problem there is always 13 places:
 11
13
for 11 stars and 2 for 2 bars, and we just need to select two places of out 13, which is 2 , which

is 11C3
3 1
1
. The last formula enables us to generalize to n objects and k bins.
Finally, we’re going to solve the problem "A circular table has exactly 60 chairs around it.
There are N people seated around this table in such a way that the next person (after N people)
to be seated must sit next to someone. What is smallest possible value of N ?" Try a smaller
problem with 6 chairs, draw a diagram and try to see what arrangement satisfies the condition
of the problem. Then, extend it to 60 chairs. The answer is N D 20.

2.27 Pascal triangle and the binomial theorem


A binomial expansion is one of the form .a C b/n where a; b are real numbers and n is a
positive integer number. With only simple algebra and adequate perseverance one can obtain the
following formulae:
.a C b/0 D 1
.a C b/1 D aCb
.a C b/2 D 1a2 C 2ab C 1b 2 (2.27.1)
.a C b/3 D a3 C 3a2 b C 3ab 2 C b 3
.a C b/4 D a C 4a3 b C 6a2 b 2 C 4ab 3 C b 4
4

We find the first trace of the Binomial Theorem in Proposition 4 of Book II of Euclid’s Elements:
"If a straight line be cut at random, the square on the whole is equal to the squares on the segments
and twice the rectangle of the segments". This is nothing but .a C b/2 D a2 C b 2 C 2ab if
the segments are a and b. In .a C b/2 D 1a2 C 2ab C 1b 2 , the numbers 1; 2; 1 are called
the coefficient of the binomial expansion. It’s a remarkable fact that the coefficients in these
binomial expansions make a triangle, which is usually referred to as Pascal’s triangle. As shown
in Fig. 2.40a, this binomial expansion was known by Chinese mathematician Yang Hui (ca.
1238–1298) long before Blaise Pascal (1623 – 1662).
The triangle can be constructed as follows (Fig. 2.40b): first placing a 1 along the left and
right edges, then the triangle can be filled out from the top by adding together the two numbers
just above to the left and right of each position in the triangle. Can you write a small program to
build the Pascal triangle? This is a good coding exercise.
Section 2.27.1 is about the binomial theorem, which allows us to expand .a C b/n for
n 2 N without resorting to the Pascal triangle. In the next section, a surprising relation between

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 189

1 row 0
1 1 row 1
1 2 1 row 2
1 3 3 1 row 3
1 4 6 4 1 row 4
1 5 10 10 5 1 row 5
1 6 15 20 15 6 1 row 6
(a) (b)

Figure 2.40: Pascal’s triangle for the binomial coefficients.

sum of powers, discussed in Section 2.6, and Pascal triangle is presented, together with the so-
called Bernoulli numbers (Section 2.27.2). Finally, a proof of the binomial theorem is shown in
Section 2.27.3.

2.27.1 Binomial theorem


Knowing how to draw the Pascal triangle it is then possible to know the coefficients of .a C b/10 .
But it is quite slow. Is there a faster way to know the coefficient of a certain term in .a C b/n
without going through the Pascal triangle? To answer that question, let’s consider .a C b/3 . We
expand it as follows

.a C b/3 D .a C b/.a C b/.a C b/


D .aa C ab C ba C bb/.a C b/
D aaa C aab C aba C abb C baa C bab C bba C bbb

Every term in the last expression has three components containing only a and b (e.g. aba). We
also know some of these terms are  going to group together; e.g. aba D baa D baa, as they are
all equal a2 b. Now, there are 32 ways to write a sequence of length three, with only a and b,

that has precisely two a’s in it. Thus, the coefficient of a2 b is 32 D 3. Refer to Section 2.26 for
a discussion on the notation nc .
Generalization allows us to write the following binomial theorem:
! !
Xn
n n nŠ n.n 1/    .n k C 1/
.a C b/n D an k b k ; D D (2.27.2)
k k .n k/ŠkŠ kŠ
kD0

As an example, what is the coefficient of a6 b in .a C b/7 ? We have: n D 7 and k D 1, thus the


coefficient is 71 D 7Š=6Š1Š D 7. You can check the result using the Pascal triangle.

Question 8. What if the exponent n is not a positive integer? How about .a C b/1=2 or
.a C b/ 3=2 ? To these cases, we have to wait for Newton’s discovery of the so-called gener-
alized binomial theorem, see Section 4.15.1.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 190

Question 9. If we have the binomial theorem for .a C b/n , how about .a C b C c/n ? The third
power of the trinomial a C b C c is given by .a C b C c/3 D a3 C b 3 C c 3 C 3a2 b C 3a2 c C
3b 2 a C 3b 2 c C 3c 2 a C 3c 2 b C 6abc. Is it possible to have a formula for the coefficients of the
terms in .a C b C c/3 ? And how about .x1 C x2 C    C xm /n ‹

2.27.2 Sum of powers of integers and Bernoulli numbers


Now I present a surprising result involving the binomial coefficients. Recall in Section 2.6 that
we have computed the sums of powers of integers (e.g. 13 C 23 C    C n3 ). We’ve considered the
sums of powers of one, two and three only. But back in the old days, the German mathematician
Johann Faulhaber (1580- 1635) did that for powers up to 23. Using that result, Jakob Bernoulli
in 1713, and the Japanese mathematician Seki Takakazu (1642-1708), in 1712 independently
found a pattern and discovered a general formula for the sum. With n; m 2 N and m  1, let

X
n 1
Sm .n/ WD km
kD1

Then, we haveŽ !
1 X
m
k mC1
Sm .n/ D . 1/ Bk nmC1 k
(2.27.3)
mC1 k
kD0

where Bk are now called the Bernoulli numbers. Why not Takakazu numbers, or better Bernoulli-
Takakazu numbers? Because history is not what happened, but merely what has been recorded,
and most of what has been recorded in English has a distinctly Western bent. This is particularly
true in the field of mathematical history. The Bernoulli numbers Bk are

1 1 1
B0 D 1; B1 D ; B2 D ; B3 D 0; B4 D ; B5 D 0; : : :
2 6 30
What are the significance of these mysterious numbers? It turns out that, as is often the case
in mathematics, the Bernoulli-Takakazu numbers appear in various fields in mathematics, see
Section 4.17 for more detail.

2.27.3 Binomial theorem: a proof


n

Knowing the binomial theorem, the Pascal triangle is now written using the k
notation ŽŽ :
Ž
This is how Jacob Bernoulli derived this. He wrote something similar to Eq. (2.6.13) for m D 1; 2; 3; : : : ; 10.
Then, he looked at the coefficients of nmC1 ; nm ; : : : carefully. A pattern emerged, in connection to Pascal’s triangle,
and he guessed correctly Eq. (2.27.3) believing in the pattern. Thus, he did not prove this formula. It was later
proved by Euler.
ŽŽ
This is typeset using the package tikz.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 191

0

0

1
 
1
0 1

2 2
 
2
0 1 2

3 3
 
3

3
0 1 2 3

4
 
4 4
 
4 4

0 1 2 3 4

From it we can observe the following identities


! ! ! ! ! !
2 1 1 3 2 2
D C ; D C ;:::
1 0 1 1 0 1

which are generally written as


! ! !
nC1 n n
D C for 0  k < n (2.27.4)
kC1 kC1 k

This identity–known as Pascal’s rule or Pascal’s identity–can be proved algebraically. But that is
just an exercise about manipulating factorials. We need a combinatorial proof so that we better
understand the meaning of the identity.
The left hand side (the red term) in Pascal’s identity is the number of .k C 1/-element subsets
taken from a set of n C 1 elements. Now what we want to prove is that the left hand side is
also the number of such subsets. Fig. 2.41 shows the proof for the case of n D 3 and k D 1.
I provided only a proof for a special case whereas all textbooks present a general proof. This
results in an impression that mathematicians only do hard things. Not at all. In their unpublished
notes, they usually had proofs for simple cases!
With this identity, Eq. (2.27.4), we can finally prove the binomial theorem; that is the theorem
is correct for any n 2 N. The technique we use (actually Pascal did it first) is proof by induction.
Observe that the theorem is correct for n D 1. Now, we assume that it is correct for n D k, that
is
! ! !
X k
k k k
.a C b/k D ak j b j D ak C ak 1 b C    C ab k 1 C b k (2.27.5)
j D0
j 1 k 1

And our aim is to prove that it is also valid for n D k C 1, that is:
! ! !
X
kC1
k C 1 kC1 k C 1 k C 1
.a C b/kC1 D a j j
b D akC1 C ak b C    C ab k C b kC1
j D0
j 1 k
(2.27.6)

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 192

3
|S1 | = 2

AB AC
A B AB AC AX AX 3
|S2 | =
C X BC BX BC 1
BX
CX
CX
4
S : |S| = 2
|S| = |S1 | + |S2 |

Figure
 2.41: Proof of Pascal’s identity for the case of n D 3 and k D 1. The red term in Eq. (2.27.4)
is 42 , which is the cardinality of S – a set that contains all subsets of two elements taken from the set
fA; B; C; X g. We can  divide S into two subsets: S1 is the one without X and S2 is the one with X.
Obviously jS1 j D 32 .

To this end, we compute .a C b/kC1 as .a C b/k .a C b/ and use Eq. (2.27.5):

.a C b/kC1 D .a C b/k .a C b/
" ! ! #
k k
D ak C ak 1 b C    C ab k 1 C b k .a C b/
1 k 1
! ! !
k k k
D akC1 C ak b C ak b C ak 1 b 2 C    C ab k C ab k C b kC1
1 1 k 1

And the rest is some manipulations,


" ! ! # " ! ! #
k k k k
.a C b/kC1 D akC1 C ak b C ak b C    C ab k C ab k C b kC1
0 1 k 1 k
" ! !# " ! !#
k k k k
D akC1 C C ak b C    C C ab k C b kC1
0 1 k 1 k
! !
k C 1 k C 1
D akC1 C ak b C    C ab k C b kC1 (using Eq. (2.27.4))
1 k

This is exactly Eq. (2.27.6), which is what we wanted to prove.

2.28 Compounding interest


By the eighteenth
p pcentury, the only known irrational numbers were those expressed using nth
roots (e.g. 5 or 3 7). The number e–the subject of this section–was the next number proved to
be irrational. And it appears in a suprising place of compound interest.
No other aspect of life has a more mundane character than the quest for financial security.
And central to any consideration of money is the concept of interest. The practice of charging a

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 193

fee for borrowing money goes back to antiquity. For example, a clay tablet from Mesopotamia
dated to about 100 B.C. poses the following problem: how long will it take for a sum of money
to double if invested at 20% rate compounded annually?
Imagine that you’ve deposited $1 000 in a savings account at a bank that pays an incredibly
generous interest rate of 100 percent, compounded annually. A year later, your account would be
worth $2 000 — the initial deposit of $1 000 plus the 100 percent interest on it, equal to another
$1 000.
But is it the best that we can get? What if we get interest every month? The interest rate is
now of course 1=12. Therefore, our money in the bank after the first and second month is:

 
1 1
1st month: 1000 C  1000 D 1 C  1000
12 12
   
1 1
2nd month: 1C  1C  1000
12 12
And the amount of money after 12 months is:
       
1 1 1 1 12
1C  1C    1 C 1000 D 1 C  1000 D 2613:03529
12 12 12 12
„ ƒ‚ …
12 times

which is $2 613 and better than the annual compounding. Let’s be more greedy and try with daily,
hourly and minutely compounding. It is a good habit to ask questions ‘what if’ and work hard
investigating these questions. It led to new maths in the past! The corresponding calculations
are given in Table 2.20.

Table 2.20: Amounts of money received with yearly, monthly, daily, hourly and minutely compounding.

Formula Result

yearly .1 C 1/  1000 2000


monthly .1 C 1=12/12  1000 2613.035290224676
daily .1 C 1=365/365  1000 2714.567482021973
hourly .1 C 1=365=24/36524  1000 2718.1266916179075
minutely .1 C 1=365=24=60/3652460  1000 2718.2792426663555

From this table we can see that the amount of money increases from $2 000 and settles
at $2 718,279 242 6. What I have presented was done by Jacob Bernoulli in 1683. But he did
not introduce a notation for this number and did not recognize the connection of the number
2:718279 with logarithm. It was Euler in 1731 who introduced the symbol e ŽŽ to represent the
ŽŽ
Was Euler selfish in selecting e for this number? Probably not. Note that it was Euler who adopted  in 1737
and i 2 D 1.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 194

rate of continuous compounding:


 
1 n
e WD lim 1 C (2.28.1)
n!1 n

The fascinating thing about e is that the more often the interest is compounded, the less your
money grows during each period (compare 1 C 1 versus .1 C 1=12/ for example). Yet it still
amounts to something significant after a year, for it is multiplied over so many periods.
In mathematics, there are three most famous irrational numbers and e is one of them. They
are ,  and e. We have met two of them. We will introduce  in Chapter 3 when we talk about
geometry.
We have now a definition of e as a limit, but it is not really helpful. What we need is a
formula and Section 2.28.1 presents exactly such formula. Next, Section 2.28.2 proves that e
is irrational. And finally, a surprising connection between the Pascal triangle and e is given in
Section 2.28.3.

2.28.1 How to compute e?


How we compute e? Looking at its definition, we can think of using the binomial theorem in
Eq. (2.27.2) with a D 1 and b D 1=n. We compute e as follows
   k
1 n X
n
nŠ 1
1C D 1
n kŠ.n k/Š n
kD0
nŠ nŠ nŠ (2.28.2)
D1C C C C 
.n 1/Šn 2Š.n 2/Šn2 3Š.n 3/Šn3
   
1 1 1 3 2
D1C1C 1 C 1 C C 
2Š n 3Š n n2
Thus, taking the limit (i.e., when n is very large), we get
 
1 n 1 1 1
e WD lim 1 C D 1 C C C C    D 2:718281828459045 (2.28.3)
n!1 n 1Š 2Š 3Š
because when n ! 1 all the red terms approach one for the terms involving n approach zero.
See also for a calculus-based discussion on the fascinating number e in Section 4.15.5.

How many terms needed to get e? Yes, we have a formula, Eq. (2.28.3), to compute e. One
question remains: how many terms should we use? Let’s assume that we use only four terms, so
1 1 1
e 1C C C D 2:6666666666666665
1Š 2Š 3Š
The error of this approximation is of course the terms we have ignored:
1 1 1
error D C C C 
4Š 5Š 6Š
Phu Nguyen, Monash University © Draft version
Chapter 2. Algebra 195

The task now is to understand this error, because if the error is small then our approximation is
good. Surprisingly a bit of massage to it is useful:
 
1 1 1 1
error D C C C 
3Š 4 5  4 6  5  4
 
1 1 1 1
< C C C 
3Š 2 2  2 2  2  2
 
1 1 1 1 1
< C C C  D
3Š 2 4 8 3Š
(Noting that the sum in the bracket is a geometric series being one.) Thus, if we use n C 1 terms
to compute e, the error is smaller than 1=nŠ As n ! 1, the error is approaching zero, and we
get a very good approximation for e. With this, you can figure out how many terms needed to
get one million digits of e.

2.28.2 Irrationality of e
p
Similar to Euclid’s proof of the irrationality of 2, we use a proof of contraction here. We
assume that e is a rational number and this will lead us to a nonsense conclusion. The plan
seems easy, but carrying it out is different. We start with Eq. (2.28.3):
1 1 1 a
1C C C C  D
1Š 2Š 3Š b
where a; b 2 N.
The trick is to make b appear in the LHS of this equation:
   
1 1 1 1 1 a
1 C C C  C C C C  D (2.28.4)
1Š 2Š bŠ .b C 1/Š .b C 2/Š b
We can simplify the two red and blue terms. For the red term, using the fact that bŠ D b.b
1/.b 2/    2Š, we can show that the red term is of this form c=bŠ where c 2 N.
For the second term, we need to massage it a bit:
1 1 1 1
C C  D C C 
.b C 1/Š .b C 2/Š .b C 1/Š .b C 2/.b C 1/Š
1 1
D C C 
.b C 1/bŠ .b C 2/.b C 1/bŠ
 
1 1 1
D C C 
bŠ .b C 1/ .b C 2/.b C 1/
Denote by x the blue term, we are going to show that 0 < x < 1=b. In other words, x is a real
number. Indeed,
 
1 1 1 1 1 1 1
x< C C  D 1C C C  D
b C 1 .b C 1/ 2 .b C 1/ 3 bC1 b C 1 .b C 1/ 2 b

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 196

where we used the formula for the geometric series in the bracket.
Now Eq. (2.28.4) becomes as simple as:
a c 1
D C x
b bŠ bŠ
Multiplying this equation with bŠ to get rid of it, we have:
a.b 1/Š D c C x
And this is equivalent to saying an integer is equal to the sum of another integer and a real
number, which is nonsense!
Question 10. If  
1 n
e D lim 1 C
n!1 n
Then, what is 

1 n
lim 1 D‹
n!1 n
Try to guess the result, and check it using a computer.

2.28.3 Pascal triangle and e number


Let’s recall the binomial theorem:
! !
X
n
n n nŠ
.a C b/n D an k k
b ; D
k k .n k/ŠkŠ
kD0

For a given n if we compute the product of all the binomial coefficients in that row, denoted by
sn , something interesting will emerge. We define sn asŽ :
!
Y n
n
sn D (2.28.5)
k
kD0

The first few sn are shown in Fig. 2.42. The sequence .sn / grows bigger and bigger. How about
the ratio sn =sn 1 ?
In Table 2.21 we compute sn manually for n D 1; 2; 3; 4; 5; 6 and automatically (using
a Julia script) for n D 89; 90; 91; 899; 900; 901. The ratios rn D sn =sn 1 also grows un-
boundedly, but the ratio of rn converges to a value of 2.71677, which is close to e.
So, we suspect that the following is true
rn sn =sn 1 snC1 =sn
lim D e; lim D e; lim De (2.28.6)
n!1 rn 1 n!1 sn 1 =sn 2 n!1 sn =sn 1

Note that when n is very big, n and n 1 are pretty the same. That is why in the above equation,
we have different expressionsl they are equivalent.
Ž
If, instead of a product we consider the sum of all the coefficients in the nth row we shall get 2n . Check
Fig. 2.42, row 3: 1 C 3 C 3 C 1 D 8 D 23 .

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 197

1 1
1 1 1
1 2 1 2
1 3 3 1 9
1 4 6 4 1 96
1 5 10 10 5 1 2500
1 6 15 20 15 6 1 162000
1 7 21 35 35 21 7 1 26471025
Qn n
Figure 2.42: Pascal triangle and the product of all the terms in a row sn D kD0 k for n D 0; 1; : : : ; 7.
Qn n
Table 2.21: sn D kD0 k , see Listing A.5 for the code.

n sn rn D sn =sn 1 rn =rn 1

1 1 1 1
2 2 2 2
3 9 4.5 2.25
4 96 10.67 2.37
5 2500 26.042 2.44
6 162000 64.8 2.49
:: :: :: ::
: : : :
89 2.46e+1711
90 1.77e+1673 5.13e+37
91 2.46e+1711 1.39e+38 2.70
:: :: :: ::
: : : :
899 2.22e+174201
900 2.17e+174590 9.74e+388
901 5.74e+174979 2.65e+389 2.71677

Proof. Herein we prove that Eq. (2.28.6) is true. First, we compute sn :


!
Y
n
n Y
n
nŠ Y
n
1
nC1
sn D D D .nŠ/ (2.28.7)
k .n k/ŠkŠ .kŠ/2
kD0 kD0 kD0

To see the last equality, one can work out directly for a particular case. For n D 3, we have

Y
3
nŠ 3Š 3Š 3Š 3Š Y 1 3
s3 D D    D .3Š/4
.n k/ŠkŠ 3Š0Š 2Š1Š 1Š2Š 0Š3Š .kŠ/2
kD0 kD0

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 198

We can write snC1 and the ratio snC1=sn as

Y
nC1
1 snC1 ..n C 1/Š/n .n C 1/n
nC2
snC1 D ..n C 1/Š/ ) D D (2.28.8)
.kŠ/2 sn .nŠ/nC1 nŠ
kD0

Therefore, the other ratio sn=sn 1 can be given by

sn .n/n 1
D (2.28.9)
sn 1 .n 1/Š
Finally,  
snC1 =sn .n C 1/n .n 1/Š 1 n
D D 1C (2.28.10)
sn =sn 1 nŠ nn 1 n
 n
Given that limn!1 1 C n1 D e, the result follows.


2.29 Polynomials
A polynomial is an expression consisting of variables (also called indeterminates) and coeffi-
cients, that involves only the operations of addition, subtraction, multiplication, and non-negative
integer exponentiation of variables. An example of a polynomial of a single variable x is
x 2 x C2. An example in three variables is x 2 C2xy 3 z 2 yz C4. The expression 1=x Cx 2 C3
is not a polynomial due to the term 1=x D x 1 (exponent is 1, contrary to the definition).
Polynomials appear in many areas of mathematics and science. For example, they are used
to form polynomial equations, they are used in calculus and numerical analysis to approximate
other functions. For example, we have Taylor series and Lagrange polynomials to be discussed
in Chapters 4 and 12.
A polynomial in a single indeterminate x can always be written in the form

X
n
n n 1 2
Pn .x/ WD an x C an 1 x C    C a2 x C a1 x C a0 D ak x k ; ak 2 R (2.29.1)
kD0

The summation notation enables a compact notation (noting x 0 D 1). Assume that an ¤ 0, then
n is called the degree of the polynomial (which is the largest degree of any term with nonzero
coefficient). Polynomials of small degree have been given specific names. A polynomial of
degree zero is a constant polynomial (or simply a constant); that is, P0 .x/ D a0 . Polynomials of
degree one, two or three are linear polynomials, quadratic polynomials and cubic polynomials,
respectively. For higher degrees, the specific names are not commonly used, although quartic
polynomial (for degree four) and quintic polynomial (for degree five) are sometimes used.
This section is about polynomials in a single indeterminate x. We start with the arithmetic
of polynomials in Section 2.29.1. If we can add numbers, we can add x 2 C 2x and x 3 C 4x 3;
why not? When we divide two polynomials, we care about the remainder, and mathematicians

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 199

dicovered the polynomial reminder theorem while studying these remainders (Section 2.29.2).
This theorem allows us to factor a polynomial if we know one root of it; for example x 2 5x C
6 D .x 2/.x 3/. But, how can we guess the root of a polynomial? Descartes gave us a
theorem–the rational root theorem (Section 2.29.3). In Section 2.25.3 we have observed that the
complex roots of z n 1 D 0 come in conjugate pairs. Now, we can prove that (Section 2.29.4).
Section 2.29.5 presents the fundamental theorem of algebra stating that a nth degree polynomial
has exactly n roots.
We then turn to the problem of evaluating a given polynomial at a specific point in a most
efficient way. For example, given P2 .x/ D x 2 C 5x 6, what is P2 .100/? Section 2.29.6 gives
a method discovered by Horner for this task. The section closes with Vieta formula relating the
sum of the roots and their product with the coefficients of the polnomial (Section 2.29.7) and
Girard-Newton indentities (Section 2.29.8).

2.29.1 Arithmetic of polynomials


We can add (subtract), multiply (divide) two polynomials, in the same manner with numbers.
Let’s start with addition/subtraction of two polynomials:

.x 3 C 2x 2 5x 1/ C .x 2 3x 10/ D .x 3 C 2x 2 5x 1/ C .0x 3 C x 2 3x 10/


3 2
D x C .2 C 1/x .5 C 3/x .1 C 10/

Thus, the sum of two polynomials is obtained by adding together the coefficients of corre-
sponding powers of x. Subtraction of polynomials is the same. And from two polynomials to
Pnpolynomials
m is a breeze thanks to Eq.P (2.29.1). To see the power of compact notation, let
n
a
kD0 k Px k
be the first polynomial and kD0 bk x be the second, then the sum of them is
k

obviously nkD0 .ak C bk /x k . It’s nice, isn’t it?


The next thing is the product of two polynomials:

.x 2 3x 10/.2x 1 C 3/ D 2x 3 C 3x 2 6x 2 9x 20x 30 D 2x 3 3x 2 29x 30

which comes from the usual arithmetic rules. What is interesting is that for two polynomials p
and q, the degree of the product pq is the sum of the degree of p and q:

deg.pq/ D deg.p/ C deg.q/

which is nothing but a consequence of the product rule of powers, discussed in Section 2.19.1.
The division of one polynomial by another is not typically a polynomial. (This is expected
from experience with division of integers.) Instead, such ratios are a more general family of ob-
jects, called rational fractions, rational expressions, or rational functions, depending on context.
This is analogous to the fact that the ratio of two integers is a rational number. For example, the
fraction 2=.1 C x 3 / is not a polynomial; it cannot be written as a finite sum of powers of the
variable x; or in other words in the form of Eq. (2.29.1).

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 200

Let’s divide x 2 3x 10 by x C 2 and 2x 2 5x 1 by x 3 using long division:

x 5; 2x C 1
 2
 2
xC2 x 3x 10 x 3 2x 5x 1
x2 2x 2
2x C 6x
5x 10 x 1
5x C 10 xC3
0 2

Thus, x C 2 evenly divides x 2 3x 10 (similarly to 2 divides 6), but x 3 does not evenly
divide 2x 2 5x 1 for the remainder is non-zero. So, we can write

2x 2 5x 1 2
D 2x C 1 C ” 2x 2 5x 1 D .x 3/.2x C 1/ C 2
x 3 x 3
The blue term is called the dividend, the cyan term is called the divisor, and the purple term is
called the quotient. The red term is called the remainder term. And we want to understand it.

2.29.2 The polynomial remainder theorem


If we ‘play’ with polynomial division we could discover one or two theorems. Let’s do it!

1. Divide x 2 by x 1. Record the remainder.

2. Divide x 2 by x 2. Record the remainder.

3. Divide x 2 by x 3. Record the remainder.

If we do all these little exercises (the answers are 1; 4 and 9), we find out that the remainders
of dividing x 2 by x a is nothing but a2 . Something special is hidden. Let’s do a few more
exercises to unravel the mystery:

1. Divide x 2 C x C 1 by x a. Record the remainder.

2. Divide x 2 C 2x C 3 by x a. Record the remainder.

The answers are a2 C a C 1 and a2 C 2a C 3, respectively. What are they? They are exactly the
values obtained by evaluating the function at x D a. In other words, the remainder of dividing
f .x/ D x 2 C x C 1 by x a is simply f .a/. Now, no one can stop us from stating the following
’theorem’: if P .x/ is a polynomial, then the remainder of dividing P .x/ by x a is P .a/. Of
course mathematicians demand a proof, but I leave it as a small exercise. After a proof has been
provided, this statement became the remainder theorem.
Now we can understand why if the equation x 3 6x 2 C 11x 6 D 0 has x D 1 as one
solution, we can always factor the equation as .x 1/.   / D 0. This is due to the remainder
theorem that f .x/ D x 3 6x 2 C 11x 6 D .x 1/.   / C f .1/. But f .1/ D 0 as 1 is one
solution of f .x/ D 0, so f .x/ D .x 1/.   /. To find other solutions of this equation, we

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 201

need to find out .   /, which we denote by g.x/. This can be, of course, done via long division.
But I really do not like that process. We can do another way using the method of undetermined
coefficients. First, g.x/ must be of the form x 2 C ax C b with a; b being real numbers to be
determined. You should understand why g.x/ is a quadratic polynomial. Now, we factor f .x/:
x3 6x 2 C 11x 6 D .x 1/.x 2 C ax C b/ D x 3 C .a 1/x 2 C .b a/x b
It is quite straightforward to determine a and b: well, we have two expressions for the same
f .x/, thus the coefficients in the twos must be the same:
x3 6x 2 C 11x 6 D x 3 C .a 1/x 2 C .b a/x b”a 1D 6; b a D 11; b D 6
Solving these three equations, we get a D 5 and b D 6. So, we can finally factor f .x/:
(
x 1D0
x3 6x 2 C 11x 6 D 0 ” .x 1/.x 2 5x C 6/ D 0 ” 2
x 5x C 6 D 0

What we have actually achieved with the help of the re- y


mainder theorem? We have transformed a hard cubic equa- x
1 2 3
tion into two easier equations: one linear and one quadratic
equation! Of course this method works only if we can find
out one solution. So, try to find one solution if you have to solve a hard (polynomial) equation.
One way is to plot the graph of the polynomial (see figure). But there is an important theme here:
it is a powerful idea to convert a hard problem into many smaller easier problems. That is the
lesson we need to learn here. If the purpose is to find the solutions to x 3 6x 2 C 11x 6 D 0,
we can just use a tool to do that (see figure).

While x 3 6x 2 C11x 6 D 0 can be factored as .x 1/.x 2 5x C6/, it is impossible


Remark to do so for x 2 C 1 if we confine ourselves to real numbers. What would you do with
this observation? Did you think of the concept of irreducibility of polynomials?

2.29.3 Guessing roots of a polynomial the smart way


Back in the time of Descartes there was no calculator to make graphs. Descartes needed a way
to guess easy roots of polynomial equations. By easy I meant rational roots of the form x D p=q
with p; q 2 Z. And he discovered what we know call the rational root theorem (or test). Suppose
that we have to solve a quadratic equation with integer coefficients (e.g. x 2 5x C 7 D 0):
ax 2 C bx C c D 0; a; b; c 2 Z
And we search for rational roots of the form x D p=q , where of course q ¤ 0 and p; q are
coprimes (i.e., the greatest common divisor of them is one or gcd.p; q/ D 1). Plugging x into
the equation we get a.p=q/2 Cb.p=q/Cc D 0. Descartes then analyzed this equation to deduce
something special about a; b; c and p; q. First, we rewrite this equation as
ap 2 C bpq D cq 2 ” p.ap C bq/ D cq 2

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 202

This equation means that pj cq 2 : p divides cq 2 . From elementary number theory we know
that this leads to pjc. Next, we are doing something similar but for q, we rewrite the above
equation as
ap 2 D cq 2 bpq ” ap 2 D q.cq C bp/

This equation means that qjap 2 : q divides ap 2 . From elementary number theory we know that
this leads to qja. It is straightforward to generalize this to an x n C an 1 x n 1 C    C a0 D 0:
the rational root x D p=q of this equation, if any, must have pja0 and qjan .
The following example demonstrates how to use this test.

Example 2.3
Assume we want to guess the roots of the following quadratic equation with integer coeffi-
cients:
3x 2 4x C 1 D 0
If the root is of the form x D p=q , then pj1 and qj3. We now list all the possibilities:

p D f 1; 1g; q D f 1; 1; 3; 3g

From that the potential roots should be: x D f˙1; ˙1=3g. Checking these potential roots, we
see that 1=3 is the root. Then, using the factor theorem, we can find the second root.

2.29.4 Complex roots of z n 1 D 0 come in conjugate pairs


In Section 2.25.3 we have observed that the complex roots of z n 1 D 0 come in conjugate
pairs. Now, we can prove that. Suppose that

p.x/ D a0 C a1 x C    C an x n ; ai 2 R

is a polynomial of real coefficients. Let ˛ be a complex root (or zero) of p i.e., p.˛/ D 0. We
need to prove that the complex conjugate of ˛ i.e., ˛ is also a root. That is, p.˛/ D 0. The
starting point is, of course, p.˛/ D 0. So we write p.˛/

p.˛/ D a0 C a1 ˛ C    C an ˛ n

From that, we compute p.˛/:

p.˛/ D a0 C a1 ˛ C    C an ˛ n
D a0 C a1 ˛ C    C an ˛ n .a D a if a is real/
D a0 C a1 ˛ C    C an ˛ n N aCb DaCb
(ab D aN b),
D p.˛/ D 0 D 0

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 203

2.29.5 The fundamental theorem of algebra


The fundamental theorem of algebra is about the roots of an nth degree polynomial equation. It
is easy to state, plausible and hard to prove. We can guess this theorem reasoning like this. The
linear equation ax Cb D 0 has one root (i.e., x D b=a), the quadratic equation ax 2 Cbx Cc D
0 has two roots (can be complex), the cubic equation has three roots. Thus, it is reasonable for
Descartes to have stated, in his La Geometrie (1637), that “Every equation can have as many
distinct roots as the number of dimensions of the unknown quantity in the equation. Even
earlier, his fellow countryman Albert Girard (1590–1632) wrote, in his L’Invention Nouvelle en
L’Algebra (1629) that “Every equation of algebra has as many solutions as the exponent of the
highest term indicates".
What the theorem says is that every nth degree polynomial equation of the form
an z n C an 1 z n 1
C    C a1 z C a0 D 0; a1 ; a2 ; : : : 2 C; n2N
always has exactly n roots in the complex number system. An important detail is that if there
are identical roots of multiplicity m then they count as m roots, not just one.
Many famous mathematicians tried to prove this theorem, including d’Alembert, Euler, La-
grange. All failed. The first rigorous proof was published by Argand in 1806. In this book, we
take this theorem on faith.
This theorem is useful in many ways. One example is: we know that a polynomial of real
coefficients, if it has complex roots, they come in pairs. So, the three roots to a cubic polynomial
either must all be real, or there must be one real root and one conjugate pair. There cannot be
three complex roots because then there would have to be one complex root without a conjugate
mate.
Question 11. Given a real number a, how many cube roots of a are there? Write down the
expression of these roots.

2.29.6 Polynomial evaluation and Horner’s method


Given a n-order polynomial Pn .x/ D an x n Can 1 x n 1 C  Ca2 x 2 Ca1 x Ca0 , the polynomial
evaluation is to compute Pn .x0 / for any given x0 . You might be thinking, why we bother with
this. We have the formula for Pn .x/, just plug x0 into it and we’re done. Yes, you’re almost
right, unless that this naive method is slow. We are moving to the land of applied mathematics
where we must be pragmatic and speed is very important. Note that usually we need to compute
a polynomial many many times. For example, to plot a polynomial we need to evaluate it at
many points (because the more points we use the smoother the graph of the function is).
Let’s consider a specific cubic polynomial f .x/ D 2x 3 6x 2 C 2x C 1. The first "naive"
solution for f .x0 / is
f .x0 / D 2  x0  x0  x0 6  x0  x0 C 2  x0 C1
„ ƒ‚ … „ ƒ‚ … „ƒ‚…
3 multiplications 2 multiplications 1 multiplications

which involves 6 multiplications and 3 additions. How about the general Pn .x0 /? To count the
multiplication/addition, we need to write down the algorithm, Algorithm 1 is such one. Roughly,

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 204

an algorithm is a set of steps what we follow to complete a task. There are more to say about algo-
rithm in Section 2.34. Having this algorithm in hand, it is easy to covert it to a computer program.

Algorithm 1 Polynomial evaluation algorithm.


1: Compute P D a0
2: for k D 1 W n do
3: P D P C ak x0k
4: end for

From that algorithm we can count:

addition: n
multiplication: 1 C 2 C    C n D n.n C 1/=2

Can we do better? The British mathematician William George Horner (1786 – 1837) de-
veloped a better method today known as Horner’s method. But he attributed to Joseph-Louis
Lagrange, and the method can be traced back many hundreds of years to Chinese and Persian
mathematicians.
In Horner’s method, we massage f .x0 / a bit as:

f .x0 / D 2x03 6x02 C 2x0 C 1


D x0 Œ2x02 6x0 C 2 C 1
D x0 Œx0 .2x0 6/ C 2 C 1

which requires only 3 multiplications! For Pn .x0 / Horner’s method needs just n multiplications.
To implement Horner’s method, a new sequence of constants is defined recursively as follows:

b3 D a3 b3 D 2
b2 D x0 b3 C a2 b2 D 2x0 6
b 1 D x 0 b 2 C a1 b1 D x0 .2x0 6/ C 2
b 0 D x 0 b 1 C a0 b0 D x0 .x0 .2x0 6/ C 2/ C 1

where the left column is for a general cubic polynomial whereas the right column is for the
specific f .x/ D 2x 3 6x 2 C 2x C 1. Then, f .x0 / D b0 . As to finding the consecutive b-values,
we start with determining bn , which is simply equal to an . We then work our way down to the
other b’s, using the recursive formula: bn 1 D an 1 C bn x0 (this can be obtained by looking
carefully at the left column in the above equation), until we arrive at b0 .
A by-product of Horner’s method is that we can also find the division of f .x/ by x x0 :

f .x/ D .x x0 /Q.x/ C b0 ; Q.x/ D b3 x 2 C b2 x C b1 (2.29.2)

And that allows us to evaluate the derivative of f at x0 , denoted by f 0 .x0 /, as

f 0 .x/ D Q.x/ C .x x0 /Q0 .x/; f 0 .x0 / D Q.x0 /

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 205

One application is to find all solutions of Pn .x/ D 0. We use Horner’s method together with
Newton’s method (Section 4.5.4). A good exercise to practice coding is to code a small program
to solve Pn .x/ D 0. The input is Pn .x/ and press a button we shall get all the solutions, nearly
instantly!
Yes, Horner’s method is faster than the naive method. But mathematicians are still
not satisfied with this. Why? Because there might be another (hidden) method that
Remark can be better than Horner’s. Imagine that if they can prove that Horner’s method is
the best then they can stop searching for better ones. And they proved that it is the
case. Details are beyond my capacity.

2.29.7 Vieta’s formula


We all know the solutions to the quadratic equation ax 2 C bx C c D 0; they are:
p
b ˙ b 2 4ac
x1;2 D
2a
What we get if we multiply these roots and sum them?
p ! p !
b b C b 2 4ac b b2 4ac c
x1 C x2 D ; x1 x2 D D (2.29.3)
a 2a 2a a

That is remarkable given that the expression for the roots is quite messy: their sum and product
are, however, very simple functions of the coefficients of the quadratic equation. And this is
known as Vieta’s formula discovered by Viète. Not many of high school students (including
the author) after knowing the well known quadratic formula asked this question to discover for
themselves this formula.
How useful Eq. (2.29.3) is? It can be used to determine the sign of the roots x1;2 without
solving the equation. If a and c are of different signs, then ac < 0, which results in x1 x2 < 0:
therefore, we have one negative root and one positive root.
Did you notice something special about Eq. (2.29.3)? Note that x1 C x2 and x1 x2
will not change if we switch the roots; i.e., x2 C x1 is exactly x1 C x2 . Is this a
Remark
coincidence? Of course not. The quadratic equation does not care how we label its
roots.
After this, another question should be asked: Do we have the same formula for cubic equa-
tions, or for any polynomial equations? Before answering that question, we need to find a better
way to come up with Vieta’s formula. Because the formula of the roots of a cubic equation is
very messy. And we really do not want to even add them not alone multiply them. As x1 and
x2 are the roots of the quadratic equation, we can write that equation in this form (thanks to the
discussion in Section 2.29.2)

.x x1 /.x x2 / D 0 ” x 2 .x1 C x2 /x C x1 x2 D 0

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 206

And this must be equivalent to x 2 C .b=a/x C c=a D 0, thus we have x1 C x2 D b=a


and x1 x2 D c=a–the same result as in Eq. (2.29.3). This method is nice because we do not
need to know the expressions of the roots. With this success, we can attack the cubic equation
x 3 C .b=a/x 2 C .c=a/x C d=a D 0. Let’s denote by x1 ; x2 ; x3 its roots, then we write that cubic
equation in the following form

.x x1 /.x x2 /.x x3 / D 0 ” x 3 .x1 Cx2 Cx3 /x 2 C.x1 x2 Cx2 x3 Cx3 x1 /x x1 x2 x3 D 0

And thus comesVieta’s formula for the cubic equation:

b c d
x1 C x2 C x3 D ; x1 x2 C x2 x3 C x3 x1 D ; x1 x2 x3 D
a a a
Summarizing these results for quadratic and cubic equations, we write (to see the pattern)
a1 a0
a2 x 2 C a1 x C a0 D 0 W x1 C x2 D ; x 1 x2 DC
a2 a2
a2 a0
a3 x 3 C a2 x 2 C a1 x C a0 D 0 W x1 C x2 C x3 D ; x 1 x2 x3 D
a3 a3

In the above equation, we see something new: .x1 Cx2 ; x1 Cx2 Cx3 /, .x1 x2 ; x1 x2 Cx2 x3 Cx3 x1 /
and .x1 x2 ; x1 x2 x3 /. If we consider a fourth order polynomial we would see x1 C x2 C x3 C x4 ,
x1 x2 C x2 x3 C x3 x4 C x2 x3 C x2 x4 C x3 x4 , x1 x2 x3 C x1 x2 x4 C x1 x3 x4 C x2 x3 x4 and
x1 x2 x3 x4 . As can be seen, these terms are all sums. Moreover, they are symmetric sums (e.g.
the sum x1 C x2 C x3 is equal to x2 C x1 C x3 ). Now, mathematicians want to define these sums–
which they call elementary symmetric sums–precisely. And this is their definition of elementary
symmetric sums of a set of n numbers.

Definition 2.29.1
The k-th elementary symmetric sum (or polynomial) of a set of n numbers is the sum of all
products of k of those numbers (1  k  n). For example, if n D 4, and our set of numbers
is fa; b; c; d g, then:

1st symmetric sum D S1 DaCbCcCd


2nd symmetric sum D S2 D ab C ac C ad C bc C bd C cd
(2.29.4)
3rd symmetric sum D S3 D abc C abd C acd C bcd
4th symmetric sum D S4 D abcd

With this new definition, we can write the general Vieta’s formula. For a nth order polynomial
equation
an x n C an 1 x n 1 C    C a2 x 2 C a1 x C a0 D 0
we have
an j
Sj D . 1/j ; 1j n
an

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 207

where Sj is the j -th elementary symmetric sum of a set of n roots. With a proper tool, we
can have a compact Vieta’s formula that encapsulates all symmetric sums of the roots of any
polynomial equation!
If we do not know Vieta’s formula, then finding the complex roots of the following system
of equations:
xCyCz D2
xy C yz C zx D 4
xyz D 8

would be hard. But it is nothing than this problem: ‘solving this cubic equation t 3 2t 2 C4t 8 D
0’!

Some problems using Vieta’s formula.

1. Let x1 ; x2 be roots of the equation x 2 C 3x C 1 D 0, compute


 2  2
x1 x2
SD C
x2 C 1 x1 C 1

2. If the quartic equation x 4 C 3x 3 C 11x 2 C 9x C A has roots k; l; m and n such that


kl D mn, find A.

For the first problem the idea is to use Vieta’s formula that reads x1 C x2 D 3 and
x1 x2 D 1. To use x1 C x2 and x1 x2 we have to massage S so that these terms show up.
For example, for the term x1=x2 C1, we do (noting that x22 C 3x2 C 1 D 0, thus x22 C x2 D
1 2x2 )
x1 x1 x2 x1 x2 1
D 2 D D
x2 C 1 x2 C x2 1 2x2 1 2x2
Do we need to do the same for the second term? No, we have it immediately once we had
the above:
x2 1
D
x1 C 1 1 2x1
Now, the problem is easier as S can be written in terms of x1 C x2 and x1 x2 :

1 1 .1 C 2x1 /2 C .1 C 2x2 /2
SD C D D    D 18
.1 C 2x1 /2 .1 C 2x2 /2 Œ.1 C 2x1 /.1 C 2x2 /2

2.29.8 Girard-Newton’s identities


For a quadratic equation ax 2 C bx C c D 0 Viète gave us x1 C x2 D b=a and x1 x2 D c=a.
From these two, we can compute other things concerning the roots x1 ; x2 . For example, we can

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 208

compute x12 C x22 as follows§


.x1 C x2 /2 D x12 C x22 C 2x1 x2 H) x12 C x22 D .x1 C x2 /2 2x1 x2 (2.29.5)
What is special about x12 C x22 , why not x12 C x23 ? The term x12 C x22 is a special case of the
so-called symmetric polynomials–those polynomials that are invariant when we swap x1 ; x2 .
For example, x1 C x2 C 2x1 x2 is a symmetric polynomial. But x12 C x23 is not a symmetric
polynomial. Eq. (2.29.5) shows that it is possible to express a symmetric polynomial in terms of
x1 C x2 and x1 x2 –the elementary symmetric polynomials. Now, you understand the adjective
‘elementary’: the elementary symmetric polynomials are the “building blocks” for all symmetric
polynomials.Ž
It’s interesting, but can we do the same for x13 C x23 (what else?)? Let’s do that now. We start
with .x1 C x2 /3 D x13 C x23 C 3x1 x2 .x1 C x2 /, then x13 C x23 is
x13 C x23 D .x1 C x2 /3 3x1 x2 .x1 C x2 /
D .x1 C x2 /Œ.x1 C x2 /2 3x1 x2  (2.29.6)
D .x1 C x2 /Œx12 C x22 x1 x2 
We could have stopped at the second equality where we succeeded in expressing the symmetric
polynomial x13 C x23 in terms of the elementary symmetric sums. However, we went one step
further: we wanted to write x13 C x23 in terms of x12 C x22 .
Now, we put x12 C x22 and x13 C x23 together to see if there exists a pattern or not:
b 1 c
x12 C x22 D .x1 C x2 /.x11 C x21 / 2x1 x2 D .x1 C x21 / 2
a a
b 2 c 1
x13 C x23 D .x1 C x2 /Œx12 C x22 .x1 C x22 /
x1 x2  D .x C x21 /
a a 1
We’re seeing a pattern here (pay attention to the exponents highlighted in red), except the ugly
number 2. But, 2 D 1 C 1 D x10 C x20 . Voilà, we have now a nice formula, if we denote Pi D
x1i C x2i –the sum of the ith powers of the roots (for example, if i D 2, we have P2 D x12 C x22 ):
b c b
Pi D Pi 1 Pi 2; i D 0; 1; 2; : : : ; P0 D 2; P1 D (2.29.7)
a a a
If this relation is correct, then we would have
b c b 3 c 2
P4 D x14 C x24 D P3 P2 D .x1 C x23 / .x C x22 /
a a a a 1
And the above is indeed correct! So, we think that Eq. (2.29.7) is correct and we will have to
prove it. If you use proof by induction, you will see that it is indeed valid. Why is this formula
powerful? Because it allows us to compute the sum of any powers of the roots (without knowing
the roots themselves!). How we do this? We start with P0 D 2 and P1 D b=a, Eq. (2.29.7)
then gives us P2 , use it again and we get P3 , again to get P4 . You see the point. Can you relate
this to something we’ve met earlier in this chapter?ŽŽ
§
I learned this from brilliant.org.
Ž
This statement stems from the Fundamental Theorem on Symmetric Polynomials.
ŽŽ
The answer is Fibonacci sequence.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 209

Some problems using Girard-Newton’s identities.

1. Given x 2 2x C 6 D 0 with two (complex) roots a; b. Compute a10 C b 10 .

2. Given x 3 3x 2 C 6x 9 D 0 with roots a; b; c. Compute a5 C b 5 C c 5 .

3. If the roots of x 3 C 3x 2 C 4x 8 D 0 are a; b; c, compute a2 .1 C a2 / C b 2 .1 C


b 2 / C c 2 .1 C c 2 /. Notea .
a
Ans: 7552, 108 and 126. If you are wondering what is the formula for the cubic equation, the answer
is simple: derive a counterpart of Eq. (2.29.7) for the cubic equation. If possible, generalize it to polynomial
equation of any degree.

2.30 Modular arithmetic


I was writing this section on Friday and suppose I want to know what day of the week it will
be 100 days from today. This is how we can solve this problem. We know that every 7 days, we
will be back to the same day of the week. So if you start on a Friday, after 7 days it will also be
Friday, and after 14 days it will still be Friday, and so on. The smallest multiple of 7 less than
100 is 98, so 98 days from today, it will be a Friday. Therefore, 100 days from today it will be a
Sunday.
What we are really doing here is computing the remainder when 100 is divided by 7. That
remainder is 2 (because 100 D 7  14 C 2). So the day of the week it will be 100 days from
today is the same as the day of the week it will be 2 days from today. This is the basic concept
of modular arithmetic, the art of computing remainders.
The best way to introduce modular arithmetic is to think of the face of a clock. The numbers
go from 1 to 12 and then the clock "wraps around": when we get to "16 o’clock", it actually
becomes 4 o’clock again. So 16 becomes 4, 14 becomes 2, and so on (Fig. 2.43). This can keep
going, so when we get to "28 o’clock”, we are actually back to where 4 o’clock is on the clock
face (and also where 16 o’clock was too).

12 12
11 1 11 1
23 24 13
10 2 10 2
22 14
9 3 9 21 15 3
20 16
8 4 8 4
19 18 17
7 5 7 5
6 6

Figure 2.43: Telling the time using a clock. Imagine that the afternoon times are laid on top of their
respective morning times: 16 is next to 4, so 16 and 4 are the same or congruent (on the clock).

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 210

So in this clock world, we only care where we are in relation to the numbers 1 to 12. In this
world, 1; 13; 25; 37; : : : are all thought of as the same thing, as are 2; 14; 26; 38; : : : and so on.
What we are saying is "13 D 1Csome multiple of 12", and "26 D 2Csome multiple of 12",
or, alternatively, "the remainder when we divide 13 by 12 is 1" and "the remainder when we
divide 26 by 12 is 2”. The way mathematicians express this is:

13  1 .mod 12/; 26  2 .mod 12/

This is read as "13 is congruent to 1 mod (or modulo) 12" and "26 is congruent to 2 mod 12".
But we don’t have to work only in mod 12. For example, we can work with mod 7, or mod
10 instead. Now we can better understand the cardioid introduced in Chapter 1, re-given below
in Fig. 2.44. Herein, we draw a line from number n to 2n .mod N / because on the circle we
only have N points. For example, 7  2 D 14 which is congruent to 4 modulo 10. That’s why
we drew a line from 7 to 4.

3 2 6 5 4
7 3
4 1 8 2
9 1

5 0 10 0

11 19

6 9 12 18
13 17
7 8 14 15 16

(a) N D 10 (b) N D 20 (c) N D 200

Figure 2.44: A cardioid emerges from the times table of 2.

Should we stop with the times table of 2? No, of course. We play with times table of three,
four and so on. Fig. 2.45a shows the result for the case of eight and 51. How about times table
for a non-integer number like 2:5? Why not? See Fig. 2.45c.
So, modular arithmetic is a system of arithmetic for integers, where numbers "wrap around"
when reaching a certain value, called the modulus. The modern approach to modular arithmetic
was developed by Gauss in his book Disquisitiones Arithmeticae, published in 1801.

Definition 2.30.1
Given an integer n > 1, called a modulus, two integers a and b are said to be congruent modulo
n, if n is a divisor of their difference (that is, if there is an integer k such that a b D k n or
a D b C k n). We then write a  b .mod n/.

It is certain that there exist rules that govern modular arithmetic, and we discuss them in
Section 2.30.1. Next, modular arithmetic is used to solve some number theory problems (Sec-
tion 2.30.2). In Section 2.30.3, my very first attempt to solve a hard problem from a math contest

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 211

(a) N D 200, table of 8 (b) N D 200, table of 51 (c) N D 200, 2.5

Figure 2.45: Interesting things emerge from the times table of 8, 51 and 2:5.

is presented. Finally, we come back to the problem of divisibility of integers, but via modular
arithmetic (Section 2.30.4).

2.30.1 Rules of modular arithmetic


Now that we have a new kind of arithmetic, the next thing is to find the rules it obey. Actually,
some of the rules are simple: if a1  b1 .mod m/ and a2  b2 .mod m/, or if a  b .mod m/
then
(a) addition
a1 ˙ a2  b 1 ˙ b 2 .mod m/
(b) multiplication
a1 a2  b 1 b 2 .mod m/
(c) exponentiation
ap  b p .mod m/; p 2 N
(d) scaling
ak  bk .mod m/; k 2 N
(d) transitivity
if a  b .mod m/ and b  c .mod m/ then a  c .mod m/
(2.30.1)
It is possible to look at the clock on Fig. 2.43 to find examples demonstrating these rules. The
proof of these rules is skipped here (only use the definition of a  b .mod m/); noting that the
exponentiation rule simply follows the multiplication rule. Of course, as often the case, we can
extend the rules to the case of more numbers. For example, assuming that a1  b1 .mod m/,
a2  b2 .mod m/ and a3  b3 .mod m/, then we have

a1 ˙ a2 ˙ a3  b1 ˙ b2 ˙ b3 .mod m/; a1 a2 a3  b1 b2 b3 .mod m/

Note that this list is not meant to be exhaustive.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 212

2.30.2 Solving problems using modular arithmetic


Let’s solve some problems using this new mathematics. The first problem is: what is the last
digit (also called the units digit) of the sum

2403 C 791 C 688 C 4339

Of course, we can solve this by computing the sum, which is 8 221, and from that the answer is
1. But, modular arithmetic provides a more elegant solution in which we do not have to add all
these numbers.
First, we reduce these big numbers into smaller ones, by noting thatŽ

2403  3 .mod 10/; 791  1 .mod 10/; 688  8 .mod 10/; 4339  9 .mod 10/

Then, the addition rule in Eq. (2.30.1) leads to

2403 C 791 C 688 C 4339  3 C 1 C 8 C 9 .mod 10/  1

And the units digit of the sum is one. In this method, we had to only add 3; 1; 8 and 9.
The second problem is: Andy has 44 boxes of soda in his truck. The cans of soda in each
box are packed oddly so that there are 113 cans of soda in each box. Andy plans to pack the
sodas into cases of 12 cans to sell. After making as many complete cases as possible, how many
sodas will he have leftover?
This word problem is mathematically translated as: finding the remainder of the product
44  113–which is the number of soda cans Andy has–when divided by 12. We have

44  8 .mod 12/; 113  5 .mod 12/

Thus,
44  113  8  5 .mod 12/  40 .mod 12/  4

So, the number of sodas left over is four.


In the third problem we shall move from addition to exponentiation. The problem is what
are the tens and units digits of 71942 ? Of course, we find the answers without actually computing
71942 .
Let’s consider a much easier problem: what are the two last digits of 1235 using modular
arithmetic. We know that 1235 D 12  100 C 35, thus 1235  35 .mod 100/. So, we can work
with modulo 100 to find the answer. Now, coming back to the original problem with 71942 , of
which the strategy is to do simple things first: computing the powers of 7§ and looking for the

Ž
Why using modulus 10?
§
Do not forget the original problem is about 71942 .

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 213

pattern:
71 D7 W 71  07 .mod 100/
72 D 49 W 72  49 .mod 100/
73 D 343 W 73  43 .mod 100/
74 D 2401 W 74  01 .mod 100/
75 D 16807 W 75  07 .mod 100/ (2.30.2)
76 D 117649 W 76  49 .mod 100/
77 D : : : 43 W 77  43 .mod 100/
78 D ::: W 78  01 .mod 100/
79 D ::: W 79  07 .mod 100/
We’re definitely seeing a pattern here, the last two digits of a power of 7 can only be either of
07; 49; 43; 01. Now, as 1942 is an even number, we just focus on even powers that can be divided
into two groups: 2; 6; 10; : : : and 4; 8; 12; : : :ŽŽ The first group can be generally expressed by
2 C 4k for k D 0; 1; 2; : : : Now, solving 2 C 4k D 1942 gives us k D 485. Therefore, the last
two digits of 71942 are 49. (Note that if you try with the second group, a similar equation does
not have solution, i.e., 1942 belongs to the first group).
Although the answer is correct, there is something fishy in our solution. Note that we only
computed powers of 7 up to 79 . There is nothing to guarantee that the pattern repeats forever
or at least up to exponent of 1942. Of course we can prove that this pattern is true using the
multiplication rule. We can avoid going that way, by computing 71942 directly by noting that
1942 D 5  388 C 2. Why this decomposition of 1942? Because 75  7 .mod 100/. With this,
we can write 
71942 D .75 /388 .72 /  .7388 /.49/ .mod 100/ (2.30.3)
And certainly we play the same game for 7388 ; writing 388 D 5  77 C 3, then 77 D 5  15 C 2,
then 15 D 5  3, we have (do not forget the exponential rule for modular arithmetic)

7388  .777 /.73 /  .715 /.72 /.73 /  .73 /.72 /.73 / .mod 100/ (2.30.4)

Now using Eq. (2.30.2) to have 72  49 .mod 100/ and 73  43 .mod 100/, Eq. (2.30.3) then
becomes 71942  73 72 73 72 :

71942  710  49 .mod 100/

As can be seen the idea is simple: trying to replace the large number (1942) by smaller ones!

2.30.3 A problem from a 2006 Hongkong math contest


Let’s solve another problem, which is harder than previous problems. Consider a function f that
takes a counting number a and returns a counting number obeying this rule

f .a/ D .sum of the digits of a/2


ŽŽ
Why these groups? If not clear, look at Eq. (2.30.2): the period of 49 is 4.

Now, consider this problem: finding the tens and units digits of 49971 ? But wait, isn’t it the same problem
before? Yes, but you will find that working with powers of 49, instead of 7, is easier.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 214

The question is to compute f .2007/ .22006 / i.e., f is composed of itself 2007 timesŽŽ . Why 2006?
Because this problem is one question from a math contest in Hong Kong happening in the year
of 2006.
Before we can proceed, we need to know more about the function f first. Concrete examples
are perfect for this. If a D 321, then

f .321/ D .3 C 2 C 1/2 D 36

If we go blindly, this is what we would do: compute 22006 (assuming that we’re able to get
it), know its digits and sum them and square the sum. Then, applying the same steps for this new
number. And do this 2007 times. No, we simply cannot do all of this w/o a calculator. There
must be another way.
Because I did not know where to start, I wrote a Julia program, shown in Listing 2.1 to
solve it. The answer is then 169.

Listing 2.1: Julia program to solve the HongKong problem.


1 function ff(x)
2 digits_x = digits(x) # get the digits of x and put in array
3 return sum(digits_x)^2
4 end
5 function fff(x,n) # repeatedly applying ff n times
6 for i = 1:n
7 x = ff(x)
8 end
9 return x
10 end
11 x = big(2)^2006 # have to use big integer as 2^2006 is very big
12 println(fff(x,2007))

But without a computer, how can we solve this problem? If we cannot solve this problem, let’s
solve a similar problem but easier, at least we get some points instead of zero! This technique
is known as specialization, and it is a very powerful strategy. How about computing f .5/ .24 /?
That can be done as 24 D 16:

f .1/ .24 / D f .16/ D .1 C 6/2 D 49


f .2/ .24 / D f .49/ D .4 C 9/2 D 169
f .3/ .24 / D f .169/ D .1 C 6 C 9/2 D 256
f .4/ .24 / D f .256/ D .2 C 5 C 6/2 D 169
f .5/ .24 / D f .169/ D .1 C 6 C 9/2 D 256

The calculation was simple because 24 is a small number. What’s important is that we see a
pattern. With this pattern it is easy to compute f .n/ .24 / for whatever value of n, n 2 N.
If you do not know what is a function composition, here it is: f 2 .x/ D f .f .x//. Evaluate f at x get the
ŽŽ

result and put it into the function again.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 215

So far so good. We made progress because we were able to compute 24 , which is 16, then
we can use the definition of the function f to proceed. For 22006 , it is impossible to go this way.
Now, we should ask this question: why the function f is defined this way i.e., it depends on the
sum of the digits of the input? Why not the product of the digits? Let’s investigate the sum of
the digits of a counting number. For example,

123 H) 1 C 2 C 3 D 6; 4231 H) 4 C 2 C 3 C 1 D 10

If we check the relation between 6 and 123 and between 4231 and 10, we find this:

6  123 .mod 9/; 10  4231 .mod 9/

That is: the sum of the digits of a counting number is congruent to the number modulo 9Ž . And
then, according to the exponentiation rule of modular arithmetic, the square of sum of the digits
of a counting number is congruent to the number squared modulo 9. For example, 36  1232
.mod 9/.
With this useful ‘discovery’, we can easily do the calculations w/o having to know the digits
of 24 (in other words w/o calculating this number; note that our actual target is 22006 ):

f .1/ .24 /  .24 /2  4 .mod 9/


f .2/ .24 /  .24 /4  7 .mod 9/
(2.30.5)
f .3/ .24 /  .24 /8  4 .mod 9/
f .4/ .24 /  .24 /8  7 .mod 9/

Now, if we want to compute f .4/ .24 /, we can start with the fact that it is congruent with 7
.mod 9/. But wait, there are infinite numbers that are congruent with 7 modulo 9; they are
f7; 16; 25; : : : ; 169; 178; : : : ; g. We need to do one more thing; if we can find a smallest upper
bound of f .4/ .24 /, let say f .4/ .24 / < M , we then can remove many options and be able to find
f .4/ .24 /.
Now, we can try the original problem. Note that 22006  4 .mod 9/ŽŽ , then by similar
reasoning as in Eq. (2.30.5), we get
(
4 .mod 9/; if n is even
f .n/ .22006 /  (2.30.6)
7 .mod 9/; if n is odd

And we combine this with the following result (to be proved shortly):

f .n/ .22006 / < 529 .D 232 /; 8n  8 (2.30.7)


Ž
Of course, we need to prove this, two examples cannot be a proof. In this case, this is true. The proof is not
hard. Actually we have stumbled on this in Section 2.4 when we’re talking about the divisibility of a counting
number with 9.
ŽŽ
How can we know this? Simple: by computing the powers of 2 similarly to what we have done for 7 shown in
Eq. (2.30.2).

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 216

Now, we substitute n D 2005 in Eq. (2.30.6), we get f .2005/ .22006 /  7 .mod 9/. And because
the sum of the digits of a counting number is congruent to the number modulo 9, we now also
have (transitivity rule)

sum of digits of f .2005/ .22006 /  7 .mod 9/ (2.30.8)

Now, using Eq. (2.30.7) for n D 2006, we have

f .2006/ .22006 / D .sum of digits of f .2005/ .22006 //2 < 232

which leads to
sum of digits of f .2005/ .22006 / < 23 (2.30.9)
Combining the two results on the sum of the digits of f .2005/ .22006 / given in Eqs. (2.30.8)
and (2.30.9), we can see that it can only take one of the following two values:

sum of digits of f .2005/ .22006 / D f7; 16g

which results in 169–the same number as the code returned:

f .2006/ .22006 / D f49; 256g H) f .2007/ .22006 / D 132 D 169

Proof. Now is the proof of Eq. (2.30.7). We start with the fact that 22006 < 22007 D 8669 <
10669 . In words, 22006 is smaller than a number with 670 digits. By the definition of f , we then
have

f .22006 / < f .99 9/ D .9  699/2 < 108


  …
„ ƒ‚
699 terms

This is because 99 : : : 9 with 699 digits is the largest number that is smaller than 10699 and has
a maximum sum of the digits. Next, we do something similar for f .2/ .22006 / starting now with
108 :
f .2/ .22006 / < f .99    …9/ D .9  8/2 < 104
„ ƒ‚
8 terms

Then for f .3/ .22006 /:

f .3/ .22006 / < f .9999/ D .9  4/2 D 1296

And for f .4/ .22006 /:


f .4/ .22006 / < f .1999/ D .28/2 D 784
And for f .5/ .22006 /:
f .5/ .22006 / < f .799/ D .25/2 D 625
Continuing this way and we stop at f .8/ .22006 /:

f .8/ .22006 / < f .599/ D .23/2 D 529

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 217

2.30.4 Divisibility with modular arithmetic


Divisibility by 3 and 9. Even though it is possible to study the divisibility of integers without
using modular arithmetic, I want to use it to get familiar with it. Without loss of generality, let n
be a four digit integer with digits a; b; c and d . So, we can write n D 103 a C 102 b C 10c C d .
Now, we know that 3jn when n D 3k which means that n  0 .mod 3/. Working with mod 3,
we have 10  1 .mod 3/, hence 102  1 .mod 3/, 103  1 .mod 3/. Thus,
n D 103 a C 102 b C 10c C d  1a C 1b C 1c C d .mod 3/
Thus 3jn is when 1a C 1b C 1c C d  0 .mod 3/, or when a C b C c C d is a multiple of
3. Following the same argument, a number is a multiple of 9 when the sum of its digits is a
multiple of 9.

Divisibility by 2 and 5. Working with modulo 2 (or 5), we have 10  0 .mod 2; 5/, thus we
just need to worry about the last digit of the number. An integer number is a multiple of 2 when
its last digit is one of f0; 2; 4; 6; 8g. And an integer number is a multiple of 5 when its last digit
is one of f0; 5g.

Divisibility by 4 and 8. Let’s consider divisibility by 8 first. Working with modulo 8, we have
10  2 .mod 8/, which leads to
102  4 .mod 8/; 103  8 .mod 8/  0 .mod 8/; H) 10n  0 .mod 8/; n  3
Thus, only the last three digits are responsible for the divisibility. And the rule is: an integer is
divisible by 8 if the number represented by its last three digits is a multiple of 8.

Divisibility by 6. The divisibility of a number by 6, which is nothing but 2  3 depends on the


divisibility of that number by 2 and 3. We have this result: a number is divisible by 6 iff it is
divisible by both 2 and 3. For example, 12 is divisible by 6, and 2j12 and 3j12.

History note 2.7: Gauss (30 April 1777–23 February 1855)


Johann Carl Friedrich Gauss was a German mathematician who made
significant contributions to many fields in mathematics and science.
Sometimes referred to as the Prince of Mathematics and "the greatest
mathematician since antiquity", Gauss had an exceptional influence in
many fields of mathematics and science, and is ranked among history’s
most influential mathematicians. Gauss was born in Brunswick to poor,
working-class parents. His mother was illiterate and never recorded
the date of his birth. Gauss was a child prodigy. In his memorial on
Gauss, Wolfgang von Waltershausen wrote that when Gauss was barely three years old he
corrected a math error his father made; and that when he was seven, solved an arithmetic
series problem faster than anyone else in his class of 100 pupils.
While at the University of Göttingen, Gauss independently rediscovered several important

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 218

theorems. His breakthrough occurred in 1796 when he showed that a regular polygon can
be constructed by compass and straightedge if the number of its sides is the product of
distinct Fermat primes and a power of 2. This was a major discovery in an important field
of mathematics; and the discovery ultimately led Gauss to choose mathematics instead of
philology as a career. Gauss was so pleased with this result that he requested that a regular
heptadecagona be inscribed on his tombstone. The stonemason declined, stating that the
difficult construction would essentially look like a circle.
He further advanced modular arithmetic in his textbook The Disquisitiones Arithmeticae
written when Gauss was 21. This book is notable for having an impact on number theory
as it not only made the field truly rigorous and systematic but also paved the path for
modern number theory. On 23 February 1855, Gauss died of a heart attack in Göttingen.
a
A heptadecagon or 17-gon is a seventeen-sided polygon.

2.31 Cantor and infinity


To start a short discussion on infinity, I use the story adapted from Strogatz’s The joy of X :

A boy is very excited about the number 100. He told me it is an even number and
101 is an odd number, and 1 million is an even number. Then the boy asked this
question: “Is infinity even or odd’?’

This is a very interesting question as infinity is something unusual as we have seen in Sec-
tion 2.21. Let’s assume that infinity is an odd number, then two times infinity, which is also
infinity, is even! So, infinity is neither even nor odd!
This section tells the story of the discovery made by a mathematician named Cantor that
there are infinities of different sizes. I recommend the book To Infinity and Beyond: A Cultural
History of the Infinite by Eli Maor [44] for an interesting account on infinity.
We start with Section 2.31.1 that presents a brief introduction to set theory. Next, we discuss
finite and infinite sets (Section 2.31.2); examples of finite sets are f1; 2; 3g and f3; 4; 5; 6g, and
an example of an infinite set is f0; 1; 2; 3; : : :g. Cantor showed us that these different sets are of
the same size, and they are countable. Finally, Section 2.31.3 discusses the set of real numbers,
which is uncountable. Now, we have countably infinite sets (such as N; Z; Q) and uncountably
infinite sets (such as R).

2.31.1 Sets
Each of you is familiar with the word collection. In fact, some of you may have collections–such
as a collection of stamps, a collection of PS4 games. A set is a collection of things. For example,
f1; 2; 5g is a set that contains the numbers 1,2 and 5. These numbers are called the elements
of the set. Because the order of the elements in a set is irrelevant, f2; 1; 5g is the same set as
f1; 2; 5g. Think of your own collection of marbles; you do not care the location of each invidual
marble. And also think of fg as a polythene bag which holds its elements inside in such a way

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 219

that we can see through the bag to see the elements. Furthermore, an element cannot appear
more than once in a set; so f1; 1; 2; 5g is equivalent to f1; 2; 5g.
To say that 2 is a member of the set f1; 2; 5g, mathematicians write 2 2 f1; 2; 5g and to say
that 6 is not a member of this set, they write 6 … f1; 2; 5g.
Of course the next thing mathematicians do with sets is to compare them. Considering two
sets: f1; 2; 3g and f3; 4; 5; 6g, it is clear that the second set has more elements than the first. We
use the notation jAj, called the cardinality, to indicate the number of elements of the set A. The
cardinality of a set is the size of this set or the number of elements in the set.

2.31.2 Finite and infinite sets


It is obvious that we have finite sets whose cardinalities are finite; for instance f1; 2; 3g and
f3; 4; 5; 6g, and infinite sets such as

N D f0; 1; 2; 3; : : :g

Things become interesting when we compare infinite sets. For example, Galileo wrote in his
Two New Sciences about what is now known as Galileo’s paradox:

1. Some counting numbers are squares such as 1; 4; 9 and 16, and some are not squares such
as 2; 5; 7 and so on.
2. The totality of all counting numbers must be greater than the total of squares, because the
totality of all counting numbers includes squares as well as non-squares.
3. Yet, for every counting number, we can have a one-to-one correspondence between num-
bers and squares, for example (a doubled headed arrow $ is used for this one-to-one
correspondence)
1 2 3 4 5 6 
l l l l l l 
1 4 9 16 25 36   
4. So, there are, in fact, as many squares as there are counting numbers. This is a contradiction,
as we have said in point 2 that there are more numbers than squares.

The German mathematician Georg Cantor (1845 – 1918) solved this problem by introducing
a new symbol @0 (pronounced aleph-null), using the first letter of the Hebrew alphabet with
the subscript 0. He said that @0 was the cardinality of the set of natural numbers N. Every set
whose members can be put in a one-to-one correspondence with the natural numbers also has
the cardinality @0 .
With this new technique, we can show that the sets N and Z have the same cardinality. Their
one-to-one correspondence is:
1 2 3 4 5 6 7 
l l l l l l l 
0 1 1 2 2 3 3 

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 220

The next question is how about the set of rational numbers Q? Is this larger or equal the set
of natural numbers? Between 1 and 2, there are only two natural numbers, but there are infinitely
many rational numbers. Thus, it is tempting for us to conclude that jQj > jNj. Again, Cantor
proved that we were wrong; Q D @0 !
For simplicity, we consider only positive rational numbers. A positive rational number is a
number of this form p=q where p; q 2 N and q ¤ 0. First, Cantor arranged all positive rational
numbers into an infinite array:
1 2 3 4 5
1 1 1 1 1

1 2 3 4 5
2 2 2 2 2

1 2 3 4 5
3 3 3 3 3

1 2 3 4 5
4 4 4 4 4

1 2 3 4 5
5 5 5 5 5

:: :: :: :: :: ::
: : : : : :

where the first row contains all rational numbers with denominator of one, the second row
with denominator of two and so on. Note that this array has duplicated members; for instance
1=1; 2=2; 3=3; : : : or 1=2; 3=6; 4=8.
Next, he devised a zigzag way to traverse all the numbers in the above infinite array, once
for each number:

1 2 3 4
1 1 1 1


1 2 3 4
2 2 2 2


1 2 3 4
3 3 3 3


1 2 3 4
4 4 4 4


If we follow this zigzag path all along: one step to the right, then diagonally down, then one step
down, then diagonally up, then again one step to the right, and so on ad infinitum, we will cover
all positive fractions, one by one . In this way we have arranged all positive fractions in a row,
one by one. In other words, we can find a one-to-one correspondence for every positive rational
with the natural numbers. This discovery that the rational numbers are countable-in defiance of
our intuition- left such a deep impression on Cantor that he wrote to Dedekind: "Je le vois, mais
je ne le crois pas!" ("I see it, but I don’t believe it!").
Thus the natural numbers are countable, the integers are countable and the rationals are
countable. It seems as if everything is countable, and therefore all the infinite sets of numbers

Along our path we will encounter fractions that have already been met before under a different name such as
2/2, 3/3, 4/4, and so on; these fractions we simply cross out and then continue our path as before

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 221

you can care to mention - even ones our intuition tells contain more objects than there are natural
numbers - are the same size.
This is not the case.

2.31.3 Uncountably infinite sets


The next thing Cantor showed us is that the set of real numbers R is uncountable; that is there
is no one-to-one correspondence between N and R. Recall that the real numbers p are all the
irrationals ( those numbers that cannot be written as one integer divided by another: ; 2; e; : : :)
and the rationals together.
How he proved this? His proof consists of two steps:

 There are exactly the same number of points in any interval Œa; b as in the number line R.

 Using the above result, he proved that for the unit interval Œ0; 1, there is no one-to-one
correspondence between it and the set of natural numbers.

We focus on the second itemŽŽ . You’re might be guessing correctly that Cantor used a proof of
contradiction. And the proof must go like this. First, he assumed that all the decimals in Œ0; 1 is
countable. Second he would artificially create a number that is not in those decimals.
The following proof is taken from Bellos’ Alex adventure in numberland. It is based on
Hilbert’ hotel–a hypothetical hotel named after the German mathematician David Hilbert that
has an infinite number of rooms. One day there are infinite number of guests arriving at the
hotel. Each of these guests wears a T-shirt with a never-ending decimal between 0 and 1 (e.g.
0:415783113 : : :). The manager of this hotel is a genius and thus he was able to put all the guests
in the rooms:
room 1: 0:4157831134213468 : : :
room 2: 0:1893952093807820 : : :
room 3: 0:7581723828801250 : : :
room 4: 0:7861108557469021 : : :
room 5: 0:638351688264940 : : :
room 6: 0:780627518029137 : : :
:: ::
: :

Now what Cantor did was to build one real number that was not in the above list. Cantor used
a diagonal method as follows. First, he constructed the number that has the first decimal place
of the number in Room 1, the second decimal place of the number in Room 2, the third decimal
place of the number in Room 3 and so on. In other words, he was choosing the diagonal digits
ŽŽ
You can prove the first item using ...geometry.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 222

that are underlined here:


room 1: 0:4157831134213468 : : :
room 2: 0:1893952093807820 : : :
room 3: 0:7581723828801250 : : :
room 4: 0:7861108557469021 : : :
room 5: 0:638351688264940 : : :
room 6: 0:780627518029137 : : :
:: ::
: :

That number is 0:488157 : : : Second, he altered all the decimals of this number; he added one to
all the decimals. The final number is 0:599268 : : :. Now comes the best thing: This number is
not in room 1, because its first digit is different from the first digit of the number in room 1. The
number is not in room 2 because its second digit is different from the second digit of the number
in room 2, and we can continue this to see that the number cannot be in any room n. Although
Hilbert Hotel is infinitely large it is not enough for the set of real numbers.
So, now matter how big Hilber’s hotel is it cannot accommodate all the real numbers. The
set of real numbers is said to be uncountable. Now, we have countably infinite sets (such as
N; Z; Q) and uncountably infinite sets (such as R). With the right mathematics, Cantor proved
that there are infinities of different sizes.
There are more to say about set theory in Section 5.5.

2.32 Number systems


There is an old joke that goes like this:

There are only 10 types of people in the world: those who understand binary and
those who don’t.

If you got this joke you can skip this section and if you don’t, this section is for you.
Computers only use two digits: 0 and 1; which are
called the binary digits from which we have the word
"bit". In that binary world, how we write number 2? It
is 10. Now, you have understood the above joke. But
why 10 D 2? To answer that question we need to go
back to the decimal system. For unknown reason we–
human beings–are settled with this system. In this sys-
tem there are only ten digits: 0; 1; 2; 3; 4; 5; 6; 7; 8; 9.
How we write ten books then? There is no such digit in our system! Note that we’re allowed to
use only 0; 1; 2; 3; 4; 5; 6; 7; 8; 9. The solution is write ten as two digits: 10. To understand this
more, we continue with eleven (11), twelve (12), until nineteen (19). How about twenty? We do

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 223

the same thing: 20. Thus, any positive integer is a combination of powers of 10. Because of this
10 is called the base of the decimal system.
For the binary system we do the same thing, but with powers of 2 of course.
For example, 210 D 102 ; the subscripts to signify the number system; thus
102 is to denote the number ten in the binary system. Refer to the next figure
to see the binary numbers for 1 to 6 in the decimal system. With this, it is
straightforward to convert from binaries to decimals. For example 1112 D 1 
22 C121 C120 D 710 . How about the conversion from decimals to binaries?
We use the fact that any binary is a combination of powers of two. For example,
7510 D 64C8C2C1 D 26 C025 C024 C23 C022 C21 C20 D 10010112 .
One disadvantage of the binary system is the long binary strings of 1’s and 0’s needed
to represent large numbers. To solve this, the “Hexadecimal” or simply “Hex” number system
adopting the base of 16 was developed. Being a base-16 system, the hexadecimal number system
therefore uses 16 different digits with a combination of numbers from 0 through to 15. However,
there is a potential problem with using this method of digit notation caused by the fact that the
decimal numerals of 10, 11, 12, 13, 14 and 15 are normally written using two adjacent symbols.
For example, if we write 10 in hexadecimal, do we mean the decimal number ten, or the binary
number of two (1 + 0). To get around this tricky problem hexadecimal numbers that identify
the values of ten, eleven, . . . , fifteen are replaced with capital letters of A, B, C, D, E and F
respectively.
So, let’s convert the hex number E7 to the decimal number. The old rule applies: a hex
number is a combination of powers of 16. Thus E7 D 7  160 C 14  161 D 231.

2.33 Graph theory


This section is a brief introduction to graph theory. It all started with a mundane entertaining
exercise about crossing bridges back in 18th century in Europe (Section 2.33.1). I then present
the problem of map coloring and the famous four color theorem (Section 2.33.2). Youngsters
who would like to major in computer science should pay attention to graph theory.

2.33.1 The Seven Bridges of Königsberg


The city of Königsberg in Prussia (now Kaliningrad, Russia) was
set on both sides of the Pregel River, and included two large is-
lands—Kneiphof and Lomse—which were connected to each
other, and to the two mainland portions of the city, by seven
bridges. According to lore, the citizens of Königsberg used to
spend Sunday afternoons walking around their beautiful city.
While walking, the people of the city decided to create a game
for themselves: the game is to devise a way in which they could
walk around the city, crossing each of the seven bridges only once. Furthermore, it is possible
to start from one place and end the walk at another place. Even though none of the citizens of

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 224

Königsberg could invent a route that would allow them to cross each of the bridges only once,
still they could not understand why it was impossible. Lucky for them, Königsberg was not too
far from St. Petersburg, home of the famous mathematician Leonard Euler.
Carl Leonhard Gottlieb Ehler, mayor of Danzig, asked Euler for a solution to the problem in
1736. And this is what Euler replied (from [31]) seeing no connection between this problem and
current mathematics of the time:

Thus you see, most noble Sir, how this type of solution bears little relationship to
mathematics, and I do not understand why you expect a mathematician to produce it,
rather than anyone else, for the solution is based on reason alone, and its discovery
does not depend on any mathematical principle. Because of this, I do not know why
even questions which bear so little relationship to mathematics are solved more
quickly by mathematicians than by others.

Even though Euler found the problem trivial, he was still intrigued by it. In a letter written
the same year to the Italian mathematician and engineer Giovanni Marinoni, Euler said,

This question is so banal, but seemed to me worthy of attention in that [neither]


geometry, nor algebra, nor even the art of counting was sufficient to solve it.

And as it is often the case, when Euler paid attention to a problem he solved it. Since neither
geometry nor algebra (in other words current maths was not sufficient to solve this problem), in
the process he developed a new maths, which we now call graph theory.
The firs thing Euler did was to get rid of things that are irrelevant to the problem. Things
such as color of the bridges, of the water, how big the landmasses are are all irrelevant. Thus, he
drew a schematic of the problem shown in the left of Fig. 2.46. He labeled the landmasses as
A; B; C; D and the bridges a; b; c; d; e; f; g. The problem is just the connection between these
entities. Nowadays, we can go further: it is obvious that we do not have to draw the landmasses,
we can represent them as dots, and the bridges as lines (or curves). In the right figure of Fig. 2.46,
we did that and this is called a graph (denoted by G).

Figure 2.46: The schematic of the Seven Bridges of Königsberg and its graph.

What information can we read from a graph? The first things are: number of vertices and
number of edges. Is that all? If so, how can we differentiate one vertex from another? Thus, we

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 225

have to look at the number of edges that can be drawn from a vertex. To save words, of course
mathematicians defined a word for that: it is called the degree of a vertex. For example, vertex
C has a degree of five whereas vertices A; B; D both have a degree of three.
Now, we are going to solve easier graphs and see the pattern. Then we come back to the
Seven Bridges of Königsberg. We consider five graphs as shown in Fig. 2.47. Now, try to solve
these graphs and fill in a table similar to Table 2.22 and try to see the pattern for yourself before
continuing. Based on the solution given in Fig. 2.48, we can fill in the table.

Figure 2.47: Easy graphs to solve. The number on top each graph is to number them.

Table 2.22: Results of graphs in Fig. 2.47. An odd vertex is a vertex having an odd degree.

Shape # of odd vertices # of even vertices Yes/No

1 0 4 Yes
2 2 2 Yes
3 4 0 No
4 4 1 No
5 2 3 Yes

Figure 2.48: Solution to easy graphs in Fig. 2.47. A single arrow indicates the starting vertex and a double
arrow for the finishing vertex.

What do we see from Table 2.22? We can only find a solution whenever the number of
odd vertices is either 0 or 2. The case of 0 is special: we can start at any vertex and we end up
eventually at exactly the same vertex (Fig. 2.48). For the case of two: we start at an odd vertex,
and end up at another odd vertex.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 226

2.33.2 Map coloring and the four color theorem


Francis GuthrieŽŽ , while trying to color the map of counties of
England, noticed that only four different colors were needed. It
became the four color theorem, or the four color map theorem,
which states that no more than four colors are required to color
the regions of any map so that no two adjacent regions have the
same color.
How is this related to graph theory? Of course it is, this is no
different from the seven bridges of Königsberg. Indeed, if we place a vertex in the center of each
region (say in the capital of each state) and then connect two vertices if their states share a border,
we get a graph. Suddenly, we see that many problems fall under the umbrella of graph theory!
Coloring regions on the map corresponds to coloring the vertices of the graph (Fig. 2.49). Since
neighboring regions cannot be colored the same, our graph cannot have vertices colored the
same when those vertices are adjacent.

Figure 2.49: Coloring a map is equivalent to coloring the vertices of its graph.

In general, given any graph G, a coloring of the vertices is called (not surprisingly) a vertex
coloring. If the vertex coloring has the property that adjacent vertices are colored differently,
then the coloring is called proper. Every graph has a proper vertex coloring; for example, you
can color every vertex with a different color. But that’s boring! Don;t you agree? To make life
more interesting, we have to limit the number of colors used to a minimum. And we need a term
for that number. The smallest number of colors needed to get a proper vertex coloring is called
the chromatic number of the graph, written .G/.
We do not try to prove the four color theorem here. No one could
do it without using computers! It was the first major theorem to be
Algebra Physics
proved using a computer (proved in 1976 by Kenneth Appel and Wolfgang
Haken). Instead, we present one mundane application of graph coloring:
exam scheduling. Suppose algebra, physics, chemistry and history are four
courses in a college. And suppose that following pairs have common stu-
dents: algebra and chemistry, algebra and history, chemistry and physics. If Chemistry History

algebra and chemistry exam is held on the same day, students taking both
courses then have to miss at least one exam. They cannot take both at the same time. How do
we schedule exams in minimum number of days so that courses having common students are
ŽŽ
Francis Guthrie (1831-1899) was a South African mathematician and botanist who first posed the Four Color
Problem in 1852.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 227

not held on the same day? You can look at the graphs, where an edge is drawn between classes
having common students, and see the solutionŽ .
That’s all about graph for now. The idea is to inspire young students, especially those who
want to major in computer science in the future. If you’re browsing the internet, you are using
a graph. The story goes like this. In 1998, two Stanford computer science PhD students, Larry
Page and Sergey Brin, forever changed the World Wide Web as we know it. They created one of
the greatest universal website used daily. Google.com is one of the most successful companies
in the world. What was the basis for its success? It was the Google Search Engine that made
Larry Page and Sergey Brin millionaires.
The Google Search Engine is based one simple algorithm called PageRank. PageRank is an
optimization algorithm based on a simple graph. The PageRank graph is generated by having
all of the World Wide Web pages as vertices and any hyperlinks on the pages as edges. To un-
derstand how it works we need not only graphs, but also linear algebra (Chapter 11), probability
(Chapter 5) and optimization theory. Yes, there is no easy road to prosperity and fame.

2.34 Algorithm
To end this chapter I discuss a bit about algorithms for they are ubiquitous in our world. In
Section 2.34.1, I present probably the oldest algorithm: the Euclidean algorithm to find greatest
common divisor. Then, I show one application of this algorithm for a puzzle from the movie Die
hard (Section 2.34.2).

2.34.1 Euclidean algorithm: greatest common divisor


Let’s play a game: finding the greatest common divisor/factor (gcd) of two positive integers. The
gcd of two integers is the largest number that divides them both. The manual solution is: (1) to
list all the prime factors of these two numbers and (2) get the product of common factors and (3)
that is the gcd, illustrated below for 210 and 84:

210 D 2  3  5  7 D 42  5

84 D 2  2  3  7 D 42  2

Thus, the gcd of 210 and 84 is 42: gcd.210; 84/ D 42. Obviously if we need to find the gcd of
two big integers, this solution is terrible. Is there any better way?
If d is a common divisor to both a and b (assuming that a > b  0/, then we can write
a D d m and b D d n where m; n 2 N. Therefore, a b D d.m n/. What does this mean?
It means that d j.a b/ or d is also a divisor of a b ŽŽ . Conversely, if d is a common divisor
to both a b and b, it can be shown that it is a common divisor to both a and b. Therefore, the
set of common divisors of a and b is exactly the set of common divisors of a b and b. Thus,
Ž
We used two colors, so we need two days.
ŽŽ
One example: 5j10 and 5j25, and 5j.25 10/ or 5 is a divisor of 15.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 228

gcd.a; b/ D gcd.a b; b/. This is a big deal because we have replaced a problem with an easier
(or smaller) one for a b is smaller than a. So, this is how we proceed: to find gcd.210; 84/ we
find gcd.126; 84/ and to find gcd.126; 84/ we find gcd.42; 84/, which is equal to gcd.84; 42/
which is equal to gcd.42; 42/, and this is 42. I summarize what we have just done:

gcd.210; 84/
gcd.126; 84/
gcd.42; 84/ D gcd.84; 42/
gcd.42; 42/ D 42

We did not have to do this calculation forever for gcd.a; a/ D a for any integer. This algorithm
is better than the manual solution but it is slow: imagine we have to find the gcd of 1000 and 3,
too many subtractions. But if we look at the algorithm we can see many repeated subtractions:
for example 210 84 D 126 and 126 84 D 210 84 84 D 42. We can replace these two
subtractions by a single division: 42 D 210 mod 84 or 210 D 2  84 C 42. So, this is how we
proceed:
gcd.210; 84/ .210 D 2  84 C 42/
gcd.84; 42/ .84 D 2  42 C 0/
gcd.42; 0/ D 42 .gcd.a; 0/ D a/

It’s time for generalization. The problem is to find gcd.a; b/ for a > b > 0. The steps are a
repeated division: first a divide b to get the remainder r1 , then b divide r1 to get the remainder
r2 and so onŽŽ :
gcd.a; b/ .a D qb C r1 /; 0  r1 < b
gcd.b; r1 / .b D q1 r1 C r2 /; 0  r2 < r1
gcd.r1 ; r2 / .r1 D q2 r2 C r3 /; 0  r3 < r2
::: ::: :::

We have obtained a sequence of numbers:

b > r1 > r2 > r3 >    > 0

Since the remainders decrease with every step but can never be negative, eventually we must
meet a zero remainder, at which point the procedure stops. The final nonzero remainder is the
greatest common divisor of a and b.
What we have just seen is the Euclidean algorithm, named after the ancient Greek mathemati-
cian Euclid, who first described it in his Elements (c. 300 BC). It is an example of an algorithm,
a step-by-step procedure for performing a calculation according to well-defined rules, and is one
of the oldest algorithms in common use. About it, Donald Knuth wrote in his classic The Art of
Computer Programming: "The Euclidean algorithm is the granddaddy of all algorithms, because
it is the oldest nontrivial algorithm that has survived to the present day."

ŽŽ
Note that when we do a long division for a=b we get a remainder always smaller than the divisor b.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 229

2.34.2 Puzzle from Die Hard


In the movie Die Hard 3, John McClane had to defuse a bomb by placing exactly 4 gallons of
water on a sensor. The problem is, he only has a 5 gallon jug and a 3 gallon jug on hand. This
problem may seem impossible without a measuring cup. But McClane solved it (and just in
time) and I am sure so do you. What is interesting is how mathematicians solve it.
First, mathematicians consider a general problem not a specific one that McClane solved.
This is because they’re lazy and do not want to be in McClane’s position more than one time.
Second, in addition to solving the problem, they also wonder when the problem is solvable.
Knowing how to answer this question will save them time when they have to solve this problem
“With only a 2 gallon jug and a 4 gallon jug, how to get one gallon of water”.
It is interesting to know that solution to this problem lies in the Euclidean algorithm. Take
for example the problem of finding gcd.34; 19/, using the Euclidean algorithm we do:
34 D 19.1/ C 15 ; gcd.19; 15/
19 D 15.1/ C 4 ; gcd.15; 4/
15 D 4.3/ C 3 ; gcd.4; 3/ (2.34.1)
4 D 3.1/ C 1 ; gcd.3; 1/
3 D 3.1/ C 0 ; gcd.1; 0/
Thus, gcd.34; 19/=1: they are called coprimes or relatively primes. Now we go backwards,
starting from the second last equation with the non-zero remainder of 1 which is the gcd of 34
and 19, we express 1 in terms of 3–the remainder of the previous step, then we do the same thing
for 3. The steps are
1 D 4 .1/3
D 4 .1/.15 4.3// D .4/4 .1/15 .replaced 3 by 3rd eq in Eq. (2.34.1)/
D .4/Œ19 .15/1 .1/15 D 4.19/ .5/15 .replaced 4 by 2nd eq in Eq. (2.34.1)/
D 4.19/ .5/Œ34 19.1/ D 5.34/ C 5.19/ .replaced 15 by 1st eq in Eq. (2.34.1)/
What did we achieve after all of this boring arithmetic? We have expressed gcd.34; 19/, which
is 1, as 5.34/ C 5.19/. This is known as Bézout’s identity: gcd.a; b/ D ax C by, where
a; b; x; y 2 Z. In English, the gcd of two integers a; b can be written as an integral linear
combination of a and b. (A linear combination of a and b is just a nice name for a sum of
multiples of a and multiples of b.)
How does this identity help us to solve McClane’s problem? Let a D 5 (5 gallon jug) and
b D 3, then gcd.5; 3/ D 1. The Bézout identity tells us that we can always write 1 D 5x C 3y,
which gives us 4 D 5x 0 C 3y 0 (we need 4 as the problem asked for 4 gallons of water, and
if you’re wondering what is x 0 , it is 4x). It is easy to see that the solutions to the equation
4 D 5x 0 C 3y 0 are x 0 D 2 and y 0 D 2: 4 D 5.2/ C .3/. 2/. This indicates that we need to
fill the 5-gallon jug twice and drain out (subtraction!) the 3-gallon jug twice. That’s the rule to
solving the puzzleŽŽ .
Now is time for this problem “With only a 2 gallon jug and a 4 gallon jug, how to get one
gallon of water”. Here a D 4 and b D 2, we then have gcd.4; 2/ D 2. Bézout’s identity tells us
ŽŽ
Details can be seen in the movie or youtube.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 230

that 2 D 4x C 2y (one solution is .1; 1/). But the problem asked for one gallon of water, so we
need to find x 0 and y 0 so that 1 D 4x 0 C 2y 0 . After having spent quite some time without success
to find those guys x 0 and y 0 , we came to a conjecture that 1 cannot be written as 4x 0 C 2y 0 . And
this is true, because the smallest positive integer that can be so written is the gcd.4; 2/, which
is 2Ž .

2.35 Review
We have done lots of things in this chapter. It’s time to sit back and think deeply about what we
have done. We shall use a technique from Richard Feynman to review a topic. In his famous
lectures on physics [22], he wrote (emphasis is mine)

If, in some cataclysm, all of scientific knowledge were to be destroyed, and only
one sentence passed on to the next generations of creatures, what statement would
contain the most information in the fewest words? I believe it is the atomic hypoth-
esis (or the atomic fact, or whatever you wish to call it) that all things are made
of atoms—little particles that move around in perpetual motion, attracting each
other when they are a little distance apart, but repelling upon being squeezed into
one another. In that one sentence, you will see, there is an enormous amount of
information about the world, if just a little imagination and thinking are applied.

I emphasize that using Feynman’s review technique is a very efficient way to review any
topic for a good understanding of it (and thus useful for exam review). Only few key information
are needed to be learned by heart, others should follow naturally as consequences. This avoids
rote memorization, which is time consuming and not effective.
I have planned to do a review of algebra starting with just one piece of knowledge, but I soon
realized that it is not easy. So I gave up. Instead I provide some observations (or reflection) on
what we have done in this chapter (precisely on what mathematicians have done on the topics
covered here):

 By observing objects in our physical world and deduce their patterns, mathematicians
develop mathematical objects (e.g. numbers, shapes, functions etc.) which are abstract (we
cannot touch them).

 Even though mathematical objects are defined by humans, their properties are beyond us.
We cannot impose any property on them, what we can do is just discover them.

 Quite often, mathematical objects live with many forms. For example, let’s consider 1,
it can be 12 , 13 or sin2 x C cos2 x etc. Using the correct form usually offers the way to
something. And note that we also have many faces too.
Ž
Note that d D gcd.a; b/ divides ax C by. If c D ax 0 C by 0 then d jc, or c D d n  d . Thus d is the smallest
positive integer which can be written as ax C by.

Phu Nguyen, Monash University © Draft version


Chapter 2. Algebra 231

 Things usually go in pairs: boys/girls, men/women, right/wrong etc. They are opposite of
each other. In mathematics, we have the same: even/odd numbers, addition/subtraction,
multiplication/division, exponential/logarithm, and you will see differentiation/integration
in calculus.

 Mathematicians love doing generalization. They first have arithmetic for numbers, then
they have arithmetic for functions, for vectors, for matrices. They have two dimensional
and three dimensional vectors (e.g. a force), and then soon they develop n-dimensional vec-
tors where n can be any positive integer! Physicists only consider a 20-dimensional space.
But the boldest generalization we have seen in this chapter was when mathematicians
extended the square root of positive numbers to that of negative numbers.

 From a practical point of view all real numbers are rational ones. The distinction between
rational and irrational numbers are only of value to mathematics itself. Our measurements
always yield a terminating decimal e.g. 3.1456789 which is a rational number.

Is this algebra the only one kind of algebra? No, no, no. Later on we shall meet vectors, and
we have vector algebra and its generalization–linear algebra. We also meet matrices, and we
have matrix algebra. We have tensors, and we have tensor algebra. Still the list goes on; we have
abstract algebra and geometric algebra.

Phu Nguyen, Monash University © Draft version


Chapter 3
Geometry and trigonometry

Contents
3.1 Euclidean geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
3.2 Area of curved figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
3.3 Trigonometric functions: right triangles . . . . . . . . . . . . . . . . . . 276
3.4 Trigonometric functions: unit circle . . . . . . . . . . . . . . . . . . . . . 278
3.5 Degree versus radian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
3.6 Some first properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280
3.7 Sine table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282
3.8 Trigonometry identities . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
3.9 Inverse trigonometric functions . . . . . . . . . . . . . . . . . . . . . . . 293
3.10 Inverse trigonometric identities . . . . . . . . . . . . . . . . . . . . . . . 293
3.11 Trigonometry inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . 296
3.12 Trigonometry equations . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
3.13 Generalized Pythagoras theorem . . . . . . . . . . . . . . . . . . . . . . 305
3.14 Graph of trigonometry functions . . . . . . . . . . . . . . . . . . . . . . 306
3.15 Hyperbolic functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
3.16 Applications of trigonometry . . . . . . . . . . . . . . . . . . . . . . . . . 315
3.17 Infinite series for sine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318
3.18 Unusual trigonometric identities . . . . . . . . . . . . . . . . . . . . . . . 320
3.19 Spherical trigonometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324
3.20 Analytic geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
3.21 Solving polynomial equations algebraically . . . . . . . . . . . . . . . . . 334
3.22 Non-Euclidean geometries . . . . . . . . . . . . . . . . . . . . . . . . . . 338

232
Chapter 3. Geometry and trigonometry 233

3.23 Computer algebra systems . . . . . . . . . . . . . . . . . . . . . . . . . . 338


3.24 Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339

Geometry (means "earth measurement") is one of the oldest branches of mathematics. It


is concerned with properties of space that are related with distance, shape, size, and relative
position of figures. A mathematician who works in the field of geometry is called a geometer.
The first known application of geometry was surveying in Egypt, where annual flooding of the
Nile eradicated the demarcations between fields. Geometry provided the survival tools needed
to accurately and efficiently redivide the land into individual holdings after the floods.
Trigonometry (from Greek trigōnon, "triangle" and metron, "measure") is a branch of mathe-
matics that studies relationships between side lengths and angles of triangles. The field emerged
during the 3rd century BC, from applications of geometry to astronomical studies. This is now
known as spherical trigonometry as it deals with the study of curved triangles, those triangles
drawn on the surface of a sphere. Later, another kind of trigonometry was developed to solve
problems in various fields such as surveying, physics, engineering, and architecture. This field
is called plane trigonometry or simply trigonometry. And it is this trigonometry that is the main
subject of this chapter.
In learning trigonometry in high schools a student often gets confused of the following
facts. First, trigonometric functions are defined using a right triangle (e.g. sine is the ratio of
the opposite and the hypotenuse). Second, trigonometric functions are later on defined using a
unit circle. Third, the measure of angles is suddenly switched from degree to radian without a
clear explanation. Fourth, there are two many trigonometry identities. And fifth, why we have
to spend time studying these triangles? In this chapter we try to make these issues clear.
My presentation of trigonometry does not follow its historical development. However, I
nevertheless provide some historical perspective to the subject.
We start with the Eucledian geometry in Section 3.1. Then, Section 3.3 introduces the
trigonometry functions defined using a right triangle (e.g. sin x). Then, trigonometry func-
tions defined on a unit circle are discussed in Section 3.4. A presentation of degree versus
radian is given in Section 3.5. We discuss how to compute the sine for angles between 0 and
360 degrees in Section 3.7, without using a calculator of course. Trigonometry identities (e.g.
sin2 x C cos2 x D 1 for all x) are then presented in Section 3.8, and Section 3.9 outlines inverse
trigonometric functions e.g. arcsin x. Next, inverse trigonometry identities are treated in Sec-
tion 3.10. We devote Section 3.11 to trigonometry inequalities, a very interesting topic. Then
in Section 3.12 we present trigonometry equations and how to solve them. The generalized
Pythagorean theorem is treated in Section 3.13. Graph of trigonometry functions are discussed
in Section 3.14. Hyperbolic functions are treated in Section 3.15. Some applications of trigonom-
etry is given in Section 3.16. A power series for the sine function, as discovered by ancient Indian
mathematicians, is presented in Section 3.17. With it it is possible to compute the sine for any
angle. An interesting trigonometric identity of the form sin ˛ C sin 2˛ C    C sin n˛ is treated
in Section 3.18.
In Section 3.19 I briefly introduce spherical trigonometry as this topic has been removed
from the high school curriculum. Analytic geometry in which coordinates and algebra are used to
describe geometry is treated in Section 3.20 and a brief introduction to non-Euclidean geometries

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 234

is given in Section 3.22. Finally, a brief introduction to CAS (computer algebra system) is given
in Section 3.23, so that students can get acquaintance early to this powerful tool.

3.1 Euclidean geometry


Where there is matter, there is geometry. (Johannes Kepler)

I did not enjoy Euclidean geometry in high school. I am not sure why but it might be due
to the following reasons. First, it requires compass and straightedge and everything should be
perfect; in contrast, with just a pencil and I can do algebra. Second, a geometry problem is too
narrow in the sense that there are too many things (e.g. triangles, circles, points,...) inside a
diagram and it’s hard to see what is going on.
But this book would be incomplete without mentioning Euclidean geometry, especially Eu-
clid’s The Elements. Why? Because Euclid’s Elements has been referred to as the most success-
ful and influential textbook ever written. It has been estimated to be second only to the Bible
in the number of editions published since the first printing in 1482. Moreover, without a proper
introduction of Euclid’s geometry it would be awkward to talk about trigonometry–a branch of
mathematics which is based on geometry.
Euclid’s geometry, or Euclidean geometry, is a mathematical system attributed to Alexan-
drian Greek mathematician Euclid, which he described in his textbook The Elements. Written
about 300 B.C., it contains the results produced by fine mathematicians such as Thales, Hippias,
the Pythagoreans, Hippocrates, Eudoxus. The Elements begins with plane geometry: lines, cir-
cles, triangles and so on. These shapes are abstracts of the real geometries we observe in nature
(Fig. 3.1). It goes on to the solid geometry of three dimensions. Much of the Elements states
results of what are now called algebra and number theory, explained in geometrical language.

Figure 3.1: Geometry in nature: circle, rectangle and hexagon (from left to right). We also see circles in
the ripples on a pond, in the the human eye, and on butterflies’ wings.

Euclidean geometry is an example of synthetic geometry, in that it proceeds logically from


axioms describing basic properties of geometric objects such as points and lines, to propositions
about those objects, all without the use of coordinates to specify those objects. This is in contrast
to analytic geometry–which was created by René Descartes and Pierre de Fermat in the 17th
century–which uses coordinates to translate geometric propositions into algebraic formulas. This
analytic geometry will be briefly discussed in Section 3.20.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 235

This short section cannot present all the detail of Euclidean geometry. Instead only a few
important topics (the choice is somewhat personal) of it are presented. First, some basic concepts
are first introduced (Section 3.1.1). Then, the structure of The Elements is treated in Section 3.1.2.
Next, the influence of Euclid’s The Elements is discussed (Section 3.1.3). The difference between
the algebraic thinking and geometric thinking is given in Section 3.1.4.
I go into detail in the remaining sections. The fifth postulate is studied in Section 3.1.5.
Area of simple geometrical shapes is studied in Section 3.1.6. The topic of congruence and
similarity is given in Section 3.1.7. Section 3.1.8 presents a discussion related to the question ‘is
square a rectangle’; Section 3.1.9 is devoted to interior and exterior angles in convex polygons.
Some important theorems about circles are presented in Section 3.1.10. This is followed by
Section 3.1.11 in which some topics on tangent lines to a circle are given. Section 3.1.12.
Tessellation–the mathematics of tiling–is discussed in Section 3.1.14. Section 3.1.15 is about
regular polyhedra or Platonic solids. And Section 3.1.16 discusses Euler’s polyhedra formula.
This formula is not found in the standard curriculum taught in the schools. Some high school
students may know Euler’s formula, but most students of mathematics do not encounter this
relation until college.
I have read and learned from the following excellent books:

 Paul Lockhart’s Measurement [43].

 The Wonder Book of Geometry: A Mathematical Story by David Acheson [1].

 Geometry by Its History by Alexander Ostermann and Gerhard Wanner[3].

 Tales of Impossibility: The 2000-Year Quest to Solve the Mathematical Problems of Antiq-
uity by David Richeson [58].

 The Four Pillars of Geometry by John Stillwell [65].

If you want to read Euclid’s Elements, check this website outŽ . I still do find solving geometric
problems hard. But technology helps; using a colorful iPad really helps and software such as
geogebra is very useful.

3.1.1 Basic concepts


Euclid’s geometry operates with basic objects such as points, lines, triangles, and circles
(Fig. 3.2). Beauty is often found in regularity, symmetry, and perfection. We are all familiar
with the two-dimensional regular polygons. A polygon is regular if every side has the same
length and every interior angle has the same measureŽŽ . The equilateral triangle is the only regu-
lar three-sided polygon, the square is the only regular four-sided polygon, and so forth (Fig. 3.3).
Ž
Link: http://aleph0.clarku.edu/~djoyce/elements/bookI/bookI.html.
ŽŽ
Note that this definition is not needed for the equilateral triangles: we can just simply require that either the
three sides are congruent or the three angles are congruent. We cannot do that for n gons for n  4. Think of a
square and push two opposite vertices a bit, it will become a diamond of which four sides have same lengths but
not same angles. Because of this triangles are rigid.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 236

There are infinitely many regular polygons: one n-sided polygon for every integer n greater than
two. Triangles, rectangles, circles etc. are 2D shapes as they live on a plane.
C
Q
area
˛
P B O r
P s A

POINT LINE/SEGMENT TRIANGLE CIRCLE

Figure 3.2: Basic objects in Euclid’s geometry. Naming convention: Points are customarily named using
capital letters of the alphabet (e.g. A; B; C; D; : : :). Other figures, such as lines, triangles, or circles, are
named by listing a sufficient number of points to pick them out unambiguously from the relevant figure,
e.g. triangle ABC is a triangle with vertices at points A; B, and C . Angles can be identified by the three
points that define them. For example, the angle with vertex A formed by the rays AB and AC (that is, the
lines from point A to points B and C ) is denoted by †BAC (or †CAB). The symbols ˛; ˇ; etc. are
often used to denote the size of angles; e.g. ˛ D 30ı . To a segment, we assign a positive real number s to
quantify its length and to a close figure such as triangles/circles we assign a positive real number called
area that quantifies the space inside the figure.

edge

vertex

Figure 3.3: Basic objects in Euclid’s geometry: regular polygons with 3; 4; 5; 6 and 7 sides. For an n gon
(i.e., an n-sided polygon) the interior angle ˛ in degrees is: ˛ D 180ı .1 2=n/. For a proof, check
Section 3.1.9. For a regular polygon there always exists a circle that passes through all the vertices of the
polygon; this circle is called the circumscribed circle or circumcircle (shown for the n D 3 case as the
blue circle).

All polygons in Fig. 3.3 are special: they are called convex polygons. The line segment
between two points of a convex polygon is contained in the union of the interior and the boundary
of the polygon (Fig. 3.4). Compared with non-convex polygons, convex polygons are not only
more beautiful but also much easier to work with.
Euclidean geometry also considers 3D solids such as pyramids, cylinders, cones, spheres
(Fig. 3.5). It then studies the properties of these objects such as the length of a segment (i.e., a
part of a line), the area of a triangle/circle and the volume of a sphere/cone etc. This chapter,
however, does not discuss the volume problem. I postpone this topic to Section 4.3 in Chapter 4
about calculus. This is so because initial ideas of calculus originate from the way ancient math-
ematicians such as Archimedes calculated the area and volume of curved shapes (i.e., circle,
cylinder, sphere).
Similar to numbers–an abstract concept in geometry (e.g. points, lines etc.) are also abstract.
For example, a point does not have size. A line does not have thickness and a line in geometry

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 237

Figure 3.4: Convex polygons (left) versus non-convex polygons (right). The line segment between two
points of a convex polygon is contained in the union of the interior and the boundary of the polygon i.e.,
the shaded region.

PYRAMID CONE SPHERE

Figure 3.5: Basic objects in Euclid’s geometry: (square-based) pyramid, cone and sphere.

is perfectly straight! And certainly mathematicians don’t care if a line is made of steel or wood.
There are no such things in the physical world.
In what follows is presented a discussion on how Euclid organized his writing in the Ele-
ments.

3.1.2 The structure of Euclid’s Elements


No man can talk well unless he is able first of all to define to himself what he is
talking about. Euclid, well studied, would free the world of half its calamities, by
banishing half the nonsense which now deludes and curses it.
(Abraham Lincoln)

The structure of Euclid’s Elements is as follows:

1. Twenty three definitions of the basic concepts: point, line, triangle, circle etc. Euclid
defined a point as: A point is that which has no part (definition 1 in Book I). Definition 23
is about parallel lines: lines that never meet, think of straight railway tracks for example.

2. Five common notions and five postulates on which all subsequent reasoning is based. For
example, common notion 1 states that “Things equal to the same thing are equal to each
other” (which we now write if a D b, c D b then a D c).

3. Using the above definitions, common notions and postulates, Euclid proceeded to prove
many theorems, which he called propositions. Many propositions are proved using other
propositions which have been proved earlier.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 238

Thus, the Elements is a self-contained logically consistent system for geometry. Now, we
discuss the five postulates. Euclid’s five postulates are:

 Any two points can be joined by a straight line.

 Any straight line segment can be extended indefinitely in a straight line.

 Given any straight line segment, a circle can be drawn having the segment as radius and
one endpoint as center.

 All right angles are congruent (i.e., the same).

 If a straight line n falling on two straight lines (m and n) make the interior angles on the
same side less than two right angles (i.e., ˛ C ˇ < 180ı ), if produced indefinitely, meet
on that side on which are the angles less than the two right angles (Fig. 3.6).

The second postulate tells us that we can always make a line


segment longer. That means that we never run out of space; that
is, space is infinite. The third postulate allows for the existence of
circles of any size and center–say center A and radius AB. Pos-
tulates 1 and 3 set up the "straightedge and compass" framework
that was a standard for geometric constructions. A compass, more
accurately known as a pair of compasses, is a technical drawing
instrument that can be used for drawing circles or arcs. On the other hand, a straightedge is a
tool which can draw a perfectly straight line through any two points but which cannot be used
as a ruler to measure lengths. Why Euclid needed the fourth postulate which basically says that
all 90ı are equal? This is to indicate that space is homogeneous and isotropic (i.e., everywhere
on the plane right angles are the same).
It is clear that the fifth postulate is different from the other four.
The first four were simple assertions that few would be inclined n
to doubt. Far from being instantly self-evident, the fifth postulate l
was even hard to read and understand (we need to draw a figure). ˛
It did not satisfy Euclid and he tried to avoid its use as long as
possible - in fact the first 28 propositions of The Elements are ˇ
proved without using it. m
Once these postulates are accepted, a lot of facts known as
theorems (Euclid called them propositions) follows. To illustrate Figure 3.6: The fifth postulate
one such proposition, I present Proposition 1: On a given finite of Euclidean geometry.
straight line, construct an equilateral triangle. The purpose is to
demonstrate how definitions and postulates are used in solving the problem. Furthermore, it is to
show that Euclidean Geometry is constructive. Postulates 1, 2, 3, and 5 assert the existence and
uniqueness of certain geometric figures, and these assertions are of a constructive nature: that is,
we are not only told that certain things exist, but are also given methods for creating them with
no more than a compass and an unmarked straightedge.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 239

C
This is how Euclid constructed an equilateral triangle:
 We start with the line AB, and we need to construct the
triangle ABC such that all sides are of equal length. A B

 The point is where is C ? As AC D AB C is on the circle


centered at A with radius AB. So, we draw this circle (and
the reason why we can do this is Postulate 3).

 In the same manner, BC D BA, so C is on the circle


centered at B with radius BA. So, we draw this circle.

 Finally C is the intersection of these two circles.


It should be, however, mentioned that Euclid states no axiom about the intersection of circles,
so he has not justified the existence of the point C used in his very first proposition! There are
many such situations in Euclid’s Elements, in which Euclid assumes something is true because
it looks true in the diagram.
I now present yet another theorem and the characteristics of a geometry proof. Let’s take the
case of a triangle inside a semicircle. If we play with it long enough, we will see one remarkable
thing: no matter where on the circle we place the tip of the triangle, it always forms a nice
right triangle (Fig. 3.7a). But is it true? We need a proof. In the same figure, I present a proof
commonly given in high school geometry classes. This proof uses the fact that the sum of angles
in a triangle is 180ı or , see Section 3.1.5 for proof of this fact. A complete proof would be
more verbose than what I present here. Does it exist a better (that means elegant) proof? See
Fig. 3.7b: ACBC 0 is a rectangle and thus ABC is a right triangle! This fact is known as Thales’
theorem.

In Fig. 3.7 hatch marks are used to denote equal measures of angles, arcs, line segments, or
other elements. For example, angles OAC and ACO have the same hatch marks to indicate that
they are equal.

3.1.3 Influence of The Elements


The Elements is still considered a masterpiece in the application of logic to mathematics. It has
proven enormously influential in many areas of science. Many scientists, the likes of Nicolaus
Copernicus, Johannes Kepler, Galileo Galilei, Albert Einstein and Isaac Newton were all influ-
enced by the Elements. When Newton wrote his masterpiece Philosophiæ Naturalis Principia
Mathematica (Mathematical Principles of Natural Philosophy), he followed Euclid with defini-
tion, axioms and theorems in that order. Albert Einstein recalled a copy of the Elements and
a magnetic compass as two gifts that had a great influence on him as a boy, referring to The
Elements as the "holy little geometry book".
The austere beauty of Euclidean geometry has been seen by many in western culture as a
glimpse of an otherworldly system of perfection and certainty. Abraham Lincoln kept a copy
of Euclid in his saddlebag, and studied it late at night by lamplight; he related that he said to

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 240

C
C
C
ˇ ˛

ˇ 180ı
˛
A B A B A B
O O O

C0
(a) In triangle ABC : 2.˛ C ˇ/ D  (b) ACBC 0 is a rectangle

Figure 3.7: The angle inscribed in a semicircle is always a right angle (90ı ): two proofs. (a) The angles
of the same marking are equal, and thus we have 2.˛ C ˇ/ D  or ˛ C ˇ D =2. The key to the
proof is to draw the line OC which does not initially exist in the problem. And (b) Rotate ABC 180ı
counterclockwise we get a new triangle ABC 0 . Alternatively, draw OC (as in the first proof) but extend
it to meet the circle at C 0 (to have symmetry). Now the box ACBC 0 is a parallelogram. But it is a special
parallelogram: the two diagonals are equal (both are diameters of the circle). So, ACBC 0 must be a
rectangle. The second proof is considered to be better than the first because we can “see” the result.

himself, "You never can make a lawyer if you do not understand what demonstrate means; and
I left my situation in Springfield, went home to my father’s house, and stayed there till I could
give any proposition in the six books of Euclid at sight". Thomas Jefferson, a few years after he
finished his second term as president, he wrote to his old friend John Adams on 1 January 1812:
"I have given up newspapers in exchange for Tacius and Thucydides, for Newton and Euclid;
and I find myself much happier".

3.1.4 Algebraic vs geometric thinking


To emphasize the importance of geometry, we turn to the story of Paul Dirac
(1902 – 1984), an English theoretical physicist who is regarded as one of the
most significant physicists of the 20th century. In the thirteen hundred (or so)
pages of his published work, Dirac had no use at all for diagrams. He never
used them publicly for calculation. His books on general relativity and quantum
mechanics contained not a single figure.
Therefore it seems reasonable to assume that Dirac would consider himself
an algebraist. On the contrary, he wrote in longhand in his archives something
remarkable:

There are basically two kinds of mathematical thinking, algebraic and geometric.
A good mathematician needs to be a master of both. But still he will have a prefer-
ence for one rather or the other. I prefer the geometric method. Not mentioned in
published work because it is not easy to print diagrams. With the algebraic method
one deals with equations between algebraic quantities. Even tho I see the consis-
tency and logical connections of the equations, they do not mean very much to me.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 241

I prefer the relationships which I can visualize in geometric terms. Of course with
complicated equations one may not be able to visualize the relationships e.g. it may
need too many dimensions. But with the simpler relationships one can often get help
in understanding them by geometric pictures.

One remarkable thing happened in Dirac’s life is that he learned projective geometry early in
his life (in secondary school at Bristol). He wrote "This had a strange beauty and power which
fascinated me". Projective geometry provided Dirac new insight into Euclidean space and into
special relativity.
Of course Dirac could not know that his early exposure to projective geometry would be vital
to his future career in physics. We simply can’t connect the dots looking forward, as Steven Jobs
(February 24, 1955 – October 5, 2011)–the Apple co-founder –once said in his famous 2005
commencement speech at Stanford University:

You can’t connect the dots looking forward; you can only connect them looking
backwards. So you have to trust that the dots will somehow connect in your future.
You have to trust in something — your gut, destiny, life, karma, whatever. This
approach has never let me down, and it has made all the difference in my life.

History note 3.1: Euclid (fl. 300 BC)


Euclid (fl. 300 BC) was a Greek mathematician, often referred to as the
"founder of geometry" or the "father of geometry". His Elements is one
of the most influential works in the history of mathematics, serving as
the main textbook for teaching mathematics (especially geometry) from
the time of its publication until the late 19th or early 20th century. In
the Elements, he deduced the theorems of what is now called Euclidean
geometry from a small set of axioms. Euclid also wrote works on perspec-
tive, conic sections, spherical geometry, number theory, and mathematical
rigor.

3.1.5 The fifth postulate and consequences


Figure 3.8a illustrates the situation described by the fifth postulate, which is what happens when
the two lines are not parallel. So, the fifth postulate says: if ˛ C ˇ < 180ı then l intersects m on
the right. The negative of this postulate is also true. Therefore, it follows that when ˛Cˇ D 180ı ,
then l does not meet m; or l is parallel to m (we express this symbolically as l k m). So, if
˛ C ˇ D 180ı then l k m. By introducing alternate interior angles, which are angles that occur
on opposite sides of the transversal line n (Fig. 3.8b), we hence also have: if alternate angles are
equal then l k m. This is Proposition I.27 in the Elements.
So, if ˛ C ˇ D 180ı then l k m. How about the converse? Is it true that if l k m then
˛ C ˇ D 180ı ? This is true and is Proposition I.29. How to prove it? Proof of contradiction
is the way to go. Assume that ˛ C ˇ < 180ı , then by the fifth postulate the lines will meet,

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 242

˛ n n l

˛ l ˇ ˛
ˇ ˛ ˇ
m m
a/ b/

Figure 3.8: For two parallel lines l and m the alternate angles ˛ (or ˇ) are equal. In the left figure the two
angles ˛ are equal and are called vertical angles.

which is contradict with our assumption that they are parallel. So, ˛ C ˇ D 180ı . Do we need
to consider ˛ C ˇ > 180ı ? No.

Playfair’s Parallel postulate. Euclid’s fifth postulate is wordy. The Scottish mathematician
John Playfair rephrased it in 1795 as: Given a line and a point not on that line, there exists only
one parallel line to the original line going through that point. It now bears the name the Parallel
Postulate. We need to make sure that Playfair’s Parallel postulate and the fifth postulate are
equivalent. To this end, we have to prove two things: (1) from Playfair’s Parallel postulate we
can get the fifth postulate and vice versa. I omit these proofs for brevity.

Sum of angles in a triangle is 180ı . I know present a beautiful property C l


of triangles that the angle sum of a triangle is 180ı . I emphasize that ˛

ˇ
this result is a consequence of the parallel postulate. If ˛; ˇ and are
the angles of any triangle then ˛ C ˇ C D 180ı . To prove this result,
draw a line l through C , which is parallel to AB. Then the angle on the
ˇ
left beneath l is alternate to the angle ˛ in the triangle, so it is equal ˛
A B
to ˛. Similarly, the angle on the right beneath l is equal to ˇ. Then,
considering the angles at C , we have ˛ C ˇ C D 180ı .

3.1.6 Area of simple geometries


Similar to using a ruler to measure the length of one dimensional objects (e.g. a stick), to
determine the area of a shape, we also need something to compare with. And the conventional
choice is the amount of space of a square of unit sides (this square is called a unit square), see
Fig. 3.9. So the measurement of area really boils down to the question, how much room is any
shape taking up compared to a unit square?
Some areas are relatively easy to measure. And obviously the easiest area is that of a rectangle
as it can fit with the unit square. For example, suppose we have a 5 by 3 rectangle, we can chop
it into 15 unit squares. Thus, its area is 15 (Fig. 3.9). So, if the sides of a rectangle are nice whole
numbers, the area is then the product of the length of the two sides. But what if the sides are not
whole numbers? Then use a smaller unit square. The area is still the product of the length of the
two sides.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 243

3
1
5 1

Figure 3.9: Area of 2D shapes: an area unit is the amount of space taken up by a unit square.

Next comes the triangles. Suppose we have a triangle with base b and height h (see
Fig. 3.10a). What is its area? The way to get the triangle’s area is to use the area of a rect-
angle. That is, using the known to determine the unknown. We put the triangle inside a rectangle
with sides b and h. Then, by dropping a line CH perpendicular to AB, we see that the triangle’s
area is half of that of the rectangle. Therefore, we get the formula of 1=2bh.

a+b

C E C D
1
(a + b)h
2
h h
1 1
ah bh
2 2
A B A H B
b a b
(a) (b)

Figure 3.10: The area of a triangle is related to the area of the bounding rectangle.

Nice. But how about a slanting triangle as the orange triangle in Fig. 3.10b? In this case
point C lies outside the bounding rectangle, so the above formula might not work? Again, we
use a rectangle of which the area is .a C b/h. This area is equal to the sum of areas of the two
right triangles and the orange triangle. From that, we can see that the area of the triangle is still
1=2bh.

Next comes polygons. The area of a polygon is the sum of all the sub-triangles making up
the polygon. We can see from this that ancient mathematicians computed the area of new more
complex geometries based on the known area of old simpler geometries. See Fig. 3.11 for one
example.

Heron’s formula. What is the area of a triangle in terms of A


its sides a; b; c? The formula is credited to Heron (or Hero) of
Alexandria, and a proof can be found in his book, Metrica, writ-
ten c. CE 60. It has been suggested that Archimedes knew the
formula over two centuries earlier. I now present a derivation of c h b
this formula using the Pythagorean theorem.
First, the area is computed using the familiar formula "half
of the base multiplied with the height": A D 1=2ah. Second, the B a C
height is expressed in terms of a; b; c. Refer to the figure, there x y

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 244

A D bh A D 0:5.a C b/h
A B D a C

h h

A B
D H C
b b
Figure 3.11: Area of parallelogram and trapezoid. A parallelogram is a quadrilateral with two pairs of
parallel sides. The opposite or facing sides of a parallelogram are of equal length. A quadrilateral with at
least one pair of parallel sides is called a trapezoid. We find the area of a parallelogram by dividing it into
two congruent triangles (shaded in the left diagram); the area of one such triangle is 0:5bh.

are 3 equations to determine x; y; h:


9
xCy Da > = a b2 c2 a b2 c2
x 2 C h2 D c 2 H) x D ; yD C ; h2 D c 2 x2
>
; 2 2a 2 2a
y 2 C h2 D b 2
As we have h2 , let’s compute the square of the area:
4A2 D a2 .c 2 x2/
  
2 2 a 2b2 c2 a b2 c2
4A D a .c x/.c C x/ D a c C cC
2 2a 2 2a
 2 2 2
 2 2 2

2ac a C b c 2ac C a b Cc
4A2 D a2
2a 2a
2 2 2 2 2
16A D Œb .a c/ Œ.a C c/ b  D .b C a c/.b a C c/.a C c C b/.a C c b/
If we introduce s D 0:5.a C b C c/ – the semi-perimeter of the triangle– the Heron’s formula is
given by
p
A D s.s a/.s b/.s c/ (3.1.1)

The final expression of A is symmetrical with respect to a; b; c and it has a correct dimension
(square
p root of length power 4 is lengthpsquared–an area). Thus, it seems correct (if it was
A D s.s 2a/.s b/.s c/ or A D s.s a/2 .s b/.s c/, then it is definitely wrong).
How we know that it’s correct? Check it for a triangle of which area we know for sure. Note that
using the generalized Pythagorean theorem gives a shorter/easier proof.
What can we do with Heron’s formula? We can use it to compute the area of a triangle
given the sides a; b; c, of course. The power of symbolic algebra is that we can deduce new
information from Eq. (3.1.1). We can pose this question: among all triangles of the same
perimeter, which triangle has the maximum area? Using the AM-GM inequality (Section 2.20),
it’s straightforward to show that an equilateral triangle (i.e., triangle with three sides equal
a D b D c) has the maximum area.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 245

Generalization to quadrilaterals. One of the most beautiful


things about Heron’s formula is the generalization discovered
by the Hindu mathematician Brahmagupta around 620 AD. First,
we re-write Heron’s formula as below
1p
AD .a C b C c/.b C c a/.c C a b/.a C b c/
4
Note that starting from the term a C b C c we can get the next term by cycling through a; b; c
as shown in the figure. The formula is, however, not symmetrical. In the term a C b C c there
is no minus sign! Now comes Brahmagupta: he added d D 0 (which is nothing) to the above
equation in this form:
1p
AD .a C b C c d /.b C c C d a/.c C d C a b/.d C a C b c/
4
This equation is fair (or symmetrical) to all quantities involved i.e., a; b; c; d . This beautiful for-
mula then must have a meaning, Brahmagupta argued. And indeed, it is the area of a quadrilateral
of sides a; b; c; d inscribed in a circle (such quadrilateral is called a cyclic quadrilateral).
The following joke describes well the principle of using the known to determine the unknown:

A physicist and a mathematician are sitting in a faculty lounge. Suddenly, the coffee
machine catches on fire. The physicist grabs a bucket and leap towards the sink,
filled the bucket with water and puts out the fire. Second day, the same two sit in the
same lounge. Again, the coffee machine catches on fire. This time, the mathematician
stands up, got a bucket, hands the bucket to the physicist, thus reducing the problem
to a previously solved one.

3.1.7 Congruence and similarity


Let’s make two identical triangles by paper. Now, separate them C0
apart, and rotate one triangle. What we obtain is similar to the
figure next. Obviously, these two triangles are of equal shape and ˇ
B0
size. Mathematicians call them congruent triangles. Euclid says
that two geometric figures congruent (he used the word coincide) C ˛

when one of them can be moved to fit exactly on the other. A0

The next thing mathematicians have to do is to develop crite-


ria to determine whether two triangles are congruent or not. They ˇ
˛
do not want to build physical triangles and move them to check A B
this. That’s not mathematics!
To say two triangles ABC and A0 B 0 C 0 are congruent, we write 4ABC Š 4A0 B 0 C 0 . Note
that we always write the vertices of the two congruent triangles so that matched vertices and
sides can be read off in the natural way. So, 4ABC Š 4A0 B 0 C 0 means that

AB D A0 B 0 ; BC D B 0 C 0 ; CA D C 0 A0 ; †A D †A0 ; †B D †B 0 ; †C D †C 0

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 246

Congruence tests. The three congruence tests for triangles are SAS (side-angle-side), ASA
(angle-side-angle) and SSS (side-side-side) as shown in Fig. 3.12. Instead of prove them we
consider them as axioms. Even though Euclid proved the SAS as a theorem, his proof relied on
motion, which is not easy to be precisely described in elementary mathematics. That’s why we
accept these tests with faith.

C C0

SAS
˛ ˛ B0
A B
A0
C C0

ASA ˇ
B0
˛ ˇ ˛
A B
A0 0
C C

SSS
B0

A B
A0
Figure 3.12: Congruence tests for triangles. To use the SAS, check two sides of the two triangles and the
angles included.

Using congruence we can deduce many interesting facts. For example, in an isosceles
triangle, which is a triangle that has two sides of equal length, the angles opposite to these sides
are equal. Another result: opposite sides of a parallelogram are equal. The proof always goes
like this: find two congruent triangles, for example using the SAS test, then the angles not used
in the test are equal. The difficult step is to make the triangles show up!

The Thales theorem or the Basic Proportionality Theorem A


(B.P.T. Theorem). Consider 4ABC and a segment PQ that cuts
AB and AC in such a way that PQ k BC . Thales said that PQ
cuts AB and AC proportionally: jAP j=jPBj D jAQj=jQC j.
The notation jABj is to denote the length of AB; in high school P Q
math textbooks, it is simply written as AB which is not rigorous.
For the proof, we need one elementary result about the area of
B C
a triangle: that area is one half the product of the base and the
height. To prove the Thales theorem, draw P C and BQ. Thus, the two triangles PBQ and
QP C have the same area (as they have the same base PQ and the same height which is the
distance from PQ to BC ). Next, we consider triangles APQ and PBQ, their areas are

area APQ D h  jAP j; area PBQ D h  jPBj

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 247

where h is the distance from Q to AB. From that we get:


area APQ jAP j
D (3.1.2)
area PBQ jPBj
In the same manner, if we consider triangles APQ and PQC , we obtain
area APQ jAQj
D (3.1.3)
area PQC jQC j
From Eqs. (3.1.2) and (3.1.3) and noting that the two triangles PBQ and QP C have the
same area, we get what we wanted to prove. A bit of algebra gives us the equivalent result:
jAP j=jABj D jAQj=jAC j.
The Thales theorem is: a parallel line implies proportional A
sides. The converse is: proportional sides imply a parallel. The
question is: is this converse true? Yes, it is. The problem is: given
jAP j=jPBj D jAQj=jQC j, prove that PQ k BC . Introduce E
a line PE that is parallel to BC . The aim is to show that E P Q
is nothing but Q. Since PE k BC , we have (due to Thales’
theorem):
jAP j jAEj B C
D
jPBj jEC j
But jAP j=jPBj D jAQj=jQC j (given), thus
jAQj jAEj jAQj jAEj jAC j jAC j
D ” C1D C1” D ” jQC j D jEC j
jQC j jEC j jQC j jEC j jQC j jEC j
Similarity of triangles. If we look at the triangles ABC and APQ in the proof of the Thales
theorem we see that they have equal angles (Fig. 3.13a). This is so because PQ k BC . These
two triangles are of the same shape but not of the same size. They are said to be similar, denoted
by 4ABC  4APQ. From the Thales theorem, we have jAP j=jABj D jAQj=jAC j D k.
So, two similar triangles have proportional sides (at least for two sides). But all sides should be
equal, we guess that jPQj=jBC j is also k (Fig. 3.13b).

A A
˛
ˇ
P Q P ˛
ˇ
B C P; B Q C
a/ b/

Figure 3.13: Similar triangles have equal angles, or their sides are proportional.

To conclude, two triangles are similar if they have equal angles, or their sides are proportional
(i.e., if one triangle has sides a; b and c, the other has sides of ka; kb, and kc where k > 0 is

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 248

called the scale factor). When k D 1, the two triangles are of course congruent. Thus, congruence
is a special case of similarity. What about the areas of the two p similar triangles? The ratio of
the areas is k 2 . One easy proofŽŽ is to use Heron’s formula A D s.s a/.s b/.s c/, with
s D 0:5.a C b C c/, see Eq. (3.1.1).
Because congruence is a special case of similarity, there are also three similarity tests: SSS,
SAS (where the sides are proportional), and AA (as the sum angles is 180ı , if two angles are
equal the the remaining ones must be also equal).

Revisiting the Pythagorean theorem. I now present the second C


proof of the famous Pythagorean theorem. This proof uses similar
triangles. Let’s consider the right triangle ABC with sides a; b ˇ
˛
and hypotenuse c. From C draw CH perpendicular to AB, and a b
let AH D x (thus BH D c x). This CH divides our triangle ˛ ˇ
into two smaller right triangles, which are similar to 4ABC (AA
B H A
test). Thus, we have
a c
4ABC  4CBH H) D
c x a
b c
4ABC  4ACH H) D
x b

From these two, it is easy to get c 2 D a2 C b 2 .

3.1.8 Is a square a rectangle?


A quadrilateral is a four-sided polygon, having four edges (sides)
and four corners (vertices). The word is derived from the Latin quadrilateral

words quadri, a variant of four, and latus, meaning "side". A par- parallelogram
allelogram is a simple (non-self-intersecting) quadrilateral with rectangle
two pairs of parallel sides. A rectangle is a quadrilateral with four
right angles. A square is a rectangle with two equal-length adja- square

cent sides. Thus, a square is a regular quadrilateral, which means


that it has four equal sides and four equal angles.

3.1.9 Angles in convex polygons


An interior angle of a polygon is formed by two sides of the polygon that share an endpoint. It
is a well known fact that the sum of the three interior angles of a triangle is 180ı . Referring to
Fig. 3.14, this fact is written as a1 C a2 C a3 D 180ı . Now we move to rectangles, it is obvious
that a1 C a2 C a3 C a4 D 360ı because each angle is a right angle (or 90ı ). From this, we guess
ŽŽ
Yet another proof is: consider a parallelogram of sides a, b; its area is ab. Now, scale it up by a factor k. The
area of the scaled parallelogram is k 2 ab. Relating the area of a triangle to that of a parallelogram will give us the
result.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 249

that for a n sided convex polygon, the sum of the angles is 180ı .n 2/Ž . At least it is true for
triangles (n D 3, angle sum is .3 2/180ı D 180ı ) and rectangles (n D 4).

a4
a3 a4 a3 a3
a5

a1 a2 a1 a2 a1
a2

Figure 3.14: Sum of interior angles in a convex polygon of n sides is 180ı .n 2/?

To prove what we have guessed we need to introduce another concept : exterior angles.
An exterior angle (also called an external angle or turning angle) is an angle formed by one
side of a polygon and a line extended from an adjacent side (Fig. 3.15). If we start at P and
travel counter-clockwise around the triangle one lap we will get back to P . In doing so, we
make one left turn at C of 180ı a3 , one left turn at A of 180ı a1 and one left turn at B of
180ı a2 . In total we have make a full turn of .180ı a1 / C .180ı a2 / C .180ı a3 / D
3.180ı / .a1 C a2 C a3 /. One full turn is 360ı ; if you do not convince, considering a special
case of equilateral triangle with a1 D a2 D a3 D 60ı , then it is obvious that the total of exterior
angles is 120ı C 120ı C 120ı D 360ı .

180ı a3 C
a3 a4 a3
P
180ı a2
a1 a2 a1 a2
A
B
180ı a1

Figure 3.15: Exterior angles in a polygon.

The same analysis for rectangles yield the same result. So, we guess that

.180ı a1 / C .180ı a2 / C    C .180ı an / D 360ı (3.1.4)

If this is correct (and it is as we shall prove), then we obtain the so-called exterior angle theorem
that states that the sum of the exterior angles of a polygon is 360ı . From this theorem, we get
the sum of interior angles:

n180ı .a1 C a2 C    C an / D 2  180ı H) a1 C a2 C    C an D .n 2/180ı


Ž
This is known as the interior angles theorem.

There exists another proof without using exterior angles. The idea is to decompose a polygon into triangles,
and using the fact that the sum of triangle angles is 180ı . Assuming that we have an n sided polygon A1 A2 :::An .
From vertex A1 we draw diagonals A1 A3 ; A1 A4 ; : : : ; A1 An 1 . There are n 3 such diagonals which make n 2
triangles. Draw a figure to see all of this.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 250

For a regular n sided polygon, we have a1 D a2 D : : : D an D ˛, the above equation then


gives us
interior angle of a regular n-sided polygon: ˛ D 180ı .1 2=n/ (3.1.5)
What can we read from this formula? First, all interior angles in regular polygons are smaller
than 180ı (because they are 180ı multiplied with a factor smaller than one i.e., 1 2=n). Second,
the more edges a regular polygon has the bigger its interior anglesŽ .

Proof of the exterior angle theorem by George Polya. At each corner (of the polygon), draw two
line segments pointing outward, one line segment perpendicular to each side (Fig. 3.16). Draw a
sector of a unit circle at each vertex with these segments as sides. Now observe that the angle
made by these two segments is precisely the exterior angle. Because the sides of each pair of
adjacent sectors are parallel, we can reassemble the sectors to form a circle. So, the sum of the
exterior angles is 360ı .

180ı a4

180ı a5 180ı a3
a4
a3
a5

a1 a2

180ı a1
180ı a2

Figure 3.16: Exterior angles theorem: Polya’s proof.

3.1.10 Circle theorems


The Central Angle Theorem states that the central angle from two chosen points A and B on a
circle is always twice the inscribed angle from those two points (Fig. 3.17). From that we get
two immediate results shown in the middle (inscribed angles subtended by the same arc are
equal) and right diagram in the referred figure.

Thales’s theorem and its converse. The right diagram in Fig. 3.17 shows the Thales theorem:
if AB is the diameter of a circle and C is a point (any point) on that circle, then the angle ACB
is a right angle. As always in mathematics, we should ask the question: is the converse of the
Thales theorem true? That is, given the diameter of a circle AB, and the angle ACB is 90ı , is
C on the circle or not?
It turns out that the converse of the Thales theorem is also true. And Euclid presented a
proof by contradiction (Fig. 3.18). Assume that †ACB is 90ı but C is not on the circle. If we
can deduce something absurd from this, then C must be on the circle. That’s the gist of proof
Ž
Compare equilateral triangles with angles of 60ı and squares with angles of 90ı .

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 251

A A C

2˛ B B
O
A B
O O
˛ ˛
˛
C C
C0

Figure 3.17: Central angle theorem (left figure): The Central Angle Theorem states that the central angle
from two chosen points A and B on the circle is always twice the inscribed angle from those two points.
The inscribed angle can be defined by any point along the outer arc AB and the two points A and B. And
when AB is the diameter of the circle i.e., 2˛ D 180ı , we have C D 90ı . How to prove the central angle
theorem? Simply draw the line OC , and we have two isosceles triangles OAC and OCB.

C C0
C

90ı
A B A B
O O

Figure 3.18: Proof by contradiction for the converse of the Thales theorem: C is inside the circle case.

by contradiction. There are two cases: C is inside the circle and it is outside the circle. Let’s
consider the former case first. Extend AC so that it cuts the circle at C 0 . The Thales theorem
then tells us that †AC 0 B D 90ı . Thus we have BC and BC 0 , both perpendicular to AC , are
parallel. However, these two parallel lines meet at B! A contradiction: so C cannot be inside
the circle. I leave the case C outside the circle for you as an exercise.

Geometric mean theorem or the right triangle altitude the- C


orem. Given twoppositive numbers a and b we have met the
˛
geometric mean ab in Section 2.21.3. Herein, we see the this
concept in relation to right triangles. Consider a right triangle
ABC with †C D 90ı . (Any right triangle is associated with a ˛
B
circle due to Thales’ theorem, so I intentionally drew a circle, but A H

it is not needed for the theorem). From C , draw CH ? AB: CH


is called the altitude on the hypotenuse AB. The triangle HAC is similar to the triangle H CB
(AA test), so we have H C =HB D HA=H C . Thus,
p
H C 2 D HA  HB H) H C D HA  HB (3.1.6)

It is H C 2 D HA  HB that we can square a rectangle with ruler and compass, p that is to


construct a square of equal area to a given rectangle. On the other hand, H C D HA  HB
allows us to construct the square root of a > 0 using straightedge and compass.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 252

Four points on a circle. Consider a circle and four points A; B; C and D on it. We then have a
cyclic quadrilateral ABCD. (A cyclicŽ quadrilateral or inscribed quadrilateral is a quadrilateral
whose vertices all lie on a circle.) It is interesting that in this quadrilateral opposite angles
add up to 180ı (Fig. 3.19a)ŽŽ . To prove this, we need to make triangles (we know more about
triangles than about quadrilaterals). So, we draw the diagonals AC and BD (Fig. 3.19b). In
4ACD we have ˛1 C ˛2 C †D D 180ı . But †ABD D ˛2 (inscribed angles subtended by the
same arc are equal) and †DBC D ˛1 . Therefore, †B C †D D 180ı .

B B
˛ C ˛1
˛2
C
180ı ˇ
˛2
ˇ
˛1
A 180ı ˛ A
D D

(a) (b)

Figure 3.19: Four points on a circle: opposite angles add up to 180ı (a) and proof (b). The converse is
also true: if the opposite angles of a quadrilateral add up to 180ı , then the quadrilateral is cyclic. Proof
by contradiction, remember the converse of the Thales theorem?

Intersecting chords theorem. If two chords AB and CD meet at a point P , then


AP  BP D CP  DP (Fig. 3.20a). I could have written this as jAP j  jBP j D jCP j  jDP j.
But, for sake of brevity, I adopted the simpler notation. We can rearrange this equation to
get AP=CP D DP=BP . This gives us the idea to prove this theorem, as this new expression is
obviously about the proportionality of sides of two triangles AP C and DPB. So, draw BD and
AC : we now have triangles in the problem. The two triangles are similar (AAA test), thus their
sides are proportional. If the two chords meet outside the circle (Fig. 3.20b), the result still holds.

3.1.11 Tangents to circles

Ž
The word cyclic is from the Ancient Greek kuklos, which means "circle" or "wheel".
ŽŽ
Why we know this? Imagine of a circle and a rectangle with four vertices on the circle, obviously the opposite
angles add up to 180ı . Generalize this.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 253

ˇ B
˛ D
P ˛
˛ C
ˇ
C
A ˛ ˇ
P
A B

(a) (b)

Figure 3.20: Intersecting chords: AP  BP D CP  DP . This result shows that the quantity AP  BP is
an invariant (constant).

Q
Given a circle and a line there are three possibilities regarding the relative P
position of them: (1) the line does not intersect the circle, (2) the line
intersects the circle at two points and (3) the line touches the circle at
exactly one point. We’re interested in this last case. A tangent line to a O
circle is a line that touches the circle at exactly one point–the point of
tangency (P in the figure).
How to draw the tangent line to a circle at P ? Draw OP , then draw a
line perpendicular to OP , that is the tangent line. But why? Simply consider another point Q
on this line, from the Pythagorean theorem we know that the hypotenuse is the largest side in a
right triangleŽ (i.e., OQ > OP ). Thus any point on this line except P is outside the circle.

Tangent-secant theorem. Let P be a point outside the circle, B


P T be tangent to the circle and PB be a line that cuts the circle H

at A and B. Then, the tangent-secant theorem says that P T D 2


A
PA  PB. Where does it come from? It comes from Fig. 3.20b.
O
Imagine that PA is being rotated counter clockwise around P .
While doing so the points A; B come close to each other until
they meet at one single point T : P T is then a tangent, and we T P

have PA D PB D P T . To prove this theorem, noting that we need P T and that reminds us of
2

the Pythagorean theorem which involves right triangles. So, we draw the triangle OTP which
is a right triangle (as P T is tangent to the circle). We need another right triangle: draw OH
perpendicular to AB: we then have HB D HA (congruent triangles OHB; OHA). Now, the
three right triangles OTP; OHP and OHA give us (r is the radius of the circle):
r 2 C P T 2 D OP 2
PH 2 C OH 2 D OP 2
OH 2 C HA2 D r 2
The rest is algebra: from the first two equations we get r 2 C P T 2 D PH 2 C OH 2 (OP is gone),
Ž
The proof is easy: a2 C b 2 D c 2 , thus c 2 > a2 and c 2 > b 2 .

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 254

then using the third equation for r 2 , we obtain (r is gone too)

P T 2 D PH 2 HA2 D .PH HA/.PH C HA/ D PA  .PH C HB/ D PA  PB

If we have a circle and a point P outside it, we can have T1


maximum two tangents from P to the circle: P T1 and P T2 .
From the above result, we know that P T1 D P T2 and this can
be proved using two congruent right triangles OP T1 and OP T2 ˛
˛
(SSS test, where we have two equal sides, and the third sides are O P

also equal due to the Pythagorean theorem). Thus, OP bisects


†T2 P T1 : it divides this angle into two equal angles. One says
T2
that OP is the bisector of †T1 2P T1 . This result will bring us to
another interesting property of triangles concerning the so-called inscribed circle or incircle. An
incircle is the largest circle which fits inside a triangle just touching the three sides of the triangle.

Inscribed circle in a triangle. Consider this problem: given a C

4ABC , draw its inscribed circle. Assume that we have such



a circle with center as O that touches the triangle at P; Q and
M , then we have AQ; AM as tangents; BP; BM as tangents. Q
P

Thus, AO is the bisector of †BAC ; OB is the bisector of †ABC . O

Therefore, the center of the incircle of 4ABC is the intersection


ˇ
of the bisectors of two of its angles. It follows that the remaining ˇ ˛
˛

bisector (of †ACB) also goes through O Ž . A M B

But, how to draw the bisector for a given angle? I am glad you ask. Yes, do not accept
anything except the axioms or postulates is the motto.

3.1.12 Geometric construction


In geometry, geometric construction is the construction of lengths, angles, and other geometric
figures using only a straightedge and a pair of compasses. It is important to point out that
geometric constructions are not practical problems. The same geometry students could use
the protractors and rulers (and even software) to construct many geometric shapes with less
difficulties. Furthermore, they are not physical problems neither. They are purely theoretical
problems.
All straightedge-and-compass constructions consist of repeated application of five basic
constructions using the points, lines and circles that have already been constructed. These are:

 Creating the line through two points;

 Creating the circle that contains one point and has a center at another point;

 Creating the point at the intersection of two (non-parallel) lines;


Ž
Proof: two right triangles COQ and COP are congruent (SSS test), hence †QCO D †OCP .

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 255

 Creating the one point or two points in the intersection of a line and a circle (if they
intersect);

 Creating the one point or two points in the intersection of two circles (if they intersect).

Collapsing or locking compass. The rules seem to imply that we can’t use a locking compass.
That is, we are unable to open the compass to a fixed distance, like a pair of dividers, move
them to another location on the paper, and draw a circle with that radius. This is troubling
because modern compasses are locking compasses and they are time saver when doing geometry
problems. Then, are we cheating? No. Euclid, in his second proposition in book I, has shown
that collapsing compasses can copy distances. Specially, his second proposition states that given
a line segment AB and a point C it is possible to construct a point D so that CD D AB.
E D
E D

a a a
a a a

C
A a B b C
A a B b
b b b b
F F
(a) (b)

Figure 3.21: Euclid’s 2nd proposition in Book I.

How to construct D? Let’s draw some triangles (if not we just have two boring segments in
the diagram): the equilateral triangles ABE and BCF (Fig. 3.21a). (We discussed this problem
in Section 3.1.2). Now, assume we know where D is, then, for AE D CD , FD D a C b. This
means that D is on the circle centered at F (known) and passes through E (known). And D is
also on the line F C . Thus, we know how to construct it (Fig. 3.21b).

Bisecting a segment. Given a segment AB, find the point M on it such that AM D BM . One
way to proceed is: assuming that we can construct such point M , then find its properties and
relation with existing figures. As this is geometry, we need to have geometric figures (triangles,
circles etc.). Thus, draw MX ? AB, AX and BX (Fig. 3.22a). What is special about X ? It is:
AX D BX . Thus, X is the intersection of two circles: one centered at A with radius AX and
the other centered at B with the same radius. But we do not know AX , but we have AB.
So, the construction steps are (Fig. 3.22b):

 Draw a circle centered at A with radius AB;

 Draw a circle centered at B with radius AB;

 These two circles intersect at two points X1 ; X2 , we then get X1 and X2 ;

 Draw X1 X2 , it intersects AB at M . That’s the point we’re after.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 256

X0 X1

X ˛ ˛

A B
M

A M B X2

(a) (b)

Figure 3.22: Bisecting a segment. This construction can be applied to construct a line passing through a
given point and perpendicular to a given line.

More important than the constructions themselves are the proofs that they accomplish what they
say they accomplish: we need to prove our construction. Indeed, 4X2 AX1  4X2 BX1 (SSS
test), thus we have same angles at X1 as indicated, and that leads to 4X1 AM  X1 BM (SAS),
which results in AM D BM .
The generalization of this problem is: to divide a segment into n equal parts. To this end,
we need to use the Thales theorem on proportion. I have presented this construction when we
discuss rational numbers in definition 2.8.1.

Bisecting an angle. Given an angle, construct a line bisecting this angle. Referring to Fig. 3.23a,
what we need to construct is the line m. But a line requires two points, so we need a point M on
this line. Now, to make lines l1 and l2 involved in the problem, we draw a line through M and
? mŽŽ . This line cuts l1 and l2 at A and B, respectively. It is easy to see that OA D OB and
BM D AM . And now, we reverse the process and construct m as (Fig. 3.23b)

l2 l2
B B

m m
ˇ ˇ
M M
O ˛ ˇ O ˛ ˇ
˛ A ˛ A
l1 l1
(a) (b)

Figure 3.23: Bisecting an angle.

 Draw a circle centered at O with any radius. This circle intersects l1 at A and l2 at B;

 Bisect AB (using the previous construct): then we have M –the midpoint of AB;
ŽŽ
Why perpendicular? This is special, because if not, then there are infinite number of lines.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 257

 Draw OM . That’s the line we’re looking for.

Construct regular polygons of 62n edges inscribed in a unit circle. Let’s solve this problem:
construct a hexagon inscribed in a unit circle. Again, we do the analysis first: assuming that
we already have this hexagon (Fig. 3.24a). If we join the center O with the vertices we get six
equilateral triangles. For example, 4OAB is an equilateral triangle (why?). But we know how
to build an equilateral triangle! So, we’re able to build a hexagon (Fig. 3.24b).

B
.A; 1/
B
C
.B; 1/

60ı A
A D O
O
.C; 1/

F
E

(a) (b)

Figure 3.24: Constructing a hexagon inscribed in a unit circle. Start with a circle with center O, select
a point on it, label it A. Draw the circle .A; 1/ centered at A with radius 1. This circle cuts the original
circle at B. Make a circle .B; 1/, it cuts the original circle at C . Repeat this process and we get a hexagon.
From this hexagon and the angle bisecting construction, we can build any polygon having 6  2n edges
for n D f0; 1; 2; 3: : : :g. This successful construction of the hexagon also means that we can trisect a 180ı .
Trisecting an angle ˇ is to make three equal angles ˇ=3 from it.

Construct a pentagon inscribed in a unit circle. For me, this


Im
problem is hard. I have tried to do the analysis but failed. The rea- kD1
son was that I did not know well of this pentagon. After a while
kD2
I realized that if I know the length of one edge of the pentagon,
then it can be a good starting point. To find this length, there are 2=5
kD0
many ways, but one way is to use complex numbers, trigonometry O
Re
and analytic geometry Ž . In Fig. 2.39, we have met the fifth roots
of unity which are the vertices of a pentagon. Now, we know the kD3
coordinates of the vertex with k D 0 are .1; 0/ and the coordi- kD4
nates of k D 1 are .cos ˛; sin ˛/ with ˛ D 2=5. Therefore, the
length of this segment, denoted by a5 , is
p p
a5 D .cos ˛ 1/2 C .sin ˛/2 D 2.1 cos ˛/ (3.1.7)
Ž
If you do not know yet trigonometry/analytic geometry, you can return to this after you have mastered these
topics.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 258

p
But cos ˛ D cos.2=5/ D 5 1=4ŽŽ , thus
s p
p 5 5
a5 D 2.1 cos ˛/ D (3.1.8)
2 2

q p A1
5 5 A1
a5 D 2 2
C.A1 ; A1 S 0 /
1
A2 p A5 A2 A5
5=2
M S 1=2 O S0 N M S O S0 N
C.S; SA1 /

A4 A4
A3 A3

(a) (b)

Figure 3.25: Construction of a pentagon inscribed in a circle.

Now, using this knowledge we will do the analysis as in Fig. 3.25a: we have our circle and
our pentagon (assumed it has been constructed). To mimic the coordinate axes, we also draw
p two
perpendicular lines going through the center of the circle. Then,
p we have O1 ApD 1. And 5=2
can be obtained from a right triangle with sides 1; 1=2 (for 12 C .1=2/2 D 5=2). Thus, we
need a segment
p of length 1=2. So, we introduce
p point S –the midpoint of MO, this gives us
p
0 0 0
SA1 D 5=2. We need S such that SS D 5=2. Why? Then we have OS D 5=2 1=2.
In the right triangle OA1 S 0 , we have
r s p
p 2 5 5
0
A1 S D 12 C 5=2 1=2 D
2 2

which is nothing but a5 ! Therefore, the construction is shown in Fig. 3.25b§ .


It is time to stop and think about what we have done. We can construct n-gon for
n D f3; 4; 5; 6g. And by angle bisecting construction, we can construct regular polygons of
8; 10; 12; 20; 24; : : : edges. How about a 7 gon? How about a 9 gon? This turns out to be
an extremely hard problem! For the next two thousand years since Ptolemy, no one had ever
succeeded in constructing a regular 7 gon, a regular 9 gon, or a regular 11 gon. Every effort
seems to end up in failure. The reason is not because of our incapability but it turns out that it is
impossible to construct those polygons!
The first mathematician that made a breakthrough in polygon construction is Gauss. On
30 March 1796, when he was 19 years old, Gauss made a major discovery in mathematics: he
showed that it is possible to construct a regular 17 gon (called heptadecagon). It is believed that
Computed from cos =5, which is given in Section 3.8.
ŽŽ
§
This construction was given by Ptolemy (c. 100–c. 170), the mathematician, astronomer, and geographer of
Greek descent, in the 2nd century in the Almagest, one of the most famous scientific texts of all time.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 259

the excitement of this discovery had led Gauss to choose mathematics instead of philosophy as a
career. Gauss was so pleased by this result that he requested that a regular 17-gon be inscribed on
his tombstone. For some unknown reason, this request was not fulfilled, but on Gauss’s memorial
stone in Brunswick we can find a 17-point star.
So, we know that it is possible to construct 3 gon, 5 gon and 17 gon. What is special
about 3; 5 and 17? They are odd, they are primes. What else? Let’s touch them so that they
reveal their secret: taking one from them we obtain these numbers: 2; 4 and 16, which can be
0 1 2
written as 20 ; 22 ; 24 , which can also be written as 22 ; 22 and 22 . These are Fermat prime
numbers! Within five years Gauss had developed a theory that described exactly which polygons
were, and which were not, constructable. In 1801, Gauss presented this result at the end of his
now classic Disquistiones Arithmeticae, in which he proves the constructibility of the n gon
k
for any n that is a Fermat prime (i.e., of the form 22 C 1 for k 2 N).
How about Gauss’s proof? I will discuss it later in Section 3.20.3.

3.1.13 The three classical problems of antiquity


The three classical problems of antiquity are the following:

1. doubling the cube: given a cube, construct another cube which is twice as big,

2. squaring the circle: given a circle, construct a square which has the same area as that of
the circle,

3. trisecting the angle: given an angle, divide it into three equal angles,

using only straightedge and compass. Although Euclid solved more than 100 construction prob-
lems in the Elements, we do not see any of these problems in the Elements. The reason is that
these problems were not solved until the 19th century!
If the cube has a side of a, its volume is then a3 , and the volume of the cube we want to
construct is 2a3 . Therefore, doubling p the cube boils down to constructing, using straightedge
and compass, a cubep with a side of 2a. Of course, we can start with a D 1, and the problem
3

is to construct a 3 2 segment. This is intimately related to the cubic equation x 3 2 D 0.


If the circle has a unit
p radius, its area is then . A square that has an area of  has a side
of which the length is . Therefore, squaring p the circle boils down to constructing, using
straightedge and compass, a segment of length .
Given an angle 3 , it is said that we can trisect it if we can construct a segment of length
cos . Using the trigonometry identity cos 3 D 4 cos3  3 cos , the problem seems to become
again the business of cubic equations!
For millennia people tried again and again to double a cube and trisect an angle, but they
always failed. Such was the status of these problems until 1837 when Pierre Wantzel (1814 –
1848) published a proof that they couldn’t be solved (to be discussed in Section 3.20.4). The
great Gauss had stated that both of these methods were impossible but had never provided proof.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 260

3.1.14 Tessellation: the Mathematics of Tiling


You have probably noticed that floors are usually tiled in squares or sometimes in rectangles.
What is so special about these shapes? What are the disadvantages of using other shapes? In
mathematics, the term used for tiling a plane (floor in our context) with no gaps and no overlaps
is tessellation. Tessellations have many real-world examples and are a physical link between
mathematics and artŽ .

Figure 3.26: Examples of regular polygons that can tile the floor without gaps and overlaps: square,
equilateral triangle and hexagon. This is known as regular tessellation.

In order to have no gaps the angle around a point (point far from the boundary of the plane)
must be 360ı , no more no less. Based on this, we can understand which regular polygons can
tessellate a plane. The only regular polygons that can do this are equilateral triangle, square
and hexagon (Fig. 3.26). Equilateral triangles form a regular tessellation because their interior
angles are 60ı (consequence of Eq. (3.1.5)), and 60 is a divisor of 360: 360 D 6  60. Squares
form a regular tessellation because their interior angles are 90ı , and 90 is a divisor of 360 too.
Hexagons also form a regular tessellation because their interior angles are 120ı , and 120 is a
divisor of 360 too.
In order to not have to examine other possibilities, we resort to algebra for a proof. Let’s
consider a tessellation with p gons and q such p gons at a vertex. At this vertex the angle is
360ı and this gives us the following equation (using Eq. (3.1.5))
 
ı 2
180 1  q D 360ı
p

Dividing both sides of this equation by 180ı , we get another equation .1 2=p/q D 2. And our
task is now to solve this equation–that is finding the tuples .p; q/–noting that p; q 2 N. How to
do this? Amazingly, a bit of algebraic massage to the equation is a big help:

.p 2/q D 2p ” .p 2/q D 2p 4 C 4 ” .p 2/.q 2/ D 4

So, .p 2; q 2/ D f.1; 4/; .4; 1/; .2; 2/g. Thus, .p; q/ D f.3; 6/; .6; 3/; .4; 4/g. That’s it. The
key step was to add 4 C 4, which is nothing, to the RHS of the equation.
That’s all we can do with regular tessellation. It is kind of boring so let’s relax a little bit by
allowing more than one type of regular polygons that can be used in the tessellation. What we
obtain is called semi-regular tessellation. See Fig. 3.27 for some examples. Pattern I is made
Ž
I got to know to this topic by reading the interesting book Mathematics Rebooted: A Fresh Approach to Un-
derstanding by Lara Alcock [2]. Alcock is a Reader in the Department of Mathematics Education at Loughborough
University.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 261

up of squares, and regular hexagons. Going around each vertex we meet a square (4 edges) and
two pentagons (8 edges), we associate the tuple .4; 8; 8/ with it. Pattern II is made up of squares,
regular hexagons and regular dodecagons, so we associate the tuple .4; 6; 12/ with it. Keep in
mind that at every vertex, we must have the same polygons in the same order.

(a) pattern I .4; 8; 8/ (b) pattern II .4; 6; 12/

Figure 3.27: Some examples of semi-regular tessellation. The notation .4; 8; 8/ indicates the polygons
(via its number of edges) surrounding a vertex.

The question that mathematicians are interested in now is: how many semi-regular tessel-
lations are there? It turns out that there are only eight of them. And we’re going to prove this.
To this end, let the tuple be .n1 ; n2 ; : : : ; nk / where each ni is a positive integer representing the
number of edges. Since each polygon has at least three sides, we have ni  3 for every i . We
start by showing that k  6. That is, there can be no more than six polygons meeting at each
vertex. This constraint is due to the fact that at any vertex of tessellation the angle must be 360ı .
It is obvious that we are going to use Eq. (3.1.5) to obtain:
     
ı 360ı ı 360ı ı 360ı
180 C 180 C    C 180 D 360ı
n1 n2 nk
A bit of algebraic manipulation (dividing the above equation by 180ı ) gives us
1 1 1 k
C C  C D 1 (3.1.9)
n1 n2 nk 2
But we also have ni  3 or 1=ni  1=3. So,
1 1 1 k
C C  C  (3.1.10)
n1 n2 nk 3
From Eqs. (3.1.9) and (3.1.10) we have k=2 1  k=3 which leads to k  6. Note that we have
k  3 because at any vertex we need at least three polygons to have 360ı . Therefore, there are
only four possibilities k 2 f3; 4; 5; 6g.
Now we have to consider each case. I show here the case k D 3 (as it is the easiest). We
change the notation as there are only three unknowns. Let’s rename n1 ; n2 ; n3 as a; b; c. And
there is no harm in labeling them such that a  b  c. Now, Eq. (3.1.9) becomes 1=a C 1=b C
1=c D 1=2. In conjunction with 3  a  b  c, we have 1=2  3=a. Thus, a  6. We have
found a: a D f3; 4; 5; 6g. The next step is to determine b and c, for each a:

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 262

 a D 3: finding two positive integers b and c such that 1=b C 1=c D 1=6. We proceed as

1=b C 1=c D 1=6 ” bc 6.b C c/ D 0 ” .b 6/.c 6/ D 36

Thus, .b 6; c 6/ D f.1; 36/; .2; 18/; .3; 12/; .4; 9/; .6; 6/g. Hence, .b; c/ is one of the
following: .7; 42/; .8; 24/; .9; 18/; .10; 15/; .12; 12/.

 a D 4: in the same manner, .b; c/ is one of the following: .5; 20/; .6; 12/; .8; 8/.

 a D 5: .b; c/ is: .5; 10/.

 a D 6: .b; c/ is: .6; 6/.

To conclude .a; b; c/ is one of the following triples:


.3; 7; 42/; .3; 8; 24/; .3; 9; 18/; .3; 10; 15/; .3; 12; 12/; .4; 5; 20/; .4; 6; 12/,
b
.4; 8; 8/; .5; 5; 10/; .6; 6; 6/. We recognize that .6; 6; 6/ is the regular
tessellation with hexagons. For other possibilities, we need to examine b
them. For example, let’s consider .3; 12; 12/ shown in the figure but
imagine that instead of 12 sided polygons we have a sided and b sided a

polygons. It is then clear that we must have a D b. Thus, in the triple


.a; b; c/ if there is an odd number then the other two must equalŽ . There-
fore, .3; 7; 42/; .3; 8; 24/; .3; 9; 18/; .3; 10; 15/; .5; 5; 10/ are invalid. The only semi-regular
tessellations for k D 3 are: .3; 12; 12/; .4; 8; 8/; .4; 6; 12/.

3.1.15 Platonic solids


The generalization of a polygon in three-dimensional space is called a polyhedron (plural is
polyhedra). This is a solid with flat faces (or sides) each of which is a polygon. Among the more
familiar polyhedra is the square-base pyramid (see Fig. 3.5). And if we generalize the idea of
regular polygons to 3D what we get are called regular polyhedra or Platonic solids. Fig. 3.28
shows the five platonic solids: tetrahedron, cube, octahedron, dodecahedron and icosahedron.
What is a platonic solid? A platonic solid is a convex, regular polyhedron in three dimensional
Euclidean space. That is:

 all faces are regular polygons: e.g. a tetrahedron has four faces each of which is an equi-
lateral triangle;

 all faces are congruent i.e., same size and shape;

 the same number of faces meet at each vertex (and this number is larger than or equal to
three), and

 the solid is convex (e.g. no indentation).

Ž
Keep in mind that at every vertex, we must have the same polygons in the same order.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 263

Tetrahedron Hexahedron (Cube) Octahedron Dodecahedron Icosahedron

Figure 3.28: Platonic solids. The tetrahedron has 4 triangular faces, 6 edges, and 4 vertices. The cube or
hexahedron has 6 square faces, 12 edges, and 8 vertices. The octahedron has 8 triangular faces, 12 edges,
and 6 vertices.

The name Platonic solids of the five regular polyhedra comes from
Plato (423–348 BC), a Greek philosopher born in Athens. Today Plato is
best known as a philosopher, but one of his most important contributions
was the creation of the Academy. Although Plato did not contribute directly
to the body of mathematical knowledge, he was an important figure in the
advancement of the subject. He was a lover of mathematics, and mathemat-
ics formed the backbone of the curriculum at the Academy. This is clearly
illustrated by the inscription found above its entrance: "Let no one ignorant of geometry enter
here." Because many mathematicians were trained and nurtured in the Academy, it is said that
Plato was not a maker of mathematics, but "a maker of mathematicians" [10].
In his test Timaeus–a fictional account of a discussion between Socrates, Hermocrates, Critias,
and Timaeus, Plato associated the tetrahedron with fire, the octahedron with air, the icosahedron
with water and the cube with earth; fire, air, water and earth were the only four elements consid-
ered at that time. The fifth solid, the dodecahedron, was considered the shape that encompasses
the whole universe and was used for arranging the constellations in the heaven. This Greek
atomic model was so influential that it was universally accepted until the Irish scientist Robert
Boyle (1627–1691) published his book The Sceptical Chymist in 1661.
One question arises: Is it just that these five Platonic solids are the only ones we’ve found?
Could it be there are others out there waiting to be discoveredŽŽ ? Don’t try to find other platonic
solids as Euclid’s The Elements provided a proof that there exist no more than five platonic
solids.
The proof is based on this observation: the sum of the angles ı
90
of the faces meeting at a vertex on a platonic solid must be less 90 ı

than 2. As illustrated in the next figure, at a vertex on a cube, 90 ı

there are three faces of which the angles are 90ı : the sum of these
angles are 270ı , less than 2. For a tetrahedron, three faces meet
at a vertex of which each face has an angle of 60ı , thus the total angle sum is 180ı , again less
than 2. Now, we’re going to use this result to prove that there are only five platonic solids
shown in Fig. 3.28.

ŽŽ
I used the word discovery because these platonic solids are commonly found in nature. For instance, the carbon
atoms in a diamond are arranged in a tetrahedral shape. Common salt and fool’s gold (iron sulfide) form cubic
crystals, and calcium fluoride forms octahedral crystals.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 264

Proof. Let’s assume that the platonic solid has as faces n gons. We know that every n gon has
an interior angle of ˛ D 180ı .1 2=n/. Furthermore, assume that there are m faces meeting at
a vertex, noting that m  3. Then, the total angle at a vertex is m˛ or 180ı m.1 2=n/. Using
the above result we then have the inequality 180ı m.1 2=n/ < 360ı or m.1 2=n/ < 2. Now,
it is simple to proceed. We just need to consider n D 3; 4; 5;  6 and see that there are only
five platonic solids. For example, if n D 3, then we have m.1 2=3/ < 2 or m < 6. Thus,
3  m < 6 or m D f3; 4; 5g: we have then the tetrahedron, the octahedron and the icosahedron.
For n  6 it is impossible to satisfy the inequality m.1 2=n/ < 2 with m  3. 

3.1.16 Euler’s polyhedra formula


In a polygon the number of vertices is equal to the number of edges. Thus, there exists a re-
lationship between the number of vertices and the number of edges. The same must hold true
for polyhedra, except that the situation is more complicated as we are dealing with faces, ver-
tices and edges. It is surprising that no mathematician paid attention to this issue until 1639
when Descartes turned away his attention from philosophy to mathematics. He noticed a curious
pattern. He probably constructed a table similar to Table 3.1. The way Descartes looked at poly-
hedra was revolutionary: he was dealing with geometrical objects but he did not calculate the
metric of them such as area, volume or angles. Instead he counted the number of vertices, faces,
which seems not a special thing to do. Counting things is, however, a mathematical activity, it is
the field of combinatorics.

Table 3.1: Platonic solids: number of faces/vertices/edges. The last two columns serve a later purpose, p
denotes the number of edges of a face and q denotes the number of faces meeting at a vertex.

Polyhedron Vertices (V ) Edges (E) Faces (F ) F E CV p q

Tetrahedron 4 6 4 2 3 3
Cube 8 12 6 2 4 3
Octahedron 6 12 8 2 3 4
Dodecahedron 20 30 12 2 5 3
Icosahedron 12 30 20 2 3 5

From this table Descartes noticed that for any platonic solids we have F E C V D 2Ž
if F denotes the number of faces, V the number of vertices and E the number of edges. For
unknown reasons, he did not publish this finding. It was Euler who in 1750 re-discovered this
result and in 1751 proved it. On November 14, 1750, in a letter to his friend, the number theorist
Christian Goldbach (1690–1764), Euler wrote, "It astonishes me that these general properties of
stereometry [solid geometry] have not, as far as I know, been noticed by anyone else." [57].
Ž
Historically, Descartes came up with a slightly different relation: he did not use edges but instead used the
plane angles.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 265

Now I discuss how to build Table 3.1. If you’re thinking counting is the way to do it, what’s
there to discuss, then you’re wrong: mathematicians are lazy and they prefer smart ways of
counting. Let’s assume that each face of a platonic solid is a p-gon, and there are q faces (or q
edges) meeting at a vertex. And we assume that we know the number of faces F (this is easy to
count). The problem now is to find E and V as functions of F; p; q.
Each face has p edges and there are F faces, so there are pF edges. But we have counted
each edge twice as two faces share the same edge. So, in total we have E D pF=2. Each edge has
two vertices, and we have E edges, thus there are 2E vertices. But we have counted each vertex
q times, so V D 2E=q . We need to check our formula. Take the dodecahedron as an example:
F D 12 p D 5 and q D 3. Thus, E D .5/.12/=2 D 30 and V D .2/.30/=3 D 20.
Getting back to Table 3.1 Euler quickly realized that F E C V D 2 and he proved that this
relation holds for any convex polyhedron (not just the Platonic solids). And this equation now
bears the name Euler’s polyhedra formula, and it ranks as the second most beautiful equation in
mathematics. Recall that the most beautiful equation of all maths is Euler’s identity e i  C 1 D 0.
Euler is the master of us all, as Laplace puts it.

Using Euler’ formula to prove there exist only five Platonic solids. Consider a regular poly-
hedron with p gons as faces and there are q faces (or q edges) meeting at a vertex. We have
derived the relations:
2E 2E
F D ; V D
p q
Now we substitute them into F E C V D 2, we get

2E 2E
EC D2
p q

From that we get


2E 2E 1 1 1
EC >0” C >
p q p q 2

It can be shown that only the solids in Table 3.1 (see last two columns) satisfy this inequality.
How does this proof compare with Euclid’s proof? Euclid’s proof is geometrical. This proof is
called topological. In the proof we just massage some algebraic expressions, nothing about the
regularity of the Platonic solids was used.

3.2 Area of curved figures


In Section 3.2.1, we discuss the area of first curved shape–the lune of Hippocrates. Next, we
treat the area under a parabola segment–a problem solved by Archimedes (Section 3.2.2). The
problem of the circumference and area of circles is given in Section 3.2.3. They involve , and
how ancient mathematicians computed  is shown in Section 3.2.4.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 266

3.2.1 Area of the first curved plane: the lune of Hippocrates


In geometry, the lune of Hippocrates, Hippocrates of Chios (circa 400 BCE), is a lune bounded
by arcs of two circles, the smaller of which has as its diameter a chord spanning a right angle
on the larger circle (Fig. 3.29). Equivalently, it is a plane region bounded by one 180-degree
circular arc and one 90-degree circular arc. It was the first curved figure to have its exact area
calculated mathematically.
2 
B A = 21 π AB
4
− 1
4
πOA2 − AOB
E

F
D
AB 2 = 2OA2

A O C

Figure 3.29: Lune of Hippocrates. The shaded area AEBF is a moon-like crescent shape, and it is called
a lune, deriving from the Latin word luna for moon. Geometrically a lune is the area between two circular
arcs. Let’s denote by A the area of the lune. Then, it is shown that it is equal to the area of the triangle
AOB, assuming that we know the formula of the area of a circle.

Hippocrates wanted to solve the classic problem of squaring the circle, i.e., constructing a
square by means of straightedge and compass, having the same area as a given circle. He proved
that the lune bounded by the arcs labeled E and F in the figure has the same area as triangle
ABO. Again, ancient mathematicians compared new areas to old ones! This afforded some hope
of solving the circle-squaring problem, since the lune is bounded only by arcs of circles. HeathŽŽ
concludes that, in proving his result, Hippocrates was also the first to prove that the area of a
circle is proportional to the square of its diameter.

3.2.2 Area of a parabola segment


Let’s see how Archimedes computed the area of a parabola segment without calculus. To simplify
the analysis, let’s consider the parabola y D x 2 which is cut by the horizontal line y D 1, see
Fig. 3.30. The shaded area is computed as a sum of 1 (the area of the largest triangle), and the
left over area. This left over area is approximated by two triangles (2 and 3 ) and what is left
unaccounted. And the process goes on.
First, the areas of triangles OQR and QBR are identical and equal 1=16. Thus 2 D 1=8, and
therefore 2 C 3 D 1=4. And note that 1 D 1, so 2 C 3 D 1=41 . So, we can write that

1
A  1 C
4
If we continue the process with the unaccounted region, we get A  1 C 1=4 C 1=16. Seeing
ŽŽ
Sir Thomas Little Heath (1861 – 1940) was a British civil servant, mathematician, classical scholar, and
historian of ancient Greek mathematics. Heath translated works of Euclid of Alexandria, Apollonius of Perga,
Aristarchus of Samos, and Archimedes of Syracuse into English.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 267

y y

y=1
B

R
∆1

∆3 ∆2

O x O x
−1.0 −0.5 0.5 1.0

Figure 3.30: Area of a parabola segment. The coordinates of Q are .0:5; 1=4/ and of R are .0:5; 0:5/
noting that OB is the line y D x. Thus QR D 1=4. We have 1 D 1, and 2 is equal to the area of the
two triangles OQR and QBR combined.

now the pattern, we write


 
1 1 1 1 4
A D 1 C C C    D 1 1 C C C    D 1
4 16 4 16 3
where use was made of the geometric series (Section 2.21.2). It is remarkable that the area of
that curved shape is a factor of the area of the largest triangle. From this result, it is simple to
deduce that the area below the parabola is 2 4=3 D 2=3.
A student in a calculus course would just use integration and immediately obtain the result,
as Z 1  1
2 x3 4
AD2 .1 x /dx D 2 x D (3.2.1)
0 3 0 3
This technique has a name because it was widely used by Greek mathematicians: it’s called
the method of exhaustion; as we add more and more triangles they exhaust the area of the
parabola segment. There are a lots to learn about Archimedes’ solution to this problem. First, he
also used the area of simpler geometry (a triangle). Second, and the most important idea, is that
he used infinitely many triangles! Only when the number of triangles is approaching infinity the
sum of all the triangle areas approach the area of the parabola segment. This sounds similar to
integral calculus we know of today! But wait, while Eq. (3.2.1) is straightforward, Archimedes’
solution requires his genius. For example, how would we know to use triangles that he adopted?
Even though Archimedes’ solution is less powerful than the integral calculus developed
much later in the 17th century, he and Greek mathematicians were right in going to infinity. The
main idea of computing something finite, e.g. the area of a certain (curved) shape, is to chop it
into many smaller pieces, handle these pieces, and when the number of pieces goes to infinity
adding them up will gives the answer. This is what Strogatz called the Principle of Infinity in his
book The Power of Infinite. It is remarkable that we see Archimedes’ legacy in modern world,
see for instance Fig. 3.31. In computer graphics and in many engineering and science fields, any
shape is approximated by a collection of triangles (sometimes quadrilaterals are also used). What
is difference is that we do not go to infinity with this process, as we’re seeking an approximation.
Note that Archimedes was trying to get an exact answer.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 268

(a) Archimedes

(b) Today (c) Today

Figure 3.31: Archimedes’ legacy in the modern world: use of triangles and tetrahedra to approximate any
2D and 3D objects.

3.2.3 Circumference and area of circles

Below are the facts about circles that ancient mathematicians know:

 the circumference of a circle is proportional to its diameter, so C D 21 r assuming that


the proportionality is 1 (which is some number) and r is the circle radius;

 the area of a circle is proportional to the square of its radius, so A D 2 r 2 assuming that
the proportionality is 2 ;

 the circumference C and the area A (of a circle) is related by A D 1=2C r

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 269

The third fact induces that 1 D 2 D . A proof B


of A D =2C r is shown in Fig. 3.32: the area of the
1

circle equals the area of an inscribed regular polygon α=
n
H
α
having infinite sides (in the figure, I illustrate using a A
pentagon i.e., n D 8). The area of this polygon is the O r
α
sum of the area of all the isosceles triangles OAB; an OH = r cos
2
isosceles triangle is a triangle that has two sides of equal AB = 2r sin
α
2
length. These triangles have a height (OH ) equal the
radius of the circle and the sum of their bases equal the
circle’s circumference.
Figure 3.32
If the above reasoning was not convincing enough, here is a better one. Let’s consider a
regular polygon of n sides inscribed in a circle. Its area is denoted by An and its circumference
by Cn , from Fig. 3.32, we can get
  
An D nr 2 sin cos ; Cn D n2r sin
n n n
Then, we consider the ratio An =Cn when n is very large:

An 1  An 1
D r cos H) lim D r
Cn 2 n n!1 Cn 2
See Table 3.2 for supporting data.

Table 3.2: Proof of A D 0:5C r with r D 1: using regular polygons of 4 to 512 sides.

n An Cn An=Cn

4 2.00000000 5.65685425 0.35355339


8 2.82842712 6.12293492 0.46193977
16 3.06146746 6.24289030 0.49039264
32 3.12144515 6.27309698 0.49759236
64 3.13654849 6.28066231 0.49939773
128 3.14033116 6.28255450 0.49984941
256 3.14127725 6.28302760 0.49996235
512 3.14151380 6.28314588 0.49999059

How ancient mathematicians came up with the formula A D  r 2 ? The idea of calculating
the area of the circle is the same: breaking the circle into simpler shapes of which the area is
known. This is what ancient mathematicians did, see Fig. 3.33: they chopped a circle into eight
wedge-shaped pieces (like a pizza), and rearrange the wedges. The obtained shape does not look
similar to any shape known of. So, they chopped the circle into two wedges: this time 16 pieces.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 270

This time, something familiar appears! The wedges together looks like a rectangle. Being more
confident now, they decided to go extreme: divide the circle into infinite number of wedges.
What they got is a rectangle of sides  r (half of the circle perimeter) and r. Thus, the area of a
circle is  r 2 . What an amazing idea it was.

πr πr

(a) n D 8 (b) n D 16
πr πr

(c) n D 32 (d) n D 128

Figure 3.33: Quadrating a circle of radius r: dividing the circle into a number of wedges (n). When n is
very large what we get is a rectangle of sides r and  r, of which the area is  r 2 . And that is also the area
of the circle.

3.2.4 Calculation of 
Archimedes was the first to give a method of calculating  to any desired accuracy around 250
BC. It is based on the fact that the perimeter of a regular polygon of n sides inscribed in a
circle, denoted by ni, is smaller than the circumference of the circle, whereas the perimeter of
an n-polygon circumscribed about the circle, denoted by nc, is greater than its circumference
(Fig. 3.34). In other words, the circle circumference 2 r is squeezed in between ni and nc,
and we can always determine ni and nc. By making n sufficiently large, we can determine 
accurately, to any degree of approximation we want. That’s the essence of Archimedes’ method.
Now, we carry out his method. We just need to consider a unit circle (i.e., a circle of unit
radius); as  is a constant: it does not depend on the size of the circle. What we need to do is
to determine the circumference of n gon inscribed and circumscribed this unit circle. To this
end, we need to know just one side of these polygons. It turns out that there is a relation between
the polygons in the sense that, if we start with a hexagon inscribed in the unit circle (of which
the side is 1), we can determine the side of a 12-gon inscribed in the circle (Fig. 3.35a), and
from a 12 gon we can get a 24 gon and so on. Furthermore, we can also compute the side of a
circumscribed hexagon (Fig. 3.35b).
We need to have a notation that allows us to differentiate the different n gons. As we
consider only 6; 12; 24; 48; : : :, which can be written as 6  20 ; 6  21 ; 6  22 ; 6  23 ; : : :, we

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 271

B′
B

1 H′
H
θ
θ
O A′
A

a) b)

Figure 3.34: Archimedes’s method to determine . The red hexagon is inscribed in the unit circle (OB D
1) whereas the blue hexagon is circumscribed about it. When we increase n-the number of sides-these
two polygons go towards the circle (in b, I used n D 12).

B B0

C B
H
A H0
1
O r
H
AB = m6×2n

AC = m6×2n+1

OA = 1
O A A0
(a) (b)

Figure 3.35: The unit circle with 6  2n -gon and 6  2nC1 -gon inscribed in the circle (illustrated for
n D 0). In (b) is depicted the circumscribed 6  2n -gon (again illustrated for n D 0). The diagrams are
still valid for other polygons, by simply replacing  by =2.

can use the notation 6  2n gon to label a polygon in our collection.


The length of a side of the 6  2n -gon is denoted by m62n . The first task is to relate m62n
to m62nC1 . Using the Pythagorean theorem for 4BH C and for 4OAH , we get the following
equation (see Fig. 3.35a, illustrated with N D 6, AB is the side of the 6  2n gon and BC is
the side of the 6  2nC1 gon), noting that H C D 1 OH and OH 2 C .M=2/2 D 12 :
p r !2 q
.m62n /2 .m62n /2 p
m62nC1 D C 1 1 D 2 4 .m62n /2 (3.2.2)
4 4

With n D 0 we have m62n D m6 , whichpis the length of thepside p


of an inscribed hexagon, which
p
is one. This equation gives us m12 D 2 4 1D 2 3. From it, we can compute
m24 and so on.
The second task is to relate the side of an inscribed 6  2n with that of a circumscribed

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 272

6  2n (denoted by M62n ). Referring to Fig. 3.35b, illustrated again with N D 6, AB is


the side of the inscribed 6  2n gon and A0 B 0 is the side of the circumscribed 6  2n Dgon.
From similar triangles OBH and OB 0 H 0 (or Thales’s theorem due to BH p k B 0 H 0 ), we have
H 0 B 0=HB D OH 0=OH . But, HB D 0:5AB D 0:5m 0
62n , OH D 1 and OH D 1 .m62n /2 =4.
Thus,
2AB 2m62n
M62n D p Dp (3.2.3)
4 AB 2 4 .m62n /2
p
We start with a hexagon (following Archimedes): m6 D 1 and M6 D 2= 3 after Eq. (3.2.3).
Thus,  is:
2 p
6  1 < 2 < 6  p H) 3 <  < 2 3
3
p p
Next,
p p p p we consider a 12 gon with m12 D 2 3, and Eq. (3.2.3) provides M12 D
2 2 3= 2C 3. Therefore,

p p ! p p !
2 3 2 3
12 <  < 12 p p
2 2C 3

Continuing this way until 96 gons, Archimedes stopped:


0s r 1 0s r 1
q p q p
p p
B 2 2 C 2 C 2 C 3C B 2 2 C 2 C 2 C 3C
B C B C
96 B
B
C <  < 96 B s
C B r
C
C
@ A @ q
p A
2 p
2C 2C 2C 2C 3

Now, Archimedes faced a tedious task: computing those square roots without a calculator, with-
out decimal numbers (without number 0 even). He probably used Eq. (2.9.1) to compute the
roots. I used my computer to do these roots to get

3:14103195089053 <  < 3:1427145996453882 (3.2.4)

My compute gave me  D 3:1415926535897. Therefore, Archimedes got only two correct


decimals. But it was a magnificent achievement! This polygonal algorithm dominated for over
1 000 years until infinite series were discovered. I presented one such infinite series for  in
Eq. (2.21.16). And there is Machin’s formula in Eq. (3.10.3). And we shall see more in this
book.
But Archimedes could not write Eq. (3.2.4) as he did not have decimals. Instead, he used
fractions and wrote: 3 71
10
<  < 3 17 .
Pi is a special number, various books are written about it. There is even a day called Pi day
(March 14), which is coincidentally also the birthday of Albert Einstein (14 March 1879). People
keep calculating
p more and more digits of this number. Note that no one cares about the decimal
digits of 2. I recommend the book A History of Pi by Petr Beckmann.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 273

History note 3.2: Archimedes (c. 287 - c. 212 BC)


Archimedes of Syracuse was a Greek mathematician, physicist, engi-
neer, inventor, and astronomer. Considered to be the greatest math-
ematician of ancient history, and one of the greatest of all time,
Archimedes anticipated modern calculus and analysis by applying
concepts of infinitesimals and the method of exhaustion to derive and
rigorously prove a range of geometrical theorems, including: the area
of a circle; the surface area and volume of a sphere; area of an ellipse;
the area under a parabola; the volume of a segment of a paraboloid of
revolution; the volume of a segment of a hyperboloid of revolution;
and the area of a spiral. He derived an accurate approximation of .
Archimedes died during the Siege of Syracuse, where he was killed by a Roman sol-
dier despite orders that he should not be harmed. Cicero describes visiting the tomb of
Archimedes, which was surmounted by a sphere and a cylinder, which Archimedes had
requested be placed on his tomb to represent his mathematical discoveries.

Liu Hui’s algorithm. Liu Hui (3rd century CE) was a Chinese mathematician and writer who
lived in the Three Kingdoms period (220–280) of China. In 263, he edited and published a book
with solutions to mathematical problems presented in the famous Chinese book of mathematics
known as The Nine Chapters on the Mathematical Art, in which he was possibly the first mathe-
matician to discover, understand and use negative numbers. Along with Zu Chongzhi (429–500),
Liu Hui was known as one of the greatest mathematicians of ancient China. In this section I
present his method to determine .
Liu Hui first derived an inequality for  based on the area of inscribed polygons with N and
2N sides. In the diagram of Fig. 3.36a, ABCD is an N polygon whereas AEBF C GDH is a
2N polygon, both inscribed in the circle. Regarding the areas of these polygons where AN is the
area of the N -polygon, A2N the area of the 2N polygon and the circle (of area Ac ), we have
the following relations:

AN D yellow area (3.2.5a)


A2N D yellow area + orange area (3.2.5b)
A2N < Ac < A2N C gray area (3.2.5c)
gray area D orange area (3.2.5d)

Therefore, we can deduce that

A2N < Ac < 2A2N AN H) A2N <  < 2A2N AN (3.2.6)

where the last inequality holds when considering a circle of unit radius.
Liu Hui then computed the area of inscribed polygons with N and 2N sides. To that end, he
needed a formula relating the side of a 2N -gon, denoted by m with that of a N -gon, denoted
by M . Using the Pythagorean theorem for triangle BH C and for triangle OAH , he derived the

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 274

F E B

C
C O A
H
A
O r
AB = M
AC = m
G H
OA = r
D

(a) (b)

Figure 3.36: Liu Hui’s method for determining  based on the area of inscribed polygons with N and
2N sides.

following equation (see figure, illustrated with N D 6, AB is the side of the N -gon and BC is
the side of the 2N -gon both inscribed in the circle of radius r), noting that H C D r OH and
OH 2 C .M=2/2 D r 2 :
q 0 s 12
 2  2
M M
mD C @r r2 A
2 2

Putting r D 1 (unit), we have:


p r !2
M2 M2
mD C 1 1 (3.2.7)
4 4

Now, he calculated the area of a N gon approximately as the sum of the areas of all triangles
making the polygon:    
1 1
AN  N  Mr  N  M (3.2.8)
2 2
Now come the complete algorithm: we start with N D 6 (hexagon), thus M D 1 (as r D 1).
Then, we do:

 compute m using Eq. (3.2.7)

 compute AN D 0:5NM , compute A2N D 0:5.2N /m

 then Eq. (3.2.6) yields A2N <  < 2A2N AN

 next iteration: M D m, N D 2N

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 275

If we repeat this algorithm four times i.e., using 46-gon and 98-gon, we get this approximation
for : 3:14103195 <  < 3:14271370. And the Chinese astronomer and mathematician Zu
Chongzhi (429–500 AD) got 3:141592619365 <  < 3:141592722039 with 12288-gon, a
record which would not be surpassed for 1200 years. Even by 1600 in Europe, the Dutch mathe-
matician Adriaan Anthonisz and his son obtained  value of 3:1415929, accurate only to 7 digits.

Ramanujan’s pi formula. Srinivasa Ramanujan (22 December 1887 – 26 April 1920) was an
Indian mathematician who lived during the British Rule in India.
Albeit without any formal training in pure mathematics, he has made
substantial contributions to mathematical analysis, number theory, infi-
nite series, and continued fractions, including solutions to mathematical
problems then considered unsolvable. Ramanujan initially developed his
own mathematical research in isolation: according to Hans Eysenck, a
German-born British psychologist : "He tried to interest the leading pro-
fessional mathematicians in his work, but failed for the most part. What
he had to show them was too novel, too unfamiliar, and additionally
presented in unusual ways; they could not be bothered". Seeking math-
ematicians who could better understand his work, in 1913 he began a
postal correspondence with the English mathematician Godfrey Hardy at
the University of Cambridge. Recognizing Ramanujan’s work as extraor-
dinary, Hardy arranged for him to travel to Cambridge.
Ramanujan gave us, among many other amazing formula, the following formula for 1=
p 1
1 2 2 X .4k/Š .1103 C 26390k/
D (3.2.9)
 9801 .kŠ/4 3964k
kD0
With only one term, we get  D 3:1415926535897936! I do not know the derivation of it. But
it is certain that it did not come from the method of ancient mathematicians which relied on
geometry. Ramanujan had in his hands the power of 20th century mathematics. To know more
about Ramanujan, I recommend the 2015 British biographical drama film ’The Man Who Knew
Infinity’. The movie is based on the 1991 book of the same name by Robert Kanigel.

History note 3.3: The first letter of Ramanujan to Hardy.


Dear Sir,
I beg to introduce myself to you as a clerk in the Accounts Department of the Port
Trust Office at Madras on a salary of only £20 per annum. I am now about 23 years
of age. I have had no University education but I have undergone the ordinary school
course. After leaving school I have been employing the spare time at my disposal to
work at Mathematics. I have not trodden through the conventional regular course which is
followed in a University course, but I am striking out a new path for myself. I have made
a special investigation of divergent series in general and the results I get are termed by the
local mathematicians as ‘startling’.
Just as in elementary mathematics you give a meaning to an when n is negative and

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 276

fractional to conform to the law which holds when n is a positive integer, similarly the
whole of my investigations proceed on giving a meaning to Eulerian Second Integral
for all values of n . MyRfriends who have gone through the regular course of University
1
education tell me that 0 x n 1 e x dx D .n/ is true only when n is positive. They
say that this integral relation is not true when is negative. Supposing this is true only for
positive values of n and also supposing the definition n .n/ D .nC1/ to be universally
true, I have given meanings to these integrals and under the conditions I state the integral
is true for all values of n negative and fractional. My whole investigations are based upon
this and I have been developing this to a remarkable extent so much so that the local
mathematicians are not able to understand me in my higher flights.
Very recently I came across a tract published by you styled Orders of Infinity in page
36 of which I find a statement that no definite expression has been as yet found for
the number of prime numbers less than any given number. I have found an expression
which very nearly approximates to the real result, the error being negligible. I would
request you to go through the enclosed papers. Being poor, if you are convinced that
there is anything of value I would like to have my theorems published. I have not given
the actual investigations nor the expressions that I get but I have indicated the lines on
which I proceed. Being inexperienced I would very highly value any advice you give me.
Requesting to be excused for the trouble I give you.
I remain, Dear Sir, Yours truly, S. Ramanujan

3.3 Trigonometric functions: right triangles


Considering two similar right-angled triangles (or right triangles) as shown in Fig. 3.37. Now is
the time to apply Euclid’s geometry: as the two triangles ABC and A0 B 0 C 0 are similar we have:
AC AB
0 0
D 0 0
AC AB
From that it is simple to deduce
 
AC A0 C 0 4
D 0 0 D
AB AB 3
What does this mean? This shows that for all right triangles with an angle ˛, the side ratios
AC=AB are constant.

Now comes the key point. If we can manage to compute the ratio AC=AB for a given angle
˛, then we can use it to solve any right triangle with the angle at B equal ˛. Thus, we have our
very first trigonometric function–the tangent:
AC
tan ˛ WD
AB
Thus a trigonometric function relates an angle of a right-angled triangle to ratios of two side
lengths. And if we have a table of the tangent i.e., for each angle ˛, we can look up its tan ˛, we

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 277

C′

C
8 10

4 5

α α

A 3 B A′ 6 B′

Figure 3.37: Two similar right-angled triangles ABC and A0 B 0 C 0 : AC=AB D A0 C 0=A0 B 0 .

then can solve every right triangle problems; in Fig. 3.38a we can determine A1 C1 D A1 B tan ˛.
The first trigonometric table was apparently compiled by Hipparchus of Nicaea (180 – 125 BCE),
who is now consequently known as "the father of trigonometry."
C2

C1 C

C
OPPOSITE

HY
PO
TE
4 5
NU
SE

α α

A2 A1 A 3 B A ADJACENT B
a) b)

Figure 3.38: Knowing tan ˛ allows us to determine A1 C1 D A1 B tan ˛.

Why just the ratio AC =AB? All three sides of a triangle should be treated equally and their
ratios are constants for all right triangles with the same angle ˛. If so, from three sides, we can
have six ratios! And voilà, we have six trigonometric functions. Quite often, they are also referred
to as six trigonometric ratios. They include: sine, cosine and tangent and their reciprocals, and
are defined as (Fig. 3.38b):
adjacent AB BC 1
cos ˛ D D ; sec ˛ D D
hypotenuse BC AB cos ˛
opposite AC BC 1
sin ˛ D D ; csc ˛ D D
hypotenuse BC AC sin ˛
opposite AC AB 1
tan ˛ D D ; cot ˛ D D
adjacent AB AC tan ˛
The secant of ˛ is 1 divided by the cosine of ˛, the cosecant of ˛ is defined to be 1 divided by

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 278

the sine of ˛, and the cotangent (cot) of ˛ is 1 divided by the tangent of ˛. These three functions
(secant, cosecant and cotangent) are the reciprocals of the cosine, sine and tangent.
Where these names come from is to be explained in the next section.

3.4 Trigonometric functions: unit circle


Earlier forms of trigonometry can be traced back to the ancient Greeks, notably to the two
mathematicians Hipparchus and Ptolemy. This version of trigonometry was based on chords in
a circle. Precisely, Hipparchus’ trigonometry was based on the chord subtending a given arc in
a circle of fixed radius R, see Fig. 3.39. Indian mathematicians inherited this trigonometry and
modified it. Instead of using the chord they used half chord. Therefore, the Indian half-chord is
closely related to our sine.
a) b)
A A

R R
α α
O α O
C

B
1
AB = crd(2α) AC = AB
2

Figure 3.39: Greek’s chord and Indian’s half-chord. When R D 1, Indian’s half chord, AC is our sine.

The Sanskrit word for chord-half was jya-ardha, which was sometimes shortened to jiva. This
was brought into Arabic as jiba, and written in Arabic simply with two consonants jb, vowels
not being written. Later, Latin translators selected the word sinus to translate jb thinking that
the word was an arabic word jaib, which meant breast, and sinus had breast and bay as two of
its meanings. In English, sinus was imported as “sine.” This word history for sine is interesting
because it follows the development path of trigonometry from India, through the Arabic language
from Baghdad through Spain, into western Europe in the Latin language, and then to modern
languages such as English and the rest of the world.
Right triangles have a serious limitation. They are excellent for angles up to 90ı . How about
angles larger than that? And how about negative angles? We change now to a circle which solves
all these limitations.
We consider a unit circle (i.e., a circle with a unit radius) centered at the origin of the Cartesian
coordinate system (refer to Section 3.20.1 for details). Angles are measured from the positive
x axis counter clockwise; thus 90ı is straight up, 180ı is to the left (Fig. 3.40). The circle is
divided into four quadrants: the first quadrant is for angles ˛ 2 Œ0ı ; 90ı ; those angles are called
acute angles, the second quadrant is for angles ˛ 2 Œ90ı ; 180ı ; those angles are obtuse angles.
Angles in the third and fourth quadrants are ˛ 2 Œ180ı ; 360ı  and called reflex angles.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 279

y y
◦ (0, 1)
90
A A(cos α, sin α)
II I

sin α
180◦ α 0◦ (−1, 0) α (1, 0)
x cos α x

III IV

270◦ (0, −1)

Figure 3.40: The unit circle: angles are measured counterclockwise.

An angle ˛ is corresponding to a point A on the circle. And the x-coordinate of this point is
cos ˛ whereas the y-coordinate is sin .

Mnemonics in trigonometry. The sine, cosine, and tangent ratios in a right triangle can be
remembered by representing them as strings of letters, for instance SOH-CAH-TOA in English:

Sine D Opposite=Hypotenuse
Cosine D Adjacent=Hypotenuse
Tangent D Opposite=Adjacent

One way to remember the letters is to sound them out phonetically.


However, the acronym SOH-CAH-TOA does not help us understand why the ratio is defined
this way. Armed only with “SOH CAH TOA,” it becomes confusing to find the values of
trigonometric ratios for angles whose measures are 90 degrees or greater, or negative. The
acronym is not a bad tool, but it is insufficient to help understand trigonometric ratios. We
believe it is better not to teach students this SOH-CAH-TOA (and the likes). Instead, as the
origin of the sine is a chord, which measures the height of a right triangle, sine must be the ratio
of opposite over hypotenuse. Cosine then uses the other side of the triangle, namely the adjacent.
And tangent is just a derived quantity: it is the ratio of sine over cosine. There is no need to
memorize anything about it.

3.5 Degree versus radian


Angles are measured either in degrees or in radians. Even though it is common to have different
units for a quantity (we have meters, miles, yards as units for length), we should understand
the origin and benefits of the different units for angles. Angles were first expressed in terms of
degree and a full circle is 360ı . Why 360? We never know why but it came from Babylonians,
probably because we have 365 days in a year.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 280

A
The use of degree as the measure for angles is quite arbitrary. Math-
r
ematicians found a better way for this purpose: they used the length r
of an arc. Referring to Fig. 3.41, one radian (1 rad) is the angle corre- 1 rad
O B
sponding to the arc (AB in the figure) of which the length is the circle
radius (denoted by r). If the angle is two radians, the length of the arc
is 2r. Thus, if the angle is ˛ (radians), the length of an arc is ˛r. As
a full circle has a perimeter of 2 r, a full circle is 2 or 360ı . So, an
angle of ˛ (in degrees) is ˛=180 in radians: 90ı is =2, 180ı is  etc. Figure 3.41: Definition of
With radian, the mathematics is simpler. For example, if ˛ is in radian as unit for angle
radian, the length of an arc is simply ˛r, where r is the radius of the measurement.
circle. On the other hand, the arc length would be . ˛=180/r if the angle is in degrees. In calculus,
derivatives of trigonometric functions are also simpler when angles are expressed in radians. For
example, the derivative of sin x would be .=180/ cos x instead of cos x, see Section 4.4.8 for
details.
The concept of radian measure, as opposed to the degree of an angle, is normally credited
to the English mathematician Roger Cotes (1682 – 1716) in 1714. He described the radian in
everything but name, and recognized its naturalness as a unit of angular measure. Prior to the
term radian becoming widespread, the unit was commonly called circular measure of an angle.
The idea of measuring angles by the length of the arc was already in use by other mathematicians.
For example, the Persian astronomer and mathematician al-Kashi (c. 1380 – 1429) used so-called
diameter parts as units, where one diameter part was 1=60 radian.
The term radian first appeared in print on 5 June 1873, in examination questions set by
the British engineer and physicist James Thomson (1822 – 1892) at Queen’s College, Belfast .
He had used the term as early as 1871, while in 1869, the Scottish mathematician Thomas
Muir (1844 – 1934) was vacillated between the terms rad, radial, and radian. In 1874, after a
consultation with James Thomson, Muir adopted radian. The name radian was not universally
adopted for some time after this. Longmans’ School Trigonometry still called the radian circular
measure when published in 1890.

3.6 Some first properties


It is obvious that these six trigonometric functions are not independent of each other. One can
derive the following relations between them quite straightforwardly:

sin2 ˛ C cos2 ˛ D 1
1
1 C tan2 ˛ D 2
D sec2 ˛ (3.6.1)
cos ˛
1
1 C cot2 ˛ D D csc2 ˛
sin2 ˛

James Thomson’s reputation is substantial though it is overshadowed by that of his younger brother William
Thomson or Lord Kelvin whose name is used for absolute temperatures.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 281

where the second and third identities are obtained from the first by dividing both side by cos2 ˛
and sin2 ˛, respectively.
Without actually doing any calculations, just starting from these definitions, and some obser-
vations (Fig. 3.42), we can see that 1  sin ˛  1; 1  cos ˛  1, and
( ( (
sin. ˛/ D sin ˛ sin. ˛/ D sin ˛ sin.=2 ˛/ D cos ˛
; ; ; (3.6.2)
cos. ˛/ D cos ˛ cos. ˛/ D cos ˛ cos.=2 ˛/ D sin ˛

The first means that the function y D sin x is odd, and the second tells us that y D cos x is even
(see Section 4.2.1 for more details). Why bother? Not only because
R we do like classification but
also odd and even functions possess special properties (e.g.  sin xdx D 0, a nice result, isn’t
it?). The second in Eq. (3.6.2) allows us to just compute the sine for angles 0    =2 (these
angles are called first quadrant angles), the sine of   is then simply sin . From the third
equation we see that the value of the cosine cos.x/ is equal to the values of sin.=2 x/ for its
complementary angle. And that explains the name ‘cosine’: complementary of sine.
To see why tangent is such called, see Fig. 3.47. And then, it is easy to understand the name
cotangent as tan.=2 ˛/ D cot ˛. Note that we did not list identities for tangent and cotangent
here for brevity; as tan  and cot  are functions of the sine and cosine.

y y y
◦ ◦
90 90
A A′ A A′
π/2 − α
A
π−α
sin α

α
180◦ α 0◦ 180◦ α 0◦ 180◦ α 0◦
x x x
sin(−α)

−α

A′
270◦ 270◦ 270◦
sin(−α) = sin α sin(π − α) = sin α sin(π/2 − α) = cos α
cos(−α) = cos α cos(π − α) = − cos α cos(π/2 − α) = sin α

Figure 3.42: Some simple facts about sin x and cos x.

If we start at a certain point on a circle (this point has an angle of ) and we go a full round
then we’re just back to where we started. That means sin. C 2/ is simply sin . But if we go
another round we also get back to the starting point. Thus, for n being any whole number, we
have:

sin.˛ C 2n/ D sin ˛


(3.6.3)
cos.˛ C 2n/ D cos ˛
p
That’s why when we solve trigonometric equations like cos x D 2=2, the solution is x D
˙=4 C 2n with n 2 N not simply x D ˙=4.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 282

3.7 Sine table


Now, we are going to calculate the sines and cosines of angles from 0 degrees to 360 degrees.
You might be asking why bother if we all have calculators which are capable of doing this for us?
Note that we’re not interested herein in the results (i.e., the sine of a given angle), but interested
in how trigonometric functions are actually computed. Along the process, new mathematics
were invented (or discovered) to finish the job. Astronomers, for example, always need to know
the sine/cosine of angles from 0 degrees to 360 degrees. That was why the Greek astronomer,
geometer, and geographer Ptolemy in Egypt during the 2nd century AD, produced such a table
in Book I of his masterpiece Almagest.
As sine and cosine are related via sin2 ˛ C cos2 ˛ D 1 and other
Table of Sin(a) 19/2/21, 10:38 pm

trigonometry functions are derived from sine/cosine, knowing the


sines is enough. And we set for ourselves a task of building a sine
table for angles from 0ı to 90ı . We start simple with only whole num-
bered angles i.e., 1ı ; 2ı ; : : : Now, one small observation will save us
big time: as sin2 ˛ C sin2 .=2 ˛/ D 1, if we know sin ˛, we know
sin.=2 ˛/. Thus the work is cut halfŽ .
As such a detailed sine table is of no value today, we present
the sines/cosines of few ‘common’ angles in Table 3.3 for refer-
ence. Among these values, we just need to calculate the sines for
 D f45ı ; 60ı g. The calculation for 45ı and 60ı is demonstrated in Navigation ..

Fig. 3.43; we simply used the definition of sine/cosine, and of course the famous Pythagorean
theorem. Because  D f0ı ; 90ı ; 180ı ; 270ı g coincide with the vertices of the unit circle, their
https://www.grc.nasa.gov/www/k-12/airplane/tablsin.html Page 2 of 3

sines are easy to obtain.


Ptolemy computed sine for 36ı or =5 using a geometric reasoning based on Proposition
10 of Book XIII of Euclid’s Elements (we present a different way later). Knowing the sine for
36ı we then know the sine of 54ı . Using the trigonometry identity sin.˛ ˙ ˇ/ D sin ˛ cos ˇ ˙
sin ˇ cos ˛, to be discussed in Section 3.8, we get the sine for 72ı with ˛ D ˇ D 36ı , the sine for
18ı with ˛ D 72ı ; ˇ D 54ı and the sine for 75ı with ˛ D 30ı ; ˇ D 45ı . With ˛ D 75ı ; ˇ D 72ı
we get sin 3ı . And from that we can get the sines for all multiples of 3ı i.e., 6ı , 9ı , 12ı etc. (for
example using sin 2x D 2 sin x cos x).
If we know sin 1ı , then we will know sin 2ı , sin 6ı , sin 5ıŽ etc. and we’re done. But Ptolemy
could not find sin 1ı directly, he found an approximate method for it (see Section 3.11). The
Persian astronomer al-Kashi (c. 1380 – 1429) in his book The Treatise on the Chord and Sine,
computed sin 1ı to any accuracy. In the process, he discovered the triple angle identity often
attributed to François Viète in the sixteenth century.
Using the triple identity sin.3˛/ D 3 sin ˛ 4 sin3 ˛ (to be discussed in the next section§ ),
he related sin 1ı with sin 3ı (which he knew) via the following cubic equation:

sin 3ı D 3 sin 1ı 4 sin3 1ı


p
Ž
For example, if we know sin 44ı , then sin 46ı D 1 sin2 44ı .
Ž
Note that we already have sin 3ı .
§
Actually we derived this identity in Section 2.25.5 using complex numbers.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 283

Table 3.3: Sines and cosines of some angles from 0 degrees to 360 degrees.

Angle  [degree] Angle [rad] sin  cos 

0 0 0 1
p
30 =6 1=2 3=2
p p
45 =4 2=2 2=2
p
60 =3 3=2 1=2

90 =2 1 0
180  0 -1
270 3=2 -1 0
360 2 0 1


6
p
p 2 3 2
2
1
 

3 3
4

1 1 1

Figure 3.43: Calculation of sine and cosine for  D =4,p  D =6 and  D =3. For =4, we start with
the unit square, draw a diagonal of which the length is 2, due to the Pythagorean theorem. We then
have two identical right triangles. Consider one right triangle, the acute angles are =4. We know all the
sides of this right triangle, thus we know how to compute the sine/cosine of =4. Similarly, for =3, we
start with an equilateral triangle of side 2 (why?).

But the cubic would not be solved for another 125 years by Cardano. Clearly, al-Kashi could not
wait that long. What did he do? With x D sin 1ı , he wrote

sin 3ı C 4x 3
sin 3ı D 3x 4x 3 H) x D
3
What is this? This is fixed point iterations method discussed in Section 2.11. With only four
iterations we get sin 1ı with accuracy of 12 decimal places: sin 1ı D sin 0:017453292520 D
0:017452406437. al-Kashi gave us sin 1ı . Is there anything else? Look at the red numbers, what
did you see? It seems that we have sin x  x at least for x D 1ı . This is even more important
than what sin 1ı is. Why? Because if it is the case, we can replace sin x–which is a complex
functionŽŽ –by a very simple x.
ŽŽ
In the sense computing the sine of an angle is hard.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 284

3.8 Trigonometry identities


What is a mathematical identity? An example would answer this question: .aCb/2 D a2 C2abC
b 2 is an identity, as it involves equality of two expressions (that contains some variables) for
all values of these variables. Thus, a trigonometry identity is an identity involving trigonometry
functions; for example sin2 x C cos2 x D 1 is a trigonometry identity as it is correct for all x.
There are several trigonometric identities (actually there are two many unfortunately) and
I once learnt them by heart. It was a terrible mistakeŽ . Below, we will list the most popular
identities and provide proofs for them. So, the point is: do not learn them by heart, instead
understand how to construct them from scratch. As will be shown, all identities are derived from
one basic identity! This identity can be sin.˛ C ˇ/ or cos.˛ ˇ/.
Even though this section covers many common trigonometry identities, the list given below
is not meant to be exhaustive. Later sections will present some interesting but less common
identities. Note also that for the three angles of a triangle, there are many trigonometry identities
e.g. cot x2 C cot y2 C cot z2 D cot x2 cot y2 cot z2 for x C y C z D . These identities will be
presented at the end of this section.

(angle addition and subtraction)


sin.˛ ˙ ˇ/ D sin ˛ cos ˇ ˙ sin ˇ cos ˛
cos.˛ ˙ ˇ/ D cos ˛ cos ˇ  sin ˛ sin ˇ
tan ˛ ˙ tan ˇ (3.8.1)
tan.˛ ˙ ˇ/ D
1  tan ˛ tan ˇ
˙ cot ˛ cot ˇ 1
cot.˛ ˙ ˇ/ D
cot ˛ ˙ cot ˇ
The proof of the addition angle formulae for sine and cosine is shown in Fig. 3.44. The idea
is to use the definition of sine and cosine, and thus constructing right triangles containing ˛
and ˇ and their sum. The choice of OC D 1 simplifies the calculations. The formula for
sin.˛ ˇ/ can be obtained from the addition angle formula by replacing ˇ with ˇ and noting
that sin. ˇ/ D sin ˇ. Or we can prove the formula for cos.˛ ˇ/ using the unit circle as
given in Fig. 3.45.
The identity for the addition angle for the tangent is obtained directly from its definition and
the available formulae for sine and cosine:
sin.˛ C ˇ/ sin ˛ cos ˇ C sin ˇ cos ˛ tan ˛ C tan ˇ
tan.˛ C ˇ/ D D D (3.8.2)
cos.˛ C ˇ/ cos ˛ cos ˇ sin ˛ sin ˇ 1 tan ˛ tan ˇ

where in the last step, we divide both the denominator and numerator by cos ˛ cos ˇ so that
tangents will appear.
Ž
The French-American mathematician Serge Lang (1927 – 2005) in his interesting book Math: Encounters
with high school students [39] advised high school students to memorize formula and understand the proof. He
wrote that he himself did that. I think Lang’s advice is helpful for tests and exams where time matters. Lang was a
prolific writer of mathematical texts, often completing one on his summer vacation. Lang’s Algebra, a graduate-level
introduction to abstract algebra, was a highly influential text.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 285

C
D sin(α + β) = CH = AD
AB = sin α cos β
α BD = cos α sin β

sin
β
B

1
cos(α + β) = OH = OA − HA
co sβ
OA = cos α cos β
β HA = CD = sin α sin β
α
O A
H

Figure 3.44: Proof of sin.˛ C ˇ/ and cos.˛ C ˇ/. Noting that OC D 1.


y y
P (cos α, sin α) P (cos(α − β)
, sin(α − β))

d Q(cos β, sin β)
α−β
d
α α−β
β
x x
Q(1, 0)

Figure 3.45: Proof of cos.˛ ˇ/: expressing the distance d (between points P and Q) two ways (one
from the left figure and one from the right figure). Recall d 2 D .x1 x2 /2 C .y1 y2 /2 is the squared
distance between two points .x1 ; y1 / and .x2 ; y2 /. That exercise gives us cos.˛ ˇ/. From this result, we
can get cos.˛ C ˇ/, and sin.˛ ˇ/ by writing sin.˛ ˇ/ D cos.=2 .˛ ˇ// D cosŒ.=2 ˛/ C ˇ.
Then using the addition angle formula for cosine.

From the addition angle formula for sine, it follows that sin.2˛/ D sin.˛ C ˛/ D
2 sin ˛ cos ˛. Similarly, one can get the double-angle for cosine. If you do not like this geometric
based derivation, don’t forget we have another proof using complex numbers (Section 2.25).
Thus, we have the following double angle identities
(double-angle)
sin.2˛/ D 2 sin ˛ cos ˛
cos.2˛/ D cos2 ˛ sin2 ˛ D 2 cos2 ˛ 1 D 1 2 sin2 ˛ (3.8.3)
2 tan ˛
tan.2˛/ D
1 tan2 ˛
The triple-angle formula for sine can be obtained from the addition angle formula as follows
sin.3˛/ D sin.2˛ C ˛/
D sin.2˛/ cos ˛ C sin ˛ cos.2˛/
D 2 sin ˛ cos2 ˛ C sin ˛.cos2 ˛ sin2 ˛/
D 2 sin ˛.1 sin2 ˛/ C sin ˛.1 sin2 ˛ sin2 ˛/

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 286

And the derivation of the triple-angle for tangent is straightforward from the definition of tangent:

sin.3˛/ 3 sin ˛ 4 sin3 ˛ 3 tan ˛ tan3 ˛


tan.3˛/ D D D
cos.3˛/ 4 cos3 ˛ 3 cos ˛ 1 3 tan2 ˛

where in the last equality, we divided both numerator and denominator by cos ˛ so that tan ˛
will appear. Admittedly, we did that because we already know the result. But if you did not know
the result, you would also do the same thing. Why? This is we believe in the pattern: sin.3˛/ is
expressed in terms of powers of sin.˛/ and the same pattern occurs for cosine. Why not tangent?
Usually, this belief pays off. Remember the story of Newton discovering the binomial theorem?
The triple angle identities are thus given by,

(triple-angle)
sin.3˛/ D 3 sin ˛ 4 sin3 ˛
cos.3˛/ D 4 cos3 ˛ 3 cos ˛ (3.8.4)
3 tan ˛ tan3 ˛
tan.3˛/ D
1 3 tan2 ˛

From the double-angle for cosine: cos.2˛/ D cos2 ˛ sin2 ˛ D 2 cos2 ˛ 1 we can derive
the identity for half angle. A geometry proof for this is shown in Fig. 3.46. The proof is simple
but it requires some knowledge of Euclidean geometry. In particular, we need the central angle
theorem of which a proof is given later in Fig. 3.17. It is this theorem that gives us the angle at
O is 2 in Fig. 3.46.

1 1
OH = cos 2θ
2 2
AH = AC cos θ = sin β cos θ = cos2 θ
2θ β cos2 θ = OA + OH
θ
1 1
A O H B = + cos 2θ
2 2
1
2

Figure 3.46: Proof of the half-angle cosine formula using geometry. Consider a circle of radius of 1=2 and
a righ triangle ACB with AB being the diameter of the circle. Note that sin ˇ D cos ˛ as ˛ C ˇ D =2.

(half-angle)
r
1 C cos.2˛/
cos ˛ D (3.8.5)
r 2
1 cos.2˛/
sin ˛ D
2
Phu Nguyen, Monash University © Draft version
Chapter 3. Geometry and trigonometry 287

Next come the so-called product identities:

(Product identities)
sin.˛ C ˇ/ C sin.˛ ˇ/
sin ˛ cos ˇ D
2
cos.˛ C ˇ/ C cos.˛ ˇ/ (3.8.6)
cos ˛ cos ˇ D
2
cos.˛ ˇ/ cos.˛ C ˇ/
sin ˛ sin ˇ D
2
The product identities sin ˛ sin ˇ are obtained from the addition/subtraction identities:
)
sin.˛ C ˇ/ D sin ˛ cos ˇ C sin ˇ cos ˛
H) sin.˛ C ˇ/ C sin.˛ ˇ/ D 2 sin ˛ cos ˇ
sin.˛ ˇ/ D sin ˛ cos ˇ sin ˇ cos ˛

Another form of the product identities are the sum-product identities given by,

(Sum-product identities)
˛Cˇ ˛ ˇ
sin ˛ C sin ˇ D 2 sin cos
2 2
˛Cˇ ˛ ˇ (3.8.7)
cos ˛ C cos ˇ D 2 cos cos
2 2
˛Cˇ ˛ ˇ
cos ˛ cos ˇ D 2 sin sin
2 2

Later on in Section 9.10 we will use these identities when discussing superposition of waves
and the beat phenomenon. When you hear the ‘wah wah wah’ sound of the two pitches, close to
each other, you are hearing the sum-to-product identities at work!
And finally are two identities relating sine/cosine with tangent of half angleŽŽ :

(Sine/cosine in terms of tan x=2)


2u 1 u2 (3.8.8)
sin x D ; cos x D .u D tan x2 /
1Cu 2 1Cu 2

So, what we have just done? We proved the angle addition (or subtraction) identity, and all
the rest were derived (using simple algebra) from it.
Historically, the product identities, Eq. (3.8.6), were used before logarithms were invented to
perform multiplication. Here’s how you could use the second one. If you want to multiply x  y,
use a table to look up the angle ˛ whose cosine is x and the angle ˇ whose cosine is y. Look
up the cosines of the sum ˛ C ˇ and the difference ˛ ˇ. Average those two cosines. You get
the product xy! Three table look-ups, and computing a sum, a difference, and an average rather
than one multiplication. Tycho Brahe (1546–1601), among others, used this algorithm known as
prosthaphaeresis.
ŽŽ
The proof goes like this: sin x D 2 sin x=2 cos x=2 D 2 tan x=2 cos2 x=2:

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 288

Derivation of cos.˛ ˇ/ D cos ˛ cos ˇ C sin ˛ sin ˇ using vector algebra.

If we know vector algebra (Section 11.1), we can derive this identity easily. Consider two
unit vectors a and b. The first vector makes with the horizontal axis an angle ˛ and the
second vector an angle ˇ. So, we can express these two vectors as

a D cos ˛i C sin ˛j ; b D cos ˇi C sin ˇj

Then, the dot product of these two vectors can be computed by two ways:

a  b D cos.˛ ˇ/; a  b D cos ˛ cos ˇ C sin ˛ sin ˇ


So, we have cos.˛ ˇ/ D cos ˛ cos ˇ C sin ˛ sin ˇ. How about the similar identity for
sine? It seems using the cross product (which involves the sine) of two vectors is the way.
You could try it yourself.

Compute sine/cosine of =5.

Let’s denote  D =5, or 5 D , so

2 D  3
sin.2 / D sin. 3 / D sin.3/
2 sin  cos  D 3 sin  4 sin3 
p
1C 5
4 cos2  2 cos  1 D 0 H) cos  D
4
Did you notice something familiar? Cosine of 36ı is related to the golden ratio . To see
why check Fig. 2.15. I pass the baton to the interested reader to investigate that.

Pascal triangle again. If we compute tan n˛ in terms of tan ˛ for n 2 N, we get the following
(only up to n D 4):

tan ˛ D t
2t
tan 2˛ D
1 t2
3t t 3 (3.8.9)
tan 3˛ D
1 3t 2
4t 4t 3
tan 4˛ D
1 6t 2 C 1t 4
And see what? Binomial coefficients multiplying the powers tanm ˛ show up. The binomial
coefficients, corresponding to the numbers of the row of Pascal’s triangle, occur in the expression
in a zigzag pattern (i.e., coefficients at positions 1; 3; 5; : : : are in the denominator, coefficients

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 289

at the positions 2; 4; 6; : : : are in the numerator, or vice versa), following the binomials in row
of Pascal’s triangle in the same order.

Bernoulli’s imaginary trick. The way we obtained tan n in terms of tan  works nicely for
small n. Is it possible to have a method that works for any n? Yes, Bernoulli presented such
a method, but it adopted imaginary number i 2 D 1 and the new infinitesimal calculus that
Leibniz just inventedŽ . Here is what he did:
)
x D tan ;
H) tan 1 y D n tan 1 x
y D tan n

We refer to Section 3.10 for a discussion on inverse trigonometric functions (e.g. tan 1 x) Briefly,
given an angle , press the tangent button gives us tan , and pressing the tan 1 button gives us
back the angle. Now, he differentiated tan 1 y D n tan 1 x to get
dy dx
Dn
1Cy 2 1 C x2
Then he indefinitely integrated the above equation to get:
Z Z
dy dx
D n (3.8.10)
1Cy 2 1 C x2
Now comes the trick of using i :
 
1 1 1 1 1 1
D D D
1 C x2 x2 i 2 .x i/.x C i/ 2i x i xCi

So what he did is called factoring into imaginary components, R and in the final step, a partial
fraction expansion. With that, it’s easy to compute the integral dx=1Cx 2 :
Z Z Z  ˇ ˇ
dx 1 dx dx 1 1 ˇˇ x i ˇˇ
D D .ln jx i j ln jx C ij/ D
2i ˇ x C i ˇ
ln
1 C x2 2i x i xCi 2i
With this result, he could proceed with Eq. (3.8.10):
ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ
1 ˇˇ y i ˇˇ 1 ˇˇ x i ˇˇ ˇy i ˇ ˇx i ˇ
ln ˇ ˇ D n ln ˇ ˇ C C ” ln ˇˇ ˇ D n ln ˇ
ˇ ˇ
ˇ C C0 (3.8.11)
2i yCi 2i xCi yCi x Ciˇ

He found C 0 with the condition x D y D 0 when  D 0: C 0 D lnŒ. 1/n 1 . With this C 0 ,


Eq. (3.8.11) becomes:

   n   n 
y i x i n 1 x i
ln D ln C lnŒ. 1/  D ln . 1/n 1
(3.8.12)
yCi xCi xCi
Ž
If you do not know calculus yet, skip this. Calculus is discussed in Chapter 4.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 290

Thus, he obtained this


 n
y i x i
D . 1/n 1
(3.8.13)
yCi xCi
which gave him
 
y i x i n
DC n D 1; 3; 5; : : :
yCi xCi
 
y i x i n
D n D 2; 4; 6; : : :
yCi xCi

And solving for y (the above equations are just linear equations for y), Bernoulli have obtained
a nice formula for y or tan n with x D tan 

.x C i/n C .x i/n
tan n D i n D 1; 3; 5; : : :
.x C i/n .x i/n
(3.8.14)
.x C i/n .x i/n
tan n D i n D 2; 4; 6; : : :
.x C i/n C .x i/n

Now, we check this result, by applying it to n D 2, and the above equation (the second one of
course as n D 2) indeed leads to the correct formula of tan 2 D 2 tan =.1 tan2 /.

Trigonometry identities for angles of plane triangles. Let’s consider a plane triangle with
three angles denoted by x; y and z (in many books we will see the notations A, B and C ). We
thus have the constraint x C y C z D . We have then many identities. For example,
x y z x y z
cot C cot C cot D cot cot cot
2 2 2 2 2 2

Proof. The above formula is equivalent to the following


x y y z z x
tan tan C tan tan C tan tan D 1
2 2 2 2 2 2
From x C y C z D , we can relate tangent of .x C y/=2 to tangent of z=2, and use the addition
angle formula for tangent, we will arrive at the formula:
 
x y z 1
tan C D cot D
2 2 2 tan z2
x y
tan C tan
2 2 D 1
x y z
1 tan tan tan
2 2 2


Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 291

We list in what follows some trigonometry identities for triangle angles:

(a) Sine-related identities


x y z
sin x C sin y C sin z D 4 cos cos cos
2 2 2
sin 2x C sin 2y C sin 2z D 4 sin x sin y sin z
(b) Cosine-related identities
x y z
cos x C cos y C cos z D 4 sinsin sin C 1
2 2 2 (3.8.15)
cos 2x C cos 2y C cos 2z D 4 cos x cos y cos z 1
(c) Tangent-related identities
tan x C tan y C tan z D tan x tan y tan z
(d) Co-tangent-related identities
tan x tan y C tan y tan z C tan z tan x D 1

Proof follows the same reasoning: using x C y C z D  to replace z and use corresponding
identities, Section 3.8.

x y z
Proof. This is a proof for cos x C cos y C cos z D 4 sinsin sin C 1: From x C y C z D ,
2 2 2
we can relate cosine of z=2 to cosine of x C y, and use the summation formula for cosine to
the term cos x C cos y, we will make appear half angles. Also using the double angle formula
cos 2u D 2 cos2 u 1:
  x y 
xCy
cos x C cos y C cos z D 2 cos cos cos.x C y/
2 2
  x y   
xCy 2 x Cy
D 2 cos cos 2 cos C1
2 2 2
z   x y  
xCy

D 2 sin cos cos C1
2 2 2

Using the identity cos./ cos./ will conclude the proof. 

sin n˛ for any n. In Section 2.25.5 we have used de Moivre’s formula to derive the formula
for sin 2˛, sin 3˛ in terms of powers of sin ˛. In principle, we can follow that way to derive the
formula of sin n˛ for any n, but the process is tedious (try with sin 5˛ and you’ll understand
what I meant). There should be an easier way.
The trick is in Eq. (2.25.21), which we re-write here:

ei ˛ e i˛
sin ˛ D (3.8.16)
2i
Phu Nguyen, Monash University © Draft version
Chapter 3. Geometry and trigonometry 292

Using it for n˛ we have:


e i n˛e i n˛ .e i ˛ /n .e i ˛ /n
sin n˛ D D
2i 2i
.cos ˛ C i sin ˛/n .cos ˛ i sin ˛/n
D .use e i ˛ D cos ˛ C i sin ˛/
Pn  n k 2i Pn  n k
n n
kD0 k cos ˛.i sin ˛/k kD0 k cos ˛. i sin ˛/k
D (3.8.17)
! 2i
X n
n
i k . i/k
D cosn k ˛ sink ˛
k 2i
kD0
n n.n 1/.n 2/
D cosn 1
˛ sin ˛ cosn 3
˛ sin3 ˛ C   
1Š 3Š
where in the third equality, we have used the binomial theorem to expand .   /n , and the red
term is equal to zero for k D 0; 2; 4; : : : and equal to one for k D 1; 3; 5; : : :

cos n˛ for any n. If we have something for sine, cosine is jealous. So, we do the same analysis
for cosine, and get:
!
e i n˛ C e i n˛ X
n
n i k C . i/k
cos n˛ D D cosn k ˛ sink ˛
2 k 2
kD0
n.n 1/ n.n 1/.n 2/.n 3/
D cosn ˛ cosn 2
˛ sin2 ˛ C cosn 4
˛ sin4 ˛ C   
2Š 4Š
(3.8.18)
where in the third equality, we have used the binomial theorem to expand .   /n , and the red
term is equal to zero for k D 1; 3; 5; : : : and equal to one for k D 0; 4; 8; : : :, and equal to minus
one for k D 2; 6; 10; : : :.
With Eq. (3.8.18), we can write the formula for cos.n˛/ for the first few values of n:
cos.0˛/ D 1
cos.1˛/ D cos ˛
cos.2˛/ D 2 cos2 ˛ 1
(3.8.19)
cos.3˛/ D 4 cos3 ˛ 3 cos ˛
cos.4˛/ D 8 cos4 ˛ 8 cos2 ˛ C 1
cos.5˛/ D 16 cos5 ˛ 20 cos3 ˛ C 5 cos ˛
What is the purpose of doing this? The next step is to try to find a pattern in these formula. One
question is, is it possible to compute cos.6˛/ w/o resorting to Eq. (3.8.18)? Let’ see how we
can get cos.2˛/ D 2 cos2 ˛ 1 from cos 1˛ D cos ˛: we can multiply cos 1˛ with 2 cos ˛ and
minus 1, and 1 is cos 0˛:

cos.2˛/ D 2 cos.˛/  cos.1˛/ cos.0˛/

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 293

Thus, we can compute cos.k˛/ from cos.k 1/˛ and cos.k 2/˛! The formula is

cos.k˛/ D 2 cos ˛ cos.k 1/˛ cos.k 2/˛


Now, we have a recursive formula for cos n˛

<1; if n D 0
cos.n˛/ D cos ˛; if n D 1 (3.8.20)

2 cos ˛ cos.n 1/˛ cos.n 2/˛; if n  2

One application of this formula is to derive the Chebyshev polynomials of the first kindŽŽ
described in Section 12.3.2. Why this has to do with polynomials? Note that from the above
equation, cos.n˛/ is a polynomial in terms of cos ˛, e.g. cos 3˛ D 4.cos ˛/3 3 cos.˛/. That’s
why. If you forget what is a polynomial, check Section 2.29.

3.9 Inverse trigonometric functions


For each of the trigonometric functions there is a corresponding inverse trigonometric function.
Let’s take thepsine function as an example. Start with an angle, say1 xpD =4, we get y D
sin.=4/ D 2=2. Then, the inverse sine gives back x: =4 D sin . 2=2/. So, the inverse
sine function answers the question what is the angle of which the sine is y; it is sin 1 y. Similarly
we have cos 1 .x/; tan 1 .x/, etc. This notation was introduced by John Herschel (1792 – 1871),
an English polymath, mathematician, astronomer, chemist, inventor, experimental photographer
who invented the blueprint and did botanical work, in 1813.
Another notation exists for the inverse trigonometric functions. The most com-
mon convention is to name inverse trigonometric functionsp using  an arc- prefix:
arcsin.x/; arccos.x/; arctan.x/, etc. For example, =4 D arcsin =2 . This notation arises
2

from the following geometric relationships. When measuring in radians, an angle of  radians
will correspond to an arc whose length is r , where r is the radius of the circle. Thus in the unit
circle, the arc whose cosine is x is the same as "the angle whose cosine is x", because the length
of the arc of the circle is the same as the measurement of the angle.

3.10 Inverse trigonometric identities


To each trigonometry identity we have a corresponding inverse trigonometry identity. Herein I
present one such identity and its use to compute . The trigonometry identity is tangent of the
difference of two angles:
tan ˛ tan ˇ
tan.˛ ˇ/ D
1 C tan ˛ tan ˇ

Note that this was how to discover this relation between cos.k˛/, cos.k 1/˛ and cos.k 2/˛. When we
know such formula exists, we can prove it in an easier way. I leave it as an trigonometry exercise.
ŽŽ
If there are polynomials of 1st kind, then where are those of 2nd kind? They’re polynomials related to sin n˛.
Those related to the cosine are called the 1st kind probably because cos ˛ is the real part of e i ˛ .

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 294

Starting from this identity and we introduce two new symbols x; y which are tangents of ˛ and
ˇ:
(
x D tan ˛ x y
H) D tan.˛ ˇ/ [from the above identity]
y D tan ˇ 1 C xy

Thus, we get the following


x y
˛ ˇ D arctan
1 C xy
And by substituting ˛ D arctan x and ˇ D arctan y into the above equation, we get the
following inverse trigonometry identity:
x y
arctan x arctan y D arctan
1 C xy

Introducing x D a1 =b1 , y D a2 =b2 , we get the same identity in a slightly different form (I have
included two versions: one for angle addition and one for angle difference):

a1 a2 a1 b2 ˙ a2 b1
arctan ˙ arctan D arctan (3.10.1)
b1 b2 b 1 b 2  a1 a2

Machin’s formula. John Machin (1686 – 1751) was a professor of astronomy at Gresham
College, London. He is best known for developing a quickly converging series for  in 1706
and using it to compute  to 100 decimal places. He derived the following formula, now known
as Machin’s formula, using Eq. (3.10.1) (details given later)

 1 1
D 4 arctan arctan (3.10.2)
4 5 239
Then he combined his formula with the Taylor series expansion for the inverse tangent (see
Eq. (4.15.12) in Chapter 4) to compute  to 100 decimal places (w/o a calculator of course; he
did not have that luxury). In passing we note that Brook Taylor was Machin’s contemporary in
Cambridge University. Machin’s formula remained the primary tool of Pi-hunters for centuries
(well into the computer era). For completeness, details are given as follow.
The power series for arctan x is:

x3 x5 x7
arctan x D x C C 
3 5 7
Using it in Eq. (3.10.2) gives us a series to compute :
   
 1 1 1 1 1 1 1
D4 C C  C  (3.10.3)
4 5 3  53 5  55 7  57 239 3  2393 5  2395

With only five terms in the arctangent series, this formula gave us eight correct decimals.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 295

Proof. Derivation of Machin’s formula Eq. (3.10.2) using Eq. (3.10.1). We start with arctan 51 C
arctan 15 to get 2 arctan 15 :

1 1 1
2 arctan D arctan C arctan
5 5 5 (3.10.4)
15C15 5
D arctan D arctan
55 11 12

Now, with 2 arctan 51 C 2 arctan 51 we get 4 arctan 15 :

1 1 1
4 arctan D 2 arctan C 2 arctan
5 5 5
5 5
D arctan C arctan (Eq. (3.10.4))
12 12
5  12 C 5  12 120
D arctan D arctan .Eq. (3.10.1) /
12  12 5  5 119

Finally, we consider 4 arctan 51 


4
, writing =4 as arctan 1=1:

1  1 1 120 1 1
4 arctan D 4 arctan arctan D arctan arctan D arctan
5 4 5 1 119 1 239


Compute  from thin air. Machin’s formula for  is great, but there is an unbelievable way to
get , from thin air. To be precise from i 2 D 1. Recall that we have (Section 2.25.7):

 i
D ln.i /
4 2
A bit of algebra to convert i to a fraction form:
 
 i i 1Ci i
D ln.i/ D ln D .ln.1 C i / ln.1 i // (3.10.5)
4 2 2 1 i 2

Now, we use the power series of logarithm, written for a complex number z:

z2 z3 z4
ln.1 C z/ D z C C 
2 3 4
Thus, we have

i2 i3 i4
ln.1 C i/ D i C C 
2 3 4 (3.10.6)
. i/2 . i/3 . i/4
ln.1 i/ D i C C 
2 3 4
Phu Nguyen, Monash University © Draft version
Chapter 3. Geometry and trigonometry 296

Finally, substituting Eq. (3.10.6) into Eq. (3.10.5), we get :

 1 1 1 X1
1
D1 C C  D . 1/nC1
4 3 5 7 nD1
2n 1
p
Great. We got  from 1; a real number from an imaginary one! It seems impossible, so we
should check this result. That’s why we have provided the last (red) expression, which can be
coded in a computer. The outcome of that exercise is that the more terms we use the more close
to =4 D 0:7853981633 : : : we get. However this series is too slow in the sense that we need
too many terms to get an accurate value of . That’s why Machin and other mathematicians
developed other formula.
But still this is not Eq. (3.10.3). Don’t worry. The Germain mathematician Karl Heinrich
Schellbach (1809-1890) did in 1832. He used:
 
2 .5 C i/4 . 239 C i/
 D ln
i .5 i/4 . 239 i/
It is certain that Schellbach was aware of Machin’s formula, and that was how he could think of
the crazy expression in the bracket for i.

Derivation of Eq. (3.10.1) using complex numbers. If we consider two complex numbers
b1 C a1 i with the angle 1 D arctan a1 =b1 and b2 C a2 i with the angle 2 D arctan a2 =b2 ,
then its product is b1 b2 a1 a2 C.a1 b2 a2 b1 /i with the angle  D arctan.a1 b2 a2 b1 /=.b1 b2
a1 a2 /. Then Eq. (3.10.1) is nothing but  D 1 C2 , a property of complex number multiplication.
And this is expected as we started from the trigonometry identity for angle difference/addition.

3.11 Trigonometry inequalities


When Ptolemy was making his sine table, he could not find sin 1ı directly, and he found
an approximate method. His idea was to compare sin 1ı with sin 3ı using this inequality
sin 3ı < 3 sin 1ı , see Fig. 3.47. If you remember the trigonometry identity sin 3x D 3 sin x
4 sin3 x, you’ll understand this inequality.
As we now pay attention to inequalities of trigonometry functions, we turn to the unit circle
with sine/cosine/tangent, Fig. 3.47, and we discover these inequalities:
sin x < x < tan x (3.11.1)
where the first inequality (i.e., sin x < x) was obtained by comparing the length of the line
AA0 –which is sin x–versus the length of the arc AB (which is x) in the left figure. The second
inequality was obtained by comparing the area of the right triangle OHB, which is 1=2 tan x,
and the area of the shaded region in the right figure (which is x=2).
There is nothing special about sin 3ı < 3 sin 1ı , if we have this, we should have this:
sin ˛ ˛
< ; for all ˛ > ˇ 2 Œ0; =2 (3.11.2)
sin ˇ ˇ

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 297

y y
H

A A

tan x
1 x 1

sin x
x x x/2
A′ B x O B
x

sin x < x tan x > x

Figure 3.47: Basic inequalities in trigonometry. Noting that x is measured in radians.

And of course we need a proof as for now it is just our guess. Before presenting a proof, let’s
see how Eq. (3.11.2) was used by Ptolemy to compute sin 1ı :

2
˛ D .3=2/ı ; ˇ D 1ı W sin 1ı > sin.3=2/ı
3
4
˛ D 1ı ; ˇ D .3=4/ı W sin 1ı < sin.3=4/ı
3

From sin 3ı , we can compute sin.3=2/ı and sin.3=4/ı . Thus, we get 0:017451298871915433 <
sin 1ı < 0:01745279409512592. So, we obtain sin 1ı D 0:01745. The accuracy is only 5
decimal places. Can you improve this technique?

Proof. We’re going to prove Eq. (3.11.2) using algebra and Eq. (3.11.1). There exists a geometric
proof of Aristarchus of Samos–an ancient Greek astronomer and mathematician who presented
the first known heliocentric model that placed the Sun at the center of the known universe with
the Earth revolving around it. Thus, this inequality is known as Aristarchus’s inequality. I refer
to Wikipedia for the geometry proof.
First, using algebra to transform the inequality to a ’better’ form:

sin ˛ ˛
<
sin ˇ ˇ
ˇ sin ˛ < ˛ sin ˇ (all quantities are positive)
ˇ.sin ˛ sin ˇ/ < .˛ ˇ/ sin ˇ (add ˇ sin ˇ to both sides)
sin ˛ sin ˇ sin ˇ
<
˛ ˇ ˇ

The key step is of course the highlighted one where we brought the term ˇ sin ˇ to the game.
Why that particular term? Because it led us to this term sin ˛ sin ˇ=˛ ˇ . As all steps are equivalent,
we just need to prove the final inequality. We use the identity for sin ˛ sin ˇ to rewrite the

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 298

LHS of the last inequality (the blue term) and use sin x < x:
   
2 sin ˛ ˇ
cos ˛Cˇ      
sin ˛ sin ˇ 2 2 2 ˛ ˇ ˛Cˇ ˛Cˇ
D < cos D cos
˛ ˇ ˛ ˇ ˛ ˇ 2 2 2
The next move is to get rid of ˛ in cos.˛Cˇ=2/. For that, we need this fact cos x < cos y if x > y.
Because ˛ > ˇ, then 0:5.˛ C ˇ/ > 0:5.ˇ C ˇ/ D ˇ, thus
 
sin ˛ sin ˇ ˛Cˇ
< cos < cos ˇ
˛ ˇ 2
The last step is to convert from cos ˇ to sin ˇ noting that we have a tool not yet used, that is
tan ˇ > ˇ. Writing tan ˇ D sin ˇ= cos ˇ in that inequality, and we’re done. 
Actually, if we know calculus, the proof is super easy (Fig. 3.48); it does not require us being
genius. The function f .x/ D sin x=x is a decreasing function for x 2 Œ0;  (compute its first
derivative and using tan x > x to see that the derivative is indeed negative). Now considering
two numbers ˛; ˇ in Œ0;  such that ˛ > ˇ, we then have immediately f .˛/ < f .ˇ/. Done.
Alternatively, if we consider the function y D sin x, we also have the inequality. Comparing
with Aristarchus’s proof, which was based on circles and triangles, the calculus based proof is
straightforward. Why? Because in the old trigonometry sine was attached to angles of triangles,
whereas in calculus it is free of angles/triangles. It is simply a function.

y y
y D sin x=x y D sin x

sin.ˇ/=ˇ

ˇ ˛  x ˇ ˛  x

Figure 3.48: Calculus based proofs of Aristarchus’s inequality sin ˛=sin ˇ < ˛=ˇ . The proof using the sin x
function is based on the fact that the graph is concave. The term sin ˛=˛ is the tangent of the shaded
angle. The angle corresponds to ˛ is smaller than that of ˇ.kohktikgto

sin x  x. When we were building our sine table, we have discovered that sin x  x, at least
when x D 1ı D =180. It turns out that for small x, this is always true. And it stems from
Eq. (3.11.1), which we rewrite as
sin x
sin x < x < tan x ” cos x < <1
x
Now, let x approaches zero, then cos x approaches 1, and thus 1 < sin x
x
< 1. This leads to:
sin x
lim D1 (3.11.3)
x!0 x

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 299

Some inequalities for angles of a triangle. Herein I present some well known inequalities
involving angles of a triangle. We label the three angles by A, B, C this time. For all inequalities,
equality occurs when A D B D C or when the triangle is equilateral. They are:
p
3 3
(a) sin A C sin B C sin C 
2
3
(b) cos A C cos B C cos C 
p 2
3
(c) cot A cot B cot C  (3.11.4)
9 p
(d) cot A C cot B C cot C  3
9
(e) sin2 A C sin2 B C sin2 C 
4
(f) cot2 A C cot2 B C cot2 C  1

Proof. We prove (a) using the Jensen inequality (check Section 4.5.2 if it’s new to you) which
states that for a convex function f .x/, f ..x C y C z/=3/  .1=3/.f .x/ C f .y/ C f .z//. As
the function y D sin x for 0  x   is a concave function, we have:
 
ACB CC sin A C sin B C sin C
sin 
3 3
Thus, p
  3
sin A C sin B C sin C  3 sin D3
3 2

Proof. You might be thinking the proof of (b) is similar to (a). Unfortunately, the cosine function
is harder: its graph consists of two parts, see Fig. 3.50. Only for acute-angled triangles, we can
use the Jensen inequality as in (a). Hmm. We need another proof for all triangles. First, we
convert the term cos A C cos B C cos C to:
ACB A B C C A B C
cos ACcos BCcos C D 2 cos cos C1 2 sin2 D 2 sin cos C1 2 sin2
2 2 2 2 2 2
Then, the inequality becomes:
 
C 2 A B C 1
ED 2 sin C 2 cos sin
2 2 2 2
This is a quadratic function in terms of sin C =2 with a negative highest coefficient (i.e., 2).
To show that E  0 for all A; B; C , we just need to check its discriminant . Indeed, the
discriminant is always negative, thus E is always below the x axis, and thus it is always smaller
or equal to 0. A consequence of this result is another inequality that reads
A B C 1
sin sin sin 
2 2 2 8
Phu Nguyen, Monash University © Draft version
Chapter 3. Geometry and trigonometry 300

C
which is obtained from (c) and the identity cos A C cos B C cos C D 4 sin A2 sin B2 sin C 1.
2


Proof. We prove (c) using the Jensen inequality for the convex function tan x. Note that we
only need to consider acute angles (because if one angle is not acute, its cotangent is negative
whereas the other cotangents are positive, and thus the inequality holds):
 
ACB CC tan A C tan B C tan C
tan 
3 3

Thus,   p
tan A C tan B C tan C  3 tan D3 3
3
But, we have tan A C tan B C tan C D tan A tan B tan C , thus
p
p 1 3
tan A tan B tan C  3 3 ” 
tan A tan B tan C 9


Proof. We prove (d) as follows. The key point is the identity cot A cot B C cot B cot C C
cot C cot A D 1. We start with:

.cot ACcot BCcot C /2 D cot2 ACcot2 BCcot2 C C2.cot A cot BCcot B cot C Ccot C cot A/

And we can relate cot2 A C cot2 B C cot2 C with cot A cot B C cot B cot C C cot C cot A:
8̂ 2 2
< cot A C cot B  2 cot A cot B
cot2 B C cot2 C  2 cot B cot C
:̂ 2
cot C C cot2 A  2 cot C cot A

Therefore,

cot2 A C cot2 B C cot2 C  cot A cot B C cot B cot C C cot C cot A

And then,

.cot A C cot B C cot C /2  3.cot A cot B C cot B cot C C cot C cot A/

Proof. We prove (e) using some algebra and the inequality (b). First, we transform sin2 A C
sin2 B C sin2 C to cos 2A; : : ::
3 1
sin2 A C sin2 B C sin2 C D .cos 2A C cos 2B C cos 2C /
2 2
Phu Nguyen, Monash University © Draft version
Chapter 3. Geometry and trigonometry 301

Then, using Eq. (3.8.15), we get:


sin2 A C sin2 B C sin2 C D 2 C 2 cos A cos B cos C
If one angle (assuming that angle is A without loss of generality) is not acute, then cos A < 0
and cos B; cos C > 0, thus cos A cos B cos C < 0. Therefore, sin2 A C sin2 B C sin2 C < 2. If,
all angles are acute, cos A; cos B; cos C > 0, we can use the AM-GM inequality:
p
3 1
cos A cos B cos C  .cos A C cos B C cos C /
3
And using the inequality (b), we get:
1 1 27 1
cos A cos B cos C  .cos A C cos B C cos C /3  D
27 27 4 4
And the result follows immediately:
1 9
sin2 A C sin2 B C sin2 C  2 C D
2 4

Proof. We can prove (f) using the Cauchy-Swatch inequality and the inequality (d). 
Cauchy’s proof of Basel problem. In Section 2.21.4 I have introduced the Basel problem and
one calculus-based proof. Herein, I present Cauchy’s proof using only elementary mathematics.
The plan of his proof goes as:
 The starting point is the trigonometry inequality we all know:
sin  <  < tan ; 0 <  < =2

 The above inequality gives us an equivalent one:


1
cot2  < < 1 C cot2  (3.11.5)
2
 Now, he introduced two new positive integer variables n and N such that
n
D ; 1nN (3.11.6)
2N C 1
This definition of  comes from the requirement that  < =2. Now, Eq. (3.11.5) becomes
 
2 n 2N C 1 2 n
cot < < 1 C cot2 (3.11.7)
2N C 1 n 2N C 1
Now that the Basel problem
P is about the summation of the reciprocals of the squares of
the natural numbers i.e., n 1=n2 , he made 1=n2 appear:
2 2 n 1 2 2 n
cot < < C cot2 (3.11.8)
.2N C 1/ 2 2N C 1 n 2 .2N C 1/ 2 .2N C 1/ 2 2N C 1

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 302

P
 The next step is, of course, to introduce n
1=n2 :

X
N
2 n X
N
1 X
N
2 X
N
2 n
2 2
cot < < C cot
nD1
.2N C 1/2 2N C 1 nD1 n2 nD1 .2N C 1/2 nD1 .2N C 1/2 2N C 1
(3.11.9)

 Now, the Basel sum is an infinite series,


P we have to consider N ! 1 (then the blue term
vanishes); I also introduced S for N nD1 =n :
1 2

2 XN
n 2 XN
n
2
lim cot < S < lim cot2
N !1 .2N C 1/ 2 2N C 1 N !1 .2N C 1/ 2 2N C 1
nD1 nD1
(3.11.10)
What Cauchy needed now is to be able to evaluate the red sum.

 The next move is to adopt de Moivre’s formula (Eq. (2.25.6)):

cos nx C i sin nx D .cos x C i sin x/n (3.11.11)

And a bit of massage, we get cot x in the game:


cos nx C i sin nx
D .cot x C i/n (3.11.12)
sinn x
Now, we use the binomial theorem, Eq. (2.27.2), to expand .cot x C i/n :
! ! ! !
n n n n
.cot x C i/n D cotn x C cotn 1 xi C    C cot xi n 1
C in
0 1 n 1 n

And from that we can extract the imaginary part of .cot x C i/n :
! ! !
n n n
Im.cot x C i/ D
n
cot n 1
x cot n 3
xC cotn 5
x C 
1 3 5

Using Eq. (3.11.12) and equating the imaginary parts of the sides, we get:
! ! !
sin nx n n n
D cotn 1 x cotn 3 x C cotn 5 x C    (3.11.13)
sinn x 1 3 5

This is by itself a trigonometry identity that holds for any n 2 N and x 2 R. Now we
take this identity, fix a positive integer N and set n D 2N C 1 and xk D k=2N C1, for
k D 1; 2; : : : ; N . Why that? Because the LHS of the identity is zero with this choice:
sin nxk D sin.2N C 1/k=2N C1 D sin k D 0. Therefore Eq. (3.11.13) becomes
! ! !
2N C 1 2N C 1 2N C 1
0D cot2N xk cot2N 2 xk C    C . 1/N (3.11.14)
1 3 2N C 1

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 303

for k D 1; 2; : : : ; N . The numbers xk are distinct numbers in the interval 0 < xk < =2.
The numbers tk D cot2 xk are also distinct numbers in this interval. What Eq. (3.11.14)
means is that, the numbers tk are the roots of the following N th degree polynomial:
! ! !
2N C 1 N 2N C 1 N 1 2N C 1
p.t/ D t t C  C . 1/N (3.11.15)
1 3 2N C 1

Now, Vieta’s formula (Section 2.29.7) links everything together: the sum of all the roots
is the negative of the ratio of the second coefficient and the first one:


X
N 2N C1
.2N /.2N 1/
tk D 3
2N C1
 D (3.11.16)
1
6
kD1

Replacing tk by its definition and noting that xk D k=2N C1, we get what is needed in
Eq. (3.11.10):
X
N
k .2N /.2N 1/
cot2 D (3.11.17)
2N C 1 6
kD1

With that sum, Eq. (3.11.10) is simplified to:

2 .2N /.2N 1/ 2 .2N /.2N 1/


lim < S < lim (3.11.18)
N !1 .2N C 1/2 6 N !1 .2N C 1/2 6

Or
2 2
<S <
6 6
Thus, S is sandwitched between  2 =6 and  2 =6, it must be  2 =6. And we come to the
end of the amazing proof due to the great Cauhy.

3.12 Trigonometry equations


This chapter would be incomplete if we left out trigonometry equations; those similar to finding
x such that sin x C cos x D 1 for example. Basically solving trigonometry functions involves
using some trigonometry identities, some algebra tricks and the fact that 1  sin x; cos x  1.
Let’s start with solving this equation tan2 x C cos 4x D 0. There are certainly more than one
way to solve this, I present one solution only. The idea is to look at the equation carefully and ask
why tan2 x and not tan3 x and why cos 4x not cos 5x. With that observation, we already half
solve this problem: tan2 x can be converted to cos 2x, and certainly cos 4x can also be converted

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 304

to cos 2x. Eventually we only have an equation of cos 2x:

sin2 x
C 2 cos2 2x 1 D 0
cos2 x
1 cos 2x
C 2 cos2 2x 1 D 0
1 C cos 2x
1 u
C 2u2 1 D 0 .u D cos 2x/
1Cu
u.u2 C u 1/ D 0

And the remaining is piece of cake, isn’t itŽ ?


We present another trigonometry equation and that’s it, no more. You can enjoy solving
them, but honestly they do not teach you more on mathematics (except you would become more
fluent in manipulating algebraic expressions, which is an important skill after all). The equation
is sin100 x C cos100 x D 1. Obviously, as the equation involves too high power (100th power),
it requires a special technique.
Of course you do not want to mess up with the left hand side of the equation (it is too
messy doing so). So, we have to play with the RHS which is the number 1. But if you know
trigonometry, you know that 1 D sin2 x C cos2 x. Thus, the equation becomes, and now you can
massage the equation, and hope something useful would appear:

sin100 x C cos100 x D sin2 x C cos2 x


sin2 x.1 sin98 x/ C cos2 x.1 cos98 x/ D 0
„ ƒ‚ … „ ƒ‚ …
0 0

When the sum of two non-negative terms is zero, it is only possible when the two terms are both
zeros:

sin2 x.1 sin98 x/ D 0; cos2 x.1 cos98 x/ D 0

which requires that sin x D 0; cos x D ˙1 or cos x D 0; sin x D ˙1. And now you can solve
the scary-looking equation sin2020 x C cos2020 x D 1.
We think we should pay less attention on solving trigonometry equations because up to this
point we still do not know how to compute sin x for any given x. All we know is just Table 3.3.
When we use a calculator and press sin 0:1 to get 0.09983341664, how does the calculator
compute that? See Section 3.17 for solution, sort of.

p
Ž
Solving that equation
p yields u D f0; . 1 ˙ 5/=2g. As u D cos 2x is always larger than 1, we do not
accept u D .1 C 5/=2.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 305

Some trigonometry problems.

1. Compute the sum sin2 10ı C sin2 20ı C sin2 30ı C sin2 40ı C    C sin2 90ı .

2. Solve 8 sin x cos5 x 8 sin5 x cos x D 1.

3. Compute cos 36ı cos 72ı .

4. Solve cos2 x C cos2 2x C cos2 3x D 1 (IMO 1962).

5. Prove cos =7 cos 2=7 C cos 3=7 D 1=2 (IMO 1963).

Answers are 5, 7:5ı and 0.5, respectively. Hints: for the first problem, follow Gauss (see
Section 2.6.1 in case you have missed it) by grouping two terms together so that something
special appear. For the third problem, do not first find cos 36ı and then cos 36ı cos 72ı .
With little massage, you can compute cos 36ı cos 72ı directly. For the final problem,
remember how we computed sin =5?

3.13 Generalized Pythagoras theorem


For right (right-angled) triangles, we have the famous Pythagoras theorem c 2 D a2 C b 2 . For
oblique triangles, the generalized Pythagoras theorem extends the Pythagoras theorem. For a
triangle with sides a, b and c, see Fig. 3.49a, the generalized Pythagoras theorem states that

a2 D b 2 C c 2 2bc cos A
2 2 2
b Dc Ca 2ca cos B (3.13.1)
c 2 D a2 C b 2 2ab cos C

In Fig. 3.49a, the proof for b 2 D c 2 C a2 2ca cos B is obtained by applying the Pythagoras
theorem for the right triangle ADC Ž . Now, we do some checking for the newly derived formula.
First, when B is a right angle, its cosine is zero, and we get the familiar b 2 D a2 C c 2 again.
Second, the term 2ca cos B has dimension of length squared, which is correct (if that term
was 2a2 b cos B, the formula would be wrong because we cannot add a square of length with a
cubic of length. We cannot add area with volume). There is no need to prove the other second
formula. As a; b; c are symmetrical, from b 2 D c 2 C a2 C 2ca cos B we can get the other two
by permuting the variables: a ! b; b ! c; c ! a.
The generalized Pythagoras theorem provides us an easy way to prove the converse of
the Pythagorean theorem: if c 2 D a2 C b 2 , then †C D 90ı . In any triangle, we have c 2 D
a2 C b 2 2ab cos C , now given that c 2 D a2 C b 2 , must then have 2ab cos C D 0, and since
a; b > 0, this leads to cos C D 0 or †C D 90ı .
Ž
For this case we get b 2 D c 2 C a2 C 2ca cos. B/, but note that cos. B/ D cos B.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 306

C C C

B/
b a b a b h a

a sin.
A c B D A c B A c H B
(a) (b) (c)

Figure 3.49: Proof of the generalized Pythagoras theorem (a), the sine law (b) and proof (c).

The generalized Pythagoras theorem is also known as the law of cosines for it relates the
lengths of the sides of a triangle to the cosine of one of its angles. If there are law of cosines,
then it should exist law of sines. This law is written as (Fig. 3.49b)

a b c
D D (3.13.2)
sin A sin B sin C
The proof is simple: see Fig. 3.49c. In the two right triangles ACH and BCH , computing
h D b sin A D a sin B. Hence, a= sin A D b= sin B. The law of sines can be used to compute
the remaining sides of a triangle when two angles and a side are known—a technique known as
triangulation.
In the law of sines we have a=sin A D b=sin B D c=sin C , but what is this number a=sin A? It must
be the length of some segment (why?) and it turns out to be the diameter of the triangle’s circum-
circle. This result dates back to Ptolemy. In geometry, the circumscribed circle or circumcircle
of a polygon is a circle that passes through all the vertices of the polygon. The center of this
circle is called the circumcenter and its radius is called the circumradiusŽ .

3.14 Graph of trigonometry functions


Even though I postpone the discussion on the concept of mathematical functions to Section 4.2,
I present here the graph of some trigonometry functions, mostly for completeness of this chapter.
Loosely speaking a function is a device that receives a number (mostly a real number)–called the
input– and returns another number, the output. If we denote by x the input of the sine function,
we write y D sin x. By varying x from negative infinity to positive infinity (of course practically
just a finite interval was considered, here is Œ 4; 4), we compute the corresponding ys, and
plot all the points .x; y/ to get the graphs shown in Fig. 3.50.
Ž
It is not hard to prove this result once the question has been asked. Draw a circle with O as the center and a
triangle ABC with vertices on the circle. Then, join OA; OB and OC . Now, use the law of sines for 4OAB and
proceed from that.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 307

y cos.x/ sin.x/
1

  0  
x
0 3 2 5 3
2 2 2 2
1

Figure 3.50: Graphs of sine and cosine functions. Made with tikz.

OK. We have used technology to do the plotting for us (as it does a better job than human
beings), but we should be able to ‘read’ information from the graph. No computer is able to do
that. First, the two graphs are confined in the interval Œ 1; 1 (because both sine and cosine are
smaller/equal to 1 and larger/equal to 1). Second, where the sine is maximum or minimum the
cosine is zero and vice versa. Third, by focusing on the interval Œ0; 4, one can see that the sine
starts with zero, increases to 1 (at =2), then decreases to zero (at ), continues decreasing until
it gets to 1, then increases back to zero (at 2). After that the graph repeats. Thus, sine is a
periodic function. And its period T is 2. The cosine function has the same period. The period
of a periodic function f .x/ is the smallest number T such thatŽ

f .x C T / D f .x/; 8x (3.14.1)

Why smallest? This is because if T is the period, then 2T; 3T; : : : are also periods. So, we just
need to use the smallest.
The graph of the tangent function is given in Fig. 3.51. It can be seen that the tangent function
is periodic with a period of  i.e., tan.x C / D tan x, which can be proved using trigonometry
identity tan.a C b/ D tan aCtan b=1 tan a tan b . As tan x D sin x=cos x , the function is not defined for
angles xN such that cos xN D 0. Solving this equation yields xN D =2 C k, k D 0; ˙1; ˙2; : : :
The vertical lines at xN are the vertical asymptotes of the tangent curve.
We now shift our discussion to the graph of a nice function. It is given by

sin x 1
y.x/ D D sin x
x x

which is obtained by taking the sine function and the 1=x function and multiply them. But hey,
why this function? Because it shows up a lot in mathematicsŽŽ . For example, in calculus we
need to find the derivative of the sine function. Here is what we do, by considering the function
Ž
As is always the case in mathematics, whenever we have a new object (herein the period), we have theorems
(facts) on them. Here is one: if f1 .x/ and f2 .x/ are two functions of the same period T , then the function ˛f1 .x/ C
ˇf2 .x/ also has the period of T .
ŽŽ
Actually this function is named the sinc function by the British mathematician Philip Woodward (1919-2018).
In his 1952 article "Information theory and inverse probability in telecommunication", in which he said that the
function "occurs so often in Fourier analysis and its applications that it does seem to merit some notation of its
own".

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 308

x
3  
0   3
2 2 2 2

Figure 3.51: Graphs of the tangent function. The vertical dashed lines at xN D =2 C k are the vertical
asymptotes of the tangent curve. An asymptote is a line that the graph of a function approaches as either
x or y go to positive or negative infinity. There are three types of asymptotes: vertical, horizontal and
oblique.

y D sin t :

sin.t C x/ sin t
.sin t/0 D lim
x!0
 x 
cos x 1 sin x
D sin t lim C cos t lim
x!0 x x!0 x

I refer to Section 4.4.8 if something was not clear. If this is not enough to get your attention,
note that the function sin x=x is very popular in signal processing. So if you are to enroll in an
electrical engineering course, you will definitely see it.
In Fig. 3.52 I plot sin x, 1=x and sin x=x . What can we observe from the graph of f .x/ D
sin x=x ? First, it is symmetrical with respect to the y-axis (this is because f . x/ D f .x/, or as

mathematicians call it, it is an even function). Second, similar to sin x, sin x=x is also oscillatory.
However not between 1 and 1. The amplitude of this oscillation is decreasing when jxj gets
larger. Can we find how this amplitude depends on x precisely?
Yes, we can:
 
1 1 1
1  sin x  1 H)  .sin x/ 
x x x
This comes from the fact that if a  b, and c > 0 then ac  bc. So, the above inequality for
sin x=x works only for x > 0. But due to symmetry of this function, the inequality holds for
x < 0 as well. Now we see that sin x=x can never exceed 1=x and 1=x; these two functions
are therefore called the envelops of sin x=x, see Fig. 3.53a.
Is it everything about sin x=x? No, no. There is at least one more thing: where are the
stationary points of this function? To that, we need to use calculus as algebra or geometry is not
powerful enough for this task. From calculus we know that at a stationary point the derivative of

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 309

y
4 1=x
1

2 sin x

10 5 5 10
2 x
x
4 3 2  0  2 3

(a) (b)

Figure 3.52: Graph of sin x, 1=x (a) and graph of sin x=x (b). What is interesting is that the area of the
shaded region when x ! 1 is nothing else but exactly half of . See Dirichlet integral in Eq. (4.7.45) to
see why.

y
y

yDx
sin.x/
x

x
1 1 x
x x   3
2 2 2

(a) (b)

Figure 3.53: Envelops of sin x=x are 1=x and 1=x (a) and solving tan x D x graphically (b).

the function vanishes:


x cos x sin x
f 0 .x/ D H) f 0 .x/ D 0 W tan x D x
x2
How we’re going to solve this equation of tan x D x or g.x/ WD tan x x D 0? Well, we do
not know. So we fall back to a simple solution: the solutions of tan x D x are the intersection
points of the curve y D tan x and the line y D x. From Fig. 3.53b we see that there is one
solution x D 0, and infinitely more solutions close to 3=2; 5=2; : : :
But the graphical method cannot give accurate solutions. To get them we have to use approxi-
mate methods and one popular method is the Newton-Raphson method described in Section 4.5.4,
see Eq. (4.5.8). In this method one begins with a starting point x0 , and gets better approximations
via:
g.xn / tan xn xn
xnC1 D xn 0
D xn ; n D 0; 1; 2; : : : (3.14.2)
g .xn / 1= cos2 xn 1

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 310

If you program this and use it you will see that the method sometimes blows up i.e., the solution
is a very big number. This is due to the tangent function which is very large for x such that
cos x D 0. So, we better off use this equivalent but numerically better g.x/ WD x cos x sin x:
xn cos xn sin xn
xnC1 D xn (3.14.3)
xn sin xn
With this and starting points close to 0, 3=2, 5=2, and 7=2 we get the first four solutions given
in Table 3.4. The third column gives the solutions in terms of multiples of =2 to demonstrate
the fact that the solutions get closer to the asymptotes of the graph of the tangent function. Here
Table 3.4: The first four solutions of tan x D x obtained with the Newton-Raphson method.

n x x

1 0.00000000 0
2 4.49340946 2:86=2
3 7.72525184 4:92=2
4 10.9041216 6:94=2

are two lessons learned from studying the graph of the nice function sin x=x :
 Not all equations can be solved exactly. However, one can always use numerical methods
to solve any equation approximately. Mathematicians do that and particularly scientists
and engineers do that all the time;
 Mathematically, tan x x D 0 is equivalent to x cos x sin x D 0. However, it is easier
to work with the latter, because all involved functions are defined for all x; tan x is not a
nice function to work with: it is a discontinuous function, at the vertical asymptotes! So
some mathematical objects are easier to work with than others, exactly similar to human
beings.
Period of sin 2x C cos 3x. The problem that we’re now interested in is what is the period of a
sum of trigonometric functions? Specifically, sin 2x C cos 3x. There is one easy way: plotting
the function. Fig. 3.54 reveals that the period of this function is 2.
Of course, there is another way without plotting the function. We know that the period of
sin x is 2, and thus the period of sin 2x is 2=2 D  ŽŽ . Similarly, the period of cos 3x is 2=3.
Therefore, we have
sin 2x repeats at : ; 2; 3; : : :
2 2 2
cos 3x repeats at : ;2 ;3 ;:::
3 3 3
Thus, sin 2x C cos 3x will repeat the first time (considering positive x only) when x D 2, that
is the period of this function.
If this is not clear, here is one way to explain. The function y D sin 2x is obtained by horizontally shrinking
ŽŽ

y D sin x by a factor of 2 (Fig. 4.9). And thus it has a period as half as that of sin x.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 311

y
sin.2x/ C cos.3x/
sin.2x/

x
0   3 2 5 3
2 2 2

cos.3x/

Figure 3.54: Plot of sin 2x (red), cos 3x (black) and sin 2x C cos 3x (blue).

3.15 Hyperbolic functions


In this section I present hyperbolic functions sinh x; cosh x as they are similar to the trigonomet-
ric functions discussed previously. There are several ways to introduce these functions, but for
now I decided to use the decomposition of a function into an even part and an odd part.
Any function, f .x/, can be decomposed into two parts as

1 1
f .x/ D Œf .x/ C f . x/ C Œf .x/ f . x/ (3.15.1)
2 2

in which the first term is an even function, i.e., g. x/ D g.x/ and the second is an odd function
i.e., g. x/ D g.x/ (see Section 4.2.1).
Applying this decomposition to the exponential function y D e x , we have:

1 x 1 x
ex D Œe C e x
C Œe e x
 (3.15.2)
2 2

from that we define the following two functions:

1 x
sinh x D .e e x/
2 (3.15.3)
1
cosh x D .e x C e x /
2

They are called the hyperbolic sine and cosine functions, which explain their symbols. I explain
the origin of these names shortly. First, the graphs of these two functions together with y D 0:5e x
and y D 0:5e x are shown in Fig. 3.55a. The first thing we observe is that for large x, the
hyperbolic cosine function is similar to y D 0:5e x , this is because 0:5e x ! 0 when x is large.
Second, the hyperbolic cosine curve is always above that of y D 0:5e x . Third, cosh x  1. This
can be explained using the Taylor series of e x and e x (refer to Section 4.15.8 if you’re not

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 312

familiar with Taylor series):



ˆ x2 x3 x4
< ex  1 C x C C C C  ex C e x
x2 x4
2Š 3Š 4Š H) 1C C C   1
ˆ x x2 x3 x4 2 2Š 4Š
:̂e  1 x C C C 
2Š 3Š 4Š

From Eq. (3.15.3), it can be seen that cosh2 x sinh2 x D 1. And we have more identities
bearing similarity with trigonometry identities that we’re familiar with. For example, we have

cosh.a C b/ D cosh a cosh b C sinh a sinh b


(3.15.4)
sinh.a C b/ D sinh a cosh b C cosh a sinh b

The proof is based on the following results:


(
e a D cosh a C sinh a e aCb C e a b
e aCb e a b
; cosh.a C b/ D ; sinh.a C b/ D
e a D cosh a sinh a 2 2

4
4

3
3
cosh x
2 2

1 1 sinh x
x x
0:5e 0:5e 0:5e x
0 0
2 1 1 2 2 1 1 2

1 1 0:5e x

2 2

3 3

4 4
(a) cosh x (b) sinh x

Figure 3.55: Plot of the hyperbolic sine and cosine functions along with their exponential components.

Why called hyperbolic trigonometry? Remember the parametric equation of a unit circle
centered at the originŽ ? It is given by x D sin t; y D cos t. Similarly, from the identity
cosh2 t sinh2 t D 1, the hyperbola x 2 y 2 D 1 is parameterized as x D cosh t and
y D sinh t. That explains the name ‘hyperbolic functions’ (Fig. 3.56). Not sure what is a

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 313

sin2 t + cos2 t = 1 cosh2 t − sinh2 t = 1


y y
(cosh t, sinh t)
1
(sin t, cos t)

−1 t 1 (−1, 0) (1, 0)
x x
area = t/2 area = t/2

−1
x2 + y 2 x2 − y 2 = 1

Figure 3.56: Trigonometric functions sin x, cos x are related to a unit circle; they are circular trigonometry
functions. On the other hand, sinh x and cosh x are related to the right hyperbola x 2 y 2 D 1; they
are hyperbolic trigonometry functions. The meaning of the parameter t is that it is twice the area of the
shaded region. For the circle, that is easy to see. For the hyperbola, we need calculus. Check this youtube
channel out if you’re interested in the detail.

hyperbola? Check out Section 4.1.

Another derivation of hyperbolic functions. Start with Euler’s identity e i D cos  C i sin 
but written for  D x and  D x:
e ix D cos x C i sin x
ix
e D cos x i sin x
We then have (adding the above two equations):
e ix C e ix
cos x D (3.15.5)
2
Now we consider a complex variable z D x C iy, and use z in the above equation:
e i.xCiy/ C e i.xCiy/
cos.x C iy/ D
2
ix y
e C e ixCy e ix e y C e ix e y
D D
2 2
.cos x C i sin x/e y C .cos x i sin x/e y
D
 y y
 2  y 
e Ce e e y
D cos x i sin x
2 2
And you see the hyperbolic sine/cosine show up! With our definition of them in Eq. (3.15.3), we
get this cos.x C iy/ D cos x cosh y i sin x sinh y. And a similar equation is awaiting for sine:
sin.x C iy/ D sin x cosh y C i cos x sinh y
Ž
Refer to Section 4.2.6 for a discussion on parametric curves.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 314

Putting them together,

cos.x C iy/ D cos x cosh y i sin x sinh y


sin.x C iy/ D sin x cosh y C i cos x sinh y

And they are quite similar to the real trigonometry identities of sin.a C b/ and cos.a C b/! Now
putting x D 0 in the above, we get

cos.iy/ D cosh y; sin.iy/ D i sinh y (3.15.6)

which means that the cosine of an imaginary angle is real but the sine of an imaginary angle is
imaginary.

Can sine/cosine be larger than one? We all know that for real angles x, j sin xj  1. But for
complex angles z, might we have cos z > 1? Let’s find z such that cos z D 2. We start with

cos.x C iy/ D cos x cosh y i sin x sinh y

Then, cos z D 2 is equivalent to

cos.x C iy/ D 2 ” cos x cosh y i sin x sinh y D 2 C 0i

And we obtain the following equations to solve for x; y:


(
cos x cosh y D 2
sin x sinh y D 0

From the second equation we get sin x D 0; noting that we’re not interested in sinh y D 0
or y D 0 as we’re looking for complex angles not real ones. With sin x D 0, we then have
cos x D ˙1. But we remove the possibility of cos x D 1, as from the first equation we know
that cos x > 0 as cosh y > 0 for all y. So, we have cos x D 1 (or x D 2n), and with that we
have cosh y D 2:
ey C e y
cosh y D 2 ” D2
2
 p 
of which solutions are y D ln 2 ˙ 3 . Finally, the angle we’re looking for is:
 p 
z D 2n C i ln 2 ˙ 3

These hyperbolic functions are the creation of the humand minds, but again they model
satisfactorily natural phenomena. For example in Section 10.2 we shall demonstrate that the
hyperbolic sine is exactly the shape of a hanging chain‘ .

A hanging chain or cable is a parabola-like shape that a cable assumes under its own weight when supported
only at its end.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 315

3.16 Applications of trigonometry


I present some applications of trigonometry in this section. Some materials are borrowed from
[37], which is a great book to read. We shall see more applications of trigonometric functions in
later chapters.

3.16.1 Measuring the earth


It is obvious that without modern technologies such as GPS, measuring the
radius of the earth must be done indirectly. It was Abu Reyhan Al-Biruni,
the 10th century Islamic mathematical genius, who combined trigonometry
and algebra to achieve this very numerical feat. He first measured the height
of a hill near the Fort of Nandana in today’s Punjab province of Pakistan.
He then climbed the hill to measure the horizon. Using trigonometry and
algebra, he got the value equivalent to 3928.77 English miles, which is
about 99 percent close to today’s radius of the earth. Using the law of sines,
Eq. (3.13.2), for the right triangle OTM , we have

hCR R h cos ˛
D H) R D
sin =2 sin.=2 ˛/ 1 cos ˛

Al-Biruni tells us that the mountain height’s is 305.1 meters and the angle ˛ is 0.5667ı . Using
these data and a calculator (which he did not have), we find that R D 6 238 km. And the earth
circumference is thus 39 194 km. Its actual value is 40 075 km (from Google).
How Al-Biruni measured h and ˛? First, he measured the angle of elevation of a mountain
top at two different points lying on a straight line using an astrolabe, 1 and 2 . Then he measured
the distance between these two points d (Fig. 3.57).

(a) (b)

Figure 3.57: Al-Biruni’s measurement of the height of a mountain.

Finally, simple application of trigonometry gives us the height h:

d tan 1 tan 2
hD
tan 2 tan 1

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 316

3.16.2 Charting the earth


To demonstrate how ancient Greek astronomers (also mathematicians) used trigonometry to find
distances between two points on the earth surface, we first need to have a coordinate system
for the earth so that we can locate precisely any point on it. We know that our earth is a perfect
sphere. Two notes here: first how you knew that the earth is a sphere? Because other people
said so? It’s better to find out why for yourself. Second, the earth is not a perfect sphere; we are
making an approximation.

(a) (b)

Figure 3.58: Longitudes and latitudes: source from Britannica.

On this sphere there are three special points: the center O, the north pole N and the south
pole S . We draw many circles with center at O and pass through S and N (see Fig. 3.58). Each
half of such circles is called a line of longitude or meridian. Among many such meridians, we
define the prime meridian which is the meridian at which longitude is defined to be 0ı . The
prime meridian divides the sphere into two equal parts: the eastern and western parts.
All points on a meridian have the same longitude, which leads to the introduction of another
coordinate. To this end, parallel circles perpendicular to the meridians are drawn on the sphere.
One special parallel is the equator which divides the earth sphere in to two equal parts: the
northern and southern part.
Now we can define precisely what longitude and latitude mean. Referring to Fig. 3.58, we
first define a special point A which is the intersection of the equator and the prime meridian.
Now, the longitude is the angle AOB in degrees measured from the prime meridian. Thus a
longitude is an angle ranging from 0°E to 180°E or 0°W to 180°W. Similarly, the latitude is the
angle BOC measured from the equator up (N) or down (S), ranging from 0°N to 90°N or 0°S
to 90°S.
Considering two cities located at P and Q having the same longitude, P is on the equator
(Fig. 3.59). Now assume that the city located at Q has a latitude of '. The question we’re
interested in is: how far is Q from P (how far from the equator)? The answer is the arc PQ,
which is part of the great circle of radius R where R being the radius of the earth. Thus:
'
PQ D R
180
Phu Nguyen, Monash University © Draft version
Chapter 3. Geometry and trigonometry 317

'

O0 Q

Figure 3.59: Two cities located at P and Q having the same longitude. The arc PQ, which is part of a
great circle. A great circle is the intersection of a sphere with a central plane, a plane through the center
of that sphere. Noting that OP is parallel to O 0 Q.

Now considering two cities located at Q and M having the same latitude (Fig. 3.59). What
is the distance between them traveling along this latitude? This is the arc QM of the small circle
centered at O 0 . If we can determine the radius of this small circle, then we’re done. This radius
is O 0 Q D R cos ' (see the right picture in Fig. 3.59). Then the distance QM is given by
 
QM D O 0 Q D R cos '
180 180
where  is the difference (assuming that these two points are either on the eastern or western
part) of the longitudes of Q and M . But is this distance the shortest path between Q and M ? No!
The shortest path is the great-circle distance. The great-circle distance or spherical distance is
the shortest distance between two points on the surface of a sphere, measured along the surface
of the sphere. At this moment, we accept this with faith.
Figure 3.60 illustrates how to find such a great-circle distance. The first step is to find
r D O 0 Q as done before. Then in O 0 QM using the cosine law, i.e., Eq. (3.13.1), we can
compute the straight-line distance between QM , denoted by d :

d 2 D r2 C r2 2r 2 cos  D 2r 2 .1 cos / D 2R2 cos2 '.1 cos /

Then using the cosine law again but now for the triangle OQM to determine the angle ˛:
 2 
2 2 2 2 2R d2 2

d DR CR 2R cos ˛ H) ˛ D arccos D arccos 1 .1 cos / cos '
2R2
Knowing the angle of the arc QM in the great circle, it’s easy to compute its length:

˛ arccos 1 .1 cos / cos2 '
QM D R D R
180 180
How about the great-circle distance between any two points on the surface of the earth? We
do not know (yet) as it requires spherical trigonometry.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 318

O0
 r
Q M
d

great circle

O
˛ R

Q M
d

Figure 3.60: Calculating the great-circle distance between Q and M . The great circle of which QM is
! !
an arc lies in the plane going through O and perpendicular to the vector OM  OQ. This great circle is
now shown in the sphere on the left picture. Note that r D O 0 Q D R cos '.

3.17 Infinite series for sine

In this section I present how the infinite series for the sine was developed long before calculus.
The idea is to consider a sector of a unit circle and calculate its area using two ways. The first
is an exact way, and the second is an approximation following Archimedes’ idea of exhausting
this area by infinitely many triangles (see Section 3.2.2 if you’re not familiar with the method of
exhaustion).

B B

1 D 1

θ/2 θ/4
O A O D
θ/2 H θ/4

a) b)
C A

Figure 3.61: Computing the approximate area of the sector OBAC using the colored triangles. D is the
point such that its angle with OA is =4.

The exact area of the sector OBAC is =2 (see Fig. 3.61). This area is approximated as a
sum of the area of triangles OBC , ABC and ABD . We first compute the area of these triangles
now. The area of the triangle OBC is easy (recall that the circle has an unit radius):

  
1   1
OBC D 2 sin cos D sin 
2 2 2 2

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 319

The area of the triangle ABC is also straightforward:


  
1  
ABC D 2 sin 1 cos
2 2 2
(3.17.1)
 
D 2 sin sin2 (double angle formula for cosine)
2 4
We now use the approximation that sin x  x for small x: thus the area of ABC is approximated
as  3=16.
Next we compute the area of the triangle ABD. If we work with Fig. 3.61a then finding out
this area might be hard, but if we rotate OAB a bit counterclockwise (Fig. 3.61b), we’ll see that
ABD is similar to ABC , but with =2, thus its area is:
 3
1  3
ABD  D
16 2 128

Let’s sum the areas of all these triangles (ABD counted twice), and we get:

1 3 3
A  sin  C C
2 16 64
We can see a pattern here, and thus the final formula for the area of the sector is:

1 3 3 3
A sin  C C C C 
2 16 64 256
The added terms account for the areas not considered in our approximation of the sector area.
The red term looks familiar: it’s a geometric series, so we can compute the red term, and get a
more compact formula for A as:

1 3 3 3
A sin  C C C C 
2 16  64 256 
1 3 1 1 1
D sin  C  C C C 
2 16 64 256
1 3
D sin  C (geometric series)
2 12
Now we have two expressions for the same area, so we get the following equation, which leads
to an approximation for sin :

 1 3 3
 sin  C H) sin   
2 2 12 6

Want to have an even better approximation? Let’s apply sin x  x x 3 =6 into Eq. (3.17.1) to
get ABC   3=128  5=8192 (the algebra is indeed a bit messy, thus I have used a CAS to help

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 320

me doing this tedious algebraic manipulation, see Section 3.23). And we repeat what we have
just done to get:
3 5
sin    C
6 120
And of course we want to do better. What should be the next term after  5=120? It is  7=x with
x D 5040:
3 5 7
sin    C
6 120 5040
Are you asking if there is any relation between those numbers in the denominators and those
exponents in the nominators? There is! If you just played with factorial (Section 2.26.2) enough
you would recognize that 6 D 3Š, 120 D 5Š and of course 5040 must be 7Š (pattern again!), thus

1 3 5 7 X 1 2iC1
i sin 
sin   C C  D . 1/ (3.17.2)
1Š 3Š 5Š 7Š i D0
.2i C 1/Š

Can we develop a similar formula for cosine? Of course. But for that we need to wait until the
17th century to meet Euler and Taylor who gave us a systematic way to derive infinite series for
trigonometry functions. Refer to Sections 4.15.6 and 4.15.8 if you cannot wait.
Why Eq. (3.17.2) was a significant development in mathematics? Remember that we have
built a sine table in Section 3.7? It is useful but it is only for integral angles e.g. 30ı or 45ı . If the
angle is not in the table, we have to use interpolation, which is of low accuracy. To have higher
accuracy (and thus better solutions to navigation problems in the old days), ancient mathemati-
cians had to find a formula that can give them the value of the sine for any angle. And Eq. (3.17.2)
is one such formula; it involves only simple addition/subtraction/multiplication/division.

3.18 Unusual trigonometric identities


We now all know that the sum of the first n whole numbers is given by:

n.n C 1/
1 C 2 C 3 C  C n D (3.18.1)
2
And some mathematician discovered the following identity (the material herein is from the
interesting book Trigonometry delights of Eli Maor [45]):

.sin n˛=2/.sin.n C 1/˛=2/


sin ˛ C sin 2˛ C sin 3˛ C    C sin n˛ D (3.18.2)
sin ˛=2

Even though if you do not believe this identity, one thing clear is a striking similarity between
it and the sum in Eq. (3.18.1). We should first test Eq. (3.18.2) for a few ˛ to be confident that
it’s correct. Since the values we choose for ˛ are random, if Eq. (3.18.2) is correct for them, it
should be correct for others and hence for all ˛ such that sin ˛=2 ¤ 0.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 321

Proof. Proof of Eq. (3.18.2). Remember how the 10-year old Gauss computed the sum of the
first n whole numbers? We follow him here. Denoting by S the sum on the LHS of Eq. (3.18.2),
we write
S D sin ˛ C sin 2˛ C    C sin.n 1/˛ C sin n˛
S D sin n˛ C sin.n 1/˛ C    C sin 2˛ C sin ˛
Then, we sum these two equations:
2S D .sin ˛ Csin n˛/C.sin 2˛ Csin.n 1/˛/C  C.sin.n 1/˛ Csin 2˛/C.sin n˛ Csin ˛/
And now, of course we use the sum-to-product trigonometry identity sin a C sin b D
2 sin.a C b/=2 cos.a b/=2 for each sum (because it helps for the factorization):
.n C 1/˛ .1 n/˛ .n C 1/˛ .3 n/˛
2S D 2sin cos C 2sin cos C C
2 2 2 2
.n C 1/˛ .n 3/˛ .n C 1/˛ .n 1/˛
C 2sin cos C 2sin cos
2 2 2 2
A common factor appears, so we factor the above as:
 
.n C 1/˛ .1 n/˛ .3 n/˛ .n 3/˛ .n 1/˛
2S D 2 sin cos C cos C    C cos C cos
2 2 2 2 2
(3.18.3)
So far so good. The next move is the key and we find it thanks to Eq. (3.18.2). So, this is definitely
not the way the author of this identity came up with it (because he did not know of this identity
before discovering it). In Eq. (3.18.2) we see the term sin ˛=2, so we multiply Eq. (3.18.3) with
it:

˛ .n C 1/˛ ˛ .1 n/˛ ˛ .3 n/˛
2S sin D sin 2 sin cos C 2 sin cos C 
2 2 2 2 2 2

˛ .n 3/˛ ˛ .n 1/˛
C 2 sin cos C 2 sin cos
2 2 2 2
Now we want to simplify the term in the bracket. To this end, we use the product-to-sum
trigonometric identity 2 sin ˛ cos ˇ D sin.˛ C ˇ/ C sin.˛ ˇ/:
"
˛ .n C 1/˛ n˛ .2 n/˛ .n 2/˛ .4 n/˛
 
2S sin D sin sin C sin  C sin  C sin C
2 2 2  2  2 2
#
.n 2/˛ .4 n/˛ .2 n/˛ n˛
 
C    C sin  C sin C sin  C sin

 2 2 
 2 2
And lucky for us that all terms in the bracket cancel out except the red terms. It’s a bit hard to
see how other terms are canceled out, one way is to do this for n D 3 and n D 4 to see that it is
indeed the case. Now, the above equation becomes
˛ .n C 1/˛ n˛
2S sin D 2 sin sin
2 2 2
And from that we can get our identity. 

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 322

If we have one identity for the sine, we should have one for the cosine and from that one for
the tangent:
.sin n˛=2/.sin.n C 1/˛=2/
sin ˛ C sin 2˛ C sin 3˛ C    C sin n˛ D
sin ˛=2
.sin n˛=2/.cos.n C 1/˛=2/
cos ˛ C cos 2˛ C cos 3˛ C    C cos n˛ D (3.18.4)
sin ˛=2
sin ˛ C sin 2˛ C sin 3˛ C    C sin n˛ .n C 1/˛
D tan
cos ˛ C cos 2˛ C cos 3˛ C    C cos n˛ 2
So, it is quite impressive that we were able to prove Eq. (3.18.2). But how someone could
discover this crazy identity? Here might be the way that these identities were discovered. Let’s
compute the following sum:

A D e i ˛ C e i 2˛ C    C e i n˛ (3.18.5)

Why this sum is related to Eq. (3.18.4)? This is because Euler’s identity tells us that e i ˛ D
cos ˛ C i sin ˛:
A D .cos ˛ C i sin ˛/ C .cos 2˛ C i sin 2˛/ C    C .cos n˛ C i sin n˛/
(3.18.6)
D .cos ˛ C cos 2˛ C    C cos n˛/ C i.sin ˛ C sin 2˛ C    C sin n˛/

The terms in our identities show up both for the sine and cosine! That’s the power of complex
numbers. Now is the plan: we will compute A in another way, from that we get the real and
imaginary parts of it. Then, we compare that result with Eq. (3.18.6): equating the imaginary
parts gives us the sine formula, and equating the real parts gives us the cosine formula.
It can be seen that A is a geometric series, so it’s not hard to compute it:

e i ˛ .1 e i n˛ /
A D e i ˛ C e i 2˛ C    C e i n˛ D e i ˛ .1 C e i ˛ C e i 2˛ C    C e i.n 1/˛
/D (3.18.7)
1 ei ˛
Of course, now we bring back sine and cosine (because that’s what we need), and A becomes:

e i ˛ .1 e i n˛ / 1 cos n˛ i sin n˛
AD i ˛
D .cos ˛ C i sin ˛/
1 e 1 cos ˛ i sin ˛
.1 cos n˛ i sin n˛/.1 cos ˛ C i sin ˛/
D .cos ˛ C i sin ˛/ (3.18.8)
.1 cos ˛ i sin ˛/.1 cos ˛ C i sin ˛/
.1 cos n˛ i sin n˛/.1 cos ˛ C i sin ˛/
D .cos ˛ C i sin ˛/
2.1 cos ˛/
What we have just done in the second equality is the standard way to remove i in the denominator,
now we can get the real and imaginary parts of A. Let’s focus on the imaginary part:
.sin ˛ C sin n˛/ .sin ˛ cos n˛ C sin n˛ cos ˛/ .sin n˛=2/.sin.n C 1/˛=2/
ImA D D
2.1 cos ˛/ sin ˛=2
(3.18.9)

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 323

Now comparing Eq. (3.18.6) with Eq. (3.18.9), we can get the sine identity.
Hey, but wait. Euclid would ask where is geometry? We can construct the sum sin ˛ C
sin 2˛ C : : : and cos ˛ C cos 2˛ C : : : as in Fig. 3.62. To ease the presentation we considered
only the case n D 3. It can be seen that sin ˛ C sin 2˛ C : : : equals the y-coordinate of P3 . Now
if we can compute d and ˇ, then we’re done.
y
P3

P3H = d sin(α + β)

P3 P2

sin 3α
d P2
α
P1

sin 2α r
P1 α β
β 1
O sin α
α
x
H C α
cos α cos 2α cos 3α O

Figure 3.62: Starting at the origin (which we have labeled as O), we draw a line segment OP1 of unit
length forming an angle ˛ with the positive x-axis. At P1 we draw a second line segment of unit length
forming an angle ˛ with the first segment and thus an angle 2˛ with the positive x-axis. Continuing in
this manner n times, we arrive at the point Pn (which is P3 in the illustration),
P whose coordinates we
shall denote by X and Y . Obviously, Y is what we’re looking for i.e., Y D nkD1 sin k˛.

Indeed, O; P1 ; P2 ; ::: are vertices of a polygon inscribed in a circle of radius r. Thus,



d D 2r sin
2
And in the triangle OCP1 , we have something similar: 1 D 2r sin ˛2 . Therefore, d D
sin n˛=2=sin ˛=2. Now the angle ˇ subtends the chord P P , and is therefore equal to half the
1 3
central angle that subtends the same chord (Fig. 3.17):
1 .n C 1/˛
ˇ D .n˛ ˛/ H) ˛ C ˇ D
2 2
Now, we can determine the sum of sines straightforwardly:
sin n˛=2 .n C 1/˛
sin ˛ C sin 2˛ C sin 3˛ C    C sin n˛ D d sin.˛ C ˇ/ D sin
sin ˛=2 2
I emphasize that there is no real life applications of Eq. (3.18.4). If you’re asking why we
bothered with these formula, the answer is simple: we had fun playing with them. Is there any-
thing more important than that in life, especially when we’re young. Moreover once again we see
the connection between geometry, algebra and complex numbers. And we saw the telescoping
sum again.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 324

Example 3.1
This example is taken from the 2021 Oxford MAT admission test: compute the following sum

S D sin2 .1ı / C sin2 .2ı / C sin2 .3ı / C    C sin2 .89ı / C sin2 .90ı / (3.18.10)

One solution is, using sin2 x D 0:5.1 cos 2x/, to write S as


1
SD .90 S1 / ; S1 D cos.2ı / C cos.4ı / C    C cos.180ı /
2
then, using the second identity in Eq. (3.18.4) with n D 90 and ˛ D 2ı , we can compute S1

sin 90ı cos 91ı


S1 D D 1
sin 1ı
And thus,
S D 45:5
How to make sure the solution is correct? Write a small program to check!
Admittedly, no one can remember the second identity in Eq. (3.18.4)! There must be
another easier way. And indeed, there is. If we write the sum in this way (still remember how
the ten year old Gauss computed the sum of the first 100 integers?):
   
S D sin2 .1ı / C sin2 .89ı / C sin2 .2ı / C sin2 .88ı / C   
  (3.18.11)
C sin2 .44ı / C sin2 .46ı / C sin2 .45ı / C sin2 .90ı /
 
There are 44 terms of the form sin2 .x ı / C sin2 .90ı x ı / , and each term is equal to one
(why?), and the red and blue terms are easy, thus
p !2
2
S D 44  1 C 1 C D 45:5
2

3.19 Spherical trigonometry


Spherical trigonometry is the study of curved triangles called spherical
triangles, triangles drawn on the surface of a sphere. The sides of a spher-
ical triangle are arcs of great circles. A great circle is the intersection of a
sphere with a central plane, a plane through the center of that sphere. The
subject is practical, for example, because we live on a sphere. Spherical
trigonometry is of great importance for calculations in astronomy, geodesy,
and navigation. For details, we refer to the textbook of Glen Van Brum-
melen [11]. Glen Robert Van Brummelen (born 1965) is a Canadian historian of mathematics
specializing in historical applications of mathematics to astronomy. In his words, he is the “best

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 325

trigonometry historian, and the worst trigonometry historian” (as he is the only one).

3.19.1 Area of spherical triangles


The first problem we want to solve in spherical trigonometry is to determine the area of a
spherical triangle knowing its interior angles. An easier problem is to determine the area of a
lune. A lune is a two sided polygon on a sphere where the sides are two halves of great circles
(Fig. 3.63a). Let’s denote by ˛ the angle of the lune, and by r the radius of the sphere. The area
of the lune is then 2˛r 2 ; when ˛ D 2 the lune is simply the entire sphere surface of which the
area is 4 r 2 .

(a) spherical lune (b) spherical triangle

Figure 3.63: Spherical lune and triangle.

Now, to compute the area of the spherical triangle ABC shown in Fig. 3.63b, we extend the
three sides to get full great circles. Doing so gives us three lunes of which the total surface area
is .2˛ C 2ˇ C 2 /r 2 . But this surface area is also the area of half of the sphere and twice of that
of ABC . Thus,
.2˛ C 2ˇ C 2 /r 2 D 2 r 2 C 2area.ABC /
And from that we can get the formula for the area of ABC :

area.ABC / D Œ.˛ C ˇ C /  r 2 (3.19.1)


In other words, the area is angle sum minus  for unit spheres. Because the sum of the interior
angles of every planar triangle is , we can restate the above formula in yet another way: area =
angle sum minus angle sum for a planar triangle.
This formula is known as the Harriot-Girard theorem as it was published by the French
mathematician Albert Girard (1595–1632) in 1626 but has also been attributed to the English
mathematician Thomas Harriot (1560–1621) who discovered it in 1603 but he had the habit of
not publishing anything. Girard’s work on trigonometry is the first to use the abbreviations sin,
cos, tan.

3.19.2 Area of spherical polygons


Having obtained the formula to compute the area of spherical triangles, it is a small step to
consider the area of spherical polygons. We simply divide the polygon into triangles and use the

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 326

Harriot-Girard theorem for them. Let’s start with a four sided polygon and it is divided into two
triangles as shown in Fig. 3.64a. Applying the Harriot-Girard theorem to these triangles and add
up the areas of them we get
area D .a1 C a2 C a3 C a4 2/r 2
Doing the same exercise for five sided polygons (Fig. 3.64b), we get area D a1 C a2 C a3 C
a4 C a5 3. Obviously we can see a pattern here, and

area D Œa1 C a2 C    C an .n 2/r 2 (3.19.2)

But .n 2/ is the sum of the angles of a planar n sided convex polygon. So we can restate
the above formula in yet another way for a unit sphere: area = angle sum minus angle sum for a
planar convex polygon.

(a) four sided polygon (b) five sided polygon

Figure 3.64: Surface area of spherical polygons: divide and conquer approach. The polygon is divided
into spherical triangles and then the Harriot-Girard theorem is used for all the triangles.

Legendre’s proof of Euler’s polyhedra formula. Let’s start with a


convex polyhedron with V vertices, E edges, and F faces, each of which
is a p gon. Let x be any point in the interior of the polyhedra. As
shown in the figure, construct a sphere centered at x that surrounds
the polyhedron completely. Because the units are irrelevant, we choose
them so that the sphere has radius one. Project the polyhedron onto the
sphere using rays emanating from x. Now imagine that the polyhedron
is a wireframed model and that x is a light bulb. The projection is the
shadow of the wire frame on the surface of the encompassing sphere. In
this case the faces of the polyhedron become spherical polygons.
Then, Legendre computed the surface area of the unit sphere in two ways. The first way is of
course 4, and the second way is: this area is the sum of the areas of all the spherical polygons.
Now, Eq. (3.19.2) comes into play:
X 
4 D a1 C a2 C    C ap .p 2/
faces

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 327

Noting that the angles at each vertex total to 2 and there are V vertices, and Fp D 2E, so we
have
4 D 2V Fp C 2F H) 2 D V E CF

3.20 Analytic geometry


As long as algebra and geometry have been separated, their progress have been
slow and their uses limited; but when these two sciences have been united, they
have lent each mutual forces, and have marched together towards perfection.
(Joseph-Louise Lagrange)

Simply put, analytic geometry or coordinate geometry is geometry with numbers. In analytic
geometry, points are represented by an ordered pair of numbers that is called coordinates: e.g. a
point is specified by .x; y/ where x; y 2 R. Curves are represented by equations. For example,
the circle of radius one with the center at point .0; 0/ is the equation x 2 C y 2 D 1. We then
manipulate these coordinates and equations to study geometric figures, explore their properties
etc.
This section briefly introduces this powerful branch of mathematics. We start with a dis-
cussion of the concept of coordinates in Section 3.20.1. In Section 3.20.2 the equation of lines
is derived and we revisit the Thales theorem and prove it using equations. Section 3.20.3;Sec-
tion 3.21

3.20.1 Cartesian coordinate system


In the 17th century, René Descartes (Latinized name: Cartesius) and y
P .2; 3/
Pierre de Fermat developed Cartesian coordinates. The name came 3

from Descartes’s work La Géométrie published in 1637 as an ap- 2

pendix to Discours de la méthode (Discourse on the Method). This is 1

one of the most influential scientific works of the 17th century. For ex-
.0; 0/
3 2 1 1 2 3 x

ample, it was one of the only two books that the young Isaac Newton 1

owned. Note that Fermat never published his work on this topic. On Q. 2; 2/
2

a plane we draw two fixed perpendicular oriented lines (called axes) 3

meeting at a point called the origin (marked as a dot in the figure).


The convention is to call the horizontal axis the x-axis and the vertical axis the y axis. Mark
points along these two axes according to their distance from the origin, like the markings on a
ruler: positive numbers to the right and up, negative to the left and down. Then, every point is
uniquely defined by an ordered pair of numerical coordinates, which are the signed distances to
the point from these two axes. For example .2; 3/ is the point with a distance to the y axis of 2
and a distance to the x axis of 3. The coordinates of the origin are then .0; 0/.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 328
y
An arbitrary point on the plane is usually denoted by .x; y/.
The first coordinate (x) is called the abscissa, whereas the sec-
Q
ond coordinate .y/ is called the ordinate. The Pythagorean theo- b2
rem allows us to determine the distance between two points on

b1
j
Q
jP
this Cartesian plane. For example, the distance between the ori-

b2
gin .0; 0/ and .2; 3/ isp the length of the segment joining the two b H
1
points; that length is 22 C 32 . The distance between .a1 ; b1 / P a2 a1
and .a2 ; b2 / is (see the right triangle PQH ) x
a1 a2
p
jPQj D .a2 a1 /2 C .b2 b1 /2 (3.20.1)
Using the Cartesian coordinate system, geometric shapes (such as curves) can be described
by Cartesian equations: algebraic equations involving the coordinates of the points lying on the
shape. For example, a circle of radius 2, centered at the origin of the plane,pmay be described
as the set of all points whose coordinates x and y satisfy the equation x 2 C y 2 D 2 or
x 2 C y 2 D 4.
The invention of Cartesian coordinates revolutionized mathematics by providing the first sys-
tematic link between Euclidean geometry and algebra. Cartesian coordinates are the foundation
of analytic geometry, and provide enlightening geometric interpretations for many other branches
of mathematics, such as linear algebra, complex analysis, differential geometry, calculus, and
more. A familiar example is the concept of the graph of a function. Cartesian coordinates are also
essential tools for most applied disciplines that deal with geometry, including astronomy, physics,
engineering and many more. They are the most common coordinate system used in computer
graphics, computer-aided geometric design and other geometry-related data processing.

History note 3.4: René Descartes (31 March 1596 – 11 February 1650)
René Descartes (Latinized: Renatus Cartesius) was a French philoso-
pher, mathematician, and scientist who spent a large portion of his
working life in the Dutch Republic, initially serving the Dutch States
Army of Maurice of Nassau, Prince of Orange and the Stadtholder of
the United Provinces. One of the most notable intellectual figures of
the Dutch Golden Age, Descartes is also widely regarded as one of the
founders of modern philosophy. His mother died when he was very
young, so he and his brothers were sent to live with his grandmother. His father believed
that a good education was important, so Descartes was sent off to boarding school at a
young age.
In 1637, Descartes published his Discours de la methodé in which he explained his ratio-
nalist approach to the interpretation of nature. La methodé contained three appendices:
La dioptrique, Les météories, and La géométrie. The last of these, The Geometry, was
Descartes’ only published mathematical work. Approximately 100 pages in length, The
Geometry was not a large work, but it presented a new approach in mathematical thinking.
Descartes boasted in his introduction that “Any problem in geometry can easily be re-
duced to such terms that a knowledge of the length of certain straight lines is sufficient for

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 329

construction." But Descartes’ La géométrie was difficult to understand and follow. It was
written in French, not the language of scholarly communication at the time, and Descartes’
writing style was often obscure in its meaning. In 1649, Frans van Schooten (1615–1660),
a Dutch mathematician, published a Latin translation of Descartes’ Geometry, adding his
own clarifying explanations and commentaries.

3.20.2 Lines and Thales theorem revisited


If we look at the x axis, which is obviously a line, we see that all the points on it have zero
y coordinates. We say that y D 0 is the equation of the x axis. We set a task now to derive the
equation for a line passing through two points. There are different ways to do this, but we start
humbly with a line going through the origin O and a point P .xP ; yP / as shown in Fig. 3.65a.

y y
P .xP ; yP /
yP y0

Q.x; y/ B
y yB
A ˛
yA
x0
˛ H K
x xP x xA xB x
O O

(a) line through the origin (b) not through the origin

Figure 3.65: Equation of lines.

From the two similar triangles OQH and OPK (AA test), we get x=xP D y=yP . From that,
we get
yP yP
line through the origin and P .xP ; yP / W yD x; or y D ax; a WD (3.20.2)
xP xP
We call a the slope of the line; it is the tangent of the angle (˛) between the x axis and the line
measured counter-clockwise. Obviously, the slope can be negative. Thus, y D ax describes the
line going through the origin and has a slope of a.
Having obtained this, we now move to the general case of a line passing through two arbitrary
points A.xA ; yA / and B.xB ; yB / as shown in Fig. 3.65b. There are more than one way to derive
the equation, but I present a way of which we can reuse Eq. (3.20.2), and in the process we learn
the idea of using whatever coordinate system that makes our life easier.
To reuse Eq. (3.20.2), we introduce a new set of coordinates x 0 y 0 with the origin at A. In this
coordinate system, the equation of the line AB is y 0 D .yB0 =xB0 /x 0 (because it is a line through
the origin). We return to the original coordinate system using the relation between the two:
x D x 0 C xA ; y D y 0 C yA ;

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 330

Thus,
yB0 0 yB yA
y0 D x H) y yA D .x xA / (3.20.3)
xB0 xB xA
The red term is the slope of the line (Fig. 3.65b). And this line is of the form y D ax C b.
Having this equation available, it is easy to check if two lines are parallel or perpendicular. If
two lines y D ax C b and y D a0 x C b 0 have the same slopes (i.e., a D a0 ) then they are
parallel (assuming that b ¤ b 0 /.

Revisiting the Thales theorem. We have prove the Thales theo- y


P .x; y/
rem in Fig. 3.17. Now, we prove it using coordinates. To this end,
we consider a unit circle centered at the origin of the Cartesian
˛
coordinate system. Furthermore, we consider the diameter AB A ˛
with A. 1; 0/ and B.1; 0/. Now, let P .x; y/ be any point on the . 1; 0/ x
B.1; 0/
circle. Our task is to prove that AP is perpendicular to PB. First,
as P is on the circle, its coordinates .x; y/ satisfy the equation:
x 2 C y 2 D 1. We know use the fact that two lines are perpendic-
ular to each other if the product of their slopes is minus one. We thus compute the slopes using
Eq. (3.20.3):
y 0 y 0
slope of AP is: s1 D ; slope of BP is: s2 D
xC1 x 1
Obviously s1 > 0 and s2 < 0. The product of slopes is:
y y y2
s1 s2 D D 2 D 1 .x 2 C y 2 D 1/
x C1x 1 x 1

3.20.3 Constructible numbers


It is time now to see how Gauss proved the constructibility of the y
17 gon in 1796 at the age of 19. His work is based on analytic ge-
ometry, complex numbers, theory of equations and number theory. But A2

first, we need a definition of what is a constructible n gon. Our starting 2


point are: point O with coordinates .0; 0/ and point A1 with coordinates n A1
O M x
.1; 0/. From them we can build the unit circle, and the two coordinate cos.2=n/
axes Ox and Oy. It is now easy to see that to construct an n gon
we need to create the second vertex A2 . To this end, we just need to
construct point M with a distance OM D cos.2=n/. From M , draw
a line perpendicular to Ox, this line intersects the unit circle at A2 –the second vertex of our
polygon. A circle centered at A2 with radius being A1 A2 will intersect the unit circle at A3 –the
third vertex and so on.

Definition 3.20.1: Constructable n-gon


A regular n gon is said to be constructible if the number cos.2=n/ is constructible.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 331

But, when is point M of a distance cos.2=n/ from O constructible? Let’s look at the
expression of cos 2=n for n D 3; 4; 5; 7 and see the relation of it with the constructibility of
the corresponding n gon:
2 1
n D 3 W cos D .constructible/
3 2
2
n D 4 W cos D1 .constructible/
4 p (3.20.4)
2 5 1
n D 5 W cos D .constructible/
5 q4 p
2 3
n D 7 W cos D 7=2.1 C 3 3/ C    .non-constructible/
7
I wrote that the 7 gon is not constructible just based on the fact that no one was able to provide
such a construction until 1796. Now, this summary of the situation suggests that if cos 2=n can
be written as an expression that involves the four fundamental arithmetic operations (addition/-
subtraction, multiplication/division) and square roots of positive integers, it then seems that the
n gon is constructible.
Our task now is to see what numbers we can obtain with straightedge/compass construction,
from two starting points .0; 0/ and .1; 0/; points O and A1 in Fig. 3.66a. First, it is fairly easy
to construct points .2; 0/, .3; 0/ etc. by drawing some circles. For example, a circle centered at
p1 and radius 1 gives us point M.2; 0/. Thus, all integers are constructible. Second, how about
A
5? It is constructible thanks to the altitude
p theorem (Fig.
pp 3.66b). Third, if we apply this square
p
root construction to b D a, we can get b, which is a.
y

C
A1 M
O x p
1
a

B
A a H 1
(a) C.A1 ; 1/ gives M.2; 0/ (b) jAH j D a; jHBj D 1

Figure 3.66: Constructing the integers and the square root of a > 0. Points A; H; B are there, construct
a circle centered at the midpoint of AB with diameter being AB. Draw then a line through H and
p
perpendicular to AB, this line intersects the circle at C . And CH is exactly a (thanks to the altitude
theorem).
p p
What about 3 C 9? It is constructible, because we can use straightedge/compass to add
two numbers a and b (if they are
p already constructed). Similarly, we can construct
p a b. The
next question is how about 2 3? Can we construct this number? And 5=2? Yes, we can
(Fig. 3.67).

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 332

b
C

1
A B
a a=b
ab a

Figure 3.67: Constructing numbers ab and a=b based on the Thales theorem on similar triangles. Seg-
ments a, b and 1 are given. To this end, we just need to construct a line going through a point (DB) and
parallel to a given line (AC ). How to do that? If we can construct a perpendicular line to a given line, we
can construct parallel lines.

What Gauss did in 1796 was to compute explicitly cos 2=17. He showed that

 q 
2 1 p p
cos D 17 1 C 34 2 17
17 16
0r 1
q q (3.20.5)
1@ p p p
C 17 C 3 17 34 2 17 2 34 C 2 17A
8

which involves only addition/subtraction, multiplication/division and square roots of integers.


Thus, the 17 gon is constructible.
Thus we have found a lot of constructible numbers. Every integer, every rational number, and
irrational numbers such as cos 2=17 are constructible. But are all real numbers constructible?
The answer is no, as you might have guessed correctly, thanks to the following theorem due to
Descartes.

Theorem 3.20.1: Descartes’s constructible number theorem


A real number is constructible if and only if it can be obtained from the integers using addition,
subtraction, multiplication, division, and the extraction of square roots.

How to prove this theorem? It is based on the equation of line y D ax C b and of circle
.x a/2 C .y b/2 D c 2 . An intersection of a line and a circle is then the solution of a
quadratic equation, which involves only addition, subtraction, multiplication, division, and the
extraction of square roots.

Quadratic irrationals. To be able to do algebra with constructible numbers, we need to find a


form for them, similar to an even number is 2n for n 2 Z. Let a; b; c; d; e; f be some rationals,

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 333

then the following constructible numbers are called quadratic irrationals


p
a C bp c (has an order of one)
p
dC q aCb c (has an order of two) (3.20.6)
p p
d Ce f C aCb c (has an order of three)

where the order of a quadratic irrational is the maximum depth of all of the nested square roots. A
rational number is a quadratic irrationality
p of order zero. Now comes the key idea: any quadratic
p
irrational can be written as X C Y Z where X; Y and Z are p quadratic irrationals and Z has
a higher order than either X or Y . For example, we can write 4 a C b as
p q p p
4
aCb D0C 0C aCb W X D 0; Y D 1; Z D aCb
p p
Noting that Z D a C b is a quadratic irrational: Z D 0 C 1  a C b.
With this, we’re now able to see how Wantzel attacked the doubling the cube and trisecting
the angle problems.

3.20.4 Wantzel’s solution on two classical problems of antiquity


Herein I present Wantzel’s proof that it is impossible to double the cube and to trisect the angle
using straightedge and compass. Let’s first solve the doubling the cube first. It is amount to
whether the cubic equation x 3 2 D 0 has a constructible solution. Wantzel showed that this
equation cannot have such root, thus it is impossible to double the cube, with straightedge and
compass.
First, we notice that x 3 2 D 0 does not have rational rootsŽ . Second,p assume that this
equation has a constructible root,pthis root must be then of the form A C B C , where A; B and
C are quadratic irrationals and C has a higher order than p either A or B. We further assume
that B ¤ 0; we shall see why shortly. Introducing A C B C into x 3 2 D 0, we get
p p
A3 C 3A2 B C C 3AB 2 C C B 3 C C 2 D 0

What we do with this equation? On the RHS we have a quadratic irrational, which is not in the
right form yet, so we rewrite it
p
.A3 C 3AB 2 C 2/ C .B 3 C C 3A2 B/ C D 0

This leads to
A3 C 3AB 2 C 2 D 0; B 3 C C 3A2 B 1 D 0
If B D 0, the above become A3 2 D 0, which is just the cubic equation that we started with.
By looking at the exponents of B in the above two equations, we can see that B also satisfies
Ž
Remember the rational root theorem of Descartes discussed in Section 2.29.3? It is the key here.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 334

p p
themŽŽ . Thus, if A C B C is a root, so is A B C . Using Vieta’s theorem about the sum of
the roots (Section 2.29.7), we then have
p p
.A C B C / C .A B C / C x3 D 0 H) x3 D 2A

But this cubic equation does


p not have rational roots, so A must be a quadratic
pirrational. Thus,
we can write x3 D D C E F . Applying the same logic to this root D C E F we also have
x4 D 2D is a root of the cubic equation. Now, what we have achieved was to show that the
cubic equation has four roots
p p
x1 D A C B C ; x2 D A B C ; x3 D 2A; x4 D 2D

These four roots are different for they are of different orders. And a cubic cannot have four differ-
ent equation. A contradiction! Thus, the cubic equation x 3 2 D 0 cannot have a constructible
root. In other words, it is impossible to double the cube.
Now, we turn attention to the problem of trisecting the angle. When we say that it is possible
to trisect an angle, we mean that we can trisect any angle. Therefore, to show that it is impossible
to do so, we just need one counterexample. And Wantzel chose an angle of 60ı . We need to find
the equation for cos 20ı :
1
D 4 cos3 20ı 3 cos 20ı
2
With x D 2 cos 20ı , the above becomes x 3 3x 1 D 0. Now, instead of x 3 2 D 0, we
have another cubic equation. Using the same argument, we can conclude that it is impossible to
trisect the angle.

3.21 Solving polynomial equations algebraically


It has been shown that the vertices of a regular n gon are the nth roots of unity. Therefore,
the geometric problem of constructing an n gon is related to solving the polynomial equation
z n 1 D 0 for z 2 C. An understanding of geometry hence requires an understanding of
equations. This section is a brief introduction to the topic of finding the algebraic solutions to
polynomial equations.
To make the terms clear, here is one example. The algebraic solutions of the quadratic
equation ax 2 C bx C c D 0 are
p
b ˙ b 2 4ac
x1;2 D (3.21.1)
2a
And our aim is to study solutions to the cubics, quartics and so on. All the solutions must be
written in the form similar to Eq. (3.21.1). That is, the solutions are written using only the four
fundamental arithmetic operations C; ; ;  and extraction of roots.
ŽŽ
You can plug B into the first equation, you get the same equation because of B 2 . If you plug B into the
second one, you get .B 3 C C 3A2 B/, which is also zero.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 335

3.21.1 Solving quadratic equations


By plotting the function y.x/ D ax 2 C bx C c on a Cartesian plane, as done in Fig. 3.68, we
can see that the curve either intersects the x axis twice or does not intersect at all. Thus, the
quadratic equation ax 2 C bx C c D 0 has: (1) two real solutions or (2) two complex solutions
(or no real solutions); do not forget that x 2 D 1 has two roots x D ˙i . So, any quadratic
equation has maximum two solutions.
y
ax 2 C bx C c
y y
ax 2 C bx C c
u2 C d

x
x1 x2 x
p u
b  O x1 O0 x2
2a 2a
P
p
b  P
2a p 2a
b  b
2a
C 2a 2a

(a) (b)

Figure 3.68: Graph of y.x/ D ax 2 C bx C c.

The next step is to examine why the roots are of the form in Eq. (3.21.1). First, there is a
special point on the parabola y.x/ D ax 2 C bx C c: point P –the lowest point on the curve.
We can find this point by algebra. Using complete-the-square technique, we can massage the
expression ax 2 C bx C c as
 
2 b c b 2 c b2 c b2
x C xC D xC C  8x
a a 2a a 4a2 a 4a2
Thus, the x coord of P is b=2a. The parabola is symmetrical with respect to the vertical line
drawn from P . Because of this symmetry, the two roots must have the form b=2a plus/minus
something. We also understand the geometry behind the change of variable x D u b=2a:
we use a different coordinate system with the origin at O 0 (Fig. 3.68b). In this new coordinate
system, the parabola is of the simple form u2 C d (because its lowest point has an x coordinate
of 0).

3.21.2 Solving cubic equations


We reconsider the depressed cubic equation given in Eq. (2.13.6)

x 3 C px D q (3.21.2)

However, our task now is to derive all the three roots of this equation via formula like Eq. (3.21.1).
The approach is of course based on Cardano’s solution technique in combination with the fact
that an n order polynomial equation has n complex roots.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 336

We start with a change of variable


p
x D u C v; vD
3u
to get what Lagrange called the resolvent equation of the original cubic equation:

p3 p3
u3 D q ” u6 qu3 D0 (3.21.3)
27u3 27
which is a six-degree equation, but disguised as a quadratic equation with t D u3 : t 2 qt
p 3=27 D 0. Solving this quadratic equation we obtain

r r
q q2 p3 q q2 p3
t1 D C C ; t2 D C (3.21.4)
2 4 27 2 4 27
Do not forget Vietè told us: t1 t2 D p3=27.
Having now t (two of them), we use t D u3 to get u. Note Im
that we have to be careful not to miss any roots. We p must find
out six u. From t1 , we get the first real solution u1 D 3 t1 . How
!
to get p
the complex roots? Referring to the figure, we see that
p
u2 D t1 !, where ! D cos 2=3 p
3
C i sin 2=3 (a cubic root of 3
t1
unity). We get another one u3 D 3 t1 ! 2 . And doing the same O Re
thing for t2 , finally all the six roots of the resolvent equation are
p !2
u1 D 3 t1 ; u2 D u1 !; u3 D u1 ! 2 ;
p (3.21.5)
u4 D 3 t2 ; u5 D u4 !; u6 D u4 ! 2
p
We have u1 u4 D 3 t1 t2 D p=3. Now, we determine x1 and x4 ,
the real roots of the original depressed cubic equation:
p p
x1 D u1 ; x4 D u4
3u1 3u4
Using u1 u4 D p=3, we then have x1 D x4 D u1 C u4 . Thus, the two real roots are identical!
And this real root is nothing but the sum of u1 and u4 –the real roots of the resolvent equation.
We can get Cardano’s formula for this root
s r s r
3 q q 2 p 3 3 q q2 p3
x1 D C C C C
2 4 27 2 4 27
Similarly, we have
p p
x2 D u2 ; u2 u6 D u1 u4 D H) x2 D u2 C u6 D x6
3u2 3
p p (3.21.6)
x3 D u3 ; u3 u5 D u1 u4 D H) x3 D u3 C u5 D x5
3u3 3

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 337

What we have just achieved? We have expressed all of the roots of the cubic equation in
terms of all of the roots of the resolvent:
x1 D u1 C u4
x2 D u2 C u6 (3.21.7)
x3 D u3 C u5

Example 3.2
Solving the cubic equation x 3 6x D 4. This one was solved by Euler in the 17th century.
We are going to solve this equation two ways. In the first way, we use Descartes’ rational root
test to find out easy roots. In the second way, we use the formula in Eq. (3.21.7).

 First way: assuming that we have roots of the form x D p=q , then Descartes told us
that pj 4 and qj1. Thus, we have p D f˙1; ˙2; ˙4g and q D f˙1g. Therefore,
x D f˙1; ˙2; ˙4g. Plugging those values into the equation we see that x D 2 is a
root. Now, it is easy to solve the cubic equation using factor theorem:

x3 6x 4 D 0 ” .x C 2/.x 2 2x 2/ D 0

Now we just need to solve the quadratic equation .x 2 2x 2/ D 0. Altogether, we


have three real distinct roots:
p p
x1 D 2; x2 D 1 C 3; x3 D 1 3

 Second way: we first compute t1 ; t2 with p D 6 and q D 4: t1 D 2 C 2i and


t2 D 2 2i. (Noting that q 2=4 C p3=27 D 4 is negative) Writing t1 ; t2 in polar form we
can then compute u1 and u4 :
p p
u1 D 2e i =12 ; u4 D 2e i =12
p
Thus, we get the first solution
p x1 D u1 C u4 D 2 2 cos =12. We can compute
cos =12 from cos =6 D 3=2.
s p
p 1 3 p
x1 D 2 2 C D1C 3
2 4
In the same manner, it is possible to find other roots. For example, x2 D u2 C u6 D
u1 ! C u4 ! 2 with ! D cos 2=3 C i sin 2=3.

3.21.3 Studying roots of cubic using its graph


We have seen from Example 3.2 that when 4p 3 C 27q 2 < 0 there are three real roots. This
section presents an analysis using the tangent line concept (from calculus) to understand the

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 338

reason why. We consider the depressed cubic equation x 3 C px D q. Now, we rewrite this
equation as x 3 D px C q, and we study the relation of two curves: one is the cubic y D x 3
and one is the line y D px C q. Where these two curves intersect lie the roots of the cubic
equation.
First, on a Cartesian plane we plot the simplest cubic C W y D x 3
(the red curve in the figure). Now we focus on the tangents to C , because y y D x3
they’re special. To this end, consider an arbitrary point P on C with
coordinates .x0 ; x03 /; x0 can be anything from 1 to 1. The tangent
line to C at P isŽ :
x
T W y D x03 C 3x02 .x x0 / D .3x02 /x 2x03

Thus the slope of the tangent T is 3x02 and its intercept is 2x03 . Given a
slope k > 0, there are two x0 such that 3x02 D k: there are two tangents
that are parallel to each other. They are orange lines in the diagram. Why
these two tangents are special in our study? Because if a line having the same slope with them
and lives inside the domain bounded by them, this line intersects C at three distinct points (the
purple line). Now, the line y D px Cq will have the same slope with the tangents if p D 3x02 ,
and to have three distinct roots, we need jqj < 2x03 . Thus, we have
 p 3
p D 3x02 ; jqj < 2x03 H) q 2 < 4x06 D 4 ” 4p 3 C 27q 2 < 0
3

As the number of roots depends on the sign of 4p 3 C 27q 2 , it is the discriminant of the depressed
cubic equation:  D 4p 3 C 27q 2 . We then have,

  < 0: three real distinct roots;

  D 0:

  > 0: one real root and two complex conjugate roots.

3.22 Non-Euclidean geometries

3.23 Computer algebra systems


A computer algebra system (CAS) or symbolic algebra system (SAS) is any mathematical soft-
ware with the ability to manipulate mathematical expressions in a way similar to the traditional
manual computations of mathematicians and scientists. The main general-purpose computer al-
gebra systems are Maple, Mathematica (proprietary software) and Axiom, Maxima, Magma,
SageMath (free).

Ž
See Section 4.4.6 for detail.

Phu Nguyen, Monash University © Draft version


Chapter 3. Geometry and trigonometry 339

The primary goal of a Computer Algebra system is to automate te-


dious and sometimes difficult algebraic manipulation tasks. Computer
algebra systems have not only changed how mathematics is taught at
many schools and universities, but have provided a flexible tool for
mathematicians worldwide. Computer algebra systems can be used to
simplify rational functions, factor polynomials, find the solutions to a
system of equations, and various other manipulations. In calculus, they
can be used to find the limit, derivative and integrals of functions, all
done symbolically. Computer algebra systems began to appear in the
1960s and evolved out of two quite different sources—the requirements of theoretical physicists
and research into artificial intelligence.
In the next figure, some use of a CAS using the Python SymPy package is illustrated. Ap-
pendix A.10 presents a short introduction to this package.

3.24 Review
This chapter has presented geometry and trigonometry as usually taught in high schools but with
less focusing on rote memorization of many trigonometric identities. Briefly, trigonometry was
developed as a tool to solve astronomical problems. It was then modified and further developed
to solve plane triangle problems–those arising in navigation, and surveying. And eventually it
became a branch of mathematics i.e., it is studied for its own sake.
Now that we know a bit of algebra and a bit of geometry and trigonometry, it is time to meet
calculus. About calculus, the Hungarian-American mathematician, physicist, John von Neumann
said

The calculus was the first achievement of modern mathematics and it is difficult to
overestimate its importance. I think it defines more unequivocally than anything
else the inception of modern mathematics; and the system of mathematical analysis,
which is its logical development, still constitutes the greatest technical advance in
exact thinking.

Phu Nguyen, Monash University © Draft version


Chapter 4
Calculus

Contents
4.1 Conic sections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343
4.2 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351
4.3 Integral calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360
4.4 Differential calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 368
4.5 Applications of derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . 395
4.6 The fundamental theorem of calculus . . . . . . . . . . . . . . . . . . . . 406
4.7 Integration techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411
4.8 Improper integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 432
4.9 Applications of integration . . . . . . . . . . . . . . . . . . . . . . . . . . 433
4.10 Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 444
4.11 Some theorems on differentiable functions . . . . . . . . . . . . . . . . . 460
4.12 Parametric curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463
4.13 Polar coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 468
4.14 Bézier curves: fascinating parametric curves . . . . . . . . . . . . . . . . 475
4.15 Infinite series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 479
4.16 Applications of Taylor’ series . . . . . . . . . . . . . . . . . . . . . . . . 496
4.17 Bernoulli numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 499
4.18 Euler-Maclaurin summation formula . . . . . . . . . . . . . . . . . . . . 502
4.19 Fourier series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505
4.20 Special functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 511
4.21 Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 513

340
Chapter 4. Calculus 341

Ancient geometers (i.e., mathematicians working on geometrical problems) was obsessed


with two problems: (1) finding the area of planar shapes (e.g. the area of a circle) and (2) finding
the tangent to a curve i.e., the line that touches the curve at only one point. Although some
results were obtained, mostly by Archimedes with the method of exhaustion, a universal method
that can be applied to any curves was not available until the early of the seventeenth century.
The mathematicians of the seventeenth century were equipped with more powerful math-
ematics; they had the symbolic algebra of Viète and the analytic geometry of Descartes and
Fermat. Furthermore, the work of Kepler on the motion of heavenly objects and Galileo on the
motion of earthly objects has put the study of motion into the scene. The seventeenth mathe-
maticians no longer saw static objects (such as curves) as motionless. They saw curves as the
trajectory of the motion of a particle. With these new tools and dynamics view, they again solved
the two above geometrical problems, old results were confirmed and new results were obtained.
The pinnacle of the mathematical developments of this century were the introduction, by
Newton and Leibniz, of the two concepts–derivative and integral. The former provides the final
answer to the tangent problem and the latter is the solution to the area problem. What is more is
that a connection between the derivative and the integral was discovered, what is now we call
the fundamental theorem of calculus.
With this new mathematics, called the calculus, problems that once required the genius of
Archimedes can be solved by any high school students. A powerful thing was developed. And
as in many other cases in mathematics, the calculus turns out to be a very effective tool to solve
many other problems; those involve changes. That’s why Richard Feynman–the 1964 Nobel-
winning theoretical physicist once said “Calculus is the language God talks”. Feynman was
probably referring to the fact that physical laws are written in the language of calculus. Steven
Strogatz in his interesting book Infinite Powers [69] wrote ‘Without calculus, we wouldn’t
have cell phones, computers, or microwave ovens. We wouldn’t have radio. Or television. Or
ultrasound for expectant mothers, or GPS for lost travelers. We wouldn’t have split the atom,
unraveled the human genome, or put astronauts on the moon.’
Thus it is not surprising that calculus occupies an important part in the mathematics curricu-
lum in both high schools and universities. Sadly, as being taught in schools, calculus is packed
with many theorems, formula and tricks. The truth is, the essence of calculus is quite simple:
calculus is often seen as the mathematics of changes. The ball rolls down an inclined plane:
change of position in time or motion. A curve is something that changes direction. That were
the two types of changes that motivated the development of the calculus.
But calculus does not work with all kinds of change. It only works with change of continuous
quantities. Mathematicians (and physicists as well) assume that space and time are continuous.
For example, given a length we can cut it in two halves, cut one half into two halves, and so on
to infinity. What we get from this infinite division? A very very small quantity which is not zero
but smaller than any positive real numbers. We call such quantity an infinitesimal.
How does calculus work? It works based on one single principle: the principle of infinity–a
term coined by Strogatz. Take as example the problem of finding a tangent to a curve. This
curve is divided into infinitely many line segments, each has an infinitesimal length. With that, a
tangent to a curve at any point is simply the slope of the line segment connecting that point to

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 342

the next point (infinitesimally nearby). The slope of a line? We know it.
This chapter is devoted to calculus of functions of single variable. I use primarily the follow-
ing books for the material presented herein:

 Infinite Powers by Steven Strogatz§ [69]. I recommend anyone to read this book before
taking any calculus class;

 Calculus by Gilbert Strang‘ [67];

 What is mathematics?: an elementary approach to ideas and methods by Richard Courant,


Herbert RobbinsŽŽ , Ian Stewart [15];

 The historical development of the calculus by Charles Edwards [20]

 Calculus: An Intuitive and Physical Approach by Moris Kline [36]

Our plan in this chapter is as follows. First, in Section 4.1, we briefly discuss the analytic
geometry with the introduction of the Cartesian coordinate system, the association of any curve
with an equation. Second, the concept of function is introduced (Section 4.2). Then, integral
calculus of which the most important concept is an integral is treated in Section 4.3. That is
followed by a presentation of the differential calculus of which the most vital concept is a
derivative (Section 4.4). We then present some applications of the derivative in Section 4.5. The
connection between integral and derivative is treated in Section 4.6, followed by methods to
compute integrals in Section 4.7.
Section 4.9 gives some applications of integration. A proper definition of the limit of a
function is then stated in Section 4.10. Some theorems in calculus are presented in Section 4.11.
Polar coordinates are discussed in Section 4.13. Bézier curves–a topic not provided in high school
and even college program–is shown in Section 4.14. Infinite series and in particular Taylor series
are the topics of Section 4.15. Applications of Taylor series are given in Section 4.16. Fourier
series are given in Section 4.19. Section 4.20
§
Steven Henry Strogatz (born 1959) is an American mathematician and the Jacob Gould Schurman Professor of
Applied Mathematics at Cornell University. He is known for his work on nonlinear systems, including contributions
to the study of synchronization in dynamical systems, for his research in a variety of areas of applied mathematics,
including mathematical biology and complex network theory. Strogatz is probably famous for his writings for the
general public, one can cite Sync, The joy of x, Infinite Powers.

William Gilbert Strang (born 1934) is an American mathematician, with contributions to finite element theory,
the calculus of variations, wavelet analysis and linear algebra. He has made many contributions to mathematics
education, including publishing seven mathematics textbooks and one monograph.
ŽŽ
Herbert Ellis Robbins (1915 – 2001) was an American mathematician and statistician. He did research in
topology, measure theory, statistics, and a variety of other fields. The Robbins lemma, used in empirical Bayes
methods, is named after him. Robbins algebras are named after him because of a conjecture that he posed concerning
Boolean algebras.

Morris Kline (1908 – 1992) was a professor of Mathematics, a writer on the history, philosophy, and teaching
of mathematics, and also a popularizer of mathematical subjects.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 343

4.1 Conic sections


Conic sections received their name because they can be represented by a cross section of a plane
cutting through a cone (Fig. 4.1). A conic section is a curve on a plane that is defined by a
2nd-degree polynomial equation in two variables. Conic sections are classified into four groups:
parabolas, circles, ellipses, and hyperbolas. The conic sections were named and studied as long
ago as 200 B.C.E., when Apollonius of Perga undertook a systematic study of their properties.

Figure 4.1: Conic sections: parabolas, circles, ellipses, and hyperbolas.

Two well-known conics are the circle and the ellipse. They arise when the intersection of the
cone and plane is a closed curve (Fig. 4.2a). The circle is a special case of the ellipse in which
the plane is perpendicular to the axis of the cone. If the plane is parallel to a generator line of
the cone, the conic is called a parabola. Finally, if the intersection is an open curve and the plane
is not parallel to generator lines of the cone, the figure is a hyperbola.

(a) (b)

Figure 4.2

Conic sections are observed in the paths taken by celestial bodies (e.g. planets). When two
massive objects interact according to Newton’s law of universal gravitation, their orbits are conic
sections if their common center of mass is considered to be at rest. If they are bound together,
they will both trace out ellipses; if they are moving apart, they will both follow parabolas or
hyperbolas (Fig. 4.2b).
Straight lines use 1; x; y. The next curves use x 2 ; xy; y 2 , which are conics. It is important to
see both the curves and their equations. This section presents the analytic geometry of René

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 344

Descartes and Pierre de Fermat in which the geometry of the curve is connected to the analysis of
the associated equation. Numbers are assigned to points, we speak about the point .1; 2/. Euclid
and Archimedes might not have understood as Strang put it.

4.1.1 Circles
Definition 4.1.1
A circle is a set of points whose distance to a special point–the center–is constant.

From this definition, we can derive the equation of a circle. Let’s denote the center by .xc ; yc /
and the radius by r, then we consider a point .x; y/ on the circle. The fact that the distance from
this point to the center is r is written as (using Eq. (3.20.1))
p
.x xc /2 C .y yc /2 D r H) .x xc /2 C .y yc /2 D r 2

Upon expansion, we get the following form

x2 C y2 2xc x 2yc y C xc2 C yc2 r2 D 0 (4.1.1)

When xc D yc D 0 i.e., the center of the circle is at the origin, the equation of the circle is much
simplified:
x2 C y2 D r 2 (4.1.2)

4.1.2 Ellipses
An ellipse is a plane curve surrounding two focal points, such that for all points on the curve,
the sum of the two distances to the focal points is a constant. It generalizes a circle, which is the
special type of ellipse in which the two focal points are the same.

Definition 4.1.2
The ellipse is the set of all points .x; y/ such that the sum of the distances from .x; y/ to the
foci is constant.

We are going to use the definition of an ellipse to derive its equation. Assume that the ellipse
is centered at the origin, and its foci are located at F1 . c; 0/ and F2 .c; 0/; see Fig. 4.3. The two
vertices on the horizontal axis are A1 .a; 0/ and A2 . a; 0/.
It is clear that the distances from A1 (or A2 ) to the two foci are 2a, and that is the constant
mentioned in the definition. So, pick any point P .x; y/, and compute its distances to the foci

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 345

y
P (x, y) B1 (0, b)

d1 d2

A2 F1 F2 A1
x
(−a, 0) (−c, 0) (c, 0) (a, 0)

B2 (0, −b)

Figure 4.3: An ellipse centered at the origin. The major axis of an ellipse is its longest diameter: a line
segment that runs through the center and both foci, with ends at the widest points of the perimeter. The
semi-major axis is one half of the major axis. The semi-minor axis is a line segment that is perpendicular
with the semi-major axis and has one end at the center.

d1 C d2 , set it to 2a and do some algebraic manipulations, we have

d1 C d2 D 2a (definition of ellipse)
p p
.x C c/2 C y 2 C .x c/2 C y 2 D 2a (definition of distance)
p p
.x C c/2 C y 2 D 2a .x c/2 C y 2
p
.x C c/2 C y 2 D 4a2 C .x c/2 C y 2 4a .x c/2 C y 2
p
a .x c/2 C y 2 D a2 xc
.a2 c 2 /x 2 C a2 y 2 D a2 .a2 c2/
x2 y2
C D1
a2 a2 c 2
All steps from the third equality are just algebraic, to remove the square root. Now, the final step
is to bring b into play by considering that distances from B1 to the foci are also 2a (from the
very definition of an ellipse). This gives us b 2 C c 2 D a2 . So, we have

x2 y2
C D 1; b 2 C c 2 D a2 (4.1.3)
a2 b2
From which an ellipse is reduced to a circle when a D b.
Ellipses are common in physics, astronomy and engineering. For example, the orbit of each
planet in the solar system is approximately an ellipse with the Sun at one focus point. The same
is true for moons orbiting planets and all other systems of two astronomical bodies. The shapes
of planets and stars are often well described by ellipsoids.

Area of ellipse. If we know the area of a circle is  r 2 , then what is the area of an ellipse? We
can get the formula without actually computing it. This area must be in the form f .a; b/, and

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 346

f .a; b/ D f .b; a/ and f .a; a/ D a2 . The only form is f .a; b/ D ab. So, the area of an ellipse
is ab.

Reflecting property of ellipses. The ellipse reflection property says that rays of light emanating
from one focus, and then reflected off the ellipse, will pass through the other focus. Now, apart
from being mathematically interesting, what makes this property so fascinating? Well, there
are several reasons. Most notable of which is its significance to physics, primarily optics and
acoustics. Both light and sound are affected in this way. In fact there are many famous buildings
designed to exploit this property. Such buildings are referred to as whisper galleries or whisper
chambers. St. Paul’s Cathedral in London, England was designed by architect and mathematician
Sir Christopher Wren (1632–1723) and contains one such whisper gallery. The effect that such
a room creates is that if one person is standing at one of the foci, a person standing at the other
focus can hear even the slightest whisper spoken by the other. We refer to Section 4.4.2 for a
proof.

4.1.3 Parabolas
When you kick a soccer ball (or shoot an arrow, fire a missile or throw a stone) it arcs up into
the air and comes down again ... following the path of a parabola. A parabola is a curve where
any point is at an equal distance from a fixed point (called the focus), and from a fixed straight
line (called the directrix).

P .x; y/ P .x; y/ y

jPF j jPF j
F .a; b/ F .0; b/
jP lj jP lj
x
V directrix l V directrix l
yDk yD b
(a) (b)

Figure 4.4: A parabola is a curve where any point is at an equal distance from a fixed point (the focus
F ), and a fixed straight line (the directrix l); or jPF j D jP lj. To compute the distance from P to l we
compute the distance between P and its projection on l which is .x; k/. The vertex V is the lowest point
on the parabola.

In Fig. 4.4a, we label the focus as F with coordinates .a; b/, and a horizontal directrix l with
the equation y D k (of course we can have parabolas with a vertical directrix). Now, we pick a
point P .x; y/ and the definition of a parabola tells us jPF j D jP lj (i.e., the distance from P
to F is equal to the distance from P to the directrix l). Computing jPF j and jP lj and equate

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 347

them, we have
p p
.y k/2 D .x a/2 C .y b/2
y2 2yk C k 2 D x 2 2ax C a2 C y 2 2yb C b 2
(4.1.4)
.x a/2 bCk
yD C
2.b k/ 2

One can see that bCk=2 is the ordinate of the vertex of the parabola. To simplify the equation, we
can put the origin at V , as done in Fig. 4.4b, then we have a D 0 and k D b, thus

x2
yD or x 2 D 4by
4b
Reflecting property of parabola. The parabola reflection prop- y
erty says that rays of light emanating from one focus, and then
reflected off the parabola in a path parallel to the y axis (or vice ˛
versa). To prove this property, see the next figure. We consider a
parabola with the vertex at the origin. We then consider a point F .0; b/
ˇ
P .x1 ; y1 /

P .x1 ; y1 / on the parabola. Through P we draw a tangent line


that intersects the y axis at T . We can write the equation for this tangent x
tangent line (see Section 4.4.6), and thus determine the ordinate
˛
of T ; it is y1 . From optic (Section 4.4.2) we know that the light
follows the path such that ˛ D ˇ. So all we need to prove is that T .0; y1 /
PF is making an angle (with the tangent) exactly equal to ˛ (i.e.,
consistent with the physics of light). This is indeed the case as
the triangle TFP is an isosceles triangle (proved by checking that TF D FP , all coordinates
known).
What are some applications of this nice property of the parabola? A solar collector and a TV
dish are parabolic; they concentrate sun rays and TV signals onto a point–a heat cell or a receiver
collects them at the focus. Car headlights turn the idea around: the light starts from the focus
and emits outward. Is this reflection property related to that of an ellipse? Yes, for the parabola
one focus is at infinity.

4.1.4 Hyperbolas
Definition 4.1.3
A hyperbola is the set of all points .x; y/ in a plane such that the difference of the distances
between .x; y/ and the two foci is a positive constant.

Notice that the definition of a hyperbola is very similar to that of an ellipse. The distinction
is that the hyperbola is defined in terms of the difference of two distances, whereas the ellipse is
defined in terms of the sum of two distances. So, the equation of a hyperbola is very similar to

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 348

the equation of an ellipse (instead of a plus sign we have a minus sign):

x2 y2
D1 (4.1.5)
a2 b2
What is the graph of a hyperbola looks like? First, we need to re-write the equation in the usual
form y D f .x/:
bp 2
yD˙ x a2 ; jxj  a
a
Thus, there are two branches, one for x  a and one for x  a. When x ! 1, y ! 1.
But we can do better than that: we have x 2 a2  x 2 when x ! 1. Thus, for x ! 1, y
approaches ˙.b=a/x. These two lines are therefore called the asymptotes of the hyperbola. We
can see all of this in Fig. 4.5a for a particular case with a D 5 and b D 3. When a D b, the
asymptotes are perpendicular, and we get a rectangular or right hyperbola (Fig. 4.5b).

y y

3 3
y=− x y= x
5 5
y = −x y=x

3√ 2
y= x − 25
5

−5 5 x −5 5 x

3√ 2
y=− x − 25
5

(a) a D 5, b D 3 (b) a D b D 5

Figure 4.5: Graph of hyperbolas: when a D b, we get a rectangular hyperbola.

4.1.5 General form of conic sections


Any conic section namely ellipse, circle, parabola or hyperbola can be generally described by
the following equation:
Ax 2 C Cy 2 C Dx C Ey C F D 0 (4.1.6)
But this equation is not complete in the sense that it lacks the term xy. Actually, the most general
form of a conic section is the following with a xy term (mathematics is fair, isn’t it?):

Ax 2 C Bxy C Cy 2 C Dx C Ey C F D 0 (4.1.7)

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 349

y
The proof is based on the fact that we can transform Y
Eq. (4.1.7) to Eq. (4.1.6) by a specific rotation of axes to be de-
P .x; y/
scribed in what follows. First we consider axes Ox and Oy. We P .X; Y /
then rotate these axes an angle  counterclockwise to have OX y
and OY . Considering a point P which has coordinates .x; y/ in Y r X
the xy system and .X; Y / in the rotated system. The aim is now '
X
to relate these two sets of coordinates. From the figure, we have 
x
( (
these results: O x
X D r cos ' x D r cos.' C /
;
Y D r sin ' y D r sin.' C /
Using the trigonometry identities for sin.a C b/ and cos.a C b/, we can write x; y in terms of
X; Y as (
x D X cos  Y sin 
(4.1.8)
y D X sin  C Y cos 
and by solving X; Y , we have
(
X D Cx cos  C y sin 
Y D x sin  C y cos 

Substituting Eq. (4.1.8) into Eq. (4.1.7) we get an equation in terms of X; Y

A.X cos  Y sin /2 C B.X cos  Y sin /.X sin  C Y cos /C
C.X sin  C Y cos /2 C D.X cos  Y sin / C E.X sin  C Y cos / C F D 0 (4.1.9)

which has this form

A0 X 2 C B 0 XY C C 0 Y 2 C D 0 X C E 0 Y C F D 0

and we’re interested in the XY term with the coefficient given by

B 0 D B.cos2  sin2 / C 2.C A/ sin  cos  D B cos 2 C .C A/ sin 2

The condition B 0 D 0 (so that no cross term XY is present) gives us

A C
B cos 2 C .C A/ sin 2 D 0 H) cot 2 D
B

Example 4.1
Now we show that the equation xy D 1 is a hyperbola. This is of the form in Eq. (4.1.7)
with A D C D 0 and B D 1. Thus, cot 2 D 0, hence  D =4. With this rotation angle,
using Eq. (4.1.8) we can write x; y in terms of X; Y as

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 350

p p p p
2 2 2 2
xDX Y ; yDX CY
2 2 2 2
And therefore xy D 1 becomes
X2 Y2
D1
2 2
which is obviously a hyperbola.

We also compute A0 and C 0 for another use later:

A0 D A cos2  C B sin  cos  C C sin2 ; C 0 D A sin2  B sin  cos  C C cos2 

Isn’t is remarkable that even though A0 ; B 0 ; C 0 are different from A; B; C , certain quantities do
not. For example, the sum A0 C C 0 is invariant:

A0 C C 0 D A C C

We also have another invariant–the so-called discriminant of the equation given by B 02 4A0 C 0 :

B 02 4A0 C 0 D .B.cos2  sin2 / C 2.C A/ sin  cos /2


4.A cos2  C B sin  cos  C C sin2 /.A sin2  B sin  cos  C C cos2 /
D B 2 .cos2  sin2 /2 C .4BC 4BA/.cos3  sin  sin3  cos /C
4.C A/2 sin2  cos2 
4A2 cos2  sin2  C 4AB cos3  sin  4AC cos4  4AB sin3  cos C
4B 2 sin2  cos2  4BC sin  cos3  4CA sin4  C 4CB sin3  cos 
4C 2 cos2  sin2 
D B 2 .cos4  C sin4 / C 2B 2 cos2  sin2  8AC sin2  cos2 
4AC.cos4  C sin4 /
D .B 2 4AC /.cos4  C sin4 / C 2.B 2 4AC / sin2  cos2 
D .B 2 4AC /.cos4  C sin4  C 2 sin2  cos2 /
D .B 2 4AC /.cos2  C sin2 /2 D B 2 4AC

We have shown the proof to demonstrate the fact that sometimes mathematics can be boring
with tedious manipulations of algebraic expressions. But to be able to do something significant,
we have to be patient and resilience. Intelligence only is not enough. Albert Einstein once said
“It is not that I’m so smart. But I stay with the questions much longer.” And Voltaire–one of the
greatest of all French Enlightenment thinkers and writers– also said "no problem can withstand
the assault of sustained thinking".
And yet, there are tools called computer algebra systems such as Maple, Mathematica
and SageMath that can do symbolic calculations, so these tools can help us with some tedious
symbolic calculations such as the one we have just done. See Section 3.23 for details.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 351

Now, you might ask why bother with this term B 02 4A0 C 0 ? To answer that question, we
ask you to tell us which conic section is associated with the equation

5x 2 C y 2 C y 8D0

Well, you can massage (completing the square technique) the equation to arrive at

x 2 =5 C .y C 1=2/2 D 33=4

which is an ellipse. Very good! How about 5x 2 23xy C y 2 C y 8 D 0? As completing the


square trick does not work due to the cross term xy, you think of using a program to plot this
equation and the type of the curve comes out immediately. But, this question is not about the
final result but about finding a way (using only paper/pencil) to classify conic sections. Another
way (not so good) is to rotate the axes, so that B 0 D 0, then using the complete the square
technique. It works, but not so elegantly.
Now you have seen that we can rotate a conic section Ax 2 CBxyCCy 2 CDxCEyCF D 0
to get the same conic but written in this simpler form A0 X 2 C C 0 Y 2 C D 0 X C E 0 Y C F D 0.
And we have shown that B 2 4AC D 4A0 C 0 . It can be shown that using this A0 X 2 C C 0 Y 2 C
D 0 X C E 0 Y C F D 0, one can deduce the type of the conic based on the sign of 4A0 C 0 , thus
for the general form of conic Ax 2 C Bxy C Cy 2 C Dx C Ey C F D 0, we have this theorem:

<B
2
4AC > 0 W hyperbola
B 2
4AC < 0 W ellipse (4.1.10)
:̂ 2
B 4AC D 0 W parabola

4.2 Functions
Consider now Galileo’s experiments on balls rolling down a ramp. He measured how far a
ball went in a certain amount of time. If we denote time by t and distance by s, then we have a
relation between s and t . As s and t are varying quantities, they are called variables. The relation
between these two variables is a function. Loosely stated for the moment, a function is a relation
between variables.
The most effective mathematical representation of a function is what we call a formula. For
example, the distance the ball traveled is written as s D t 2 . The formula immediately gives us
the distance at any time; for example by plugging t D 2 into the formula the distance traveled is
4ŽŽ . As s depends on t , t is an independent variable and s a dependent variable. And we speak
of s D t 2 as s is a function of t .
As we see more and more functions it is convenient to have a notation specifically invented
for functions. Euler used the notation s D f .t/, reads f of t , to describe all functions of single
variable t. When the independent variable is not time, mathematicians use y D f .x/. You
should think of y D f .x/ as f acts on x to yield y. Note that this short notation y D f .x/
ŽŽ
Units are not important here and thus skipped.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 352

represents all functions that take one number x and return another number y! It can be y D x,
y D sin x etc.
In the function y D f .x/ for each value of x we have a corresponding value for y (D f .x/).
But what are the possible values of x? That varies from functions to functions. pFor y D x, x can
be any real number (mathematicians like to write x 2 R for this). For y D x, x must be any
real number that is equal or larger than zero (we do not discuss complex numbers in calculus in
this chapter). That’s why when we talk about a function we need to be clear about the range of
the input (called the domain of a function) and also the range of the output. The notation for that
is f W R ! R for any function that takes a real number and returns a real number.
Now we consider three common functions: a linear function y D f .x/ D x, a power
function y D x 2 and an exponential function y D 2x . For various values of the input x,
Table 4.1 presents the corresponding outputs. It is obvious that it is hard to get something out
of this table, algebra is not sufficient. We need to bring in geometry to get insights. A picture is
worth 1000 words. That’s why we plot the points .x; f .x// in a Cartesian plane and connect the
points by lines and we get the so-called graphs of functions. See Fig. 4.6 for the graphs of the
three functions under consideration.

Table 4.1: Tabulated values of three functions: y D x; y D x 2 and y D 2x .

x yDx y D x2 y D 2x

0 0 0 1
1 1 1 (1) 2 (1)
2 2 4 (3) 4 (2)
3 3 9 (5) 8 (4)
4 4 16 (7) 16 (8)
5 5 25 (9) 32 (16)
6 6 36 (11) 64 (32)

With a graph you can actually see how the function is changing, where its zeroes and in-
flection points are, how it behaves at each point, what are its minima etc. Compare looking at a
graph to looking at a picture of someone and looking at an equation to reading a description of
that person.

4.2.1 Even and odd functions


Just as with numbers where we have even numbers and odd numbers, we also have even and
odd functions. If we plot an even function y D f .x/ we observe that it is symmetrical with
respect to (sometimes, the abbreviated w.r.t is used) the y-axis; the part on one side of the vertical
axis is a reflection of the part on the other side, see Fig. 4.7. This means that f . x/ D f .x/.
On the other hand, the graph of an odd function has rotational symmetry with respect to the

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 353

y y y

4 4 4
yDx y D x2 y D 2x
3 3 3

2 2 2

1 1 1

x x x
2 1 1 2 3 4 3 2 1 1 2 3 3 2 1 1 2 3
1 1 1

2 2 2

(a) y D x (b) y D x 2 (c) y D 2x

Figure 4.6: Graph of functions y D x, y D x 2 and y D 2x . The dots are those points x; f .x/ which are
then connected by lines to get a graph. Of course, many points are used to get smooth curves.

origin, meaning that its graph remains unchanged after rotation of 180 degrees about the origin.
So, even functions and odd functions are functions which satisfy particular symmetry relations.
Mathematicians define even and odd functions as:

Definition 4.2.1
(a) A function f .x/ W R ! R is an even function if for any x 2 R: f . x/ D f .x/.

(b) A function f .x/ W R ! R is an odd function if for any x 2 R: f . x/ D f .x/.

y
y
f (x) = sin x
f (x) = x4 − 8x2 + 16

f (x∗ )
−x∗ x
−π x∗ π
x
−x∗ x∗ f (−x∗ ) = −f (x∗ )
(a) even function (b) odd function

Figure 4.7: Graphs of some even and odd functions. Typical even functions are y D x 2n , y D cos x and
typical odd functions are y D x 2nC1 , y D sin x.

Decomposition of a function. Any function f .x/ can be decomposed into a sum of an even
function and an odd function, as

f .x/ D f e .x/ C f o .x/ (4.2.1)

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 354

where the even/odd functions can be found as



f .x/ D f e .x/ C f o .x/ f e .x/ D 0:5Œf .x/ C f . x/
e o H) o (4.2.2)
f . x/ D f .x/ f .x/ f .x/ D 0:5Œf .x/ f . x/

Why such a decomposition is worthy of studying? One example: As integral is defined as area,
from Fig. 4.7, we can deduce the following results:
Z a Z a Z a
e e
f .x/dx D 2 f .x/dx; f o .x/dx D 0 (4.2.3)
a 0 a

4.2.2 Transformation of functions


For ordinary objects, we can do certain kinds of transformation to them such as vertical transla-
tion (e.g. bring a book upstairs), horizontal translation. Mathematicians do the same thing, but
of course to their mathematical objects which are functions in this discussion. ?? shows verti-
cal/horizontal translation of y D x 2 . And this seemingly useless stuff will prove to be useful
when we study waves in Section 9.10. To mathematicians, a traveling wave is just a function
moving in space, just like what we’re doing now to y D x 2 .

y
f .x/ C c
f .x C c/ f .x/ f .x c/

1
x
4 3 2 1 0 1 2 3 4

Figure 4.8: Translation of a function y D f .x/: vertical translation f .x/ C c displaces the function a
distance c upward (c > 0), and downward if c < 0. Horizontal translation to the right with f .x c/ and
to the left with f .x C c/ for c > 0. Note: the original function is y D x 2 plotted as the blue curve.

And as we stretch (or squeeze/shrink) a solid object mathematicians stretch and squeeze
functions. They can do a horizontal stretching by the transformation f .cx/ (c < 1) and a
vertical stretching with cf .x/ (c > 1). Fig. 4.9 illustrates these scaling transformations for
y D sin x.

4.2.3 Function of function


Function composition (or function of function) is an operation that takes two functions f and
g and produces a function h such that h.x/ D g.f .x//. Intuitively, composing functions is a
chaining process in which the output of function f feeds the input of function g. This is one
way for mathematicians to create new functions from old ones.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 355

y y
sin.x/ sin.2x/
1 1 sin.x/

  00  
x   00  
x
3 2 3 2
2 2 2 2 2 2
1 1
2 sin.x/

(a) horizontal shrink (b) vertical stretching

Figure 4.9: Stretching and shrinking mathematical functions.

The notation .g ı f /.x/ is used to represent a composition of two functions:

.g ı f /.x/ WD g.f .x// (4.2.4)

For example, consider two functions: g.x/ D sin x and f .x/ D x 2 , we obtain the composite
function sin x 2 . If we do the inverse i.e., .f ı g/.x/ we get sin2 x. So, .g ı f /.x/ ¤ .f ı g/.x/.
Is it interesting to know that, later on in linear algebra course, we will see that this fact is why
matrix-matrix multiplication is not commutative (Section 11.6).
How about chaining three functions h.x/; g.x/ and f .x/? It is built on top of composing
two functions:
Œh ı .g ı f /.x/ D h.Œg ı f .x// D h.gŒf .x// (4.2.5)
where we have used Eq. (4.2.4) in the first equality. It can be seen that (verify it for yourself)

Œh ı .g ı f /.x/ D Œ.h ı g/ ı f .x/

That is function composition is not commutative but is associative (similar to .ab/c D a.bc/
for reals a; b; c).

4.2.4 Domain, co-domain and range of a function


p
If we consider these two functions y D x 2 and y D x we see that the first function is an
easy guy who accepts any value of x. On the other hand, the second function is picky; it only
accepts non-negative numbers i.e., x  0Ž . So, to specify all the possible values of the input,
mathematicians invented the concept domain of a function. The domain of a function is the set
of all inputs.
If we have something for the input, you know that we also have something for the output:
that is called the co-domain. The co-domain is the set of outputs. Thus, the co-domain of y D x 2
Ž
We confine our discussion in this chapter mostly to functions of real numbers. Functions of complex numbers
is left to Chapter 7.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 356

p p
and y D x is R i.e., all real numbers. However, we know that y D x can only output non-
negative reals. Ok. Mathematicians invented another concept: the range of a function, which is a
sub-set of its co-domain which contains the actual outputs. For example, if f .x/ D x 2 , then its
co-domain is all real numbers but its range is only non-negative reals. Using Venn diagramsŽŽ
we can visualize these concepts (Fig. 4.10).

x f (x)

Range

Domain Co-domain

Figure 4.10: Venn diagram for domain, co-domain and range of a function.

Example 4.2
One example is sufficient to demonstrate how to find the domain of a function:
2x 1
f .x/ D p
1 x 5
As we forbid division by zero and only real numbers are considered, the function only makes
sense when: ( p
1 x 5¤0
H) x ¤ 6 and x  5
x 50
To say x is a number that is larger or equal 5 and different from 5, we can write x ¤ 6 and
x  5. Mathematicians seems to write it this way: x 2 Œ5; 6/ [ .6; 1/. This is because
they’re thinking this way: considering the number line starting from 5, and you make a cut
at 6 (we do not want it). Thus the line is broken into two peaces Œ5; 6/ and .6; 1/. And the
symbol [ in A [ B means a union of both sets A and B. The brackets mean that the interval
is closed – it includes the endpoints. An open interval .a; b/, on the other hand, would not
include endpoints a and b, and would be defined as a < x < b.

4.2.5 Inverse functions


Given a function y D f .x/ which produces a number y from an input x, the inverse function
x D g.y/ undoes the function f by giving back the number x that we started with. That is if
y D f .x/, then x D g.f .x//. The inverse function of f is commonly denoted by f 1 (not
1=f ). We now illustrate some examples. Corresponding to a power function and an exponential
ŽŽ
Check Section 5.5 if you’re not sure of Venn diagrams.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 357

function, we have their inverses: the root function and the logarithm function functions, see
Fig. 4.11.

xD2 x2 4 xD2 3x 9

p
y log3 y
(a) (b)
p
Figure 4.11: Illustration of some inverse functions: x 2 = y and 3x = log3 y.

R1
Let’s assume that we know the integral of y D x 2 between 0 and 1, it is written as 0 x 2 dx
R1p
(Section
p 4.3.2 will discuss this weird symbol). What is then 0 vdv? As the two functions x 2
and u are inverses of each other, it follows that the sum of these integrals equal one (Fig. 4.12).
So, knowing one integral yields the other.

y; v
4
p y; v
uD v
Z 1 p
vdv
0
1
2
yDx
Z 1
1 x 2 dx
0

x; u x; u
1 2 0 1
(a) (b)
R1 R1p
Figure 4.12: As the sum of 0 x 2 dx and 0 vdv is one, knowing the former will give us the latter.

Invertible functions. If we plot the function y D


y sin.x/
sin x on the interval 0  x  , then there
are two x coordinates, x1 and x2 , having the same y
y coordinate of sin.y/ for 0  y  1. Thus there
does not exist the inverse of y D sin x on the interval x
0 x1 =2 x2 
0  x  . This leads to the following definition. A
function f W A ! B is one-to-one (or injective) if distinct inputs give distinct outputs. That is,
f is one-to-one if x1 ¤ x2 , then f .x1 / ¤ f .x2 /.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 358

4.2.6 Parametric curves


When we express the equation of a circle of radius r centered
pat the origin in the form y D f .x/,
it reveals one big limitation: we need two equations y D ˙ r 2 x 2 to fully describe the circle.
By introducing an additional variable, usually denoted by t , it is possible to express the full
circle by just one equation

x.t/ D r cos t
(4.2.6)
y.t/ D r sin t; 0  t  2

Curves, represented by an equation of the form .x.t/; y.t// are called parametric curves. And
the variable t is called a parameter. Why the symbol t is used for the parameter? Think of a
particle moving on a plane, as it moves it traces out a curve; its position is specified by its
coordinates which are function of time.
How to get the graph of a parametric curve? That is simple: for each value of t, we compute
x.t / and y.t /, which constitute a point in the xy plane. The locus of all such points is that
graph. Fig. 4.13 shows some parametric curves. We have more to say about parametric curves
in Section 4.12.

y y
20 20

10 10

x x

−20 −10 10 20 −20 −10 10

−10 −10

−20 −20

(a) .t cos t; t sin t / (b) a.2 cos t


cos.2t//; a.2 sin t sin.2t //

Figure 4.13: Spiral and the cardioid, which we have met in Fig. 1.1. In Section 4.13.3 I will discuss how
to derive the equation of the cardioid.

4.2.7 History of functions


M. Kline credits Galileo with the first statements of dependency of one quantity on another, e.g.,
"The times of descent along inclined planes of the same height, but of different slopes, are to
each other as the lengths of these slopes." In 1714, Leibniz already used the word "function" to
mean quantities that depend on a variable. The notation we use today, f .x/, was introduced by
Euler in 1734 [35].

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 359

4.2.8 Some exercises about functions


Let’s f .x/ W Œ0; 1 ! R, given by f .x/ D 4x=4x C2. Compute the following sum
       
1 2 39 1
S Df Cf C  C f f
40 40 40 2
This is from JEE-Advanced 2021 exam. Joint Entrance Examination – Advanced (JEE-
Advanced) is an academic examination held annually in India.
Ok. Assume that we’re not sitting any exam, and the ultimate goal is to get the sum, then it
is super easy. Write a few lines of code and you’ll see that S D 19. But what if we actually have
to do this without calculator not alone a PC. What we’re going to do? We pay attention to the
expression of S and we observe a regularity:
         
1 2 38 39 1
S Df Cf C  C f Cf f
40 40 40 40 2
That is the pairs of the same color sum to one (e.g. 1=40 C 39=40 D 1, 2=40 C 38=40 D 1,
3=40 C 37=40 D 1 etc.). Let’s compute then f .x/ C f .1 x/, and hope that something good
is there. To test this idea, we compute f .0/ C f .1/ (as these sums are easy), and it gives us 1:
very promising. Moving on to f .x/ C f .1 x/Ž :
4x 41 x
f .x/ C f .1 x/ D C D  D 1
4x C 2 41 x C 2
The sum is also 1. Then, S is consisted of 19 sums of the form f .x/ C f .1 x/, which is
nothing but one, plus f .20=40/ minus f .1=2/. The final result is thus simply 19ŽŽ .
A second problem we discuss is the following:
x
f0 .x/ D ; fnC1 .x/ D .f0 ı fn /.x/; n D 0; 1; 2; : : :
xC1
Find fn .x/. How we’re going to solve it? We have a rule to find fnC1 .x/ for whole numbers.
Let’s try to compute few of them e.g. f1 .x/; f2 .x/; : : : to see what we get:
x
xC1 x
nD0W f1 .x/ D .f0 ı f0 /.x/ D x D
xC1
C1 2x C 1
x
2xC1 x
nD1W f2 .x/ D .f0 ı f1 /.x/ D x D
2xC1
C1 3x C 1
x
3xC1 x
nD2W f3 .x/ D .f0 ı f3 /.x/ D x D
3xC1
C1 4x C 1

What we just did is starting from f0 .x/ D x=xC1, using fnC1 .x/ D .f0 ı fn /.x/ we compute
f1 .x/ (recall that .f0 ı fn /.x/ is a composite function), then using f1 .x/ to compute f2 .x/ and
Ž
Just use the rule ay z D ay=az , then 41 x D 4=4x .
With this, now you can try this Canadian math problem in 1995. Let f .x/ D 9x =.9x C 3/. Compute S D
ŽŽ
P1995
nD1 f .n=1996/.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 360

so on. Lucky for us that we see a patternŽŽ . Observe the red numbers on each equation and we
can write
x
fn .x/ D
.n C 1/x C 1
Now we prove this formula using ... proof by induction (what else?). The formula works for
n D 0. Now we assume it works for n D k:
x
fk .x/ D
.k C 1/x C 1
And we’re going to prove that it’s also correct for n D k C 1 i.e.,
x
fkC1 .x/ D
.k C 2/x C 1
You can check this.

4.3 Integral calculus


Integral calculus is originally concerned with the quadrature of closed curves. That is the prob-
lem of determining the area of a closed curve (e.g. what is the area of a circle of radius r).
This problem is certainly very old dating back to the times of Greek mathematicians. But why
mathematicians bother with this problem and what is really area of a shape? The problem is
as simple as this: there are two pieces of lands, one of a rectangular shape and the other of a
triangular shape. The question is which piece is larger, and by that we mean which piece takes
more space. This measurement is called area.
After area of two dimensional objects, we move to volume of three dimensional objects:
cube, cone, pyramid etc. (Section 4.3.1).
Althouth ancient mathematicians came up with genius ways to determine the area of various
shapes, but the method is not universal: each shape requires a different solution! Mathematicians
in the 17th century came up with a smarter way to solve this area problem. In Section 4.3.2 a
definition of a definite integral is given and in Section 4.3.3 we compute the first definite integral
using that definition.

4.3.1 Volumes of simple solids


After area is volume of simple 3D solids (Fig. 4.14). The volume of a cube of side a is a3 , and
that of a cuboid of sides a; b; c is abc. This is simple. But what is the volume of a cylinder? A
cylinder is an interesting object. It’s kind of round and it’s kind of straight.
The method of working out the volume of a cylinder of height h and radius r is the same:
chopping it by vertical thin slices (Fig. 4.15). Each slice is a thin cuboid of height h and base area
Ai . The volume of the cylinder is the total volume of allPthe cuboids. As the number of slices
gets larger, the area of the bases of these rectangles (e.g. i Ai ) approaches the area of the base
of the cylinder, which is  r 2 . Thus, the volume of the cylinder is  r 2 h: base area multiplied
with the height.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 361

CUBE CUBOID CYLINDER

c
a h
a b r
a a
vol=a3 vol=abc vol= r 2 h

Figure 4.14: Volume of a cube, a cuboid and a cylinder. The volumes of these solids are always: base area
times height.

X
vol D vol of cuboids
Ai X
n
D h  Ai
h
iD1
!
X
n
Dh Ai
iD1
2r
D h  r2

Figure 4.15: Method of exhaustion to determine the volume of a cylinder with radius r and height h. The
volume of the cylinder is  r 2 h: base area multiplied with the height.

PYRAMID CONE SPHERE

Figure 4.16: Pyramid, cone and sphere.

So far so good. The next problem of ancient mathematics was to determine the volume of
pyramid, cone and sphere (Fig. 4.16). Their arguments to find these volumes are fascinating.

Volume of a pyramid. The pyramid can be seen as a triangle, so to get its volume we can
go back to how the area of a triangle was computed. We put a triangle in a rectangular case
(Fig. 3.10), and asked how much space of the case the triangle takes up? It turns out that it takes
up one half the area of the rectangle. Now, we do the same thing: we put a pyramid inside a
case which is a cuboid and we guess the pyramid’s volume is a constant times the volume of the
cuboid. And we know the volume of the cuboid. What should be the constant? Is it 1=2? No, it
is 3D, the constant is 1=3. Thus, the volume of a pyramid of base A and height h is 1=3Ah. But
ŽŽ
It has to have a pattern. Why? Because this is a test! It must be answered within a short amount of time.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 362

it is not a proof!

a2
= 0.5a(a/2)
4
a h
a2
a/2 a/2 h
4
a a a a
(a) (b)

Figure 4.17: Area of a triangle: effects of dilation and shearing.

Let’s get back to 2D and we start with a square of sides a. Its area is a2 . Drawing two
diagonals and we get four equal triangles, each has an area of a2 =4 (Fig. 4.17a). Thus, if the
triangle has a base a and a height a=2, its area is 1=2.a=2/.a/. What if the height is not a=2?
Does the formula still work? Yes, we did that in Fig. 3.10. Thus, if the height is a then the area is
0:5a2 , that is twice the old area. What we’re seeing is that: if we dilate the triangle in the vertical
direction by a factor ˛, the new area would be ˛ times the old one. Finally, we want a general
triangle (as shown in the right most figure in Fig. 4.17b): the area is still 0:5bh. What does it tell
us? It indicates that shearing a shape does not change its area (Fig. 4.18).

shearing
D0 D C0 C

b A D ab b

a A a B
a/ b/

Figure 4.18: Shearing a shape does not change its area: easy to see for rectangles/squares; as C C 0 D DD 0
the area of ABCD and ABC 0 D 0 are equal (a). For a general shape (b), think of it as made of many many
tiny squares. When we shear it each square does not change area, thus the area of the shape, which is the
total area of all the squares, does not change. What a nice argument! This can be also proved using linear
algebra, particularly linear transformation and determinant. See Chapter 11.

For a pyramid, the argument is exactly the same: see Fig. 4.19.

Volume of a cone. A cone looks similar to a pyramid, so is it volumes also 1=3Ah? It is. How
to prove it? Use the volume of a pyramid. A cone is made of many many pyramids whose the
bases making up the base of the cone. All these pyramids have the same height, h. Thus P each
pyramid
P has a volume of 1=3A h. And the total volume of these pyramids is 1=3h
i Ai , but
Ai D A, the base of the cone. This is exactly the strategy used to get the volume of a cylinder
(see Fig. 4.15).

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 363

a
a=2

a a

a a
1 a
3
vol=a =6 vol=a =6 D Œ.a2 / 
3
3 2

Figure 4.19: The volume of a pyramid of base A and height h is 1=3Ah. A cube of side a contains six
equal pyramids of which the volume is a3 =6. This is 1=3A  h where A is the area of the base and
h D a=2 is the height. Then a vertical dilation to any h and a shearing leads to the formula 1=3A  h for
any pyramid.

Volume of a ball. Finally, we address the volume of a ball of


radius one. Our strategy is to divide the top half of this ball
into many many cylinders of the same height. We know the
volume of cylinder, and we just add up all these volumes (see
beside figure). Note that this is not the only way to divide the
sphere: we can equally use cylinders that fit inside the sphere.
But this does not matter when we consider n ! 1.

r3
B
r2
a/ O 1 A b/

Figure 4.20: Calculation of the volume of half of a sphere of unit radius. The hemisphere is divided into n
cylinders of height AB D 1=n. In (a) n D 3 and in p (b) n D 8. The cylinders are numbered from bottom
to top starting from 0; cylinder i has a radius ri D 1 .i=n/2 from the Pythagorean theorem (for the
right triangle OAB). Cylinder i has a volume of  ri2 .1=n/.

Referring to Fig. 4.20a, cylinder i has a volume of  ri2 .1=n/. Thus, the volume V of the
ball, which is the sum of the volume of n cylinders when n is very large, is:
   2    2    
1 2 n 12 1
V D 2 lim  1 C 1 C 1 C  C 1
n!1 n n n n
Now, we have a problem of algebra. Let’s first massage the above expression a bit, and we shall
see something familiar:
 
1 1 2 2 2

V D 2 lim n 1 C 2 C    C .n 1/
n!1 n n2

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 364

A sum of the first n 1 squares! We know how to get this sum, see Eq. (2.6.8). So, we have
   
.n 1/.2n 1/ .n 1/.2n 1/ 4
V D 2 lim 1 D 2 1 lim D (4.3.1)
n!1 6n2 n!1 6n2 3
Therefore, the volume of a sphere of radius r is 4=3 r 3 . We know beforehand that the volume
must be of this form c r 3 , but the fact that c is 4=3 is surprising. Archimedes was the first to
find out this constant. In his work On the Sphere and Cylinder, Archimedes proved that the ratio
of the volume of a sphere to the volume of the cylinder that contains it is 2 W 3. In that same work
he also proved that the ratio of the surface area of a sphere to the surface area of the cylinder
that contains it, together with its circular ends, is also 2 W 3. Because expressions for the volume
and surface area of a cylinder were known before his time, Archimedes’ results established the
first exact expressions for the volume and surface area of a sphere.

4.3.2 Definition of an integral


Let’s consider a general curve described by the function y D f .x/, and we want to calculate the
area of the region bounded by this curve and y D 0 and x D a and x D b. The idea is, following
Archimedes, to chop this space into a large number of thin rectangles or slices (Fig. 4.21) and
compute the areas of all these slices and add them up. It’s then obvious that using more slices
results in a better estimation of the area. And when the number of slices goes to infinity (or
approaches) the total area of all the slices is exactly the area under the curve.
To make the above statement more precise, let’s call n the number of slices, and An the
total area of n slices. We start with 1 slice, then 2 slices, 3 slices, etc. up to infinity. Thus, we
get a sequence .An / D fA1 ; A2 ; : : : ; An g. If A is the area under the curve, then this sequence
approaches A when n approaches infinity. So, we define A to be the limit of the area sequence
.An /:
A WD lim An (4.3.2)
n!1
What we need to do now is to compute An . Luckily that’ simple and it should be because it is
our choice to make this chop! For simplicity, assume that these rectangles have the same base
x D .b a/=n (Fig. 4.22). That is we place n C 1 equally spaced points x0 ; x1 ; : : : over the
interval Œa; b, we have then n sub-intervals Œxi ; xi C1 . Actually we have two ways to build the
slices: one way is to use the left point xi Ž of Œxi ; xi C1  (similar to an inscribed polygon in a
circle); the second way is to use the right point xi C1 (similar to circumscribed polygon). The
area A is now written as:
Z b Z b !
X
n 1 Xn
A WD lim xf .xi / WD lim xf .xi / D f .x/dx D dA (4.3.3)
n!1 n!1 a a
i D0 i D1
R
The notation was introduced by Gottfried Wilhelm Leibniz to represent the long S (for sum)||
The function f .x/ under the integral sign is called the integrand. The points a and b are called
Ž
Note that i here is not the imaginary unit i 2 D 1.
||
He first used the symbol omn, short for omnia which is Latin for sum. On the other hand, Newton did not care
about notation, thus he did not have a systematic notation for the integral.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 365

y y

a b x a b x

(a) 10 slices (b) 15 slices

y y

a b x a b x

(c) 20 slices (d) 100 slices

Figure 4.21: Approximating the area under the curve y D f .x/ by many thin rectangles.

the limits of integration and Œa; b is called the interval of integration. The modern notation for
the definite integral, with limits above and below the integral sign (a and b), was first used by
Joseph Fourier in Mémoires of the French Academy around 1819–20. The red sum is called the
Riemann sum named after nineteenth century German mathematician Bernhard Riemann.

4.3.3 Calculation of integrals using the definition


Let’s start by calculating the area under a parabola y D x 2 with a D 0 (the easiest area problem)
and see if we get the same result that Archimedes got long time ago. The steps are (a D 0,

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 366

y y

f (x) f (x)

x0 x1 x2 x3 x4 x x0 x1 x2 x3 x4 x

Figure 4.22: The area of the region bounded by the curve y D f .x/ and y D 0 and x D a and x D b:
computed by chopping that region into an infinite number of thin slices. The interval Œa; b is divided into
n sub-intervals Œxi ; xi C1  where xi D a C i.b a/=n, i D 0; 1; : : : ; n. We can either use the left point
(left figure) or the right point of the sub-intervals to define the height of one slice. And of course, we can
also pick any point inside a sub-interval to define the slice height.

x D b=n, xi D ib=n):
Z b X n  
2 ib 2 b
x dx D lim (definition)
0 n!1
i D1
n n
1 X 2
n
3
D b lim 3 i (algebra)
n!1 n
i D1
n.n C 1/.2n C 1/ P
3
D b lim 3
( niD1 i 2 in Eq. (2.6.13))
n!1 6n
 
3
1 1 1 b3
D b lim C C 2 D
n!1 3 2n 6n 3
The red terms vanish when n approaches infinity; they are infinitely small. The result before
going to limit is quite messy (many terms), but in the limit, a simple result of b 3 =3 was obtained.
This is similar to how ancient mathematicians found the area of the circle (Fig. 3.33). By the
way, the red terms account for those small triangles above the curve (the right figure in Fig. 4.22).
If b D 1, the area is 1=3 which agrees with Archimedes’ finding.
Let’s do another integral for y D x 3 , and hope that we can see a pattern for y D x p with p
being a positive integer (because we do not want to repeat this ‘boring’ procedure for y D x 4 ,
y D x 5 etc.; mathematics would be less interesting then):
Z b X n  
b4 X 3
n
3 ib 3 b b 4 n4 C 2n3 C n2 b4
x dx D lim D lim 4 i D lim D (4.3.4)
0 n!1
i D1
n n n!1 n i D1 n!1 4 n4 4
P
and we have used Eq. (2.6.12) to compute niD1 i 3 . We are seeing a pattern here, and thus for
any positive integer p, we have the following results
Z b Z a Z b
p b 1Cp p a1Cp b 1Cp a1Cp
x dx D H) x dx D H) x p dx D (4.3.5)
0 1Cp 0 1Cp a 1Cp 1Cp

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 367

which is based on Eq. (2.6.14).


Next step is to do integration for y D x 1=m . As we know the integral of y D x m –the inverse
of y D x 1=m , and the sum of these two areas is known, see Fig. 4.12, it is possible to compute
R b 1=m Rb
0 x dx. That sum is b  b m D b 1Cm , and one area is 0 x m dx D b1Cm=1Cm, and the other
R bm
area is 0 v 1=m dv. So, we have
Z b Z bm Z bm
1Cm m 1=m m
b D x dx C v dv H) v 1=m dv D b 1Cm
0 0 0 1Cm
R b 1=m
So, we’re able to compute 0 x dx:
Z b
m
x 1=m dx D b 1=mC1 ; .m ¤ 1/ (4.3.6)
0 1Cm
One way to be sure is to use m D 2, b D 1 and Fig. 4.12 to check the new result. Obviously
the integral of the hyperbola y D 1=x (m D 1) cannot be computed using Eq. (4.3.6) as it
involves the non-sense 1=0.

4.3.4 Rules of integration


The following rules of integration follow from its definition. Or they can be verified geometrically
as shown in Fig. 4.23. They are:
Z b Z c Z b
f .x/dx D f .x/dx C f .x/dx
Z a
a a
Z c c
Z a
f .x/dx D 0 H) f .x/dx D f .x/dx
Z b Z b Z b
a a c
(4.3.7)
Œ˛f .x/ C ˇg.x/dx D ˛ f .x/dx C ˇ g.x/dx
Z b
a a a

f .x/dx > 0 if f .x/ > 0 8x 2 Œa; b


a
The first rule means that we can split the integration interval into sub-intervals and do the
integration over the sub-intervals and sum them up. The second rule indicates that if we reverse
the integration limits, the sign of the integral change. The third rule is actually a combination of
Rb Rb Rb Rb Rb
two rules: a ˛f .x/dx D ˛ a f .x/dx and a Œf .x/ C g.x/dx D a f .x/dx C a g.x/dx.
The fourth means that if the integrand is positive within an interval, then over this interval the
integral is positive.
Another rule (or property) of integrals is the following
Z b Z b Z b
if h.x/  f .x/  g.x/ .a  x  b/ H) h.x/dx  f .x/dx  g.x/dx (4.3.8)
a a a
One application of Eq. (4.3.8) is to prove sin x  x:
Z x Z x
cos t  1 H) cos dt  dt H) sin x  x (4.3.9)
0 0

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 368

y y

f .x/ > 0

Z c Z b Z b
f .x/dx f .x/dx f .x/dx > 0
a c a

a c x x
b a b
Rb
Figure 4.23: Some properties of the integral a f .x/dx.

4.3.5 Indefinite integrals


When the limits of an integral are not fixed, we have an indefinite integral, which is a function.
Usually, we assume the lower limit is fixed and the upper limit is a variable:
Z x Z x
F .x/ D f .u/du D f .t/dt (4.3.10)
a a

I have used two notations f .u/du and f .t/dt to illustrate that u or t can be thought of dummy
variables; any variable (not x) can be used.
That’s all we can do with integral calculus, for now. We are even not able to compute the
area of a circle using the integral! We need the other part of calculus–differential calculus, which
is the topic of the next section.

4.4 Differential calculus


While integration was an idea from antiquity, the idea of derivative was relatively new. This
section presents the basic ideas of differential calculus. I first present in Section 4.4.1 how
Fermat solved a maxima problem using the idea behind the concept of derivative that we know
of today. As Fermat (and all the mathematicians of his time) did not know the concept of limit–
which is the foundation of calculus, his maths was not rigorous, but it worked in the sense that
it provided correct results. The motivation of the inclusion of Fermat’s work is to show that
mathematics were not developed as it is now presented in textbooks: everything works nicely.
Far from that, there are set backs, doubts, criticisms and so on. Then in Section 4.4.3 we talk
about uniform and non-uniform speeds as a motivation for the concept of derivative introduced
in Section 4.4.4. As we have already met the limit concept in Section 2.22, I immediately use
limit to define the derivative of a function. But I will postpone a detailed discussion of what is
a limit until Section 4.10 to show that without limits 17th century mathematicians with their
intuition can proceed without rigor. This way of presentation style will, hopefully, comfort many
students. It took hundreds of years for the best mathematicians to develop the calculus that we
know of today. It is OK for us to be confuse, to make mistakes and to have low grades.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 369

4.4.1 Maxima of Fermat


As an elementary example of a maxima problem that Pierre de Fermat solved back in 17th
century we consider this problem: prove that, of all rectangles with a given perimeter, it is the
square that has the largest area. This problem belongs to the so-called optimization problems.
And Fermat’s solution, discussed below, presented the first steps in the development of the
concept of a derivative.
Let’s denote the perimeter by a, and one side of the rectangle by x, the other side is hence
y D a=2 x. Therefore, the area–which is xy–is
ax
M.x/ D x2 (4.4.1)
2
Before presenting Fermat’s solution, let’s solve it the easy way: M.x/ is a concave parabola
with an inverted bowl shape thus it has a highest point. We can rewrite M in the following form
 2  2  a 2
a a
M D x H) M  (4.4.2)
4 4 4

thus M is maximum when the red term vanishes or when x D a=4 . Thus y D a=4, and a square
has the largest area among all rectangles with a given perimeter. One thing to notice herein is
that this algebraic way is working only for this particular problem. We need something more
powerful which can be, hopefully, applicable to all optimization problems, not just Eq. (4.4.1).
Fermat’s reasoning was that: if x is the one that renders M maximum, then adding a small
number  to x would not change M ŽŽ . This gives us the equation M.x C / D M.x/, and with
Eq. (4.4.1), we get:

a.x C / ax
.x C /2 D x2 (4.4.3)
2 2
which leads to another equation, by dividing the above equation by  (this can be done because
 ¤ 0):
a
2x C  D 0 (4.4.4)
2
Then, he used  D 0, to get x
a a
2x D 0 H) x D (4.4.5)
2 4
To someone who knows calculus, it is easy to recognize that Eq. (4.4.5) is exactly M 0 .x/ D 0 in
our modern notation, where M 0 .x/ is the first derivative of M.x/. Thus Fermat was very close
to the discovery of the derivative concept.
ŽŽ
Why? Imagine you’re climbing up a hill. When you’re not at the top each move increases your altitude.
But when you’re already at the top, then a move will not change your altitude. Actually, it changes, but only
insignificantly (assuming that your step is not giant).

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 370

It is important to clearly understand what Fermat did in the above process. First, he
introduced a quantity  which is initially non-zero. Second, he manipulated this  as if it is an
ordinary number. Finally, he set it to zero. So, this  is something and nothing simultaneously!
Newton and Leibniz’s derivative, also based on similar procedures, thus lacks a rigorous founda-
tion for 150 years until Cauchy and Weierstrass introduced the concept of limit (Section 4.10).
But Fermat’s solution is correct!

Isoperimetric problems. We have seen that among rectangles with f .x/


P
a given perimeter, square is the curve which has the largest area. A
general version of this so-called isoperimetric problems is: if you have f .x/ D‹
a piece of rope with a fixed length, what shape should you make with maximize the area

it in order to enclose the largest possible area? Here we are trying


to choose a function f .x/ to maximize an integral giving the area
enclosed by f .x/, given the constraint that the length of the curve (P ) is fixed. See beside figure.
We used the derivative of a function to find the solution to the rectangle problem. But it is not
useful for this general isoperimetric problem as we do not know the function f .x/. Solving this
requires a new kind of mathematics known as variational calculus developed in the 17th century
by the likes of Euler, Bernoulli brothers, and Lagrange. See Chapter 10 for details on variational
calculus.
As we’re at optimization problems, let me introduce another optimization problem in the
next section. This is to demonstrate that optimization problems are everywhere. We shall see
that not only we try to optimize things (maximize income, minimize cost and so on) but so does
nature.

4.4.2 Heron’s shortest distance


One of the first non-trivial optimization problems was solved
by Heron of Alexandria, who lived about 10-75 C.E. Heron’s
’Shortest Distance’ problem is as follows. Given two points A
and B on one side of a straight line, find the point C on the line
such that jAC j C jCBj is as small as possible where jAC j is the
distance between A and C .

c
A p
AC D a2 C x 2
B p
BC D b 2 C .c x/2
a p p
b f .x/ D a2 C x 2 C b 2 C .c x/2
C f 0 .x0 / D 0 ) x0 ) C
x c x
Figure 4.24: Heron’s shortest distance problem: we need to find the best location for point C . Note that
a; b; c are constants, only x–measuring the horizontal distance from A to C –changes.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 371

Before presenting Heron’s smart solution, assume that we know calculus, then this problem
is simply an exercise of differential calculus. We express the distance jAC j C jCBj as a function
of x–the position of point C that we’re after, then calculate f 0 .x/ and set it to zero. That’s it.
The derivative of the distance function is (Fig. 4.24)
x c x
f 0 .x/ D p p
a2 C x 2 b 2 C .c x/2
Thus, setting the derivative to zero gives
a b
f 0 .x/ D 0 W H) xb D .c x/a H) D (4.4.6)
x c x
What is this result saying? It is exactly the law of light reflection if we see the problem as light
is moving from A, reflected on a surface at C and goes to B: angle of incidence (i.e., the angle
of which tangent is a=x ) equals angle of reflection, which was discovered by Euclid some 300
years earlier.
So what is exactly what Heron achieved? He basically demonstrated that reflected light
takes the shortest path—or the shortest time, assuming light has a finite speed. Why is this a
significant achievement? Because this was the first evidence showing that our universe is lazy.
When it does something it always selects its way so that a certain quantity (e.g. time, distance,
energy, action) is minimum. Think about it for a while then you would be fascinated by this
idea. As a human being, we do something in many ways and from these trials we select the
best (optimal) way. Does nature do the same thing? It seems not. Then, why it knows to select
the best way? To know more about this topic, I recommend the book The lazy universe by
Coopersmith [14]Ž .

Heron’s proof of the shortest distance problem. Referring to Fig. 4.25, Heron created a new
point B 0 which is the reflection of point B through the horizontal line. Then, the solution–point
C –is the intersection of line AB 0 and the horizontal line. An elegant solution, no question. But
it lacks generality. On the other hand, the calculus-based solution is universally applicable to
almost any optimization problem and it does not require the user to be a genius. With calculus,
things become routine.
But wait, how did Heron know to create point B’? Inspiration, experience, trial and error,
dumb luck. That’s the art of mathematics, creating these beautiful little poems of thought, the
sonnets of pure reason.

Algebra vs geometry. This problem illustrates the differences between algebra and geometry.
Geometry is intuitive and visual. It appeals to the right side of the brain. With geometry,
beginning an argument requires strokes of genius (like drawing the point B’). On the other
hand, algebra is mechanical and systematic. Algebra is left-brained.

Proof of reflection property of ellipse. The reflective property of an ellipse is simply this: A ray
of light starting at one focus will bounce off the ellipse and go through the other focus. Referring
Ž
You can also watch this youtube video.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 372

A
B

a AC C CB D AB 0
b AC 0 C C 0 B D AC 0 C C 0 B 0
C ˛ AC 0 C C 0 B 0 > AB 0
˛
C0

B0

Figure 4.25: Heron’s genius solution. It uses one property of triangle: the sum of two edges is larger than
the remaining one. The crux of the proof is to create point B 0 .

to Fig. 4.26, we need to prove that a light starts from F1 coming to P , bounces off the ellipse E
and gets reflected to F2 . Using the result of Heron’s shortest distance property, what we have to
prove is: F1 P and F2 P make equal angles with the tangent line to E at P i.e., 2 D 3 .
Extend F1 P past P to F20 such that F2 P D PF20 . Let L is the line that bisects the angle
F20 PF2 so that 1 D 2 . Thus, L is the perpendicular bisector of F2 F20 . What we need to prove
now is: proving L is the tangent to E at P . Why? Because then we have 1 D 3 (vertical
angles), but 1 D 2 (as L is the bisector), so 2 D 3 .

L F20

P
3 1
Q
2

F1 F2

Figure 4.26: Proof of reflection property of ellipse (PF1 C PF2 D 2a): 2 D 3 .

Proof of the fact that L is tangent to E at P . We use proof by contradiction. Suppose that L is
not tangent to E at P . Then L meets E at another point Q. As Q is on L we have QF2 D QF20 .
By definition of ellipse, we have F1 Q CF2 Q D F1 P CF2 P . Consider now the triangle F1 QF20 ,
we have

F1 Q C F20 Q D F1 Q C QF2 D F1 P C F2 P D F1 P C PF20 D F1 F20

which violates the triangle inequality in the triangle F1 QF20 . So, L must be the tangent to E at
P. 

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 373

4.4.3 Uniform vs non-uniform speed


To understand the concept of derivative, one can either consider the problem of finding a tangent
to a curve at a certain point on the curve or the problem of determining the velocity of an
object at a certain moment in time if it is moving with a non-uniform velocity. I take the second
approach as there is change inherently in this problem. This was also how Newton developed
his fluxionsŽŽ . Note that Newton is not only a mathematician, but also a physicist.
Let’s start simple with a car moving with a constant speed. If it has gone 30 kilometers in
1 hour, we say that its speed is 30 kilometers per hour. To measure this speed, we divide the
distance the car has traveled by the elapsed time. If s measures the distance and t measure time,
then
distance s
uniform speed D D (4.4.7)
time interval t
The ratio s=t is called a time rate of change of position i.e., change of position per unit time.
Sometimes it is simply referred to as the rate of change of position. Note that  does not stand
for any number.  stands for ‘the change in’ — that and nothing else. Thus, s (read Delta s) is
used to indicate a change in s and t (read Delta t) is used to indicate a change in t .
But life would be boring if everything is moving at constant speed. Then, one would need no
differential calculus. Luckily, non-uniform motions are ubiquitous. Kepler discovered that the
planets moved non-uniformly around their ellipses with the Sun as focus, sometimes hesitating
far from the Sun, sometimes accelerating near the Sun. Likewise, Galileo’s projectiles moved at
ever-changing speeds on their parabolic arcs. They slowed down as they climbed, paused at the
top, then sped up as they fell back to earth. The same was true for pendulums. And a car which
travels 30 miles in an hour does not travel at a speed of 30 miles an hour. If its owner lives in a
big town, the car travels slowly while it is getting out of the town, and makes up for it by doing
50 on the arterial road in the country.
How could one quantify motions in which speed changed from moment to moment? It was
the task that Newton set out for himself. And to answer that question he invented calculus. We
are trying here to reproduce his work. We use Galileo’s experiment of ball rolling down an
inclined plane (Table 4.2 from s D t 2 ) and seek out to find the ball speed at any time instant, the
notation for that is v.t/, where v is for velocity.

Table 4.2: Galileo experiment of ball rolling down an inclined plane.

time [second] 0 1 2 3 4 5 6

distance [feet] 0 1 4 9 16 25 36

Let us first try to find out how fast the ball is going after one second. First of all, it is easy to
ŽŽ
The modern definition of a function had not yet been created when Newton developed his calculus. The context
for Newton’s calculus was a particle “flowing” or tracing out a curve in the x y plane. The x and y coordinates of
the moving particle are fluents or flowing quantities. The horizontal and vertical velocities are the fluxions (which
we call derivatives) of x and y, respectively, associated with the flux of time.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 374

see that the ball continually goes faster and faster. In the first second it goes only 1 foot ; in the
next second 3 feet; in the third second 5 feet, and so on. As the average speed during the first
second is 1 foot per second, the speed of the ball at 1 second must be larger than that. Similarly,
the average speed during the second second is 3 feet per second, thus the speed of the ball at 1
second must be smaller than that. So, we know 1 < v.1/ < 3.
Can we do better? Yes, if we have a table similar to Table 4.2 but with many many more data
points not at whole seconds. For example, if we consider 0.9 s, 1 s and 1.1 s (Table 4.3), we can
get 1:9 < v.1/ < 2:1. And if we consider 0.99 s, 1 s and 1.01 s, we get 1:99 < v.1/ < 2:01.
And if we take thousandth of a second, we find the speed lies between 1.999 and 2.001. And if
we keep refining the time interval, we find that the only speed satisfying this is 2 feet per second.
Doing the same thing, we find the speed at whole seconds in Table 4.4. If s D t 2 , then v D 2t.

Table 4.3: Galileo experiment of ball rolling down an inclined plane with time increments of 0.1 s.

time [second] 0.9 1 1.1

distance [feet] 0.81 1 1.21

Table 4.4: Galileo experiment of ball rolling down an inclined plane: instantaneous speed.

time [second] 0 1 2 3 4 5 6

speed [feet/s] 0 2 4 6 8 10 12

So the speed at any moment will not differ very much from the average speed during the
previous tenth of a second. It will differ even less from the average speed for the previous
thousandth of a second. In other words, if we take the average speed for smaller and smaller
lengths of time, we shall get nearer and nearer — as near as we like — to the true speed.
Therefore, the instantaneous speed i.e., the speed at a time instant is defined as the value that the
sequence of average speeds approaches when the time interval approaches zero. We show this
sequence of average speeds in Table 4.5 at the time instant t0 D 2s. Note that this table presents
not only the average speeds from the time instances t0 C h and t0 , but also from t0 h and t0 .
And both sequences converge to the same speed of 4, which is physically reasonable. Later on,
we know that these correspond to the right and left limits.
But saying ‘the value that the sequence of average speeds approaches when the time interval
approaches zero’ is verbose, we have a symbol for that, discussed in Section 2.22. Yes, that
value (i.e., the instantaneous speed) is the limit of the average speeds when the time interval
approaches zero. Thus, the instantaneous speed is defined succinctly as

s
instantaneous speed s 0 .t/ or sP  lim (4.4.8)
t!0 t

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 375

Table 4.5: Limit of average speeds when the time interval h is shrunk to zero.

h .t0 Ch/2 t02=h .t0 h/2 t02= h

10 1
4.100000000000 3.900000000000
10 2
4.010000000000 3.990000000000
10 3
4.001000000000 3.998999999999
10 4
4.000100000008 3.999900000000
10 5
4.000010000027 3.999990000025
10 6
4.000001000648 3.999998999582

where, we recall, the notation s is used to indicate a change in s; herein it indicates the distance
traveled during t. And we use the symbol s 0 .t/ to denote this instantaneous speed and call it the
derivative of s.t/. Newton’s notation for this derivative is sP , and it is still being used especially
in physics. This instantaneous speed is the number that the speedometer of your car measures.

4.4.4 The derivative of a function


Leaving behind distances and speeds, if we have a function f .x/, then its derivative at point x0 ,
denoted by f 0 .x0 /, is defined as

f .x0 C x/ f .x0 / f


f 0 .x0 / D lim D lim (4.4.9)
x!0 x x!0 x

In words, the derivative f 0 .x0 / is the limit of the ratio of change of f (denoted by f ) and
change of x (denoted by x) when x approaches zero. The term f =x is called a difference
quotient.
Instead of focusing on a specific value x0 , we can determine the derivative of f .x/ at an
arbitrary point x, which is denoted by f 0 .x/. For an x we have a corresponding number f 0 .x/,
thus f 0 .x/ is a function in itself. Often we use h in place of x because it is shorter. Thus, the
derivative is also written as

f .x C h/ f .x/
f 0 .x/ D lim
h!0 h

Notations for the derivative. There are many notations for the derivative: (1) Newton’s notation
fP, (2) Leibniz’s notation for the derivative dy=dx, and (3) Lagrange’s notation f 0 .x/. Let’s
discuss Lagrange’s notation first as it is easy. Note that given a function y D f .x/, its derivative
is also a function, which Lagrange called a derived function of f .x/. That’s the origin of the
name ‘derivative’ we use today. Lagrange’s notation is short, and thus very convenient.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 376

How about Leibniz’s notation? I emphasize that when Leibniz developed the concept of
derivative, the concept of limit was not availableŽŽ . Leibniz was clear that the derivative was
obtained when f and x were very small, thus he used df and dx, which he called the
infinitesimals (infinitely small quantities) or differentials. An infinitesimal is a hazy thing. It
is supposed to be the tiniest number we can possibly imagine that isn’t actually zero. In other
words, an infinitesimal is smaller than everything but greater than nothing (0). On the other
hand, the notation dy=dx has these advantages: (i) it reminds us that the derivative is the rate of
change y=x when x ! 0 (the d s remind us of the limit process), (ii) it reveals the unit of
the derivative immediately as it is written as a ratio while f 0 .x/ is not. But, the major advantage
is that we can use the differentials dy and dx separately and perform algebraic operations on
them just like ordinary numbers.

4.4.5 Infinitesimals and differentials


To understand Leibniz’s infinitesimals, surprisingly a simple question is a big help. We all know
that 23 D 8, but what is 2:0013 ? We’re not interested in the final result, but in the structure of
the result. Using a calculator, we get

2:0013 D 8:012006001

So, it is 8 plus a bit. That makes sense, a tiny change from 2 to 2.001 results in a tiny change
from 8 to 8.012006001 (a change of 0.012006001). What is interesting is that we can decompose
this change into a sum of three parts as follows

0:012006001 D 0:012 C 0:000006 C 0:000000001

which is a small plus a super-small plus a super-super small.


We can use algebra to understand this structure of the result. Let’s consider x0 (instead of 2
as we did previously) and a change x, then we ask what is .x0 C x/3 . It is given by (we can
multiply out or use Pascal’s triangle if we’re lazy):

.x0 C x/3 D x03 C 3x02 x C 3x0 .x/2 C .x/3

Thus the change .x0 C x/3 x03 is:

.x0 C x/3 x03 D 3x02 x C 3x0 .x/2 C .x/3

And putting x0 D 2 into the above equation, we have

.2 C x/3 23 D 12x C 6.x/2 C .x/3 (4.4.10)

Now we can see why the change consists of three parts of different sizes. The small but dominant
part is 12x D 12.:001/ D :012. The remaining parts 6.x/2 and .x/3 account for the super-
small .000006 and the super-super-small .000000001. The more factors of x there are in a part,
ŽŽ
this only came to life about 200 years after Newton and Leibniz!

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 377

the smaller it is. That’s why the parts are graded in size. Every additional multiplication by the
tiny factor x makes a small part even smaller.
Now come the power of Leibniz’s notation dx and dy. In Eq. (4.4.10), if we replace x by
dx and call dy the change due to dx, and of course we neglect the super and super-super small
parts (i.e., .dx/2 and .dx/3 ), then we have a nice formula:

dy WD .2 C dx/3 23 D 12dx (4.4.11)

which allows us to write


dy
D 12 which is nothing but the derivative of x 3 at x D 2
dx
For Leibniz dx and dy exist (sometimes he doubted their existence), or in other words, to him,
they are fundamental mathematical objects and the derivative just happens to be their ratio.
Many mathematicians object this way of doing calculus. But I do not care. Euler, Bernoullis
and many other mathematicians used these dx and dy successfully. If you want rigor, that is
fine, then use limit.

Differential operator. Yet another notation for the derivative of y D f .x/ at x0 is:
ˇ
d ˇ f .x0 C h/ f .x0 /
f .x/ˇˇ D lim
dx xDx0 h!0 h

This notation adopts the so-called differential operator dx d


f .x/. What is an operator?
p Think of
the square
p root of a number. Feed in a number x, the operator square root 2 gives another
number x. Similarly, feed in a function f .x/, the operator =dx 2 gives another function–the
d

derivative f 0 .x/. For the time being, just think of this operator as another notation that works
better aesthetically (not objective) for functions of which expression is lengthy. Compare the
following two notations and decide for yourself:
 0  
x 2 C 3x C 5 d x 2 C 3x C 5
p ; p
x 3 3x C 1 dx x 3 3x C 1
Later on, we shall see that mathematicians consider this operator as a legitimate mathematical
object and study its behavior. That is, they remove the functions out of the picture and think of
the differentiation process (differentiation is the process of finding the derivative).

Nonstandard analysis. The history of calculus is fraught with philosophical debates about the
meaning and logical validity of fluxions and infinitesimals. The standard way to resolve these
debates is to define the operations of calculus using the limit concept rather than infinitesimals.
And that resulted in the so-called real analysis. On the other hand, in 1960, Abraham RobinsonŽŽ
ŽŽ
Abraham Robinson (1918 – 1974) was a mathematician who is most widely known for development of non-
standard analysis, a mathematically rigorous system whereby infinitesimal and infinite numbers were reincorporated
into modern mathematics.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 378

developed nonstandard analysis that reformulates the calculus using a logically rigorous notion
of infinitesimal numbers. This is beyond the scope of the book and my capacity as I cannot
afford to learn another kind of number–known as the hyperreals (too many already!).

Surface of a ball. Equipped with infinitesimals, we’re now able to determine the surface area
of a sphere (of radius r). Thanks to Archimedes, we know that it is 4 r 2 . But why? To derive
this result, let’s get back to the circumference of a circle. We want to derive the formula of the
circumference given the area formula. The idea is to consider an annulus of radii r and r C h
as shown in Fig. 4.27. It is obvious that the area of the annulus is the area of the bigger circle
minus that of the smaller circle:

Aannulus D .r C h/2  r 2 D Œ.r C h/2 r 2  D .2rh C h2 /

Now, we consider tiny h, in which case h2 is negligible, so Aannulus D .2 r/h. But this thin
annulus is nothing but a rectangle of length C –the circumference of our circle and width h.
Therefore, C D 2 r.
h

h
C

C
r

r
r r

Figure 4.27: Area of an annulus of radii r and r C h: interesting appears when considering small hs.

We now play the same game for spheres: we consider two spheres of which one has a radius
r and the bigger has a radius r C h, then we compute the volume of the region between the two
spheres:
4 4 3 4
V D .r C h/3  r D .3r 2 h C 3rh2 C h3 /
3 3 3
When h is super tiny, we then have
4
V D .3r 2 h/ D .4 r 2 /h
3
Therefore, the surface area of a sphere is 4 r 2 . It is not 4:01 r 2 , it is exactly 4 r 2 ; in other
words, the surface area of a sphere is exactly four times the area of its great circle.

4.4.6 The geometric meaning of the derivative


We have used algebra to define the derivative of a function, let’s consider the geometric meaning
of this important concept. To this end we use Descartes’ analytic geometry by plotting the graph
of the function y D f .x/ on the Cartesian xy plane. We then consider a point P on the curve

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 379

with coordinates .x0 ; f .x0 //, cf. Fig. 4.28a. To have change, we consider another point Q on
the curve with coordinates .x0 C h; f .x0 C h//. Then we have the average rate of change of the
function at P , that is f =h where f D f .x0 C h/ f .x0 /. This rate of change is nothing but
the slope of the secant PQ (see the shaded triangle). Now, the process of considering smaller
and smaller h, to get the derivative, is amount to considering points Q0 ; Q00 which are closer and
closer to P (Fig. 4.28b). The secants PQ, PQ0 , PQ00 , ... approach the line PP 0 which touches
the curve y D f .x/ at P . PP 0 is the tangent to the curve at P . The average rate of change f =h
approaches df =dx–the derivative of f .x/ at x0 .
When h approaches 0, the secants approach the tangent and their slopes approach the deriva-
tive. Thus, the derivative of the function at P is the slope of the tangent to f .x/ at the same
point. That’s the geometric meaning of the derivative.

a/ y Q f .x/ b/ y
Q00 Q0 Q
P
f .x0 C h/

P f
˛

f .x0/
x x
x0 x0 C h x0 x0 C h
Figure 4.28: The derivative of a function y D f .x/ evaluated at x0 is the slope of the tangent to the curve
at .x0 ; f .x0 //. The ratio f =h is the slope of the secant PQ. When h ! 0, this ratio becomes the slope
of the tangent (blue line) of the curve at P .

Now we derive the equation for this tangent. It is the line going through the point P .x0 ; y0 /
and has a slope equal f 0 .x0 /, thus the equation for the tangent is:

tangent to y D f .x/ at P .x0 ; y0 /: f .x0 / C f 0 .x0 /.x x0 / (4.4.12)

And this leads to the so-called linear approximation to a function, discussed later in Section 4.5.3.
The idea is to replace a curve–which is hard to work with–by its tangent (which is a line and
easier to work with).
We now understand the concept of the derivative of a function, algebraically and geometri-
cally. Now, it is the time to actually compute the derivative of functions that we know: polyno-
mials, trigonometric, exponential etc.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 380

4.4.7 Derivative of f .x/ D x n


We start simple first. Let’s compute the derivative of y D x 2 at x0 , and we use the definition in
Eq. (4.4.9):
f .x0 C h/ f .x0 / .x0 C h/2 x02
f 0 .x0 / D lim D lim (def.)
h!0 h h!0 h
2x0 h C h2
D lim .purely algebra/ (4.4.13)
h!0 h
D lim .2x0 C h/ .algebra, h is not zero!/
h!0
D 2x0 (when h is approaching 0)
The algebra was simple but there are some points worthy of further discussion. First, if we used
h D 0 in the difference quotient 2x0 hCh2=h we would get this form 0=0–which is mathematically
meaningless. This is so because to get the derivative which is a rate of change at least we should
allow h to be different from zero (so that some change is happening). That’s why the derivative
was not defined as the difference quotient when h D 0. Instead, it is defined as the limit of this
quotient when h approaches zero. Think of the instantaneous speed (Table 4.5), and thing is
clear.
As always, it is good to try to have a geometric interpretation. What we are looking for is
what is the change of x 2 if there is a tiny change in x. We think of x 2 immediately as the area of
a square of side x (Fig. 4.29). Then, a tiny change dx leads to a change in area of 2xdx, because
the change .dx/2 is so so small that it can be neglected.
So, it’s up to you to like the limit approach or the infinitesimal one. If you prefer rigor then
using limit is the way to go. But if you just do not care what is the meaning of infinitesimals
(whether they exist for example), then use dx and dy freely like Leibniz, Euler, and many
seventeenth century mathematicians did. And the results are the same!

dx
2
dx xdx (dx)

x x2
x x2 xdx

x dx x dx
Figure 4.29: Geometric derivation of the derivative of x 2 . The change .dx/2 is super small compared
with 2xdx and thus it will approach zero when dx is approaching zero.

Similar steps give us the derivatives of x 3 , x 4 : .x 3 /0 D 3x 2 , and .x 4 /0 D 4x 3 . It is hard to

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 381

resist to write this general formula for all positive integers nŽ :

.x n /0 D nx n 1
(4.4.14)

How about the derivative when n is negative? Let’s start with f .x/ D x 1
D 1=x. Using
the definition, we can compute its derivative as
1 1
 0
1 1 1
D lim x C h x D lim D
x h!0 h h!0 x.x C h/ x2
Let’s see whether we can have a geometry based derivation y
of this result. We plot the function 1=x and pick two points on y D 1=x
the curve close to each other: one point is C D .x; 1=x/ and the
other point is D D .x C dx; 1=.x C dx//. As the areas of the two H C
rectangles OACH and OBDG are equal (equal 1), the areas of dy  dxdy
the two rectangle strips (those are shaded) must be equal. We can D
G
compute these two areas and equate them, then we will obtain an x
1=.x C dx/
equation relating dx and dy; just note that HG D dy because
x
dy is negative. So we can write O A dx B

 
1 dy 1
. dy/.x/ D .dx/ H) x 2 dy D dx H) D
x C dx dx x2
In the algebra, we removed the term x.dy/.dx/ as super super small quantity in the same manner
discussed in Section 4.4.5ŽŽ . Geometrically, dydx is proportional to the the area of the gray
shaded region in the figure; obviously this area shrinks to zero when dx is tiny. What we just
got means that the formula in Eq. (4.4.14) (i.e., .x n /0 D nx n 1 ) still holds for negative powers
at least for n D 1. p
Now, we compute the derivative of the square root function i.e., x. We assume that (once
again believe in mathematical patterns) Eq. (4.4.14) still applies for fractional exponents, so we
write  0
p 0 1=2 1 1
. x/ D x D x 1=2 D p (4.4.15)
2 2 x
p
Let’s try if we can get the same resultp by geometry. As x is the inverse of x 2 , we use area
concept.
p We consider a square of side x, its area is thus x. We consider a change in the side
d. x/, and see how the square area changes, see Fig. 4.30.
Change in the area is dx, and it is written as
p
p p d. x/ 1
dx D 2 xd. x/ H) D p (4.4.16)
dx 2 x
Ž
If you need a proof so that you can have a good sleep at night, then follow Eq. (4.4.13) and use the binomial
theorem.
ŽŽ
This is to demonstrate that we can use dx and dy as ordinary numbers. But keep in mind that all of this works
because of properties of limit.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 382

√ √ √
d( x) xd( x)

√ √ √
x x xd( x)

√ √
x d( x)
p
Figure 4.30: Geometric derivation of the derivative of x.

4.4.8 Derivative of trigonometric functions


I present the derivatives of sine/cosine in this section. Once we have known these derivatives, the
derivative of tangent and other trigonometric functions is straightforward to compute as these
functions are functions of sine/cosine. Let’s start with a direct application of the definition of a
derivative for sin x:
sin.x C h/ sin x
.sin x/0 D lim
h!0 h
sin x cos h C sin h cos x sin x
D lim
h!0
 h
cos h 1 sin h
D sin x lim C cos x lim
h!0 h h!0 h

We need the following limits (proof of the first will be given shortly, for the second limit, check
Eq. (3.11.3))  
cos h 1 sin h
lim D 0; lim D1 (4.4.17)
h!0 h h!0 h
which leads to
.sin x/0 D cos x (4.4.18)
We can do the same thing to get the derivative of cosine. But we can also use trigonometric
identities and the chain rule (to be discussed next) to obtain the cosine derivative:
    
0 d  
.cos x/ D sin x D cos x D sin x (4.4.19)
dx 2 2
A geometric derivation of the derivative of sin x, shown in Fig. 4.31, is easier and without
requiring the two limits in Eq. (4.4.17).
Using the quotient rule, we can compute the derivative of tan x:
 0
0 sin x cos2 x C sin2 x 1
.tan x/ D D 2
D (4.4.20)
cos x cos x cos2 x

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 383

y
tangent to circle at A
C A

sin x
dx
x C B
x x
O cos x
A

x
O

Figure 4.31: Geometric derivation of the derivative of the sine/cosine functions by considering a unit
circle and point A with coordinates .cos x; sin x/. For a small variation in angle dx, we have AC D dx
and AC ? OA because AC is tangential to the circle. Then, d.sin x/ D AB D dx cos x from the right
triangle ABC . Note that angles are in radians. If it is not the case, AC D .dx=180/, and the derivative
of sin x would be .=180/ cos x.

Proof. Herein, we prove that the limit of cos h 1=h equals zero. The proof is based on the limit of
sin h=h and a bit of algebra:
       
cos h 1 sin2 h sin h sin h 0
lim D lim D lim lim D1
h!0 h h!0 h.1 C cos h/ h!0 h h!0 .1 C cos h/ 2


4.4.9 Rules of derivative


Let’s summarize
p what we know about derivative. We know the derivatives of x n for positive
integers n, of x, of x 1 , of trigonometric functions, exponential functions ax or e x and loga-
rithm functions (to be discussed). What about the derivative of x 2 sin x, x 2 3=cos x ? For them, we
need to use the rules of differentiation. With these rules, only derivatives of elementary functions
are needed, derivative of other functions (inverse functions, composite functions) are calculated
using these rules. They are first summarized in what follows for easy reference (a; b 2 R)
Œa0 D0 (constant function rule)
Œaf .x/ C bg.x/0 0 0
D af .x/ C bg .x/ (sum rule)
Œf .x/g.x/0 0 0
D f .x/g.x/ C g .x/f .x/ (product rule)
 0
1 f0
D (reciprocal rule)
f .x/ f2
 
f .x/ 0 f 0 g fg 0
D (quotient rule)
g.x/ g2
f .g.x///0 D f 0 .g.x//g 0 .x/ (chain rule)

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 384

Among these rules the chain rule is the hardest (and left to the next section), other rules are quite
easy. The function y D a is called a constant function for y D a for all x. Obviously we cannot
have change with this boring function, thus its derivative is zero.
If we follow Eq. (4.4.13) we can see that the derivative of 3x 2 is 3.2x/. A bit of thinking will
convince us that the derivative of af .x/ is af 0 .x/, which can be verified using the definition of
derivative, Eq. (4.4.9). Again, following the steps in Eq. (4.4.13), the derivative of x 3 C x 4 is
3x 2 C 4x 3 , and this leads to the derivative of f .x/ C g.x/ is f 0 .x/ C g 0 .x/: the derivative of
the sum of two functions is the sum of the derivatives. This can be verified using the definition of
derivative, Eq. (4.4.9). Now, af .x/ is a function and bg.x/ is a function, thus the derivative of
af .x/ C bg.x/ is .af .x//0 C .bg.x//0 , which is af 0 .x/ C bg 0 .x/. And this is our first ruleŽŽ .
The sum rule says that the derivative of the sum of two func- f dg df dg ≈ 0
tions is the sum of the derivatives. Thus Leibniz believed that the dg
derivative of the product of two functions is the product of the
derivatives. It took him no time (with an easy example, let say
x 3 .2x C3/) to figure out that his guess was wrong, and eventually g fg gdf
he came up with the correct rule. The proof of the product rule
is given in the beside figure. The idea is to consider a rectangle
of sides f and g with an area of fg. (Thus implicitly this proof
f df
applies to positive functions only). Now assume that we have an
infinitesimal change dx, which results in a change in f , denoted by df D f 0 .x/dx and a
change in g, denoted by dg D g 0 .x/dx. We need to compute the change in the area of this
rectangle. It is gdf C f dg C df dg, which is gdf C f dg as .df /.dg/ is minuscule. Thus
the change in the area which is the change in fg is Œgf 0 .x/ C fg 0 .x/dx. That concludes our
geometric proof.
The proof of the reciprocal rule starts with this function f .x/  1=f .x/ D 1. Applying the
product rule for this constant function, we get
   
0 1 d 1 d 1 f 0 .x/
0 D f .x/ C f .x/ H) D
f .x/ dx f .x/ dx f .x/ f 2 .x/

The quotient rule is obtained from the product rule and the reciprocal rule as shown in
Eq. (4.4.21)
   
d f d 1
D f
dx g dx g
 
df 1 d 1
D Cf (4.4.21)
dx g dx g
 
df 1 dg=dx f 0 g fg 0
D f D
dx g g2 g2
ŽŽ
Note that this rule covers many special cases. For example, taking a D 1, b D 1, we have Œf .x/
g.x/0 D f 0 .x/ g 0 .x/. Again, subtraction is secondary for we can deal with it via addition. Furthermore, even
though our rule is stated for two functions only, it can be extended to any number of functions. For instance,
Œf .x/ C g.x/ C h.x/0 D f 0 .x/ C g 0 .x/ C h0 .x/, this is so because we can see f .x/ C g.x/ as a new function
w.x/, and we can use the sum rule for the two functions w.x/ and h.x/.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 385

4.4.10 The chain rule: derivative of composite functions


The chain rule is for derivative of composite functions. For example,  what is the derivative of
f .x/ D sin x 2 ? Or generally f .g.x//. In the case of f D sin x 2 , we have f D sin.y/ and
y D x 2 . We know the derivative of f w.r.t y and the derivative of y w.r.t x. But the question is
the derivative of f w.r.t x. Using Leibniz’s dy and dx, it is easy to derive the rule:
f f y df df dy
D H) D (4.4.22)
x y x dx dy dx
which means that that derivative of f w.r.t x is equal to the derivative of f w.r.t y multiplied by
the derivative of y w.r.t x. 
Thus, for f D sin x 2 , its derivative is: cos x 2 2x. We should at least check Eq. (4.4.22).
For example, considering y D x 6 , then y 0 D 6x 5 . Now, we write y D x 6 D .x 2 /3 , then the
chain rule yields y 0 D .3x 4 /.2x/ D 6x 5 . Not a proof, but for me it is enough.

4.4.11 Derivative of inverse functions


We have discussed inverse functions in Section 4.2.5. Calculus is always about derivative and
integration. In this section, we discuss how to find the derivative of an inverse function. Given a
function y D f .x/, the inverse function is x D f 1 .y/. And our aim is to find dx=dy.
We write x D f 1 .y/ D f 1 .f .x// and differentiate (w.r.t x) two sides of this equation.
On the RHS, we use the chain rule:
df 1 df
x D f 1 .f .x// H) 1 D
dy dx
So, we have the rule for the derivative of an inverse function:
dx 1
D (4.4.23)
dy dy=dx
p
Let’s check this rule with y D x 2 and x D y. We compute dx=dy using Eq. (4.4.23):
p
dx=dy D 1=.dy=dx/ D 1=.2x/ D 1=.2 y/. And this result is identical to the derivative of
p
x D y.

4.4.12 Derivatives of inverses of trigonometry functions


Using Eq. (4.4.23) we can compute the derivative of inverse trigonometric functions. We sum-
marize the results in Table 4.6. Proofs follow shortly.
We present the proof of the derivative of arcsin x. Write y D sin x, then we have dy=dx D
cos x. The inverse function is x D arcsin y. Using the rule of the derivative of inverse function:
dx 1 1 1
D D Dp (4.4.24)
dy dy=dx cos x 1 y2
where in the final step we have converted from x to y aspdx=dy is a function of y. Now consid-
ering the function y D arcsin x, we have dy=dx D 1= 1 x 2 . Proofs of other trigonometric
inverses follow.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 386

Table 4.6: Derivative of inverses of trigonometric functions.


1 df 1
f .x/ f .x/ dx

sin x arcsin x p 1
1 x2
cos x arccos x p 1
1 x2
1
tan x arctan x 1Cx 2
1
cot x arccot x 1Cx 2

4.4.13 Derivatives of ax and number e


In this section, we seek for the derivative of exponential functions. To start with, consider the
function 2t . As will be seen, using the definition to analytically find out the derivative of 2t
is quite hard. So, we use computations to see the pattern: we compute the derivative of 2t at
t D 1; 2; 3; 4 using dt D 1; the results given in Table 4.7 indicate that the derivative at point t
is the function itself evaluated at the same point.

Table 4.7: Derivative of 2t using a finite increment dt D 1.


2t C1 2t
t 2t dt

1 2 2
2 4 4
3 8 8
4 16 16

We know that this cannot be true as dt is too big. But it gave us some hint that the derivative
should be related to 2t . So, we use the definition of derivative and do some algebra this time so
that 2t shows up:

d.2t / 2t Cdt 2t 2t .2dt 1/ 2dt 1


D lim D lim D 2t lim
dt dt !0 dt dt !0 dt dt !0 dt
And once again, we face the eternal question of calculus: does the limit in the above equation
exist and if so, what is the value? Herein, we pass the first question (we know from textbooks
that it does exist), and focus on finding the value. Using different values for dt, we can see that
the limit is 0.6931474 (Table 4.8).
So, the derivative of 2t is 2t multiplied by a constant. We can generalize this result as there
is nothing special about number 2. For the exponent function y D at , its derivative is given by

d.at /
D kat (4.4.25)
dt
Phu Nguyen, Monash University © Draft version
Chapter 4. Calculus 387

Table 4.8: Limit of 2dt 1=dt .


.2dt 1/
dt dt

0.1 0.7177346253629313
0.01 0.6955550056718884
0.001 0.6933874625807412
0.00001 0.6931495828199629
0.000001 0.6931474207938493

where k is a constant. To find its value, we compute k for a few cases of a D 2; 3; 4; 6; 8 to


find a pattern. The results are given in Table 4.9. If this k can be expressed as a function of a,
f .a/, then we have f .4/ D f .22 / D 2f .2/, and f .8/ D f .23 / D 3f .2/. What function has
this property? A logarithm! But logarithm of which base, we do not know yet.

adt 1
Table 4.9: .at /0 D kat , to find k: dt
with dt D 10 7.

.2dt 1/ .3dt 1/ .4dt 1/ .6dt 1/ .8dt 1/


dt dt dt dt dt

0.693147 1.098612 1.386294 1.791760 2.079442


0:693147  2 1.791760 0:693147  3

Instead of finding k, we can turn the problem around and ask if there exists an exponential
function such that its derivative is itself. In other words, k D 1. From Table 4.9, we guess that
there exists a number c within Œ2; 3 that the derivative of c x is c x . It turns out that this function
is f .t / D e t , where e is the Euler number (its value is approximately 2.78) that we have found
in the context of continuously compounding interest (Section 2.28). Indeed,

d.e t / e dt 1
D e t lim D et (4.4.26)
dt dt !0 dt
Because e is defined as a number that satisfies the following limit

e dt 1
lim D1 (4.4.27)
dt !0 dt
You can see where this definition of e comes from by looking at Eq. (2.24.6) (in the context that
Briggs calculated his famous logarithm tables). It can be shown that this definition is equivalent
to the definition of e as the rate of continuously compound interest:
 1=dt  n
1
e D lim 1 C dt D lim 1C (4.4.28)
dt !0 n!1 n

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 388

which is exactly Eq. (2.28.1).


Now, we can find the constant k in Eq. (4.4.25), k D ln a where ln x is the natural logarithm
function of base e, which is the inverse function of e x (we will discuss about ln x in the next
section)
d.at /
D ln aat (4.4.29)
dt
Proof. The proof of the derivative of at is simple. Since we know the derivative of e x , we write
ax in terms of e x . So, we write a D e ln a , thus

dat d.e .ln a/t /


at D e .ln a/t H) D D ln ae .ln a/t D ln aat
dt dt
where we have used the chain rule of differentiation. 
Are there other functions of which derivatives are the functions themselves? No, the only
function that has this property is y D ce x . The function y D e x is the only function of which
the derivative and integral are itself. To it, there is a joke that goes like this.

An insane mathematician gets on a bus and starts threatening everybody: "I’ll


integrate you! I’ll differentiate you!!!" Everybody gets scared and runs away. Only
one lady stays. The guy comes up to her and says: "Aren’t you scared, I’ll integrate
you, I’ll differentiate you!!!" The lady calmly answers: "No, I am not scared, I am
e x ."

4.4.14 Logarithm functions


So we have discovered (re-discovered to be exact) the number e, and thus we can define the
exponential function y D e x with the remarkable property that its derivative is itself. It is
then straightforward to define the natural logarithm function x D ln y (if you always think of
inverse functions or operations). Historically, it was not the case because y D e x was not known
at the time when Flemish Jesuit and mathematicians Grégoire de Saint-Vincent (1584–1667)
and Alphonse Antonio de Sarasa (1618–1667) discovered the natural logarithm function while
working on the quadrature of the rectangular hyperbola xy D 1 in 1647.
We start the discussion by noting that the area of the function y D 1=x has defied almost all
17th century mathematicians (including giants such as Descartes, Fermat). This indicates that
the integral of y D 1=x can be a new function. To find out about this function, we are going to
define a function f .x/ as Z x
1
f .x/ WD du (4.4.30)
1 u
And the task now is to find out what f .x/ is. The way to do that is to compute some values of
this function and see where they lead us. To this end, I use the definition of integral to compute
f .x/ for some values of x. The results are given in Table 4.10. In the definition of the integral
(Eq. (4.3.3)) we have to go to infinity, but herein, I have used the mid-point rule with 20 000
sub-divisions to compute these integrals. That is n D 20 000 in Eq. (4.3.3); this is similar to how

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 389

Archimedes computed .ŽŽ This is obviously not the way Saint-Vincent computed the integral
in Eq. (4.4.30) (since he did not have computers). He also used the definition of the integral as
the sum of areas of many thin slices, but his slices were special, see this youtube link for detail.
Rx
Table 4.10: The area of y D 1=u from 1 to x: f .x/ D 1 du=u for some values of x.

x 2.0 4.0 8.0 16.0

f .x/ 0.69314718 1.38629436 2.07944154 2.77258870


f .x/ 0.69314718 0.69314718 0.69314718

Anything special from this table? Ah yes. In the first row we have a geometric progression
2; 4; 8; 16, and in the second row we have an arithmetic progression (indicated by a constant
f .x/ in the last row). Which function has this property? A logarithm!Ž If not clear, you might
need to check Table 2.16 again. You can check from the values in the table that
Z 2 Z 4 Z 8
f .8/ D f .4  2/ D f .4/ C f .2/; du=u D du=u D du=u
1 2 4

From which we anticipate the following result, see Fig. 4.32:


Z b Z ˛b
du du
D (4.4.31)
a u ˛a u

How we are going to prove this? Well, it depends on what tools you want to use. If you assume
that you know already the chain rule, then a simple substitution would prove that the two integrals
in Eq. (4.4.31) are equal. If you assume you were in the 16th century, then the proof would be a
R ˛b
bit harder. You can use the definition of the integral, Eq. (4.3.3), for ˛a du=u and see that ˛ will
Rb
be canceled out, and thus that integral equals a du=u.

y
f (x) = 1/x
equal areas

α>1

x
a b αa αb

Figure 4.32: The hyperbola y D 1=x.

ŽŽ
Refer to Section 12.4.1 if you’re not sure what is the mid-point rule.
Ž
By 1661 Huygens understood this relation between the rectangular hyperbola and the logarithm.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 390

Rx
Ok. So 1 du=u is a logarithm, but what base? Can the base be 2 or 3? Of course not, there
is nothing special about these two numbers. To find out the base, let’s denote it by y, then we
have: Z x
1
logy .x/ D du
1 u
The task is now to determine the value of y. Now, we consider small x, and the area under the
hyperbola from 1 to 1 C x, we can then write
Z 1Cx
1
logy .1 C x/ D du
1 u
R 1Cx 1
Why doing this? Because we can compute that integral: 1 u
du  x (draw a figure and you’ll
see it). Therefore,
1 1 1
logy .1Cx/  x ) lim logy .1Cx/ D 1 H) lim logy .1Cx/ x D 1 H) lim .1 C x/ x D y
x!0 x x!0 x!0

The boxed formula looks familiar: it is exactly Eq. (2.28.1) that defines e. So, y is nothing but e.
Thus, mathematicians define the natural logarithm function y D ln x as:
Z x
ln x WD du=u (4.4.32)
1

In 1668 Nicolaus Mercator published Logarithmotechnia where he used the term "natural loga-
rithm" for the first time for logarithms to base e. Before that this logarithm was called hyperbolic
logarithm as it comes from the area of a hyperbola. The geometric meaning of e is given in
Fig. 4.33, with a comparison with that of .
y y

1
xy D 1


1 0 1 x
1

1 1
x2 C y2 D 1 0 1 e x

Figure 4.33: Geometric meaning of  and e: the former is related to the area of a circle and the latter to
the area of a hyperbola. Both shapes are conic sections.

And all properties of logarithm (such as ln ab D ln a C ln b), that Napier and Briggs discov-
ered (Section 2.24) in a completely different way, should follow naturally from this definition.
With Fig. 4.34, we can prove ln ab D ln a C ln b as follows:
Z ab Z b Z ab
du du du
ln ab D D C D ln b C ln a (4.4.33)
1 u 1 u b u

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 391

Figure 4.34: Proof of ln ab D ln a C ln b.

6 ex
5
4
3
2 ln x
y
1
0
4 2 2 4
1
2
x
3
(a) monotonic functions (b) non-monotonic functions

Figure 4.35: Graph of y D ln x and y D e x –monotonically increasing functions (a) and graph of
sin 4x=x, a non-monotonic function.

R ab Ra
where use was made of Eq. (4.4.31) to convert b du=u to 1 du=u D ln a.
I defer the discussion on the derivative of logarithm functions to Section 4.4.18. Fig. 4.35
presents the graph of the exponential and logarithm functions. Both are monotonically increasing
functions. This is so because their derivatives are always positive. A function f .x/ is called
monotonically increasing if for all x and y such that x  y one has f .x/  f .y/. If the order
 in the definition of monotonicity is replaced by the strict order <, one obtains a stronger
requirement. A function with this property is called strictly increasing (also increasing).

4.4.15 Derivative of hyperbolic and inverse hyperbolic functions


The hyperbolic sine and cosine functions have been introduced in Section 3.15. In what follows,
I list all hyperbolic functions in one place for convenience:

1 x 1 2e x
sinh x D .e e x/ csch x D D 2x
2 sinh x e 1
1 1 2e x
cosh x D .e x C e x / sech x D D 2x (4.4.34)
2 cosh x e C1
sinh x e 2x 1 cosh x e 2x C 1
tanh x D D 2x coth x D D 2x
cosh x e C1 sinh x e 1
Phu Nguyen, Monash University © Draft version
Chapter 4. Calculus 392

The first derivatives of some hyperbolic functions are

d
.sinh x/ D cosh x
dx
d
.cosh x/ D sinh x (4.4.35)
dx
 
d d sinh x cosh2 x sinh2 x 1
.tanh x/ D D 2
D
dx dx cosh x cosh x cosh2 x
Note the striking similarity with the circular trigonometric functions. Only sometimes a minus/-
plus difference.
We’re now too familiar with the concept of inverse operators/functions. So it is natural
to consider inverse hyperbolic functions. For brevity, we consider only y D sinh 1 x and
y D cosh 1 x. Let’s computep the derivative of y D sinh 1 x. We have x D sinh y, and
thus dx=dy D cosh y D 1 C x 2  . So,

d 1
 1 1
sinh x D Dp (4.4.36)
dx dx=dy 1 C x2
1
If someone tell you that sinh x is actually a logarithm of x:
 p 
1
y D sinh x D ln x C 1 C x 2

Do you believe him? Yes. Because the sine hyperbolic function is defined in terms of the
exponential e x , it is reasonable that its inverse is related to ln x–the inverse of e x . The proof is
simple:

ey e y p
x D sinh y D H) .e y /2 .2x/e y 1 D 0 H) e y D x C 1 C x2
2

4.4.16 High order derivatives


Given the function y D f .x/, the derivative f 0 .x/ is also a function, so it is natural to compute
the derivative of f 0 .x/ which is the derivative of a derivative. What we get is the so-called
second derivative, denoted usually by f 00 .x/Ž :
 
00 d df d 2f f 0 .x C h/ f 0 .x/
f .x/ D D D lim (4.4.37)
dx dx dx 2 h!0 h

Consider y D 2x, the first derivative is y 0 D 2, and the second derivative is y 00 D 0. Thus, a line
has a zero second derivative. For a parabola y D x 2 , we have y 0 D 2x and y 00 D 2. A parabola
has a constant non-zero second derivative. A line has a constant slope, it does not bend and thus

Using the identity cosh2 x sinh2 x D 1,
Ž
This makes f 0 .x/ the first derivative of f .x/.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 393

its second derivative is zero. On the other hand, a parabola has a varying slope, it bends and
therefore the second derivative is non-zero.
The most popular second derivative is probably the acceleration, which is the second deriva-
tive of the position function of a moving object, x.t/: a D xR following Newton’s notation or
a D d 2 x=dt 2 following Leibniz. Equivalently, the acceleration is the derivative of the velocity.
Historically, acceleration was the first second derivative ever. Newton’s laws of motions are
presented in Section 7.10.
Have ever you wondered why mathematicians used the notation d 2 f =dx 2 but not df 2=dx 2 ? In
other words, why the number 2 are treated differently in the numerator and denominator? I do
not have a rigorous answer. But using the notion of acceleration helps. The acceleration is given
by
d 2x
aD 2
H) correct unit: m/s2
dt
If a was written as a D dx =dt , its unit would be (m/s)2 , which is wrong.
2 2

Going along this direction, we will have the third derivative e.g. the third derivative of x 3 is
6. And the fourth derivative and so on, usually we denote an n-order derivative of a function
f .x/ by f .n/ .x/. But, wait how about derivatives of non-integer order like d 1=2 f .x/=dx 1=2 ?
That question led to the development of the so-called fractional calculus.

Fractional derivative. Regarding the n-order derivative of a function f .n/ .x/, in a 1695 letter,
l’Hopital asked Leibniz about the possibility that n could be something other than an integer,
such as n D 1=2. Leibniz responded that “It will lead to a paradox, from which one day useful
consequences will be drawn.” Leibniz was correct, but it would not be centuries until it became
clear just how correct he was.
There are two ways to think of f .n/ .x/. The first is the one we all learn in basic calculus:
it’s the function that we obtain when we repeatedly differentiate f n times. The second is more
subtle: we interpret it as an operator whose action on the function f .x/ is determined by the
parameter n. What l’Hopital is asking is what the behavior is of this operator when n is not
an integer. The most natural way to answer this question is to interpret differentiation (and
integration) as transformations that take f and turn it into a new function.
That’s all I know about fractional derivative and fractional calculus. I have presented them
here to illustrate the fact that if we break the rules (the order of differentiation is usually a
positive integer) we could make new mathematics.

4.4.17 Implicit functions and implicit differentiation


We have discussed the derivative of explicit functions like y D f .x/, now it is time to deal with
the derivative of implicit functions like
p x C y D 25, what is dy=dx? While it is possible, for
2 2

this particular case, to write y D ˙ 25 x 2 and proceed as usual, it is easy to see that for
other implicit functions e.g. y 5 C xy D 3 it is impossible to solve y in terms of x. Thus, we
need another way known as implicit differentiation that requires no explicit expression of y.x/.
The best way to introduce implicit differentiation is probably to solve the so-called related
rates problems. One such problem is given in Fig. 4.36.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 394

Balloon

z.t / dy
D 3 m/s
y.t / dt
dz
.y D 50 m/ D‹ m/s
dt
Observer
Balloon
A 100 meters

Figure 4.36: One problem on related rates: a balloon is flying up with a constant speed of 3 m/s. While
it is doing so the distance from it to an observer at A, denoted by z, is changing. The question is: find
dz=dt when y D 50 m.

We need to relate z.t/ to y.t/, and then differentiate it with respect to time:
dz dy dz y
Œz.t/2 D 1002 C Œy.t/2 H) 2zD 2y H) D3
dt dt dt z
p
When the p balloon ispabove the ground 50 m, z D 50 50 m, so at that time, dz=dt D
3.50/=.50 50/ D 3 5=5 m/s. The problem is easy using the chain rule and it is so because
time is present in the problem.
Now we come back to this problem: given x 2 C y 2 D 25, what is dy=dx? We can imagine
a point with coordinate .x.t/; y.t// moving along the circle of radius 5 centered at the origin.
Then, we just do the differentiation w.r.t time:
dx dy dy x
Œx.t /2 C Œy.t/2 D 25 H) 2x C 2y D 0 H) 2xdx C 2ydy D 0 H) D
dt dt dx y
p
Is this result correct? If we write y D 25 x 2 (for the upper part of the circle), then dy=dx D
x=y , the same result obtained using implicit differentiation. You can see that dt disappears.

Why dy=dx D x=y ? This simply means that the tangent to a circle at any point is perpendicular
to the line joining the center to that point. You need to draw a figure and using the fact that
.y=x /.x=y / D 1 to see this.
We can apply this to a more complex functionŽ :
dy y
y 5 C xy D 3 H) 5y 4 dy C ydx C xdy D 0 H) D
dx 5y 4 Cx
As now we’re used to this, I have removed dt in the process in the above equation.

4.4.18 Derivative of logarithms


What is the derivative of y D loga x? We can use implicit differentiation to find it. Let’s first
convert to exponential function (as we know how to differentiate this function):
y D loga x H) x D ay
Ž
If you do not like dx, dy as independent objects, it is fine to just use dy=dx , and you will get the same result.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 395

Differentiating the above, we get


dy 1 1 1
dx D ln aay dy H) D y
D
dx ln aa ln a x
With a D e, ln e D 1, thus the derivative of ln x is simply 1=x.
For sake of convenience, I summarize these two results in the following equation
d 1 1 d 1
.loga x/ D ; .ln x/ D (4.4.38)
dx ln a x dx x
Logarithmic differentiation is a useful technique to differentiate complicated functions. For
example, what is the derivative of the following function:
p
x 3=4 x 2 C 1
yD
.3x C 2/5
Taking the natural logarithm of both sides of this equation, we get:
3 1 
ln y D ln x C ln x 2 C 1 5 ln.3x C 2/
4 2
Differentiating this equation we have:
dy 3 dx 1 2xdx 15dx
D C
y 4 x 2 x2 C 1 3x C 2
And solve this for dy=dx, and replace y by its definition, we get the final result:
  p  
dy 3 x 15 x 3=4 x 2 C 1 3 x 15
Dy C 2 D C 2
dx 4x x C 1 3x C 2 .3x C 2/5 4x x C 1 3x C 2
The method of differentiating functions by first taking logarithms and then differentiating is
called logarithmic differentiation.

4.5 Applications of derivative


This section presents some applications of derivative. First, the use of derivative to solve maxima
problems (i.e., finding the maximum or minimum value of a function) is given in Section 4.5.1.
Then, convexity and Jensen’s inequality is treated in Section 4.5.2. Linear approximation–the
idea of replacing a complex function by a linear one is the subject of Section 4.5.3. Finally,
the Newton method to solve numerically any equation of the form f .x/ D 0 is presented in
Section 4.5.4. All these problems are solved thanks to the concept of derivative.

4.5.1 Maxima and minima


Probably the most important application of derivative is to find minima and maxima of a func-
tion. This is an optimization problem–a very important problem in virtually all science and
engineering fields. In an optimization problem we have an objective function y D f .x/, which
we want to minimize. Of course, in the realm of calculus, we consider only continuous functions
with at least continuous first derivative.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 396

y
Considering the graph of an arbitrary function y D global maximum
f .x/ in the next figure, we can identify special points:
local maximum (local minimum) and global maximum
(global minimum). By local maximum at a point x  local maximum
we mean the function at that point is largest in a neigh- x
borhood of the point: f .x  /  f .x  C h/ for small global minimum
negative and positive h. You can be the smartest kid in
your class (local maximum) but there might be a smarter kid in another class.
To discover the rules regarding maxima/minima, let’s consider the following fourth order
polynomial:

x4 11x 2
f .x/ D 2x 3 C 6x
4 2

And we compute the first and second derivative of this functionŽŽ :

f 0 .x/ D x 3 6x 2 C 11x 6 D .x 1/.x 2/.x 3/


f 00 .x/ D 3x 2 12x C 11

The graphs of the function f .x/, the first derivative f 0 .x/ and the second derivative f 00 .x/ are
shown in Fig. 4.37. We can see that:

f 00 .x/ f 0 .x/
f .x/

1 2 3
x

f 00 .2/ < 0

f 00 .1/ > 0 tangent

Figure 4.37: Graph of a fourth-order polynomial f .x/ with its first f 0 .x/ and second derivatives f 00 .x/.

ŽŽ
As the 1st derivative represents the slope of the tangent of the curve, at the minimum or maximum point x0
the slope is horizontal. In other words, at these points, the 1st derivative vanish. Thus, it is natural to consider the
1st derivative of f .x/ to study its maxima and minima. The second derivative is needed when it comes to decide at
x0 the function attains a maximum or a minimum. Think of a bowl (similar to y D x 2 with second derivative of 2),
it has a minimum point (the bottom of the bowl). Now, turn the bowl upside down (y D x 2 , with a negative y 00 ),
this bowl now has a maximum point. It is as simple as that.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 397

 The function is decreasing within the interval in which f 0 .x/ < 0. This makes sense
noting that f 0 .x/ is the rate of change of f .x/–when it is negative the function must be
decreasing;
 The function is increasing within the interval in which f 0 .x/ > 0;
 At point x0 where f 0 .x0 / D 0, the function is not increasing nor decreasing; it is stationary–
the tangent is horizontal. There (x0 D 1; 2; 3), the function is either a local minimum
or a local maximum. It is only a local minimum or maximum for there are locations
where the functions can get a larger/smaller value. The derivative at a point contains local
information about a function around the point (which makes sense from the very definition
of the derivative);
 A stationary point x0 is a local minimum when f 00 .x0 / > 0; the tangent is below the
function, or the curve is concave up. Around that point the curve has the shape of a cup [;
 A stationary point x0 is a local maximum when f 00 .x0 / < 0; the tangent is above the
function, or the curve is concave down. Around that point the curve has the shape of a cap
\.

Finding a minimum or a maximum of a function is essentially a comparison problem. In


principle we need to evaluate the function at all points and pick out the minimum/maximum.
Differential calculus provides us a more efficient way consisting of two steps: (1) finding
stationary points where the first derivative of the function is zero, and (2) evaluate the second
derivative at these points.

Snell’s law of refraction. We now use the derivative to derive the Snell’s law of refractionŽ .
This law is a formula used to describe the relationship between the angles of incidence and
refraction, when referring to light (or other waves) passing through a boundary between two
different isotropic media such as water/air. Fermat derived this law in 1673 based on his principle
of least time. According to this principle, the light follows a path that minimizes the travel time;
or light is lazy.
Referring to Fig. 4.38, we now compute the time required for the light to go from A to B,
which is the sum of the time for it to travel from A to O then from O to B:
p p
a2 C x 2 b 2 C .d x/2
t D tAO C tOB D C
v1 v2
Calculating the first derivative of t and set it to zero gives us
x d x sin ˛1 sin ˛2
p Dp ; H) D (4.5.1)
a2 C x2v 1 b 2 C .d x/2 v2 v1 v2

Note that the angles are measured with respect to the normal to the boundary, not to the boundary.
Now, introducing the refractive index n, defined as n D c=v where c denotes the speed of light
Ž
Named after the Dutch astronomer and mathematician Willebrord Snellius (1580-1626).

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 398

in vacuum. Thus, the refractive index describes how fast light travels through a medium. For
example, the refractive index of water is 1.333, meaning that light travels 1.333 times slower in
water than in a vacuum.
Now, Eq. (4.5.1) becomes:
sin ˛1 sin ˛2
D or n1 sin ˛1 D n2 sin ˛2 (4.5.2)
v1 v2
That’s the Snell law of refraction.
A x

normal
a v1 ; air
˛1
boundary
O
˛2 v2 ; water
b
d x
B
d

Figure 4.38: Snell’s law of refraction: v1 and v2 are the velocities of light in the medium 1 and 2. As the
velocity is lower in the second medium, the angle of refraction ˛2 is smaller than the angle of incidence
˛1 . A and B are two points fixed in space (i.e., a; b; d are constants). A light ray travels from A to B can
meet the boundary (red line) of the two media at any point. Assuming that the point is O which is x apart
from A in the horizontal direction.

4.5.2 Convexity and Jensen’s inequality


When the second derivative at a stationary point is positive the curve has a [ shape. Now we
discuss more about this property. We consider a function y D f .x/ for a  x  b (alternatively,
we write that x 2 Œa; b). The function f .x/ is said to be convex if its graph in the interval Œa; b
is below the secant line joining the two points .a; f .a// and .b; f .b//; they are labeled as A
and B in Fig. 4.39a.
To quantify this we consider an arbitrary point x in Œa; b; it is given by x D .1 t/a C tb,
t 2 Œ0; 1. We now label the point P .x; f .x// on the curve y D f .x/. Corresponding to this x
we have point Q on the secant AB; the y-coordinate of Q is .1 t/f .a/ C tf .b/Ž . And the fact
that P is below Q is written as:
f ..1 t/a C tb/  .1 t/f .a/ C tf .b/ (4.5.3)
A few examples are helpful; with t D 0:5 and t D 2=3, we have:
   
aCb f .a/ C f .b/ 2 1 2 1
f  ; or f a C b  f .a/ C f .b/
2 2 3 3 3 3
You need to know the equation of a line/segment given its two points, see Section 11.1.3.
Ž

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 399

y y

B
B

A
(1 − t)f (a) + tf (b) Q

x
P
a b
f [(1 − t)a + tb]

A
x
a b
(1 − t)a + tb
(a) convex function (b) non-convex function

Figure 4.39: Convex function (left) and non-convex function (right). The function f .x/ is said to be
convex if its graph in the interval Œa; b is below the secant line joining the two end points .a; f .a// and
.b; f .b//;

We are going to generalize this inequality. We re-write Eq. (4.5.3) as

f .t1 x1 C t2 x2 /  t1 f .x1 / C t2 f .x2 /; t1 C t2 D 1 (4.5.4)

And we ask the question: does this nice inequality hold for 3 points? The answer is yes. So, we
need to prove this:

f .t1 x1 C t2 x2 C t3 x3 /  t1 f .x1 / C t2 f .x2 / C t3 f .x3 /; t1 C t2 C t3 D 1

Proof. We use Eq. (4.5.4) to prove the above inequality. First, we need to split 3 terms into 2
terms (to apply Eq. (4.5.4)):
 
t2 x2 C t3 x3
f .t1 x1 C t2 x2 C t3 x3 / D f t1 x1 C .t2 C t3 /
t C t3
2 
t2 x2 C t3 x3
 t1 f .x1 / C .t2 C t3 /f (Eq. (4.5.4))
t2 C t3

After that we apply again Eq. (4.5.4) to f .t2 x2 Ct3 x3=t2 Ct3 /:
 
t2 t3
f .t1 x1 C t2 x2 C t3 x3 /  t1 f .x1 / C .t2 C t3 / f .x2 / C f .x3 /
t2 C t3 t2 C t3
 t1 f .t1 x1 / C t2 f .x2 / C t3 f .x3 /

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 400

And nothing can stop us to generalize this inequality to the case of n points:
!
Xn Xn Xn
f ti xi  ti f .xi / ; ti D 1 (4.5.5)
i D1 i D1 i D1

And this is known as the Jensen inequality, named after the Danish mathematician Johan Jensen
(1859 – 1925). Jensen was a successful engineer for the Copenhagen Telephone Company and
became head of the technical department in 1890. All his mathematics research was carried out
in his spare time. Of course if the
P function is concave, the inequality is reversed.
To avoid explicitly stating ti D 1, another form of the Jensen inequality is:
 Pn  Pn
i D1 ai xi D1 ai f .xi /
f Pn  iP n (4.5.6)
i D1 ai i D1 ai
P
where ti D ai = ai , and ai > 0 are weights.

Geometric interpretation of Jensen’s inequality. For n D 2, we have a geometry interpretation


of the Jensen inequality (Fig. 4.39). How about the case of more than 2 points? It turns out that
there is also a geometry interpretation, but it requires a concept from physics: the center of mass.
Let’s consider a convex function y D f .x/ on an interval. For a visual demonstration we
consider the case n D 4. We place four point masses (with masses m1 ; m2 ; m3 ; m4 , respectively)
on the curve of y D f .x/, of which the x-coordinates are xi ; i D 1; : : : ; 4 (see Fig. 4.40). The
coordinates of the center of mass of these four point masses are (refer to Section 7.8.7 if your
memory on this was rusty):
P4 P
i D1 mi xi mi f .xi /
xCM D P4 ; yCM D P
iD1 mi
mi
Note that in the equation for yCM the limits of summation were skipped for the sake of brevity.
The nice thing is that the center of mass is always inside the polygon with vertices being the
point masses–the shaded region in Fig. 4.40 (I refer Section 7.8.7 for a proof of this). This leads
immediately to yCM  f .mi xi =m/.

AM-GM inequality. We discussed the AM-GM inequality in Section 2.20.2 together with
Cauchy’s forward-backward-induction based proof. Now, we show that the AM-GM inequality
is simply a special case of the Jensen inequality. The function y D log x for x > 0 is concave
(using the fact that f 00 .x/ < 0 or looking at its graph), thus applying Eq. (4.5.6) with ai D 1:
 
x1 C x2 C    C xn log.x1 / C log.x2 / C    C log.xn /
log 
n n
Using the property of logarithm that log ab D log a C log b for the RHS of the above, we are
led to the AM-GM inequality:
  1 1
x1 C x2 C    C xn x1 C x2 C    C xn
log  log.x1 x2    xn / n H)  .x1 x2    xn / n
n n

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 401

y
m4

m1
CM

m3

m2

yCM

f (xCM )

x
x1 x2 xCM x3 x4

Figure 4.40: Geometric interpretation of the Jensen inequality for the case of more than 2 points.

where use was made of the fact that log x is an increasing function (Fig. 4.35) i.e., if
log a  log b then a  b.

Why convex functions important. Convex functions are important because they have nice
properties. Given a convex function within an interval, if a local minimum (maximum) is found,
it is also the global minimum (maximum). And it leads to convex optimization. Convex opti-
mization is the problem of minimizing a convex function over convex constraints. It is a class
of optimization problems for which there are fast and robust optimization algorithms, both in
theory and in practice.
Now, you have the tool, let’s solve this problem: Given three positive real numbers a; b; c,
prove that
aCbCc
aa b b c c  .abc/ 3

The art of using the Jensen inequality is to use what function. If you know what f .x/ to be used,
then it becomes easy.

4.5.3 Linear approximation


Although this may seem a paradox, all exact science is dominated by the idea of
approximation. (Bertrand Russell)

We now discuss a useful application of the derivative: that is to approximate a complex


function by a linear function. The problem is that we have a complex function of which the
graph is a curve. Focus now on a specific point x0 , and if we zoom in closely at this point we see
not a curve but a segment! That segment has a slope equal to f 0 .x0 /. Thus, in the neighborhood

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 402

of x0 , we replace f .x/ (which is usually complex) by a line with the equation of the following
form
Y.x/ D f .x0 / C f 0 .x0 /.x x0 / (4.5.7)
Of course working with a line is much easier than with a complex curve. Fig. 4.41 shows this
approximation.
y

tangent Y f .x/

˛ Y.x/
f .x0 / f .x/

x0 x x

Figure 4.41: Linear approximation of a function f .x/ using the first derivative f 0 .x/. In the shaded right
triangle, we have tan ˛ D Y f .x0 /=x x0 , but f 0 .x0 / D tan ˛. And voila we get our linear approximation.

At x0 we have Y.x0 / D f .x0 /, but the approximation get worse for x far way from x0 (for
we follow the tangent line not the real curve). This is obvious. We need to know the error of
this approximation; the error is e D Y.x/ f .x/. Let’s try withpa function and play with the
error. We can spot the pattern from this activity. We use y D x and x0 D 100 (no thing
special about this point except its square root is 10). We compute
p the square root of 100 C h
for h D f1:0; 0:1; 0:01; 0:001g using Eq. (4.5.7), which yields 100
pC h  10 C h=20, and the
error associated with the approximation is e.h/ WD Y.100 C h/ 100 C h.
p
Table 4.11: Linear approximations of x at x0 D 100 for various h.

h Y D 10 C h=20 e.h/

1.0 10.05 1.243789e-04


0.1 10.005 1.249375e-06
0.01 10.0005 1.249938e-08
0.001 10.00005 1.249987e-10

The results are given in Table 4.11. Looking at this table we can see that e.h/  h2 . That
is when h is decreasing
p by 1=10 the error is decreasing by 1=100. We can also get this error
measure by squaring 100 C h  10 C h=20
p h h2
100 C h  10 C H) 100 C h  100 C h C
20 400
Some common linear approximations near x D 0 are e  1 C x and sin x  x Ž , and the
x

approximation for the sine function is used in solving the oscillation of a pendulum.
Ž
For sin x, x0 D 0, then Y D sin.0/ C cos.0/.x 0/ D x.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 403

4.5.4 Newton’s method for solving f .x/ D 0


Newton’s method for solving an equation of the form f .x/ D 0 (e.g. sin x C x 2 D 0:5) uses
the first derivative of f .x/. The method belongs to a general class of iterative methods. In
an iterative method, a starting point x0 is selected, then, a better value of the solution x1 is
computed using the information evaluated at x0 . This iterative process produces a sequence of
x0 ; x1 ; x2 ; : : :which converges to the root x  .
y
f .x/

f .x0 /
D f 0 .x0 /
x0 x1 f .x0 /

˛
x
x3 x2 x1 x0 x

Figure 4.42: Newton’s method of solving f .x/ D 0 iteratively. We have a curve y D f .x/ and it
intersects with the x axis at x  ; that’s the solution to f .x/ D 0. We start with a guess x0 , and we get x1
using the tangent line to the curve at .x0 ; f .x0 //. Then we repeat the process until we get to x  .

The idea of the Newton method is illustrated in Fig. 4.42. At point .xn ; f .xn //, we draw a
line tangent to the curve and find xnC1 as the intersection of this line and the x-axis. Thus, xnC1
is determined using xn , f .xn / and f 0 .xn /:

f .xn /
xnC1 D xn (4.5.8)
f 0 .xn /

p We can use Newton’s


Extraction of square roots. method to extract square roots of any positive
number. Let’s write x D a as f .x/ D x 2
a D 0. Using Eq. (4.5.8), we have
 
xn2 a xn a 1 a
xnC1 D xn D C D xn C (4.5.9)
2xn 2 2xn 2 xn
Note that the final expression was used by Babylonians thousands years before Newton. The
result of the calculation given in Table 4.12 demonstrates that Newton method converges
quickly. More precisely it converged quadratically when close to the solution: the last three
rows indicate that the error is halved each iteration.

Coding Newton’s method. Let’s solve this equation f .x/ D cos x x D 0 using a computer.
That is we do not compute f 0 .x/ explicitly and use Eq. (4.5.8) to get:
cos xn xn
xnC1 D xn C
1 C sin xn

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 404

p
Table 4.12: Solving x D 2 with x0 D 1.

n xn e.h/

1 1.0 4.14e-01
2 1.5 -8.58e-02
3 1.416666666 -2.45e-03
4 1.414215686 -2.12e-06
5 1.414213562 -1.59e-12

That is too restrictive. We want to write a function that requires a function f .x/ and a tolerance.
That’s it. It will give us the solution for any input function. The idea is to use an approximation
for the derivative, see Section 12.2.1. The code is given in Listing A.6. In any field (pure or
applied maths, science or engineering), coding has become an essential skill. So, it is better to
learn coding when you’re young. That’s why I have inserted many codes throughout the note.
Is Newton’s method applicable only to one equation with one unknown f .x/ D 0? No! It
is used to solve systems of equations of billion unknowns, see Section 7.4. Actually it is used
everyday by scientists and engineers. One big application is nonlinear finite element analyses to
design machines, buildings, airplanes, you name it.

Exploring Newton’s method. With a program implementing the Newton method p we can
play with it, just to see what happens. For example, in the problemp of finding 2 by solving
x 2
2 D 0, if we start with x0 D 1, then the method gives us 2. Not that we want! But it
is also a root of x 2
2 D 0. Thus, the method depends on the initial guess (Fig. 4.43). To find a
good x0 for f .x/ D 0 we can use a graphic method: we plot y D f .x/ and locate the points it
intersects with the x-axis roughly, and use that for x0 .

10 14

12
8
10
6
8

4 6

4
2
2
0
3 2 1 0 1 0
1 0 1 2 3 4
2 2

(a) x0 D 3 (b) x0 D C3

Figure 4.43: Newton’s method is sensitive to the initial guess x0 .

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 405

Newton’s method on the complex plane. We discussed complex numbers in Section 2.25,
but we seldom use them. Let’s see if we can use Newton’s method to solve f .z/ D 0 such as
z 4 1 D 0 where z is a complex number. Just assume that we can treat functions of a complex
variable just as functions of a real variable, then

f .zn /
znC1 D zn (4.5.10)
f 0 .zn /

Let’s solve the simplest complex equation z 2 C 1 D 0, this equation has two solutions z D ˙i .
With the initial guess z0 D 1 C 0:5i Newton’s method converges to z D i (Table 4.13). So, the
method works for complex numbers too. Surprise? But happy. If z0 D 1 i , the method gives us
the other solution z D i (not shown here). If we pose this question we can discover something

Table 4.13: Solving z 2 C 1 D 0 with z0 D 1 C 0:5i. See Listing A.7 for the code.

n zn

1 0:1 C 0:45i
2 0:185294 C 1:28382i
3 0:0375831 C 1:02343i
4 0:000874587 C 0:99961i
5 3:40826e 7 C 1:0i
6 1:04591e 13 C 1:0i

interesting. The question is if f .z/ D 0 has multiple roots then which initials z0 converge to
which roots? And a computer can help us to visualize this. Assume that we know the exact roots
and they are stored in a vector zexact D ŒzN 1 ; zN2 ; : : :. Corresponding these exact roots are some
colors, one color for each root. Then, the steps are

1. A large number of points on the complex plane is considered.

2. For each of these points with coordinates .x; y/, form a complex number z0 D x C iy.
Use Eq. (4.5.10) with z0 to find one root z. Then find the position of z in zexact , thus find
the associated color. That point .x; y/ is now assigned with that color.

3. Now we have a matrix of which each element is a color, we can plot this matrix as an
image.

You can findpthe code in Listing A.7.


p Let’s apply it to f .z/ D z
3
1 D 0. The roots of f .z/ are:
1, 1=2 C i 3=2, and 1=2 i 3=2. Three roots and thus three colors. Points inp the green
color converge to the root zN1 D 1, those in the purple color to the root zN 2 D 1=2 C i 3=2 and
ones in the red color converge to the remaining root. These three domains are separated by a

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 406

100 100
Color Color
200 200 3.0 3.0
y y 2.5 2.5
300 300
2.0 2.0
400 400 1.5 1.5
1.0 1.0
500 500
100 200(a)300 400 500 100 200(b)300 400 500

Figurex4.44: Newton’s fractals for z 3 x1 D 0.

boundary which is known as Newton fractal. We see that complex numbers very close together,
converging to different solutions, arranged in an intricate pattern.
Arthur Cayley (1821 – 1895) was a prolific British mathematician who worked mostly on
algebra. He helped found the modern British school of pure mathematics. In 1879 he published
a theorem for the basin of attraction for quadratic complex polynomials. Cayley also considered
complex cubics, but was unable to find an obvious division for the basins of attraction. It was
only later in the early 20th century that French mathematicians Pierre Joseph Louis Fatou (1878 –
1929) and Gaston Maurice Julia (1893 – 1978) began to understand the nature of complex cubic
polynomials. With computers, from 1980s mathematicians were able to finally create pictures
of the basins of attraction of complex cubic functions.

4.6 The fundamental theorem of calculus


We have defined the integral of the function y D f .x/ over the interval Œa; b as the area under
the curve described by y D f .x/. By approximating this area by the area of many many thin
slices, we have arrived with a definition of the integral as a limit of a sum of many many small
parts. But this definition is not powerful, as it only allowed us to compute this simple integral
Rb n
a x dx. How about other functions? How about even the area of a circle?
It turns out that the answer is lying in front of us. We just do not see it. Let’s get back to
the distance-speed problem. From the distance traveled s.t/ we can determine the instantaneous
speed by differentiating the distance s.t/: v.t/ D ds=dt. How about the inverse? Given the
(non-uniform) speed v.t/ can we determine the distance? Let’s do that.
Assume that the time interval of interest is Œ0; T  where T is a fixed number (e.g. 5 hours).
Let’s call ds an infinitesimal distance, then the total distance is simply the sum of all these ds,
RT RT
or symbolically 0 ds. But ds D vdt , so the distance is 0 vdt . So, the distance is the area
under the speed curve v.t/. Actually, this is not unexpected (Fig. 4.45).

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 407

v v
(km/h) (km/h)
30

20 20
30(km)
s = 40(km) 20(km)

0 2 t 0 1 2 t
(hours) (hours)
v v
(km/h) (km/h)

30
20

0 1 2 t t
(hours) (hours)

Figure 4.45: Distance is the area under the speed curve.

If we now consider T a variable, then s.T / is a function of T :


Z T
s.T / D v.t/dt (4.6.1)
0

But the derivative of s.T / is v.T /, thus we have


Z T
d
v.t/dt D v.T / (4.6.2)
dT 0
which says that differentiating an integral of a function gives back the function. In other words,
differentiation undoes integration. Actually, we have seen this beforeŽ :
Z x  
2 x3 d x3
x dx D ; D x2
0 3 dx 3
Now, we suspect that this relation between differentiation and integration holds for any function.
We set the task to examine this. Is the following true?
Z x
d
f .t/dt D f .x/ (4.6.3)
dx 0
Ž
For the indefinite integral see Section 4.3.3, with b replaced by x.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 408

Rx
The integral 0 f .t/dt is the area under the curve y D f .t/ from 0 to x. By considering a
tiny change dx, we can see that the change in this area is f .x/dx (Fig. 4.46). Therefore, the
derivative of the area is f .x/. We proved Eq. (4.6.3) using the differential dx.

y y
y = f (x)
y = f (x)

Z x f (x)
Z x
f (t)dt f (x)dx
f (t)dt 0 dx
0

x x
x x
(a) (b)

Figure 4.46: Geometric proof of Eq. (4.6.3). The key point is to think of the area problem dynamically.
Imagine sliding x to the right at a constant speed. You could even think of x as time; Newton often did.
Then the area of the crossed region changes continuously as x moves. Because that area depends on x,
it should be regarded as a function of x. Now considering a tiny change of x, denoted by dx. The area
is increased by a tall, thin rectangle of height f .x/ and infinitesimal width dx; this tiny rectangle has an
infinitesimal area f .x/dx. (Actually there is a tiny triangle, but it is nothing compared with the rectangle).
Thus, the rate at which the area accumulates is f .x/. And this leads to Eq. (4.6.3).

RT
Assume that the speed is v.t/ D 8t t 2 , what is the distance 0 v.t/dt ? We do not know
how to evaluate this integral (not using the definition of integral of course) but we know that
it is a function s.T / such that ds=d T D v.T / D 8T T 2 , from Eq. (4.6.2). A function like
s.T / is called an anti-derivative. We have just met something new here. Before, we are given
a function, let say, y D x 3 , and we’re asked (or required) to find its derivative: .x 3 /0 D 3x 2 .
Now, we’re facing the inverse problem: .‹/0 D 3x 2 , that is finding the function of which the
derivative is 3x 2 . We know that function, it is x 3 . Thus, x 3 is one anti-derivative of 3x 2 . I used
the word one anti-derivative for we have other anti-derivatives. In fact, there are infinitely many
anti-derivatives of 3x 2 , they are x 3 C C , where C is called a constant of integration. It is here
because the derivative of a constant is zero. Graphically, x 3 C C is just a vertical translation of
the curve y D x 3 , the tangent to x 3 C C at every point has the same slope as those of x 3 .
Coming back now to s.T /, we can thus write:
Z 9
T
>
> Z
.8t 2
t /dt D s.T / = T
T3
0 H) s.T / D .8t t 2 /dt D 4T 2 CC (4.6.4)
ds >
2>
3
D 8T T ; 0
dT

To find the integration constant C , we use the fact that s.0/ D 0, so C D 0.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 409

It is straightforward to use Eq. (4.6.4) for determining the distance traveled between t1 and
Rb
t2 (we’re really trying to compute the general definite integral a f .x/dx here):

Z t2
.8t t 2 /dt D s.t2 / s.t1 /
   
t1
t23 t13
D 4t22 CC 4t12
CC (4.6.5)
3 3
   
t23 t13
D 4t22 2
4t1
3 3

There is nothing special about distance and speed, we have, for any function f .x/, the
following result

Z b
dF
f .x/dx D F .b/ F .a/ with D f .x/ (4.6.6)
a dx

which is known as the fundamental theorem of calculus, often abbreviated as FTC. So, to find a
definite integral we just need to find one anti-derivative of the integrand, evaluate it at two end
points and subtract them. It is this theorem that makes the problem of finding the area of a curve
a trivial exercise for modern high school students. Notice that the same problem once required
the genius of the likes of Archimedes.
While it is easy to understand Eq. (4.6.5) as the distance traveled between t1 and t2 must be
s.t2 / s.t1 /, it is hard to believe that a definite integral which is the sum of all tiny rectangles
eventually equals F .b/ F .a/; only the end points matter. But this can be seen if we use
Leibniz’s differential notation:

Z b Z b Z b
dF
f .x/dx D dx D dF
a a dx a
D .
F2 F1 / C .
F
3 2 / C .F4
F
 3 / C    C .Fn
F
 Fn 1 /
D Fn F1 D F .b/ F .a/

The term .F2 F1 / C .F3 F2 / C .F4 F3 / C    C .Fn Fn 1 / is a sum of differences.


The same person (Leibniz) who often worked with such sums (see Section 2.21.6) was the one
who discovered the fundamental theorem of calculus. Newton, on the other hand, discovered
the exact same theorem via a different way: the way of motion. And Newton is the father of
mechanics–the science of motion!

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 410

History note 4.1: Sir Isaac Newton (25 December 1642 – 20 March 1726/27)
Sir Isaac Newton was an English mathematician, physicist, astronomer,
and theologian (described in his own day as a "natural philosopher")
who is widely recognized as one of the most influential scientists of
all time and as a key figure in the scientific revolution. His book
Philosophiæ Naturalis Principia Mathematica (Mathematical Princi-
ples of Natural Philosophy), first published in 1687, established clas-
sical mechanics. Newton also made seminal contributions to optics,
and shares credit with Gottfried Wilhelm Leibniz for developing the
infinitesimal calculus.
Newton was born prematurely in 1642 at his family’s home near the town of Grantham,
several months after the death of his father, an illiterate farmer. When Newton was three,
his mother wed a wealthy clergyman, who didn’t want a stepson. Newton’s mother went
to live with her new husband in another village, leaving behind her young son in the care
of his grandparents.
In 1705, Newton was knighted by Queen Anne. By that time, he’d become wealthy after
inheriting his mother’s property following her death in 1679 and also had published
two major works, 1687’s “Mathematical Principles of Natural Philosophy” (commonly
called the “Principia”) and 1704’s “Opticks.” After the celebrated scientist died at age
84 on March 20, 1727, he was buried in Westminster Abbey, the resting place of English
monarchs as well as such notable non-royals as Charles Darwin, Charles Dickens and
explorer David Livingstone.

History note 4.2: Gottfried Wilhelm (von) Leibniz (1646-1716)


Gottfried Wilhelm Leibniz as a German philosopher, mathematician,
and political adviser, important both as a metaphysician and as a lo-
gician and distinguished also for his independent invention of the
differential and integral calculus. As a child, he was educated in the
Nicolai School but was largely self-taught in the library of his father–
Friedrich Leibniz, a professor of moral philosophy at Leipzig–who
had died in 1652. At Easter time in 1661, he entered the Univer-
sity of Leipzig as a law student; there he came into contact with the
thought of scientists and philosophers who had revolutionized their
fields–figures such as Galileo and René Descartes. During the 1670s (slightly later than
Newton’s early work), Leibniz developed a very similar theory of calculus, apparently
completely independently. Within the short period of about two months he had developed
a complete theory of differential calculus and integral calculus. Unlike Newton, however,
he was more than happy to publish his work, and so Europe first heard about calculus from
Leibniz in 1684, and not from Newton (who published nothing on the subject until 1693).
When the Royal Society was asked to adjudicate between the rival claims of the two men

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 411

over the development of the calculus, they gave credit for the first discovery to Newton,
and credit for the first publication to Leibniz. However, the Royal Society under the rather
biassed presidency of Newton, later also accused Leibniz of plagiarism, a slur from which
Leibniz never really recovered. Ironically, it was Leibniz’s mathematics that eventually
triumphed, and his notation and his way of writing calculus, not Newton’s more clumsy
notation, is the one still used in mathematics today.

4.7 Integration techniques


Let’s see how many ways we can compute integrals (indefinite or definite) using paper and
pencil. The first way is to use the definition of integral as the limit of the sum of all the areas of
the small thin rectangles. The fundamental theorem of calculus saves us from going down this
difficult track. Therefore, the second way is to find an anti-derivative of the integrand function.
Anti-derivatives of many common functions have been determined and tabulated in tables. So,
we just do ‘table look up’. Clearly that these tables cannot cover all the functions, so we need
a third way (or fourth). This section presents integration techniques for functions of which
anti-derivatives not present in tables.

4.7.1 Integration by substitution


 R 
We will not find the anti-derivative of cos x 2 2x in any table. However, cos x 2 2xdx can
Rp
be computed quite straightforwardly. Similarly, it is also possible to compute 1 C x 2 2xdx.
The two integrals are given by
Z
 
cos x 2 2xdx D sin x 2 C C
Z p (4.7.1)
2 2 3=2
1 C x 2xdx D .1 C x / C C
2
3
And you can verify the above equation by differentiating the RHS and you get the integrands
in the LHS. If you look at these two integrals carefully, you will recognize that they are of this
form:
Z b Z ˇ
0
f .g.x//g .x/dx D f .u/du; u D g.x/ (4.7.2)
a ˛

So, we do a change of variable u D g.x/, which leads to du D g 0 .x/dx, then the LHS of
Rb Rˇ
Eq. (4.7.2) becomes the RHS i.e., a f .g.x//g 0 .x/dx D ˛ f .u/du. Of course, ˛ D g.a/ and
ˇ D g.b/. Eq. (4.7.2) is called integration by substitution and it is based on the chain rule of
differentiation. Nothing new here, one fact of differentiation leads to another corresponding fact
of integration, because they are related.
Now we can understand Eq. (4.7.1). Let’s consider the first integral, we do the substitution
u D x 2 , hence du D 2xdx, then:
Z Z
 
cos x 2xdx D cos.u/du D sin u C C D sin x 2 C C
2

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 412

Proof. Proof of integration by substitution given in Eq. (4.7.2). We start with a composite
function F .g.x// as we want to use the chain rule. We compute the derivative of this function:
d
F .g.x// D F 0 .g.x//g 0 .x/ (4.7.3)
dx
Now we integrate the two sides of the above equation, we get:
Z b Z b
d
F .g.x//dx D F 0 .g.x//g 0 .x/dx
a dx a

(if we have two identical functions, the areas under the two curves described by these two
functions are the same, that’s what the above equation means). Now, the FTC tells us that
Z b
d
F .g.x//dx D F .g.b// F .g.a// (4.7.4)
a dx
Introducing two new numbers ˛ D g.a/ and ˇ D g.b/, then as a result of the FTC, where
u D g.x/, we have:
Z ˇ
F .ˇ/ F .˛/ D F 0 .u/du (4.7.5)
˛
From Eqs. (4.7.4) and (4.7.5) we obtain,
Z ˇ Z b Z b
0 d
F .u/du D F .g.x//dx D F 0 .g.x//g 0 .x/dx
˛ a dx a

To make f .x/ appear, just introducing f .x/ D F 0 .x/, then the above equation becomes
Z b Z ˇ
0
f .g.x//g .x/dx D f .u/du
a ˛


So, the substitution rule guides us to replace a hard integral by a simpler
R p one. The main
challenge is to find an appropriate substitution. For certain integrals e.g. 1 x 2 dx, the new
variable is clear: x D sin  to just get rid of the square root. I present in Section 4.7.7 such
trigonometry substitutions. For most of the cases, finding a good substitution is a matter in which
practice and ingenuity, in contrast to systematic methods, come into their own.

Example 4.3
Let’s compute the following integral
Z 
2x 3 3x 2
I D dx
0 .1 C sin x/2

which is the 2015 Cambridge STEP 2b

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 413

What change of variable to be used? After many unsuccessful attempts, we find that
u D  x looks promising:

uD x H) du D dx; 1 C sin x D 1 C sin u

Now we compute the nominator in terms of u:


(
x 3 D . u/3 D  3 3 2 u C 3u2 u3
H) 2x 3 3x 2 D  3 C 3u2 2u3
x 2 D . u/2 D  2 2u C u2

And the integration limits do not change, therefore I becomes:


Z  Z  Z 
 3 C 3u2 2u3 3 du 2u3 3u2
I D du D  du
0 .1 C sin u/2 0 .1 C sin u/
2
0 .1 C sin u/2
And what is the red term? It is I , so we have an equation for I and solving it gives us a new
form for I : Z
3  du
I D
2 0 .1 C sin u/2
We stop here, as the new integral seems solvable. What we want to say here is that this integral
was designed so that the substitution u D  x works. If we slightly modify the integral as
follows
Z =2 3 Z  Z  3
2x 3x 2 2x 3 3x 2 3x 3x 2
I1 D dx; I 2 D dx; I 3 D dx
0 .1 C sin x/2 0 .1 C sin x/
2
0 .1 C sin x/
2

Our substitution would not work! That’s why it was just a trick; even though a favorite one
of examiners. How we integrate these integrals then? We fall back to the very definition of
integral as the sum of many many thin rectangles, but we use the computer to do the boring
sum. This is called numerical integration (see Section 12.4 if you’re interested in, that’s how
scientists and engineers do integrals).
b
Sixth Term Examination Papers in Mathematics, often referred to as STEP, are university admissions tests
for undergraduate Mathematics courses developed by the University of Cambridge. STEP papers are typically
taken post-interview, as part of a conditional offer of an undergraduate place. There are also a number of
candidates who sit STEP papers as a challenge. The papers are designed to test ability to answer questions
similar in style to undergraduate Mathematics.

4.7.2 Integration by parts

Integration by parts is based on the product rule of differentiation that reads

Œu.x/v.x/0 D u0 .x/v.x/ C v 0 .x/u.x/

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 414

Integrating both sides of the above equation gives us


Z Z
0
u.x/v.x/ D u .x/v.x/dx C v 0 .x/u.x/dx (4.7.6)

R R
So, instead of calculating the integral u0 .x/v.x/dx, we compute v 0 .x/u.x/dx which should
be simpler. Basically we transfer the derivative from u to v. The hard thing is to recognize
which should be u.x/ and v.x/. Some examples are provided to see how to use this technique.

Example 4.4
R
We want to determine ln xdx. Start with x ln x and differentiate that (then ln x will show
up), and we’re done:
Z
0
.x ln x/ D ln x C 1 H) ln xdx D x ln x xCC
R
x cos xdx. Start with x sin x (for .sin x/0 D cos x),
Next, considering this integral
Z Z
0
.x sin x/ D sin x C x cos x H) x cos xdx D x sin x sin xdx D x sin x C cos x C C

Example 4.5
R
Let’s consider x 2 e x dx. This one is interesting as we will need to do integration by parts
two times. First, recognize that the derivative of e x is itself, so we consider the function x 2 e x ,
its derivative will make appear x 2 e x (the integrand), and another term with a lower power of
x (which is good). So,
Z Z
2 x 0 x 2 x 2 x 2 x
.x e / D 2xe C x e H) x e dx D x e 2 xe x dx

Now, we have an easier problem to solve: the integral of xe x . Repeat the same step, we write
Z Z
x 0 x x x x
.xe / D e C xe H) xe dx D xe e x dx D xe x e x

And, voilà, the result is


Z
x 2 e x dx D x 2 e x 2xe x C 2e x C C (4.7.7)

Should we stop here and move R 5toxother integrals?


R 20If we stop here and someone come to ask
us to compute this integral x e dx or even x e x dx, we would struggle to solve these
integrals. There is a structured behind Eq. (4.7.7), which we will come back to in Section 4.7.4.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 415

4.7.3 Trigonometric integrals: sine/cosine


This section presents the following integrals (on the left column are particular examples of the
integrals in the right column):
Z Z
2 3
sin x cos xdx sinp x cosq xdx (p or q is odd)
Z Z
sin5 xdx sinp x cosq xdx (p is odd, q D 0)
Z Z
sin2 x cos2 xdx sinp x cosq xdx (p or q is even)
Z 2 Z 2
sin 8x cos 6xdx sin px cos qxdx (p; q D 0; 1; 2; : : :)
Z0 2 Z0 2
sin 8x sin 7xdx sin px sin qxdx (p ¤ q)
Z 2
0 Z 2
0
2
sin 8xdx sin px sin qxdx (p D q)
0 0

We restrict the discussion in this section to nonnegative p and q. The next section is devoted to
negative exponents, and you can see it is about integration of tangents and secants. The integrals
in the last three rows are very important; they aren’t exercises on integrals. They are the basics
of Fourier series (Section 4.19).
Before computing these integrals, we Rwould like to calculate the last one without actually
2
calculating it. We know immediately that 0 sin2 8xdx D . Why? This is because:
Z 2 Z 2 Z 2
2 2
sin 8xdx C cos 8xdx D dx D 2 (4.7.8)
0 0 0

R 2 R 2
And 0 sin2 8xdx D 0 cos2 8xdx because of symmetry.

Example 4.6
R
Let’s compute sin2 x cos3 xdx. As sin2 x C cos2 x D 1, we can always replace an even
power of cosine (cos2 x) in terms of sin2 x. We are left with cos xdx which is fortunately
d.sin x/. So,

sin2 x cos3 xdx D sin2 x cos2 xd.sin x/


(4.7.9)
D sin2 x.1 sin2 x/d.sin x/ D .sin2 x sin4 x/d.sin x/

Therefore (the following is actually substitution with u D sin x)


Z Z
1 1 5
sin x cos xdx D .sin2 x sin4 x/d.sin x/ D sin3 x
2 3
sin x C C (4.7.10)
3 5

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 416

Example 4.7
R
How about sin5 xdx? The same idea: sin5 x D sin4 x sin x, and sin xdx D d.cos x/.
Details are:
sin5 xdx D sin4 xd.cos x/
D .1 cos2 x/2 d.cos x/ D . 1 C 2 cos2 x cos4 x/d.cos x/

Then, substitution now with u D cos x gives


Z Z
2 1
sin xdx D . 1 C 2 cos2 x cos4 x/d.cos x/ D
5
cos x C cos3 x cos5 x C C
3 5
R
These two examples cover the integral sinp x cosq xdx where p; q  0 and either p or q is
odd. Next example is a case where the exponent is even.

Example 4.8
R
Consider this integral cos4 xdx. We can do integration by parts or use trigonometric identi-
ties to lower the exponent. Here is the second approach:
 2
4 1 C cos 2x 1 C 2 cos 2x C cos2 2x 1 cos 2x 1 C cos 4x
cos x D D D C C
2 4 4 2 8
Thus, the integral is given by
Z Z  
4 1 cos 2x 1 C cos 4x 3x sin 2x sin 4x
cos xdx D C C dx D C C CC
4 2 8 8 4 32

Example 4.9
R
Let’s compute the integral sin2 x cos2 xdx. Again, we use trigonometric identities to lower
the powers, this time for both sine and cosine:

1 cos 2x 1 C cos 2x 1 cos2 .2x/


sin2 x cos2 x D D
2 2 Z 4 (4.7.11)
1 cos.4x/ x sin.4x/
D H) sin2 x cos2 xdx D CC
8 8 8 32

Example 4.10
R 2
Consider the integral 0 sin 8x cos 6xdx. The best way is to use the product-to-sum identity,

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 417

see Eq. (3.8.6), to replace a product of sines with a sum of two sines:
 
1
sin 8x cos 6x D sin 14x C sin 2x
2
Z 2 Z
1 2
H) sin 8x cos 6xdx D .sin 14x C sin 2x/dx D 0
0 2 0
The result is zero because ofR the nature of the sine function, y sin(2x)
2
see the figure. We always have 0 sin nxdx D 0 for any posi- sin(x)

tive integer n. This is because the plus area


R is equal to the nega- (+)
tive area. The last integral we consider is sin 8x sin 6xdx. At x
this moment, this integral is piece of cake: since sin 8x sin 6x D 0 π
2
π 3π
2

1=2.cos 2x cos 14x/, so we have (−)

Z 2 Z
1 2
sin 8x sin 6xdx D .cos 2x cos 14x/dx D 0
0 2 0

4.7.4 Repeated integration by parts


R
To illustrate the technique, let’s consider sinn xdx, n 2 N. By using integration by parts, we
obtain a recursive formula for this integral

.sinn 1 x cos x/0 D .n 1/ sinn 2 x cos2 x sinn x


Z Z
H) sinn xdx D .n 1/ sinn 2 x cos2 xdx sinn 1 x cos x
Z Z
(4.7.12)
H) sinn xdx D .n 1/ sinn 2 x.1 sin2 x/dx sinn 1 x cos x
Z Z
n 1 1 n 1
H) sinn xdx D sinn 2 xdx sin x cos x
n n

Thus, each integration


R by part lowers the power of sin x from
R n to n 2, another integration
R by
parts gets to sinn 4 dx. We proceed until we get either sin xdx if n is odd or dx if n is
even.
Now, we show that Eq. (4.7.12) can lead to an infinite product for . Using the above but
with integrations limits 0 and =2, we have (the term Œ n1 sinn 1 x cos x=2
0 D 0)

Z =2 Z =2
n 1
n
sin xdx D sinn 2
xdx (4.7.13)
0 n 0

Now, consider two cases: n is even and n is odd. For the former case (n D 2m), repeated

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 418

application of Eq. (4.7.13) gives usŽŽ


Z =2 Z
2m 2m 1 =2 2m 2
sin xdx D sin xdx
2m
   Z =2
0 0
2m 1 2m 3
D sin2m 4 xdx (4.7.14)
2m 2m 2 0
      
2m 1 2m 3 2m 5 3 1 
D 
2m 2m 2 2m 4 4 2 2
And for odd powers n D 2m C 1, we have
Z =2      
2m 2m 2 4 2
sin 2mC1
xdx D  (4.7.15)
0 2m C 1 2m 1 5 3
From Eqs. (4.7.14) and (4.7.15), we obtain by dividing the former equation by the latter equation
 2244 2m  2m
D  (4.7.16)
2 1335 .2m 1/  .2m C 1/
R =2 R
where we used the fact that 0 sin2m xdx= 0=2 sin2mC1 xdx D 1 when m approaches infinity (a
proof is due in what follows). And this infinite product for =2 is known as Wallis’ infinite
product, as John Wallis was the first who discovered it (and what I am presenting is basically his
derivation). In Section 4.15.7, I will present Euler’s derivation of this infinite product.
Proof. We’re proving the following result
R =2
sin2m xdx
lim R 0 D1
m!1 =2
0 sin2mC1 xdx

As 0  x  =2, we have

0  sin x  1 H) sin2mC1 x  sin2m x  sin2m 1


x

Thus, integrating these functions from 0 to =2 we get (see Eq. (4.3.8) if not clear)
Z =2 Z =2 Z =2
2mC1 2m
sin xdx  sin xdx  sin2m 1
xdx
0 0 0

A bit of arrangement leads to


R =2 R =2
sin2m xdx sin2m 1
xdx
R =2 R =2
0 0
1 
0 sin2mC1 xdx 0 sin2mC1 xdx
ŽŽ
To find out the numbers 3=4 and 1=2 in the last equality, just use m D 3. The number =2 is nothing but
R =2
0 dx when m has been reduced to 1.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 419

Now, let’s denote the ratio on the RHS of the above equation by A and we want to compute it.
First, Eq. (4.7.12) is used to get
Z =2 Z =2
2mC1 2m
sin xdx D sin2m 1 xdx
0 2m C 1 0
Thus, A is given by
R =2 R =2
sin2m 1
2m C 1 0 sin2m
xdx 1
xdx 1
R =2 R =2 2m
0
D D1C
sin2mC1 xdx 2m sin 1
xdx 2m
0 0

From that A approaches 1 when m approaches infinity. 


R1
4.7.5 What is 0 x 4 e x dx?
In Eq. (4.7.7), using integration by parts twice, we got the following result
Z
x 2 e x dx D Cx 2 e x 2xe x C 2e x (4.7.17)

We can see the structure in the RHS: x 2 ! 2x ! 2; that is the result of the repeated differ-
entiation of x 2 . The alternating signs C= =C are due to the minus sign appearing in each
integration by parts.
With this understanding, without actually doing the integration, we know that
Z
x 4 e x dx D x 4 e x 4x 3 e x C 12x 2 e x 24xe x C 24e x

We can check this using SymPy. R


1
Now we move to the integral 0 x 4 e x dx. First, replacing e x by e x we have the following
results:
Z
x 2 e x dx D x 2 e x 2xe x 2e x
Z
x 4 e x dx D x 4 e x 4x 3 e x 12x 2 e x 24xe x 24e x

Focus now on the second integral, but now with special integration limits of 0 and infinity, we
have:
Z 1
 1
x 4 e x dx D x 4 e x 4x 3 e x 12x 2 e x 24xe x 0 4Še x j1
0 (4.7.18)
0

All the terms in the brackets are zeroes and e x j10 D 1, thus we obtain a very interesting
result: Z 1
x 4 e x dx D 4Š (4.7.19)
0

This is a stunning result. Can you see why? We will come back to it later in Section 4.20.2.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 420

4.7.6 Trigonometric integrals: tangents and secants


R R
This section discusses integrals of tanm xdx and sec xdx. Why these two functions to-
gether? Because they are related via sec2 x tan2 x D 1.
The plan is to start simple with m D 1, m D 2 then m D 3 and use the pattern R observed
in these results, hopefully
R to get the integral for any m  3. For m D 1, that is sin x=cos x dx,
which is of the form sin cosq dx with p D 1, and q D 1 (both are odd). So, in the same
p

manner of what discussed in Section 4.7.3 we proceed as follows


Z Z Z
sin x d.cos x/
tan xdx D dx D D ln j cos xj C C (4.7.20)
cos x cos x
Now, we move to square of the tangent, and we relate it to the secant (using 1 C tan2 x D sec2 x
and .tan x/0 D sec2 x):
Z Z
tan xdx D .sec2 x 1/dx D tan x x C C
2
(4.7.21)
R
How about tan3 xdx? R We write tan x D tan x tan x D .sec x 1/ tan x, thus we can
3 2 2

relate this integral to tan xdx, which we know, and something that is easy:
Z Z Z Z
3 2 2
tan xdx D .sec x 1/ tan xdx D sec x tan xdx tan xdx
Z
D tan xd.tan x/ ln j cos xj (4.7.22)

tan2 x
D (substitution u D tan x)
ln j cos xj
2
R
Now, we see the way and can do the general integral tanm xdx:
Z Z Z
m 2 m 2
tan xdx D tan x tan xdx D .sec2 x 1/ tanm 2 xdx
Z Z Z
2 m 2 tanm 1 x
D sec x tan xdx tanm 2 dx D dx tanm 2
dx
m 1
(4.7.23)
R R
That
R is, we have a formula for tanm xdx that requires tanm 2 xdx, which in turn R involves
tan
R
m 4
xdx and so on. Depending on m being odd or even, this leads us to either tan xdx
or tan2 xdx, which we know how to integrate.
R Ok. Let’s move to the secant function. How we’re going to compute the following integral
sec xdx? Replacing sec x D 1=cos x would not help. Think of its friend tan x, we do this:
Z Z Z
sec x sec x
sec xdx D dx D dx
1 sec x tan2 x
2

We succeeded in bring in the two friends. Now the next is just algebra:
Z Z Z  
sec x 1 1 1
sec xdx D dx D C dx
.sec x tan x/.sec x C tan x/ 2 sec x tan x sec x C tan x

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 421

Now, we switch to sin x and cos x, as we see something familiar when doing so:
Z Z  
1 cos x cos x
sec xdx D C dx
2 1 sin x 1 C sin x
Z  
1 d.1 sin x/ d.1 C sin x/
D C
2 1 sin x 1 C sin x
1 1 1 C sin x
D .ln.1 C sin x/ ln.1 sin x// D ln
2 2 1 sin x
We can stop here. However, we can further simplify the result, noting that

1 C sin x sin2 x=2 C cos2 x=2 C 2 sin x=2 cos x=2


D
1 sin x sin2 x=2 C cos2 x=2 2 sin x=2 cos x=2
   
sin x=2 C cos x=2 2 1 C sin x 2
D D D .sec x C tan x/2
sin x=2 cos x=2 cos x

And finally, the integral of sec x is:


Z
1 1 C sin x 1
sec xdx D ln D ln .sec x C tan x/2 D ln j sec x C tan xj C C
2 1 sin x 2
R R R
This one was hard, but sec2Rxdx is easy. It is .1 C tan2 x/dx. How about sec3 xdx? We
do the same thing we did for cos3 xdx:
Z Z
3
sec xdx D sec2 x sec xdx
Z Z Z
2
D .1 C tan x/ sec xdx D sec xdx C tan2 x sec xdx

R
For the integral tan2 x sec xdx, we use integration by parts with u D sec x and v D tan x.
Finally,
Z
sec3 xdx D 0:5.sec x tan x C ln j sec x C tan xj/ C C (4.7.24)

Why bother with this integral? But this integral is the answer to the problem of calculating the
length of a segment of a parabola (Section 4.9.1).

4.7.7 Integration by trigonometric substitution


p
Trigonometric
p substitutions
p are useful to deal with integrals involving terms such as a2 x 2 ,
x 2 a2 , or 1=x 2 Ca2 or x 2 C a2 . These substitutions remove the square roots, and usually
lead to simpler integrals.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 422

Example 4.11
For example, consider the following definite integral
Z 4
dx
p
0 16 x 2
With this substitution x D 4 sin , we have
8
< dx
p
D 4pcos d
x D 4 sin  H) 16 x 2 D 16.1 sin2 / D 4 cos 
: 
0  2

And thus the integral becomes


Z 4 Z =2
dx 4 cos  
p D d D
0 16 x2 0 4 cos  2

Example 4.12
Simple. But, how about the following?
Z 8
dx
p
x 2 16 4
p p
The substitution x D 4 sin  would not work: x 2 16 D 16.sin2  1/ which is mean-
ingless, as the radical is negative. So, we use another trigonometric function: the secant func-
tion. The details are
8
< dx
p Dp 4 tan  sec d
x D 4 sec  H) x 2 16 D 16.sec2  1/ D 4 tan  (4.7.25)
: 
0  3

And the original integral is simplified to


Z 8 Z =3 Z
dx 4 tan  sec 
p D d D sec d D ln.sec  C tan /=3
0
4 x2 16 9 4 tan 

Example 4.13
Now comes another trigonometric substitution using the tangent function. The following
integral Z 1
dx
(4.7.26)
0 16 C x 2

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 423

with 8
< dx D 4 sec2 d
x D 4 tan  H) 16 C x 2 D 16.1 C tan2 / D 16 sec2  (4.7.27)
: 
0  2
is simplified to
Z 1 Z 
dx =2
4 sec2  1 =2 
D d D  D
0 16 C x 2 0
2
16 sec  4 0 8
R 1 dx
Sometimes we see an integral which is a disguised form of 0 16Cx 2 , for example:

Z
dx
5x 2 10x C 25

In this case, we just need to complete the square i.e., 5x 2 10x C 25 D 2 C c, c is a constant.
Then, the substitution of x D c tan  is used. So, the steps are:
Z Z Z
dx 1 dx 1 d.x 1/
D D
5x 2 10x C 25 5 x 2 2x C 5 5 .x 1/2 C 4
Z  
1 du 1 1 x 1
D D tan CC
5 u2 C 4 10 2
The second step is completing the square, the third step is to rewrite it in the familiar form of
Eq. (4.7.26).

Example 4.14
p
Now we consider the integrand of the form x 2 C a2 . The following integral
Z p
I D x 2 C a2 dx; a > 0 (4.7.28)

with 
dx
p D a sec2 d
x D a tan  H)
x 2 C a2 D a sec 
R
So, I becomes a2 sec3 d, using the result in Eq. (4.7.24), we then have
Z p Z
2
x C a dx D a
2 2 sec3 d D 0:5a2 .sec  tan  C ln j sec  C tan j/

p
Now, we need to call x back noting that tan  D x=a and sec  D x 2 Ca2=a:
Z p
1 p a2 p
x 2 C a2 dx D x x 2 C a2 C ln jx C x 2 C a2 j C C (4.7.29)
2 2

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 424

Is x D a tan  the only substitution for this problem? Euler did not think so. This is what he
used:
p 2 2 Z  2 2

a t a t
x 2 C a2 D x C t H) x D H) dx D dt H) I D C t dt
2t 2t
Now, the original integral in terms of x turns into an integral in terms of t. And this integral
can be computed with ease; the result is some g.t/. What is interesting is to prove that g.t/ is
identical to Eq. (4.7.29). That’s a nice exercise on algebra. And it also demonstrates the beauty
of mathematics: something as messy as g.t.x// is nothing but a nice function in Eq. (4.7.29).

Example 4.15
I present the final trigonometric substitution so that we can evaluate integrals of any rational
function of sin x and cos x. For example,
Z Z
dx dx
;
3 5 sin x 1 C sin x cos x
The substitution is (discovered by the Germain mathematician Karl Weierstrass (1815-1897))

x 2du
u D tan ; dx D
2 1 C u2
This is because, as given in Eq. (3.8.8), we can express sin x and cos x in terms of u:

2u 1 u2
sin x D ; cos x D
1 C u2 1 C u2
R
Then, dx
3 5 sin x
becomes:
Z Z
dx du
D2 (4.7.30)
3 5 sin x 3u2 10u C 3
This integral is of the form P .u/=Q.u/ and we discuss how to integrate it in the next section.

It is always a good idea to stop doing what we’re doing, and summarize the achievement. I
provide such a summary in Table 4.14.

4.7.8 Integration of P .x/=Q.x/ using partial fractions


This section is about the integration of rational functions, those of the form P .x/=Q.x/ where
P .x/ and Q.x/ are polynomials (Section 2.29). The most important thing is that we can always
integrate these rationals using elementary functions that we know.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 425

Table 4.14: Summary of trigonometric substitutions.

form substitution dx new form


p
a2 x2 x D a sin  a cos d a cos 
p
x2 a2 x D a sec  a tan  sec d a tan 
a2 C x 2 x D a tan  a sec2 d a2 sec2 
1 C sin x P .u/
u D tan x=2 2du=.1 C u2 /
1 C cos x Q.u/

We start off with this observation: it is not hard to evaluate the following indefinite integral
Z  
1 3 4
C dx D ln jx 2j C 3 ln jx C 2j 4 ln jxj C C
x 2 xC2 x
R
However, it is not obvious how to do the following integral x4xC16 3 4x dx. The basic idea is that,

we can always transform 4xC16=x 4x into a sum of simpler fractions (called partial fractions):
3

4x C 16 4x C 16 A B C
D D C C
x 3 4x x.x 2/.x C 2/ x x 2 xC2

where each partial fraction is of the form p.x/=q.x/ where the degree of the nominator is one
less than that of the denominator. This is called the method of Partial Fraction DecompositionŽ .
To find the constants A; B; C , we just convert the RHS into the form of the LHS:

A B C .A C B C C /x 2 C 2.B C /x 4A
C C D
x x 2 xC2 x 3 4x
As this fraction is equal to 4xC16=x 3 4x , the two nominators must be the same, thus we have
.A C B C C /x 2 C 2.B C /x 4A  4x C 16, which leads to

A C B C C D 0; 2.B C / D 4; 4A D 16 H) A D 4; B D 1; C D 3
R
Now x4xC16
3 4x dx can be computed with ease:

Z Z  
4x C 16 1 3 4
dx D C dx (4.7.31)
x 3 4x x 2 xC2 x
R
With this new tool we can finish the integral dx
3 5 sin x
, see Eq. (4.7.30):
Z Z Z Z 
dx du 1 du du
D2 D
3 5 sin x 3u2 10u C 3 4 u 3 u 1=3
1 1
D .ln ju 3j ln ju 1=3j/ D .ln j tan x=2 3j ln j tan x=2 1=3j/
4 4
Phu Nguyen, Monash University © Draft version
Chapter 4. Calculus 426

Figure 4.47: Symbolic evaluation of integrals using the library SymPy in Julia. SymPy is actually a
Python library, so we can use it directly not necessarily via Julia.

And we can check our result using a CAS (Fig. 4.47).


If you were attentive you would observe that the two integrals that we have just considered are
of the form P .x/=Q.x/ where the degree of the denominator is larger than that of the nominator.
These particular rationals are called proper rationals. And we just need to pay attention to them
only, as the other case can always be re-written in this form, for exampleŽ :

2x 2 5x 1 2
D 2x C 1 C
x 3 x 3
You should have also noticed that in the considered rationals, Q.x/ has distinct roots i.e., it can
be factored as Q.x/ D .a1 x C b1 /.a2 x C b2 /    .an x C bn / where n is the degree of Q.x/. In
this case, the partial fraction decomposition is:

P .x/ P .x/
D
Q.x/ .a1 x C b1 /.a2 x C b2 /    .an x C bn /
(4.7.32)
A1 A2 An
D C C  C
a1 x C b1 a2 x C b 2 an x C bn
And it’s always possible to find Ai when P .x/ is a polynomial of degree less than n, which is
the case for proper rationals. Why is it possible? Note that P .x/ is a polynomial of degree at
most n 1, so it can be written as

P .x/ D an 1 x n 1
C    C a2 x 2 C a1 x C a0

Therefore, there are n unknowns Ai and n known coefficients ai . So, it is possible to find
A1 ; A2 ; : : :

The case whereQ.x/ has repeated roots. Now we consider the case where Q.x/ has repeated
roots, for example the following integral
Z
x 2 C 15
dx
.x C 3/2 .x 2 C 3/
Ž
The concept was discovered independently in 1702 by both Johann Bernoulli and Gottfried Leibniz
Ž
If you do like polynomials, think of 5=3 D .3 C 2/=3 D 1 C 2=3.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 427

A polynomial is an irreducible polynomial if it cannot be factored. As the blue term x 2 C 3 does


not have real roots, we cannot factor it. It is of an irreducible form. So, we leave it as it is and
focus on the red term, which is obviously .x C 3/.x C 3/, and is a reducible form. Note that
Q.x/ D 0 has a repeated root of 3.
The decomposition in this case is little bit special:

x 2 C 15 Ax C B C D
D 2 C C (4.7.33)
.x C 3/ .x C 3/
2 2 x C3 .x C 3/ 1 .x C 3/2

where the red terms follow this rule: for the term .ax C b/k (in the denominator) we need a
partial fraction for each exponent from 1 up to k. As I am lazy, I have used SymPy to do this
decomposition for me:

x 2 C 15 1 x 1 2
D C C (4.7.34)
.x C 3/ .x C 3/
2 2 2.x C 3/ 2.x C 3/ .x C 3/2
2

Now, it is possible to compute the integral:


Z Z  
x 2 C 15 1 x 1 2
dx D C C dx
.x C 3/2 .x 2 C 3/ 2.x 2 C 3/ 2.x C 3/ .x C 3/2
1 x 1  1 2
D p arctan p ln x 2 C 3 C ln.x C 3/ CE
2 3 3 4 2 xC3

Explanation of the rule for .ax C b/k .

To understand the decomposition when Q.x/ has repeated roots, consider the rational
1=.xC3/2 . Our first attempt to decompose it is:

1 A Bx C C
D C
.x C 3/ 2 x C 3 .x C 3/2
Obviously, this would not work as we have three unknowns A; B; C but there are only
two equations! Surprisingly, if we use a new variable u D x C 3, it works:
 
1 A Bu C C A B C A1 A2
2
D C 2
D C C 2 D C 2
u u u u u u u u
Now, there are only two unknowns A1 ; A2 .

Example 4.16
To wrap up this section, let’s compute the following integral
Z
dx
I D
1 C x4

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 428

p
We need first to factor 1 C x 4 as 1 C x 4 C 2x 2 2x 2 D .1 C x 2 /2 . 2x/2 , then we have
1 1 1
D p D p p
1Cx 4
.1 C x 2 /2 . 2x/2 .x 2 C 2x C 1/.x 2 2x C 1/
The next step is to do a partial fraction decomposition for this, and we’re done. See Fig. 4.47
for the result, done by a CAS.

4.7.9 Tricks
This section presents a few tricks to compute some interesting integrals. If you’re fascinated by
difficult integrals, you can consult YouTube channels by searching for ‘MIT integration bee’ and
the likes . Or you can read the book Inside Interesting Integrals of Paul Nahin [50].
The first example is the following integral
Z 1
cos x
dx
1 1Ce
1=x
Ra
You should ask why the integration limits are 1 and 1, not 1 and 2? Note that a f .x/dx D 0
if f .x/ is an odd function. So, we decompose the integrand function into an even and an odd
part (see Eq. (4.2.1) if not clear):
   
cos x 1 cos x cos x 1 cos x cos x
D C C
1 C e 1=x 2 1 C e 1=x 1 C e 1=x 2 1 C e 1=x 1 C e 1=x
 
1 1 cos x cos x
D cos x C
2 2 1 C e 1=x 1 C e 1=x
And we do not care about the odd part, because its integral is zero, anyway. So,
Z 1 Z 1
cos x
dx D cos xdx D sin.1/
1 1Ce
1=x
0

Feymann’s trick. This trick is based on the Leibniz rule that basically says:
Z b Z b
d I.t/ @f .x; t/
I.t/ D f .x; t/dx H) D dx (4.7.35)
a dt a @t
We refer to Section 7.8.7 for a discussion leading to this rule. The symbol @f @t .x;t /
is a partial
derivative of f .x; t/ with respect to t while holding x constant.
As the first application of this rule, we can generate new integrals from old ones. For example,
we know the following integral (integrals with one limit goes to infinity are called improper
integrals and they are discussed in Section 4.8)
Z 1   =2
dx 1 1 x 
I D D tan D (4.7.36)
0 x 2 C a2 a a 0 2a
Rq p p

One example from the MIT integration bee: x x x : : :dx.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 429

And by considering a as a variable playing the role of t in Eq. (4.7.35), we can write:
Z 1 Z 1
dx dI 2a
I.a/ D H) D dx (4.7.37)
0 x Ca
2 2 da 0 .x C a2 /2
2

And from Eq. (4.7.36)–which says I D =2a–we can easily get dI =da D =2a2 , and thus we
get the following new integral:
Z 1 Z 1
2a  dx 
dx D H) D
0 .x 2 C a2 /2 2a 0 .x 2 C a2 /2 4a3

Of course, we can go further by computing d 2 I =da2 and get new integrals. But we stop here to
do something else.
Suppose we need to evaluate this integral (of which antiderivative cannot be found in ele-
mentary functions) Z 1 2
x 1
dx (4.7.38)
0 ln x
So, we introduce a parameter b, to get
Z 1 b Z 1   Z 1
x 1 dI d xb 1 1
I.b/ D dx H) D dx D x b dx D (4.7.39)
0 ln x db 0 db ln x 0 1Cb

So, we were able to compute dI =db as the integral became simpler! Another integration will
give us I.b/:
dI 1
D H) I.b/ D ln j1 C bj C C (4.7.40)
db 1Cb
To find C , we just look for a special value of b such that I.b/ can be easily evaluated. It can be
seen that I.0/ D 0 D ln 1 C C , so C D 0. And now we come back to the original integral in
Eq. (4.7.38)–which is nothing but I.2/, but I.2/ D ln 3. This trick is very cool. I did not know
this in high school, and only became R 1aware of it by reading Nahin’s book [50].
x2
Let’s consider another integral: 0 e cos.5x/dx. We consider the following integral, and
do the now familiar procedure
8̂ Z 1
dI
ˆ
2
ˆ
ˆ D xe x sin.bx/dx
Z 1 ˆ
ˆ db
Z
< 0
x 2 b 1 x2
I.b/ D e cos.bx/dx H) D e cos.bx/dx (4.7.41)
ˆ
ˆ 2
0
ˆ
ˆ
0
ˆ b
:̂ D I.b/
2
in which we have used integration by parts to arrive at the final equality. Now, we get an equation
to determine I.b/, this is in fact an ordinary differential equation
dI b
D I.b/ (4.7.42)
db 2
Phu Nguyen, Monash University © Draft version
Chapter 4. Calculus 430

Following a variable separation (that is, isolate the two variables I and b on two sides of the
equation) we can get I.b/ by integration:

dI b b2 b 2 =4
D db H) ln jI j D C D H) I D C e .C D e D / (4.7.43)
I 2 4
R1 2 p
Again, we need to find C and with b D 0, we have I.0/ D C D 0 e x dx D =2Ž . So, we
get a nice result for our original integral and many more corresponding with different values of
b:
Z 1 p
x2  25=4
I.5/ D e cos.5x/dx D e
2
Z 1
0
p (4.7.44)
2 
I.2/ D e x cos.2x/dx D
0 2e
R1
Dirichlet integral. Another interesting integral is 0 sin x=x dx. Let us introduce the parameter
b in such a way that differentiating the integrand will give us a simpler integral:
Z 1 Z 1 
sin bx dI sin bx 1
I.b/ D dx H) D cos.bx/dx D
0 x db 0 b 0

Unfortunately, we got an improper integral. So, we need to find another way. We need a function
of which the derivative has x. That can be e bx . But due to the limit of infinity, we have to use
e bx with b  0. Thus, we consider the following integral
Z 1
sin x bx
I.b/ D e dx
0 x
R1
From which 0 sin x=x dx D I.0/. Let’s differentiate this integral w.r.t b:
Z 1
dI
D sin xe bx dx D A
db 0

We can evaluate this integral A using integration by parts:


Z 1 Z 9
1 1
AD bx
sin xe dx D cos xe bx
dx >
>
=
b 0 1
Z 1 0
)AD
1 A >
> 1 C b2
cos xe bx dx D ;
0 b

Now, we have I 0 .b/, and a simple integration gives us I.b/


Z
0 1 db 1
I .b/ D H) I.b/ D D tan bCC
1Cb 2 1 C b2
R1 x2
Ž
How to compute the integral 0 e dx is another story, see Section 5.11.3.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 431

We have to find C : with b D 1, we can compute I.b/, tan 1


b and thus we can get C :
Z 1
sin x 1x 1 
I.1/ D e dx D 0 D tan 1 C C H) C D
0 x 2

So,
Z 1
sin x 
dx D (4.7.45)
0 x 2

Another amazing result about . The graph of sin x=x is given in Fig. 4.48a. And the area of the
shaded region is nothing else but exactly half of .
Mathematicians defined the following functionŽŽ :
Z x
sin t
Si.x/ WD dt (4.7.46)
0 t

And our task is to derive an expression for Si.x/. We have just showed that we cannot compute
the integral directly, the Feynman technique only works for definite integrals in which the limits
are numbers not variables. But we have another way, from Newton: we can replace sin t by its
Taylor series, then we can integrate sin t=t easily:

1 3 1 sin t t2 t4
sin.t / D t t C t5    H) D1 C 
3Š 5Š t 6 5Š

Thus, we can write


Z x Z x    x
sin t t2 t4 t3 t5
dt D 1 C    dt D t C 
0 t 0 3Š 5Š 3  3Š 5  5Š 0
1
x x3 x5
D C 
1  1Š 3  3Š 5  5Š

And voilà, the Si.x/ function is written as:


Z x X 1
sin t x 2iC1
Si.x/ WD dt D . 1/i (4.7.47)
0 t i D0
.2i C 1/.2i C 1/Š

With this result we can visualize the function, see Fig. 4.48 where the graph of sin x=x is also
shown. Given the graph, we now can understand Eq. (4.7.45), when x ! 1, the function
approaches 1:57079, which is =2.
ŽŽ
Why they did so? There is no elementary function whose derivative is sin x=x . However, antiderivatives of this
function come up moderately frequently
Rx in applications, for example in signal processing. So it has been convenient
to give one of its antiderivatives, 0 sint t dt, a name.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 432

y 2
1

0
10 5 0 5 10
1
x
3 2  0  2 3 2
(a) (b)
R x sin t
Figure 4.48: Graph of sin x=x (a) and graph of S i.x/ D 0 t dt (b).

4.8 Improper integrals


Improper
R 1 integrals refer to those integrals whereRone or both of the integration limits are infinity
1 p
e.g. 1 dx=x or the integrand goes to infinity 0 dx= x .
2

To see how we compute improper integrals, let’s consider one simple integral:
Z 1
dx
I D
1 x2
Rb
We do not know how to evaluate this integral, but we know how to compute I.b/ D 1 dx=x 2 . It
is I.b/ D 1 1=b . And by considering different values for b (larger than 1 of course), we have a
sequence of integrals, see Fig. 4.49. Let’s denote this by .I1 ; I2 ; : : : ; In /. It’s obvious that this
sequence converges to 1 when n approaches infinity. In other words, the area under the curve
y D 1=x 2 from 1 to infinity is one. Therefore, we define
Z 1 Z b
dx dx
I D WD lim D1
1 x2 b!1 1 x2

In the same manner, if the lower integration limit is minus infinity, we have this definition:
Z b Z b
I D f .x/dx WD lim f .x/dx
1 a! 1 a

The next improper integral to be discussed is certainly the one with both integration limits being
infinite, like the following Z 1
dx
I D
1 1Cx
2

The strategy is to split this into two improper integrals of the form we already know how to
compute: Z a Z 1
dx dx
I D C
1 1Cx 1 C x2
2
a

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 433

y y y

y = x12 y = x12 y = x12


1 3 x 1 5 x 1 9 x

(a) (b) (c)


Z b
Figure 4.49: Integral dx=x 2 for different values of b. Source code for the figures:
1
plot-calculus-area.jl.

To ease the computation we will select a D 0, just because 0 is an easy number to work with.
The above split does not, however, depend on a (as I will show shortly). With the substitution
x D tan , see Table 4.14, we can compute the two integrals and thus I as
 
I D Œ0 =2 C Œ=2
0 D C D
2 2
Now to show that any value for a is fine, we just use a, and compute I as:
    
arctan a =2
I D Œ =2 C Œarctan a D arctan a C C arctan a D 
2 2
R1
And what we have done for this particular integral applies for 1 f .x/dx.

4.9 Applications of integration


Some elementary applications of integration are given herein. We start with the computation of
length of planar curves in Section 4.9.1, of area and volume in Section 4.9.2. An application in
physics–the gravitation of distributed masses is presented in Section 4.9.4. Finally, the use of
integrals to compute limit of sum is the topic of Section 4.9.5.

4.9.1 Length of plane curves


As the first application of integral, we consider the problem of calculating the length (or arc-
length) of a plane curve expressed by the equation y D f .x/. Determining the length of a curve
is also called rectification of a curve. This is because when rectified, the curve gives a straight
line segment with the same length as the curve’s length. For much of the history of mathematics,
even the greatest thinkers (e.g. Descartes) considered it impossible to compute the length of a
curve. The advent of infinitesimal calculus led to a general formula that provides closed-form
solutions in some cases.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 434

p The idea is simple: take a very small segment ds, of which the length is certainly ds D
dx 2 C dy 2 , then integrating/summing the length of all these segments, we get the total length
(Fig. 4.50). In symbols, we write (as dy D f 0 .x/dx)
Z b Z bp
ds D 1 C Œf 0 .x/2 dx (4.9.1)
a a

What does this equation mean? It tellspus that the length of a plane curve C is the area
of another curve C 0 with the function 1 C y 0 ! And this arc-length problem exposes the
essence of the calculus in a simplest way: differential calculus allows us to compute the
length of a small part of the R curve (ds), and then integral calculus gives us a tool to inte-
grate all these small lengths ( ds). And this is used again and again in sciences and engineering.

y y
f (x) f (x)
ds ∆si
∆yi
ds ∆xi
dy
dx

a b x xi xi+1 xi+2 x

Figure 4.50: Arc length of a curve y D f .x/ for x 2 Œa; b.

If you’re not convinced with this approach of physicists; for example where is the Riemann
sum that is in the definition of a definite integral? For those readers, look at the right picture in
Fig. 4.50. You see that,
p we divide the interval Œa; b into n sub-intervals. For a sub-interval of
length xi , si D 1 C f 0 .xi /xi , then
X
n
p
L D lim 1 C f 0 .xi /xi
n!1
i D1

And the Riemann sum appears, and that’s why we have the integral in Eq. (4.9.1).

Perimeter of a circle. We only have separate functions of each of the four quarters ofpa circle,
so we compute the length of the first quarter. We write the circle’s equation as y D 1 x 2 ,
then a direct application of Eq. (4.9.1) gives
Z 1r Z 1 Z =2
x2 dx 
1C 2
dx D p D d D .x D sin /
0 1 x 0 1 x 2 0 2
Perimeter of an ellipse. Compute
p the perimeter of 1/4 of the ellipse given by y 2 C 2x 2 D 2.
We write the ellipse as y D 2 2x 2 , then an application of Eq. (4.9.1) results in
Z 1r
1 C x2
dx
0 1 x2

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 435

Unfortunately, we cannot compute this integral unless we use numerical integration (Sec-
tion 12.4). Be careful that the integrand is infinity at x D 1 and thus not all numerical integration
method can be used. There is no simple exact closed formula for the perimeter of an ellipse! We
will come back to this problem of the determination of the ellipse perimeter shortly.

Arc-length of a parabola. We find the arc length of a parabola y D x 2 for 0  x  a:


Z ap
sD 1 C 4x 2 dx
0

You know how to computep this integral (Section 4.7.7). Herein we’re interested in finding C 0 ,
which is given by y D 1 C 4x or y D 1 C 4x 2 . And this is a hyperbola. So the length of a
2 2

parabola is nothing new, it’s the area of its cousin-a hyperbola.

Elliptic integrals. Consider an ellipse given by x 2 =a2 C y 2 =b 2 D 1, with a > b, its length is
given by
Z =2 p
C D4 a2 cos2 t C b 2 sin2 tdt
0
p
With k D a2 b 2 =a, we can re-write the above integral as
Z =2 p
C D 4aE.k/; E.k/ D 1 k 2 sin2 tdt
0

The integral E.k/ is known as an elliptic integral. The name comes from the integration of
the arc length of an ellipse. As there are other kinds of elliptic integral, the precise name is the
elliptic integral of second kind. What is then the elliptic integral of first kind? It is defined as
Z =2
dt
E.k/ D p
0 1 k 2 sin2 t
It is super interesting to realize that this integral appears again and again in physics. And we will
see it in the calculation of the period of a simple pendulum (Section 9.8.6).

4.9.2 Areas and volumes


Herein I present the application of integration in computing the area and volume of some geo-
metrical shapes. First, let’s consider a circle of radius r (centered at the origin
p of the coordinate
system). We can write the equation for one quarter of the circle as y D r 2 x 2 , and thus the
area of one quarter of the circle is
Z 1p Z =2
2 r2
r 2 x 2 dx D r cos2 d D
0 0 4
And voilà, the circle area is  r 2 , a result once required the genius of Archimedes and the likes.
This corresponds to the traditional way of slicing a region by thin rectangular strips. For circles

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 436

which possess rotational symmetry, a better way is to divide the circle into many wedges, see
Fig. 4.51:
Z 2 2
r
d D  r 2
0 2

And it gives directly the area of the full circle and no trigonometric substitution is required.

y

y= 1 − x2 rdθ

A

r x

Figure 4.51: Area of a circle: two ways of integration. In the second way, the area of one wedge is
0:5r.rd/ as a wedge is a triangle of base rd and height r.

Next, we compute the volume of a cone with radius r and height h. We approximate the
cone as a series of thin slices of thickness dy parallel to the base, see Fig. 4.52. The volume of
each slide is R2 dy with R D r.1 y= h/, and thus the volume of the cone is:

Z h Z h  2
2 2 y  r 2h 1
R dy D r 1 dy D D Ah; A D r2
0 0 h 3 3

Therefore, the volume of a cone is one third of the volume of a cylinder of the same height and
radius, in agreement with the finding of Greek mathematicians.

R
h

r r
x

Figure 4.52: Volume of a cone by integration. The thick slice has a thickness dy and a radius R D
r.1 y= h/ where y is the vertical coordinate of its center.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 437

In the same manner, we compute the volume of a sphere of r r


radius R as follows (see figure). We consider a slice of thick- y R
ness dy–which is nothing but a cylinder–of which the volume is O
r 2 D R2 y2

 r 2 dy, with r 2 D R2 y 2 where R is the sphere’s radius and y d V D  r 2 dy

is the distance from the origin to the slice. Thus, the total volume,
which is the sum of the volumes of all these slices, is:
Z R
4R3
2 .R2 y 2 /dy D
0 3

We needed a factor of 2 because we consider only slices of the top half of the sphere. Can you see
the similarity with Eq. (4.3.1)? In that equation I presented a pre-calculus method to determine
the volume of sphere.

4.9.3 Area and volume of a solid of revolution


Starting with a curve y D f .x/ for a  x  b and we revolve it around an axis. That produces
a solid of revolution (Fig. 4.53). This section presents how to use integration to compute the
volume of such solids and also the area of the surface of such solids.

(a) (b)

Figure 4.53: Solid of revolution: revolving the red curve y D f .x/ around an axis (the red axis) one full
round (360ı ) and we get a solid of revolution. Generated using the geogeba software.

Volume of a solid of revolution. Assume we have a solid of revolution that is symmetrical


around the x axis. This solid is divided into many slices, each slice is similar to a pizza with
area y 2 , y D f .x/, and thickness dx, thus the volume of the solid is then given by
Z b
volume of solid of revolution around x-axis D Œf .x/2 dx (4.9.2)
a

Area of the surface of a solid of revolution. Using the idea of calculus, to find the area of a
surface of revolution, we need to divide this surface into many tiny pieces, the area of each piece
can be computed. Then, we sum these areas up. When the number of pieces is approaching
infinity we get the surface area. We divide the surface into many thin bands shown in Fig. 4.54.
As the band is thin, it is actually a truncated cone.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 438

Figure 4.54: A surface of revolution obtained by revolving a curve y D f .x/, x 2 Œa; b, around the x-
axis 360ı . To find the surface area, we divide the surface into many tiny bands (one is shown highlighted
by red color). The band was intentionally magnified for visualization purposes. We then find the surface
area of one band, and integrate it to get the total surface.

To find the area of a truncated cone, we start from a cone of radius r and slant s. We find its
surface area, then we cut the cone by a plane and we get: a truncated cone and a smaller cone.
We know the surface area of the original cone and that of the smaller cone, thus we’re done.
That’s the plane. To find the surface area of a cone, we flatten the cone out and get a fraction of
a circle, see Fig. 4.55a. The area of this flattened cone is  rs. The area of a truncated cone is
therefore  r1 s1  r2 s2 where r1 is the radius of the original cone and r2 is the radius of the
circle at the cutting plane. It can be seen that this area also equals 2 rs where r D 0:5.r1 Cr2 /
(Fig. 4.55b) and s is the thickness of the band shown in Fig. 4.54Ž .
Of course now we let s approach zero, then s  ds and r D y D f .x/. The surface
area of the solid of revolution is then the integral of those little areas 2f .x/ds:
Z b Z b p
area of surf. of revolution (x-axis) D 2yds D 2 f .x/ 1 C Œf 0 .x/2 dx (4.9.3)
a a
where we have used the formula for the arclength ds (Section 4.9.1).
How we know that Eq. (4.9.3) is correct? Simple: use it to compute the surface area of
a sphere
p of radius r. If the result is 4 r , then the formula is correct. To this end, consider
2

y D r 2 x 2 and x 2 Œ0; r, then a direct application of Eq. (4.9.3)ŽŽ indeed gives us 4 r 2 .

Gabriel’s Horn is a surface of revolution by revolving the function y D 1=x for x  1


around the x axis. This surface has a name because it has a special property: the volume inside
Gabriel’s Horn is finite but the surface area is infinite. We’re first going to prove these results.
The volume is given by using Eq. (4.9.2)
Z 1
dx
V D  2 D
1 x
And the surface is after Eq. (4.9.3):
Z 1 r
1 1
A D 2 1 C 4 dx D‹‹‹
1 x x
Ž
We have .r1 C r2 /s D .r1 C r2 /.S1 S2 / D  r1 S1  r2 S2 . This is so because r1 S2 D r2 S1 .
ŽŽ
But do not forget to multiply the result with two.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 439

a/ O O


S S
 D .2 r/=S
A D  rS
r
2 r
b/ O O

S2

S1
r2
r1 Cr2
 s
A D 2 2
s
r1

Figure 4.55: Surface area of a truncated cone is 2 rs where r is the average radius and s is the width.
The dashed line indicates where the cut is. The angle of the flattened cone in (a) is  D 2 r=S . From that
we can determine its area as shown in the picture.

After many unsuccessful attempts we realized that it is not easy to compute this integral directly.
How about an indirect way? That is, we compare this integral with another easier integral. The
easier integral is: Z 1
dx
2
1 x
which is infinity. Now, we need to find a relation between the two integrals, or these two functions
r
1 1 1
f .x/ WD 1 C 4 ; g.x/ WD
x x x
We can see that f .x/ > g.x/, thus:
Z 1 Z 1 Z 1 Z 1
f .x/dx > g.x/dx H) 2 f .x/dx > 2 g.x/dx
1 1 1 1
R1 R1
And since 2 1 dx x
D 1, we also have 2 1 f .x/dx D 1. In other words, the area of
the surface of Gabriel’s Horn is infinite. Ok, enough with the maths (which is actually nothing
particularly interesting).
Associated with Gabriel’s Horn is a painter’s paradox. Here it is. One needs an infinite
amount of paint to cover the interior (or exterior) of the horn, but only a finite amount of paint is
needed to fill up the interior space of the horn. So, either the math is wrong or this paradox is
wrong. Of course this paradox is wrong! Can you see why?

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 440

Area of ellipsoid. Let’s consider an ellipse x 2 =a2 C y 2 =b 2 D 1; if we revolve it around the


x axis or y axis we will get an ellipsoid, see Fig. 4.56. In this section we’re interested in the
area of such ellipsoid.

(a) revolution around x axis (b) revolution around y axis

Figure 4.56: Ellipsoid: revolving an ellipse around an axis.

We just need to consider one quarter of the ellipse in the first quadrant and we revolve it
around the x axis 360ı . We parameterize the ellipse by
) )
x D a cos  dx D a sin  p p
H) H) ds D dx 2 C dy 2 D a2 sin2  C b 2 cos2 d
y D b sin  dy D b cos 

Now, we just apply Eq. (4.9.3) to get:


Z =2 p
AD2 2b sin  a2 sin2  C b 2 cos2 d
0

Using this substitution u D cos  the above integral becomes:


Z 1p
A D 4b a2 .a2 b 2 /u2 du (4.9.4)
0

Now, we assume that a > b (we have to assume this or a < b to use the appropriate trigonometry
substitution), and thus we use the following substitution:
a
uD p sin ˛
a2 b2
which leads to
Z p " p #
arcsin a2 b 2 =a
4ba2 1 C cos 2˛ a 2
b a 2 b 2
AD p d˛ D 2 b 2 C p arcsin
a2 b 2 0 2 a2 b 2 a

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 441

Ok, if we now apply this result to concrete cases a D : : : and b D : : :, then it’s fine. But we will
miss interesting things. Let’s consider the case a < b to see what happens.
Now, consider the case a < b, then we write A in a slightly different form (Eq. (4.9.4)):
Z 1 p p Z 1 p a2
AD2 2b a2 C .b 2 a2 /u2 du D 4b b2 a2 u2 C c 2 du; c 2 D
0 0 b2 a2

The red integral is exactly the one we met in Eq. (4.7.29), so we get
" p ! #
bC b2 a2 a2 b
A D 2 b 2 C ln p
a b2 a2

Now comes a nice observation. The area of an ellipsoid does not care about the magnitude of a
and b. But, then why we have two different expressions for the same thing? This is because we
do not allow square root of negative numbers. But hey, we know imaginary numbers. Why don’t
use them to have a unified expression? Let’s do it.
First, define the following:
p
a2 b 2 b
sin D ; cos D
a a
Then, we can write:
p p p p
b 2 a2 .a2 b 2 /. 1/ .a2 b 2 /i 2 i a2 b 2
D D D D i sin
a a a a
With this, we have two expressions for A:
 
2 a2 b
A D 2 b C p .a > b/
a2 b 2
  (4.9.5)
2 a2 b
A D 2 b C p ln .cos C i sin /
i a2 b 2

And of course the second terms in the above should be the same:

ln .cos C i sin / D i

And this is obviously related to Euler’s identity e i D cos C i sin . This logarithmic version
of Euler’s identity was discovered by the English mathematician Roger Cotes (1682 – 1716),
who was known for working closely with Isaac Newton by proofreading the second edition
of the Principia. He was the first Plumian Professor at Cambridge University from 1707 until
his early death. About Cotes’ death, Newton once said “If he had lived, we might have known
something”. The above analysis was inspired by [47].

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 442

4.9.4 Gravitation of distributed masses


In 1687 Newton published his work on gravity in his classic Mathematical Principles of Natural
Philosophy. He stated that every object is pulling other object and two objects are pulled by a
force that is proportional to the product of their masses and inversely proportional to the square
of the distance between them. In mathematical symbols, his law is expressed as:
GM m
F D (4.9.6)
r2
where G is a constant, called the universal gravitational constant, that has been experimentally
measured by Cavendish about 100 years after Newton’s death. It is G D 6:67310 11 Nm2 =kg2 .
This section presents how to use integrals to compute F for distributed masses e.g. a rod.

Gravitational pull of a thin rod 1. Consider a thin rod of length L, its mass M is uniformly
distributed along the length. Ahead one end of the rod along its axis is placed a small mass m at
a distance a (see Fig. 4.57). Calculate the gravitational pull of the rod on m.

Figure 4.57

Let’s consider a small segment dx, its mass is d m D .M=L/dx. So, this small mass d m will
pull the m with a force dF given by Newton’s gravitational theory. The pull of the entire rod is
then simply the pull of all these small dF :
Z
GM m dx GM m L dx GM m
dF D H) F D D (4.9.7)
L .L C a x/ 2 L 0 .L C a x/ 2 a.L C a/
Gravitational pull of a thin rod 2. Consider a thin rod of length 2L, its mass M is uniformly
distributed along the length. Above the center of the rod at a distance h is placed a small mass
m (see Fig. 4.59). Calculate the gravitational pull of the rod on m.

Figure 4.58

Due to symmetry the horizontal component of the gravitational pull is zero. Thus, only the
vertical component counts. This component for a small segment dx can be computed, and thus
the total force is just a sum of all these tiny forces, which is of course an integral:
Z L
GM m dx GM m dx
dF D cos  H) F D h (4.9.8)
2L h C x
2 2 L 0 .h C x /
2 2 3=2

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 443

RL
To evaluate the integral 0 .h2 Cx 2 /3=2 ,
dx
we use the trigonometric substitution:
8
< dx D h sec2 d
x D h tan  H) h2 C x 2 D h2 sec2  (4.9.9)
: 1
0    tan .L= h/
Thus, the integral becomes
Z L Z tan 1 .L= h/ Z 1
dx h sec2 d 1 tan .L= h/
D D 2 cos d
0 .h C x /
2 2 3=2 h3 sec3  h 0
0
  
1 tan 1 .L= h/ 1 1 L L
D 2 Œsin 0 D 2 sin tan D p
h h h h2 h2 C L2
And finally, the gravitational force is
GM m
F D p
h h2 C L2
Gravitational pull of a thin disk. Consider a thin disk of radius a, its mass M is uniformly
distributed. Above the center of the disk at a distance h is placed a small mass m (see Fig. 4.59).
Calculate the gravitational pull of this disk on m.

Figure 4.59

We consider a ring located at distance r from the center, this ring has a thickness dr. We
first compute the gravitational pull of this ring on m. Then, we integrate this to get the total pull
of the whole disk on m. Again, due to symmetry, only a downward pull exists. Consider a small
d m on this ring, we have
Z
d mGm Gm cos 
dF D 2
cos  H) Fring D dF D mring (4.9.10)
R R2
This is because R and cos  are constant along the ring. The mass of the ring is mring D
tM Œ.r C dr/2 r 2  D 2 rtMdr. So, the pull of the ring on m is
Gm cos  rdr
Fring D 2
2 rtMdr D GM mt2h p (4.9.11)
R h2 C r 2
And thus, the total pull of the disk on m is
Z Z a  
rdr h
Fdisk D Fring D GM mt2h p D 2GM mt 1 p (4.9.12)
0 h2 C r 2 h2 C a2
Phu Nguyen, Monash University © Draft version
Chapter 4. Calculus 444

4.9.5 Using integral to compute limits of sums


We know that we can use the Riemann sum to approximate an integral. We have
X
n Z b

lim f .xi /xi D f .x/dx
n!1 a
i D1

When the interval is Œ0; 1 and the intervals are equal (i.e., xi D 1=n), the above becomes
  Z 1
1X
n
i
lim f D f .x/dx (4.9.13)
n!1 n n 0
i D1

Now that we know all the techniques to compute definite integrals, we can use integral to
compute limits of sum. For example, compute the following limit:
X
n
n
lim
n!1
i D1
n2 C i 2

The plan is to rewrite the above sum in the form of a Riemann sum, then Eq. (4.9.13) allows
us to equate it to an integral. Compute that integral and we’re done. So, we write n=n2 Ci 2 D
.1=n/1=1C.i=n/2 . Thus,
X Z 1
1X
n n
n 1 1 
lim D lim D dx D    D
n!1
i D1
n2 C i 2 n!1 n
i D1
1 C .i=n/2 0 1Cx
2 4

Example 4.17
Evaluate the following limit:
X
1
2x
lim
x!0C 1 C k2x2
kD1

4.10 Limits
The calculus was invented in the 17th century and it is based on limit–a concept developed in the
18th century. That’s why I have intentionally presented the calculus without precisely defining
what a limit is. This is mainly to show how mathematics was actually evolved. But we cannot
avoid working with limits, that’s why we finally discuss this concept in this section.
Let’s consider the quadratic function y D f .x/ D x 2 , and we want to define the derivative
of this function at x0 . We consider a change h in x with a corresponding change in the function
f D .x0 C h/2 x02 . We now know that Newton, Leibniz and their fellows defined the
derivative as the value that the ratio f =h tends to when h approaches zero. Here what they did
.x0 C h/2 x02 2x0 h C h2
f 0 .x0 / D D D 2x0 C h D 2x0
h h
Phu Nguyen, Monash University © Draft version
Chapter 4. Calculus 445

The key point is in the third equation where h is not zero and in the final equation where h is zero.
Due to its loose foundation h were referred to as “The ghosts of departed quantities” by Bishop
George Berkeley of the Church of England (1685-1753) in his attack on the logical foundations
of Newton’s calculus in a pamphlet entitled The Analyst (1734).
Leibniz realized this and solved the problem by saying that h is a differential–a quantity that
is non-zero but smaller than any positive number. Because it’s non-zero, the third equation in
the above is fine, and because it is a super super small number, it’s nothing compared with 2x0 ,
thus we can ignore it.
Was Leibniz correct? Yes, Table 4.15 confirms that. This table is purely numerics, we com-
puted f = h for many values of h getting smaller and smaller (and we considered x0 D 2 as we
have to give x0 a value).
y
h f =h

10 1
4.100000000000 g.h/ D 2x0 C h
y1
10 2
4.010000000000
y2
10 3
4.001000000000
10 4
4.000100000008 2x0
10 5
4.000010000027
10 6
4.000001000648 0
h2 h1 h
Table 4.15: lim f =h.
h!0 Figure 4.60: lim .2x0 C h/.
h!0

Now we’re ready for the presentation of the limit of a function. The key point here is to
see f =h as a function of h; thus the derivative of y D f .x/ at x0 is the limit of the function
g.h/ WD f =h when h approaches zero:

.x0 C h/2 x02


f 0 .x0 / D lim D lim .2x0 C h/
h!0 h h!0

And what is this limit? As can be seen from Fig. 4.60, as h tends to zero 2x0 C h is getting closer
and closer to 2x0 . And that’s what we call the limit of 2x0 C h.
In the preceding discussion we have used the symbol h to denote the change in x when
defining the derivative of y D f .x/. This led to the limit of another function g.h/ with h being
the independent variable. It’s possible to restate the problem so that the independent variable is
always x. We choose a fixed point x0 . And we consider another point x, then we have

x2 x02
f 0 .x0 / D lim D lim x C x0 D 2x0
x!x0 x x0 x!x0

4.10.1 Definition of the limit of a function


Now we can forget the derivative and focus on the limit of a function. Let’s denote by y D f .x/
any function, and we’re interested in the limit of this function when x approaches a. Intuitively,

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 446

we know that limx!a f .x/ D L means that we can make x getting closer and closer to a so that
f .x/ gets closer to L as much as we want. See Table 4.15 again, we have stopped at h D 10 6
but we can get f = h much closer to four by using smaller h.

How to mathematically describe ’x gets closer and closer to a’? Given a positive small
number ı, we say that x is close to a when a ı < x < a C ı i.e., x 2 .a ı; a C ı/ (in plain
English, x is in a neighborhood of a). We can write it more compactly as jx aj < ı. Similarly,
f .x/ gets closer to L means jf .x/ Lj < ,  is yet another small positive number. Cauchy
and Bernard Bolzano (1781–1848) was the first who used these  and ı.

There is a little detail here before we present the definition of the limit of a function. We
always say that a limit of a function when x approaches a. This implies that we do not care what
happens when x D a. For example, the function y D x 1=x 2 1 is not defined at x D 1, but it is
obvious that lim f .x/ D 0:5Ž . But the classic example is a circle and an n-polygon inscribed
x!1
in it. When we say a limit of this n polygon when n approaches infinity is the circle, we mean
that n is a very large number. But it is meaningless if n is actually infinity. Because in that case
we would have a polygon of which each side is of vanished length.

Thus, x close to a and not equal to a is written mathematically as:

0 < jx aj < ı

y y

y D f .x/
LC smaller 
L L

L 
smaller ı

0 a ı a aCı x 0 a x

Figure 4.61: Visualization of the  ı definition of the limit of a function.

Ž
You can try by plotting this function or making a table similar to Table 4.15 to confirm this.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 447

Definition 4.10.1
We denote the limit of f .x/ when x approaches a by lim f .x/, and this limit is L i.e.,
x!a

lim f .x/ D L
x!a

when, for any  > 0, there exists a ı > 0 such that

if 0 < jx aj < ı then jf .x/ Lj < 

This definition was given by the German mathematician Karl Theodor Wilhelm Weierstrass
(1815 –1897) who was often cited as the "father of modern analysis".

The key point here is that  is the input that indicates the level of accuracy we need for f .x/
to approach L and ı is the output (thus depends on ). Fig. 4.61 illustrates this; for a smaller ,
we have to make x closer to a and thus a smaller ı.
What is analysis by the way? Analysis is the branch of mathematics dealing with limits
and related theories, such as differentiation, integration, measure, infinite series, and analytic
functions. These theories are usually studied in the context of real and complex numbers
and functions. Analysis evolved from calculus, which involves the elementary concepts and
techniques of analysis.
p
One-sided limits. If we want to find the limit of this function x 1 when x approaches 1,
we’ll see that we need to consider only x  1, and this leads to the notion of one-sided limit:
p p
lim x 1 D lim x 1
x!1C x#1

which is a right hand limit when we approach 1 from above, as indicated by the notation # 1,
even though this
pis not popular. And of course, if we have right hand limit, we have left hand
limit e.g. lim 1 x.
x!1
If the limit of f .x/ when x approaches a exists, it means that the left hand and right hand
one-sided limits exit and equal:

if lim f .x/ D L then lim f .x/ D lim f .x/ D L


x!a x!aC x!a

Infinite limits. If we consider the function y D 1=x 2 we realize that y is very large for x near
0. Thus, we say that:
1
lim 2 D 1
x!0 x

And this is called an infinite limit which is about limit of a function which is very large near
x D 0. We can generalize this to have

lim f .x/ D 1; lim f .x/ D 1; lim f .x/ D 1


x!a x!a x!aC

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 448

y
y

1
y=
x−1
1
y=
∞ x2

0 x

x=1

0 x

(a) (b)

Figure 4.62: Infinite limits and vertical asymptotes.

Fig. 4.62 illustrates some of infinite limits and we can see that the lines x D a are the vertical
asymptotes of the graphs. This figure suggests the following definition of infinite limits.

Definition 4.10.2
The limit of y D f .x/ when x approaches a is infinity, written as,

lim f .x/ D 1
x!a

when, for any large number M , there exists a ı > 0 such that

if 0 < jx aj < ı then f .x/ > M

Limits when x approaches infinity. Again considering the function y D 1=x 2 but now focus
on what happens when x approaches infinity i.e., x is getting bigger and bigger or when it
gets smaller and smaller. It’s clear that 1=x 2 is then getting smaller and smaller. We write
lim 1=x 2 D lim 1=x 2 D 0.
x!C1 x! 1

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 449

Definition 4.10.3
The limit of y D f .x/ when x approaches 1 is finite, written as,

lim f .x/ D a
x!1

when, for any  > 0, there exists a number M > 0 such that

if x>M then jf .x/ aj < 

We can use this definition to prove that lim 1=x 2 D 0; select M D 1= then 1=x will be
x!C1
near to .
We soon realize that the definition of the limit of a function is not as powerful as it seems to
be. For example, with the definition of limit, we’re still not able to compute the following limit
p
t2 C 9 3
lim
t !0 t2
The situation is similar to differentiation. We should now try to find out the rules that limits obey,
then using them will enable us to evaluate limits of complex functions.

4.10.2 Rules of limits


Considering two functions y D f .x/ and y D g.x/ and assume that lim f .x/ and lim g.x/
x!a x!a
exit, then we have the following rules:

(a: constant function rule) lim c Dc


x!a
(b: sum/diff rule) lim .f .x/ ˙ g.x// D lim f .x/ ˙ lim g.x/
x!a x!a x!a
(c: linearity rule) lim .cf .x// D c  lim f .x/
x!a x!a
(4.10.1)
(d: product rule) lim .f .x/g.x// D . lim f .x//. lim g.x//
x!a x!a x!a
(e: quotient rule) lim .f .x/=g.x// D . lim f .x//=. lim g.x//
x!a x!a x!a
(f: power rule) lim Œf .x/n D Œ lim f n
x!a x!a

The sum rule basically states that the limit of the sum of two functions is the sum of the limits.
And this is plausible: near x D a the first function is close to L1 and the second function to L2 ,
thus f .x/ C g.x/ is close to L1 C L2 . And of course when we have this rule for two functions,
we also have it for any number of functions! Need a proof? Here it is:

lim .f C g C h/ D lim Œ.f C g/ C h D lim .f C g/ C lim h D lim f C lim g C lim h


x!a x!a x!a x!a x!a x!a x!a

Similar to multiplication leads to exponents e.g. 2  2  2 D 2 , the product rule limx!a fg D


3

.limx!a f /.limx!a g/ will lead to lim Œf .x/n D Œ lim f n for n being a positive integer. This
x!a x!a
result also holds for negative integer n by combining it with the quotient rule.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 450

Proof of the sum rule of limit. We assume that

lim f .x/ D L1 ; lim g.x/ D L2


x!a x!a

And we need to prove that


lim Œf .x/ C g.x/ D L1 C L2
x!a

And this is equivalent to proving the following (using the definition of limit)

jf .x/ C g.x/ L1 L2 j <  when jx aj < ı

Now we use our assumption about the limits of f and g to have (ı1 ; ı2 ;  are positive real
numbers):

jx aj < ı1 then jf .x/ L1 j <
2 (4.10.2)

jx aj < ı2 then jg.x/ L2 j <
2
Then, define ı D min.ı1 ; ı2 /, we thus have the two above inequalities:
 
jx aj < ı H) jf .x/ L1 j < ; jg.x/ L2 j <
2 2
Now using the triangle inequality ja C bj < jaj C jbj:
 
jf .x/ C g.x/ L1 L2 j < jf .x/ L1 j C jg.x/ L2 j < C D
2 2

Now you know why we have used =2 as the accuracy in Eq. (4.10.2). To summary, the whole
proof uses (1) the triangle inequality ja C bj < jaj C jbj and (2) a correct accuracy (e.g. =2
here). Do we need another proof for the difference rule? No! This is because a b is simply
a C . b/. If you’re still not convinced, we can do this:

lim .f g/ D lim Œf C . 1/g D lim f C lim . 1/g D lim f lim g


x!a x!a x!a x!a x!a x!a

where we used the rule limx!a cf D c limx!a f with c D 1.


Proof of the product rule of limit. We assume that

lim f .x/ D L; lim g.x/ D M


x!a x!a

It’s possible to prove the product rule in the same way as the sum rule, but it’s hard. We follow
an easier path. First we massage a bit fg ŽŽ :

fg D .f L/.g M/ LM C Mf C Lg
ŽŽ
This is the crux of the whole proof. This transform the original problem to this problem: prove limx!a .f
L/.g M / D 0, which is much more easier.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 451

Thus, the limit of fg is

lim fg D lim .f L/.g M/ LM C lim Mf C lim Lg


x!a x!a x!a x!a
D lim .f L/.g M/ LM C LM C LM
x!a
D lim .f L/.g M / C LM
x!a

Now if we can prove that limx!a .f L/.g M / D 0 then we’re done. Indeed, we have
p )
0 < jx aj < ı1 H) jf Lj < 
p
0 < jx aj < ı2 H) jg M j < 

With ı D min.ı1 ; ı2 /, we then have

0 < jx aj < ı H) jf Ljjg M j <  or j.f L/.g M/ 0j < 

Proof of the quotient rule. First, we prove a simpler version:

1 1
lim D (4.10.3)
x!a g.x/ lim g.x/
x!a

Then, it is simple to prove the original rule:


 
f .x/ 1
lim D lim f .x/
x!a g.x/ x!a g.x/
h i 1

D lim f .x/ lim (using product rule)
x!a x!a g.x/

lim f .x/
x!a
D (Eq. (4.10.3))
lim g.x/
x!a

To prove Eq. (4.10.3), let’s denote M D lim g.x/. Then, what we have to prove is that
x!a
ˇ ˇ
ˇ 1 1 ˇˇ
ˇ
ˇ g.x/ M ˇ <  when 0 < jx aj < ı
Or this
1 1
jg.x/ Mj <  when jx aj < ı (4.10.4)
jM j jg.x/j
Now we need to find 1
jg.x/j
<‹ and jg.x/ M j <‹. Because lim g.x/ D M , when 0 < jx aj <
x!a
ı1 we have
jg M j < jM j=2

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 452

We can always select ı1 so that the above inequality holds. You can draw a picture, similar to
Fig. 4.61 to convince yourself about this. Thus,

jM j D jM g.x/ C g.x/j
 jM g.x/j C jg.x/j .triangle inequality/
jM j 1 2
 jg.x/ M j C jg.x/j  C jg.x/j H) <
2 g.x/ jM j

Now based on Eq. (4.10.4), we need jg.x/ M j < .M 2=2/. And of course we have it at our
disposal because the limit of g is M . This holds true when 0 < jx aj < ı2 . Now, with
ı D min.ı1 ; ı2 /, we have

1 2 M2 1 1 1 2 M2
< ; jg.x/ Mj <  H) jg.x/ Mj < D
g.x/ jM j 2 jM j jg.x/j jM j jM j 2
Sadly that in many textbooks, the proof is written in a reversal way, which makes students
believe that they look stupid. We emphasize again that finding a proof is hard and involves
many setbacks. When a proof has been found, the author presents it not in a way the proof was
found. 

Using the definition of limit, we can see that:

lim x D a (4.10.5)
x!a

Combined with the power rule in Eq. (4.10.1), we have

lim x n D an (4.10.6)
x!a

If we look at again these two results, we see that the function y D x n has this nice property:
lim f .x/ D f .a/, that is the limit when x approaches a equals the function value at a. We’re
x!a
now turning our discussion to the functions that have this special property: continuous functions.

4.10.3 Continuous functions


I mentioned in the introduction of this chapter that the essence of calculus is quite simple:
calculus is often seen as the mathematics of changes. But calculus does not work with all kinds
of change. It only works with change of continuous quantities (or continuous functions). Now,
we finally can define precisely what is a continuous function.
A function is continuous if we can draw its graph without lifting our pencil off the paper.
The functions sin x; cos x; e x and so on are such functions (Fig. 4.63a). However, if you draw
the graph of 1=x you have to lift your pencil off the paper as the graph has two branches, one for
x 2 . 1; 0/ and one for x 2 .0; 1/, see Fig. 4.63b. At x D 0 where we have to lift off the
pencil we say that there is a discontinuity. This discontinuity is called an infinite discontinuity.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 453

1
y
y sin.x/ x
1
x
0 0
 
x jxj

1 x
0
(a) continuous everywhere (b) discontinuous at x D 0 (c) a piecewise function

4
2:5x C 3
3
.x 4/2 2
2 1
x 2
1
x
4 3 2 1 0 2 3 4
1
2 x2 2
1 3 jump
xC3
4

(d) a piecewise functions with discontinuities

Figure 4.63: Graph of continuous and discontinuous functions.

To see what else do we have as far as discontinuous functions are involved, we now consider
functions that are defined by more than one expression. We break apart the function domain into
two or more disjoint pieces, we then use different functions to calculate the output for each x
where the function used is based upon the piece into which that particular x falls. Let’s start with
the function y D jxj, which is explicitly written as
(
x; if x  0
y D jxj D
x; if x < 0

The graph of this function has two pieces, one defined over . 1; 0/ with the expression y D x
and one over .0; 1/ with the expression y D x (Fig. 4.63c). But the function is definitely con-
tinuous everywhere. Understandably, mathematicians call such functions piecewise functions.
The piecewise function in Fig. 4.63d is more interesting. At x D 3, the function is defined:
f . 3/ D 2 as indicated by the dot (or closed circle). Still, the function is discontinuous at

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 454

x D 3. At x D 1 and x D 2, the function is continuous. However, x D 0 is special: at this


point we can go with either of the two expressions and of course only one of them is chosen.
Herein, x 2 2 is selected and to indicate that a dot is drawn at .0; f .0// or .0; 2/ and a hollow
(open) circle is drawn at .0; 3/:
(
2:5x C 3; if 1  x < 0
f .x/ D 2
x 2; if 0  x  2

As we had to lift off the pencil at x D 0, the function is discontinuous there (even though the
function is well defined here). As we have a jump from 3 to 2 at x D 0, x D 0 is called a jump
discontinuity. Similarly, the function is also discontinuous at x D 3.
A function is continuous if we can draw its graph without lifting our pencil off the paper.
That’s correct, but it is not a definition that mathematicians want. They need a definition that they
can use, one with symbols, so that they can manipulate them. From Fig. 4.63 and the surrounding
discussion we see that for a function to be continuous at x D a it must be defined at that point;
that is, f .a/ must exist. But this condition is not enough (jump discontinuity is one example).
We also need that limx!a f .x/ exists.

Definition 4.10.4
A function y D f .x/ is continuous at point x D a when the limit of f .x/ as x approaches a
equals the function value at a:
lim f .x/ D f .a/ (4.10.7)
x!a

With that definition of the continuity of a function at a single point, we have another definition.
A function is continuous over an interval if it is continuous everywhere in that interval.
It is not hard to discover these rules for continuity of functions:

(a: sum/diff rule) if f .x/ and g.x/ are continuous then f ˙ g is continuous
(b: linearity rule) if f .x/ is continuous then cf is continuous
(4.10.8)
(c: product rule) if f .x/ and g.x/ are continuous then fg is continuous
(d: quotient rule) if f .x/ and g.x/ are continuous then f =g is continuous

We skip the proof: it’s a combination of the definition of continuity and the limit rules in
Eq. (4.10.1). Now we’re in a position to establish the continuity of many functions we know of.
We start with polynomials, those of the form

X
n
P .x/ D an x n C an 1 x n 1
C    C a2 x 2 C a1 x C a0 D ai x i (4.10.9)
i D0

They are continuous everywhere. This is because each term ai x i is continuous (this in turn is
due to y D x n is continuous and cx n is also continuous).

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 455

Next is rational functions y D P .x/=Q.x/; they are continuous due to the quotient rule in
Eq. (4.10.8). Of course they’re only continuous where Q.x/ ¤ 0. Then, trigonometry functions,
logarithm functions, exponential functions are all continuous.
2
How about composite functions e.g. sin x 2 or e x ? Our intuition tells us that they are
continuous. We can confirm that by drawing them and see that their graphs are continuous
(Fig. 4.64). Therefore, we have
 x2
lim sin x 2 D sin.1/; lim e De 1
x!1 x!1

y y

1:0 1:0
0:5 0:5
x x
2 1 0 1 2 2 1 0 1 2


(a) sin x 2 (b) e x2

Figure 4.64: Two composite functions y D f .g.x//: lim f .g.x// D f .g.1//.


x!1

Theorem 4.10.1: Limit of a composite function


Considering a composite function y D f .g.x// with lim g.x/ D b and f .x/ is continuous
x!a
at b, then:
lim f .g.x// D f .b/ (4.10.10)
x!a

We’re now finally in a position ready to compute some interesting limits. For example,
p
t2 C 9 3 t2 1
lim D lim p D lim p (algebra)
t !0 t2 t !0 t 2 . t 2 C 9 C 3/ t !0 t2 C 9 C 3
1
D p (quotient rule with f .x/ D 1)
lim . t 2 C 9 C 3/
t!0
1
D p (sum rule)
lim . t 2 C 9/ C 3
t!0
1 1
Dq Dp D 1=6 (Eq. (4.10.10))
lim .t 2 C 9/ C 3 9C3
t!0
(4.10.11)
where the first step is to convert the form 0=0 to something better.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 456

4.10.4 Indeterminate forms


Let’s recall that the derivative of a function y D f .x/ at x D a is:

f .a C h/ f .a/
f 0 .a/ D lim
h!0 h
Even though now we know the quotient rule of limits, we cannot compute f 0 .a/ as:

lim f .a C h/ f .a/
0 f .a C h/ f .a/ h!0
f .a/ D lim D
h!0 h lim h
h!0

because it is of the form 0=0 which is not defined. Limit of the form 0=0 is called an indeterminate
form and we list other indeterminate forms in Table 4.16. How to compute indeterminate forms

Table 4.16: Indeterminate forms.

intermediate form conditions


0
0
lim f .x/ D 0, lim g.x/ D 0
x!a x!a
1
1
lim f .x/ D 1, lim g.x/ D 1
x!a x!a
01 lim f .x/ D 0, lim g.x/ D 1
x!a x!a
1 1 lim f .x/ D 1, lim g.x/ D 1
x!a x!a

then? The first method is to do algebraic manipulations to convert an indeterminate form to a


normal form. We actually have used this method in Eq. (4.10.11). We give another example
of the form 1=1: lim 4x 2 Cx=2x 2 Cx . We convert the indeterminate form by dividing both the
x!1
nominator and denominator by x 2 –the highest power:

1 1
4x 2 C x 4C lim 4 C
lim D lim x D x!1 x D 4 D2
x!1 2x 2 C x x!1 1 1 2
2C lim 2 C
x x!1 x
Why we divided both the nominator and denominator by x 2 ? This is because we know that for
a very large x, x is nothing (or negligible) compared with 4x 2 and 2x 2 , so we can write (not
mathematically precise but correct):

4x 2 C x 4x 2
lim D lim D2
x!1 2x 2 C x x!1 2x 2

So to say x is nothing is equivalent to convert it to the form 1=x, and that’s why we did the
division of x 2 . And there is no value in doing more limits of this form, as we can guess (note that

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 457

generalization is a good thing to do) the following result for the ratio of any two polynomials:

<0 if n < m
Pn .x/
lim D 1 if n > m
x!1 Qm .x/
:̂ an
bn
if n D m

which is nothing but the fact that this limit depends on whether the nominator or denominator
overtakes the other. If the denominator overtakes the nominator, the limit is zero.

L’Hopital’s rule. The method of using algebra does not apply for this limit: limx!0 sin x=x . To
deal with this one we had to use geometry, but isn’t it against the spirit of calculus? We need to
find a mechanical way so that everyone is able to compute this limit and similar limits without
resorting to geometry (which is always requiring some genius idea).
What do you think if you see someone doing this?

sin x cos x lim cos x 1


x!0
lim D lim D D D1
x!0 x x!0 1 1 1
First, the result is correct, and second it is purely algebra. What a magic! Could you guess the
formula for this? It is the L’Hopital rule that states: if f .a/ D g.a/ D 0, then

f .x/ f 0 .a/
lim D 0 (4.10.12)
x!a g.x/ g .a/
Actually it is not hard to guess this rule. Recall that for x near a, we have the following approxi-
mations for f .x/ and g.x/:

f .x/  f 0 .a/.x a/; g.x/  g 0 .a/.x a/

Thus,
f .x/ f 0 .a/.x a/ f 0 .a/
lim D lim 0 D 0
x!a g.x/ x!a g .a/.x a/ g .a/
What is the limit of x n=nŠ when n ! 1? Why bother with this? Because it is involved in the
Taylor theorem (Section 4.15.10), which is a big thing. Let’s start simple and concrete with
x D 2:
2n
lim D‹
n!1 nŠ

A bit of algebraic manipulation goes a long way (of course we assume n > 2 as we’re interested
in the case n goes to infinity):
2n 2  2  2    2 2 2 2 2 2
D D      
nŠ 1  2  3    n 1 2 3 4 n
As the red terms are all smaller than one, we’re multiplying a constant (the blue term) repeatedly
with factors smaller than one, we guess that as n approaches infinity, the limit is zero. But, to be

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 458

precise, we polish our expression a bit more:

2n 2 2 2 2 2 2 2
D        
nŠ 1 2 ƒ‚ 3 …
„ 4 „
5 6 ƒ‚ n

4 terms n 4 terms

What is nice with this new form is that all terms in red are smaller than 1=2, thus we immediately
have    n 4
2n 2 2 2 2 1 1
<    D 24 k n
nŠ 1 2 3 4 2 2
„ ƒ‚ …
k

Now, it’s obvious that the limit is zero:

2n 1 1 1 1
lim < lim 24 k n D 24 k lim n D 24 k D 24 k D0
n!1 nŠ n!1 2 n!1 2 limn!1 2n 1

This proof holds for x D 3; 4; : : : or even negative integers of which the absolute is larger than
one. But how about x D 3:123? We just see that

3:123n 4n
<
nŠ nŠ
And then, we know that for all x 2 R, we have limn!1 x n=nŠ D 0.

4.10.5 Differentiable functions


This section is about the differentiability of a function. Usually, we’re given a function and
asked to compute its derivative. We have done that a lot. And because of that we have the
misunderstanding that any continuous function can be differentiated at any point on its domain.
That’s not true, and thus we have to define the differentiability concept.

Definition 4.10.5
A function y D f .x/ defined on an interval I is differentiable at point x D a 2 I if the
derivative:
f .a C h/ f .a/
f 0 .a/ D lim
h!0 h
exists. If x is an end point of I then the limit in this definition is replaced by an appropriate
one-sided limit. The function f .x/ is differentiable on I if it is differentiable at each point of
I.

If a function is differentiable at a point, it is continuous at that point:


  
f .a C h/ f .a/
lim f .a C h/ f .a/ D lim lim h D f 0 .a/  0 D 0
h!0 h!0 h h!0

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 459

However, if a function is continuous at a point, it might be that it is non-differentiable at that


point. The simplest example is the function y D jxj, which is continuous everywhere, but non-
differentiable at x D 0. We can try to compute the derivative of y D jxj at x D 0 to see this.
Or we can think geometrically. At the corner .0; 0/, there does not exist a single well defined
tangent to the point.
The most famous example of a continuous function but not differentiable everywhere is the
Weierstrass function defined as

X
1
f .x/ D an cos.b n x/; a 2 .0; 1/ (4.10.13)
nD0

Fig. 4.65 gives the plots of two cases: (i) a D 0:2; b D 0:1; n D 3 and (ii) a D 0:2; b D 7; n D 3

1.0 1.0

0.5
0.5
0.0
0.0
−0.5

−0.5 −1.0

−2 −1 0 1 2 −2 −1 0 1 2

(a) a D 0:2; b D 0:1; n D 3 (b) a D 0:2; b D 7; n D 3

Figure 4.65: Weierstrass function:

Definition 4.10.6
A function f W .a; b/ ! R is continuously differentiable on .a; b/, written f 2 C 1 .a; b/, if
it is differentiable on .a; b/ and f 0 W .a; b/ ! R is continuous.

For example, the function y D x 2 is a C 1 function for all x because it is differentiable


everywhere and its first derivative is 2x, a continuous function.

Definition 4.10.7
A function f W .a; b/ ! R is said to be k-times continuously differentiable on .a; b/, written
f 2 C k .a; b/, if its derivatives of order j , where 0  j  k, exist and are continuous
functions.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 460

4.11 Some theorems on differentiable functions


This section presents some commonly used theorems regarding continuous functions. They
are: (1) Extreme value theorem, Fig. 4.66a; (2) Intermediate value theorem, Fig. 4.66b; (3)
Rolle’s theorem, Fig. 4.67a; (4) Mean value theorem, Fig. 4.67b; and (5) Mean value theorem
for integrals.
y y
f (d)
B
f (b)
C
y=M

f (c) f (a)
A
a d c x a c x
b b
a) b)

Figure 4.66: Extreme value theorem and Intermediate value theorem.

y f ′ (c) = 0 y
C B
f (b)

A B
f (a) = f (b)

α
f (a)
A
a c x a c x
a) b b) b

Figure 4.67: Rolle’s theorem and Mean value theorem.

4.11.1 Extreme value and intermediate value theorems


As can be seen the first two theorems stem from the property of continuity. The extreme value
theorem states that on a closed interval Œa; b, a continuous function attains a maximum value at
some point c and a minimum value at some point d . Fig. 4.66a illustrates only the case where
c; d 2 .a; b/. The intermediate value theorem states that if we draw a horizontal line y D M
where f .a/  M  f .b/, then it will always intersect the continuous curve described by the
function y D f .x/ at c and a < c < b. Note that the theorem does not tell us what c is. It just
tells that there is such a point only.

Applications. As an application of the intermediate value theorem, let’s consider this problem:
‘prove that the equation x 3 C x 1 D 0 has solutions.’ Let’s denote by f .x/ D x 3 C x 1, we
then have f .0/ D 1 and f .1/ D 1. According to the intermediate value theorem, there exists
a point c 2 .0; 1/ such that f .c/ D 0 because 0 is an intermediate value between f .0/ D 1
and f .1/ D 1.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 461

4.11.2 Rolle’s theorem and the mean value theorem


About Rolle’s theorem (Fig. 4.67a), we consider a differentiable function y D f .x/ defined on a
closed interval Œa; b and f .a/ D f .b/. In this figure, starting from f .a/ the function increases
to a maximum then decreases to f .b/ D f .a/. Of course it attains a maximum at a certain point
c 2 .a; b/, and at c we have f 0 .c/ D 0 (from maxima/minima problems). We have another
case, where f .x/ first decreases to a minimum and then increases back to the starting level. And
finally we also have functions that are increasing/decreasing multiple times within Œa; b, again
the theorem is still true.
We can use Rolle’s theorem for this kind of problem ‘prove that the equation x 3 C x 1 D 0
has exactly one real solution.’ We use proof by contradiction by assuming that this equation
has two roots a and b. That means we have f .a/ D f .b/ D 0. And since f .x/ is continuous,
according to Rolle’s theorem there exists c such that f 0 .c/ D 0. But this is impossible because
f 0 .x/ D 3x 2 C 1 > 0 for all x ŽŽ .
The most important application of Rolle’s theorem is to prove the mean value theorem.
Rolle’s theorem has one restriction that f .a/ D f .b/. But if we rotate Fig. 4.67a a bit counter-
clockwise we get Fig. 4.67b, which is the mean value theorem:
f .b/ f .a/
9c 2 .a; b/ s.t f 0 .c/ D or f .b/ f .a/ D f 0 .c/.b a/ (4.11.1)
b a
This theorem was formulated by the Italian mathematician and astronomer Joseph-Louis La-
grange (1736 – 1813).
From a geometry point of view, Fig. 4.67b shows that there exists a point c 2 .a; b/ such
that the tangent at c has the same slope as the slope of the secant AB. Let’s turn the attention to
motion, and consider a continuous motion s D f .t/, then the average speed within an interval
Œa; b is f .b/ f .a/=b a. The mean value theorem then indicates that there is a time instant t0
during the interval that the instantaneous speed is equal to the average (or mean) speed.
Proof of the mean value theorem. We need to construct a function y D g.x/ such that
g.a/ D g.b/, then Rolle’s theorem tells us that there exists a point c 2 .a; b/ such that g 0 .c/ D 0,
and that leads to the mean value theorem. So, we must have
g 0 .c/ D 0 H) f 0 .c/.b a/ .f .b/ f .a// D 0
From that, we know g 0 .x/, and then g.x/
g 0 .x/ D f 0 .x/.b a/ .f .b/ f .a// H) g.x/ D f .x/.b a/ .f .b/ f .a//x
The proof is then as follows. Build the following g.x/
f .b/ f .a/
g.x/ D f .x/ x
b a
which is differentiable and g.a/ D g.b/, thus there exists c 2 .a; b/ so that g 0 .c/ D 0. And that
leads to the mean value theorem.

ŽŽ
Can you think of a function with f .a/ D f .b/ but there is no f 0 .c/ D 0?

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 462

Michel Rolle (1652 – 1719) was a French mathematician. He is best known for Rolle’s theorem.
Rolle, the son of a shopkeeper, received only an elementary education. In spite of his minimal
education, Rolle studied algebra and Diophantine analysis (a branch of number theory) on his
own. Rolle’s fortune changed dramatically in 1682 when he published an elegant solution of
a difficult, unsolved problem in Diophantine analysis. In 1685 he joined the Académie des
Sciences. Rolle was against calculus and ironically the theorem bearing his name is essential for
basic proofs in calculus. Among his several achievements, Rolle helped advance the currently
accepted size order for negative numbers. Descartes, for example, viewed 2 as smaller than
5. Rolle preceded most of his contemporaries by adopting the current convention in 1691.
Rolle’s 1691 proof covered only the case of polynomial functions. His proof did not use the
methods of differential calculus, which at that point in his life he considered to be fallacious.
The theorem was first proved by Cauchy in 1823 as a corollary of a proof of the mean value
theorem. The name "Rolle’s theorem" was first used by Moritz Wilhelm Drobisch of Germany
in 1834 and by Giusto Bellavitis of Italy in 1846.

Analysis of fixed point iterations. In Section 2.11 we have seen the fixed point iteration method
as a means to solve equations written in the form x D f .x/. In the method, we generate a
sequence starting from x0 : .xn / D fx1 ; x2 ; : : : ; xn g using the formula xnC1 D f .xn /. We have
demonstrated that these numbers converge to x  which is the solution of the equation. Now,
we’re going to prove this using the mean value theorem. The whole point of the proof is that
if the method works, then the distance from the points x1 ; x2 ; : : : to x  must decrease. So, we
compute one such distance xn x  :

xn x  D f .xn 1 / f .x  / D f 0 ./.xn 1 x  /;  2 Œxn ; x  

Now there are two cases. First, if jf 0 ./j  1, then jxn x  j  jxn 1 x  j, that is, the distance
between xn and x  is smaller than xn 1 and x  . And that tells us that xn converges to x  . Thus,
if we start close to x  i.e., x0 2 I D Œx  ˛; x  C ˛, and the absolute of the derivative of the
function is smaller than 1 in that interval I , the method works. Section 12.5.1 discusses more
on this topic.

4.11.3 Average of a function and the mean value theorem of integrals

Let us recall that for n numbers a1 ; a2 ; : : : ; an , the ordinary average number is defined as
.a1 Ca1 CCan /=n. But what is the average of all real numbers within Œ0; 1? There are infinite

numbers living in that interval! Don’t worry, integral calculus is capable of handling just that.
Finding an answer to that question led to the concept of the average of a function.
The idea is to use integration. Assume we want to find the average of a function f .x/ for
a  x  b. We divide the interval Œa; b into n equal sub-intervals of spacing x D .b a/=n.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 463

For each interval we locate a point xi , so we have


f .x1 / C f .x2 / C    C f .xn /
faverage D
n
f .x1 /x C f .x2 /x C    C f .xn /x
D (multiplying deno/nominator by x)
P nx
Z b
f .xi /x 1
D D f .x/dx
b a b a a
(4.11.2)
In the final step, we get an integral when n goes to infinity. So, the average of a continuous
function is its area divided by b a, which is the average height of the function.

Example. Let’s compute the average of these functions: y D x in Œ0; 1, y D x 2 in Œ 1; 1 and
y D sin2 x in Œ0; . They are given by
Z 1 Z Z
1 1 1 2 1 1  2 1
faverage D xdx D ; faverage D x dx D ; faverage D sin xdx D
0 2 2 1 3  0 2

y y y

y=x y = x2 y = sin2 x

1 1 1
1/2 1/3 1/2
x x x
0 c 1 −1 −c 0 c 1 0 c π/2 c π

Figure 4.68: Average of functions: y D x in Œ0; 1, y D x 2 in Œ 1; 1 and y D sin2 x in Œ0; .

Looking at Fig. 4.68, it is obvious to see that there always exists a point c in Œa; b such that
f .c/ is the average height of the function (the horizontal line y D faverage always intersects the
curve y D f .x/). And this is the mean value theorem of an integral:
Z b
1
9c 2 .a; b/ s.t f .c/ D f .x/dx (4.11.3)
b a a
But some examples do not make a proof. Think about whether we have f .a/  f .c/  f .b/
Rb
or not. And what is f .b/.b a/? How it is compared with a f .x/dx. That’s the proof.
p
For y D x 2 we have c D ˙1= 3. They are Gauss points in the Gauss quadrature method
to numerically evaluate integrals, see Section 12.4 for details.

4.12 Parametric curves


In Section 4.2.6 we have met parametric curves–those which are described by .x.t/; y.t// where
t is called a parameter. We can think of a curve lying on the plane and a particle is moving on

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 464

this curve, at time t the position of the particle is given by .x.t/; y.t//. An equation y D f .x/
tells the shape of the path, but not the speed of it.
This section discusses the calculus of parametric curves: tangent to a parametric curve
(Section 4.12.1), how to calculate its length and area (Section 4.12.2). Finally, a presentation of
the cycloid–a fascinating curve with many interesting properties–is given (Section 4.12.3).

4.12.1 Tangents
The slope of the tangent to the curve y D f .x/ at point .x0 ; f .x0 // is the first derivative of f
at x0 i.e., f 0 .x0 /. We now want to find the slope of the tangent to a parametric curve without
having to eliminate the parameter (to convert to the familiar y D f .x/). There are many ways
to get this slope. First, we can use Leibniz notation to write dy=dx as
dy .dy=dt/
D (4.12.1)
dx .dx=dt/

4.12.2 Length and area of parametric curves


I present two methods to calculate the arc length of a parametric curve. In the first method we
start with the arc length formula of y D f .x/ that we know:
s
Z b  2
dy
LD 1C dx (4.12.2)
a dx
Now, we replace dy=dx and dx by
dy y 0 .t/dt y 0 .t/
D 0 D 0 ; dx D x 0 .t/dt
dx x .t/dt x .t/
Therefore, Eq. (4.12.2) becomes
s
Z t2  0 2 Z t2 p
y .t/ 0
LD 1C x .t/dt D Œx 0 .t/2 C Œy 0 .t/2 dt (4.12.3)
t1 x 0 .t/ t1

That’s the formula for the arc length of a parametric


p curve.
In the second way, we start with s D x C y 2 , and x  x 0 .t/t and y 
2
0
y .t /t, hence p
s D Œx 0 .t/2 C Œy 0 .t/2 t
To calculate the area of a parametric curve, we start from the area of y D f .x/:
Z b
AD y.x/dx
a
0
But x D x.t /, hence dx D x .t/dt. So, the area formula is:
Z t2
AD y.t/x 0 .t/dt (4.12.4)
t1

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 465

If y.t / > 0 for t 2 Œt1 ; t2  (i.e., the curve is above the x axis), the area must be positive. But that
depends on the sign of x 0 .t/. If x 0 .t/ > 0, we traverse the curve from t1 to t2 , and if x 0 .t/ < 0,
the curve is traversed from t2 to t1 . This is the case of an ellipse x D a cos t; y D b sin t. In that
case, we need to put a minus sign before the integral in Eq. (4.12.4).

Arc-length of ellipse. We consider again the perimeter of 1=4 of the ellipse y 2 C 2x 2 D 2.


Using Eq. (4.12.3), we do
) ( Z =2 p
x D cos t dx=dt D sin t
p H) p H) sin2 t C 2 cos2 tdt
y D 2 sin t dy=dt D 2 cos t 0

Of course we cannot find an anti-derivative for this integral. Compared to Section 4.9.1, this
one is better as the integrand does not blow up at the integration limits. Using any numerical
quadrature method, we can evaluate this integral easily. This is how an applied mathematician
or engineer or scientist would approach the problem. If they cannot find the answer exactly,
they adopt numerical methods. But pure mathematicians do not do that. They will invent new
mathematics to deal with integrals that cannot be solved using existing (elementary) functions.
Recall that they invented negative integers so that we can solve for 5 C x D 2, and i 2 D 1,
and so on.

4.12.3 Cycloid
In geometry, a cycloid is the curve traced by a point on a circle as it rolls along a straight line
without slipping. The cycloid has been called "The Helen of Geometers" as it caused frequent
quarrels among 17th-century mathematicians. Galileo, Descartes, Pascal, Fermat, Roberval, New-
ton, Leibniz and the Bernoullis, as well as the architect, Christopher Wren, all wrote on various
aspects of the cycloid. This section discusses this fascinating curve.
It is possible to mark a point on a bicycle wheel and observe this point closely when someone
rides the bikeŽ . We cannot do this on paper, so we have to follow a different way. We draw a
horizontal line and a circle of radius a. We mark three points: the center O, point A that touches
the line and point B such that the arc AB is half the circle (see the left most picture in Fig. 4.69).
After the circle has rolled for awhile point A will be at the highest point on the circle (middle
picture) we have the distance AB D a because it is exactly the length of the arc AB (think of
straightening the arc AB into the segment AB). Finally, point A will be at the lowest point on
the circle again (right picture). So, we know at least three points on the cycloid (red dots) for
one period.
Referring to Fig. 4.70, the coordinates of A on the cycloid are (computed from the coordinates
of the center)
x D a a sin ; y D a a cos  (4.12.5)
Ž
Blaise Pascal thought that the curve that he saw most in daily life was the cycloid, second only to the circle.
Perhaps the large and slowly moving carriage wheels of the seventeenth century were more easily observed than
those of our modern automobile.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 466

B A B

O O
O
a

A B A
a
2a

Figure 4.69: A circle of radius a rolls on a horizontal line. The marked point A traces out a curve called a
cycloid. Noting that the circle is traveling on a horizontal line.

where the derivation was for  2 Œ0; =2 but you can check that it works for any angle.
Referring Fig. 4.69 we can see that one arch of the cycloid comes from one full rotation of the
circle i.e.,  2 Œ0; 2.

y
cycloid

a O

O O A

A a sin 
a
B
x
A B
a

Figure 4.70: Derivation of the parametric equation of the cycloid. The dashed circle is the starting circle.
A Cartesian coordinate system is chosen with the origin at the lowest point A. The coordinates of the
center O are .a; a/.

Now, we turn to the tangent to the cycloid. The slope of the tangent is Eq. (4.12.1) :

dy sin 
D
dx 1 cos 
From that we can determine points at which the tangent is horizontal: sin  D 0 and 1 cos  ¤
0; the solutions are  D .2n 1/ (that is ; 3; : : :). A plot of the cycloid and its tangents is
given in Fig. 4.71. How about the tangent at  D 0? As dy=dx has the form 0=0 we use the
L’Hopital rule:
dy sin  cos  1
lim D lim D lim D D1
x!0C dx x!0C 1 cos  x!0C sin  0
dy
In the same manner, limx!0 dx D 1. Thus, there is a vertical tangent at  D 0 or  D 2n
where the cycloid touches the x axis, as we have seen from its graph. These points are called
the cusps of the cycloid.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 467

x
0 x.3=4/  2

Figure 4.71: Cycloid with a D 1: tangents at 3=4 and  (slope=0).

Length and area of the cycloid. Using Eq. (4.12.3) we can compute the length of one arch of
the cycloid:
p
x 0 ./ D a.1 cos /; y 0 ./ D a sin  H) ds D a 2.1 cos /d

Therefore, that length is


Z 2 p Z 2

LD a 2.1 cos /d D 2a sin d D 8a
0 0 2

So, the length of one arch of the cycloid is eight times the radius of the circle which generates
the cycloid.
Using Eq. (4.12.4) we can determine the area of one arch of the cycloid: it is 3a2 ; which is
three times the area of the circle which generates the cycloidŽ .

The cycloid is the solution to the Brachistochrone problem. Suppose a particle is allowed to
slide freely and frictionlessly along a wire under gravity from a point A to point B. Furthermore,
assume that the beads starts with a zero velocity. Find the curve y D y.x/ that minimizes
the travel time. Such a curve is called a brachistochrone curve (from Ancient Greek brákhistos
khrónos ’shortest time’). And the cycloid is that curve as shown in Section 10.4.2. Herein, we
compute that minimum time.
I refer to Section 10.2 for detail:
Z Z Z Z  p r
ds ds a 2.1 cos /d a
T D dt D D p D p D
v 2gy 0 2ga.1 cos / g

Thus, the time is equal to  times the square root of the radius (of the circle which generates the
cycloid) over the acceleration of gravity.

The cycloid is a tautochrone curve. A tautochrone or isochrone curve (from Greek prefixes
tauto- meaning same or iso- equal, and chrono time) is the curve for which the time taken by an
Ž
Galileo approached this area problem empirically by cutting the shape out of a uniform sheet of material and
weighing it. He found that the shape weighed the same as three circular plates of the same material cut with the
radius of the wheel used to draw the curve. Galileo gave the name "cycloid" to the curve, although it has also been
known as a "roulette" and a "trochoid". Historically, an equation for the cycloid was not written down until after its
length and area has been discovered by geometry, engineering and mechanics.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 468

object sliding without friction in uniform gravity to its lowest point is independent of its starting
point on the curve. Interestingly, the cycloid is such a curve. And we need to prove this.
The time is (similar to the case where the particle is starting at the beginning of the cycle,
except now that it begins at a point with ordinate y0 and the corresponding parameter is 0 ):
Z Z  p r Z  p
ds a 2.1 cos /d a 1 cos d
T D p D p D p
2g.y y0 / 0 2ga.cos 0 cos / g 0 cos 0 cos 

The task is now to compute the above definite integral, we use the following trigonometric
identities:
p p  
1 cos  D 2 sin ; cos  D 2 cos2
2 2
Hence, T becomes
r Z 
a sin 2
T D q d
g 0 cos2 0 cos2 
2 2

With u D cos.=2/, we have


r Z cos 20 r
a du a
T D2 q D
g 0 cos2 20 u2 g

It is clear T is independent of 0 or y0 : the time taken by an object sliding without friction in


uniform gravity to its lowest point is independent of its starting point on the cycloid.

4.13 Polar coordinates


Cartesian coordinates are not the only one way to specify points on a plane. This section presents
polar coordinates (Section 4.13.1). Also discussed are conic sections written in terms of polar
coordinates (Section 4.13.2), length and area of polar curves (Section 4.13.4).

4.13.1 Polar coordinates and polar graphs


The polar coordinate system is a two-dimensional coordinate system in which points are given
by an angle and a distance from a central point known as the pole (equivalent to the origin in
the more familiar Cartesian coordinate system), cf. Fig. 4.72. The polar coordinate system is
used in many fields, including mathematics, physics, engineering, navigation and robotics. It is
especially useful in situations where the relationship between two points is most easily expressed
in terms of angles and distance. For instance,
p let’s consider a unit circle centered at the origin: it
is described by two equations y D ˙ 1 x 2 in Cartesian coordinates (plus sign for the upper
half), but simply r D cos  in polar coordinates.
The full history of polar coordinates is described in Origin of Polar Coordinates of the
American mathematician and historian Julian Lowell Coolidge (1873 – 1954). A brief history

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 469

is given here. The Flemish mathematician Grégoire de Saint-Vincent (1584 – 1667) and Italian
mathematician Bonaventura Cavalieri (1598 – 1647) independently introduced the concepts at
about the same time. In Acta eruditorum (1691), Jacob Bernoulli used a system with a point on a
line, called the pole and polar axis, respectively. Coordinates were specified by the distance from
the pole and the angle from the polar axis. The actual term polar coordinates has been attributed
to the Italian mathematician Gregorio Fontana (1735 – 1803). The term appeared in English in
George Peacock’s 1816 translation§ of Lacroix’s Differential and Integral Calculus‘ .
In the Cartesian coordinate system we lay a grid consisting of horizontal and vertical lines
that are at right angles. Two lines are special as their intersection marks the origin from which
other points are located. In a polar coordinate system, we also have two axes with a origin.
Concentric circles centered at the origin are used to mark constant distances r from the origin.
Also, lines starting from the origin are drawn; every points on such a line has a constant angle .
So, a point is marked by .r; / (in Fig. 4.72b, the marked point has a coordinate .2; =4/). And
of course, there is a relation between Cartesian coordinates .x; y/ and polar coordinates .r; /,
see Fig. 4.72c.

π
2


4
π
4 y
P .r;  /
P .x; y/

π 0
0 1 2 3
r
y


5π 7π
4 4
x

2 x
(a) Cartesian coordinates (b) Polar coordinates (c) Relation

Figure 4.72: Cartesian and polar coordinates: x D r cos  and y D r sin  .

Curves are described by equations of the form y D f .x/ in the Cartesian coordinate system.
Similarly, polar curves are written as r D f ./. Let’s start with the unit circle. Using Cartesian
coordinates, it is written as x 2 Cy 2 D 1. Using polar coordinates, it is simply as r D 1! Fig. 4.73
presents a nice polar curve–a polar rose with as many petals as we want, and a more realistic
rose.
What do you think of Fig. 4.74? It is a spiral, from prime numbers! It was created by plotting
points .r; / D .p; p/, where p is prime numbers beneath 20 000. That is the radius and angle
§
George Peacock (1791 – 1858) was an English mathematician and Anglican cleric. He founded what has been
called the British algebra of logic.

Sylvestre François Lacroix (1765 – 1843) was a French mathematician. Lacroix was the writer of important
textbooks in mathematics and through these he made a major contribution to the teaching of mathematics throughout
France and also in other countries. He published a two volume text Traité de calcul differéntiel et du calcul intégral
(1797-1798) which is perhaps his most famous work.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 470

(in radians) are both prime numbers.

π π
2 2
90°
3π π 3π π
4 4 4 4
135° 45°

40
30 35
20 25
10 15
π
0 1 20
π
0 1 20 0 5
180° 0°

5π 7π 5π 7π
4 4 4 4 225° 315°

3π 3π
2 2 270°

(a) k D 4 (b) k D 7 (c) real rose

Figure 4.73: Polar rose r. / D a cos k with a D 1. It is a k-petaled rose if k is odd, or a 2k-petaled
rose if k is even. The variable a represents the length of the petals of the rose. In (c) is a more real rose
with r D  C 2 sin.2 /.

Figure 4.74: Prime numbers from 1 to 20 000 plotted on a polar plane. Generated using Julia package
Primes: the function primes(n) returns all primes from 1 to n. Then for every number p in that list, I com-
puted the coordinates .p cos p; p sin p/. Finally, I plot all these points. Source code: prime-spiral.jl.

4.13.2 Conic sections in polar coordinates


Herein we derive the equation for conic sections in polar coordinates where the origin is one
of the two foci. When using Cartesian coordinates, we define the parabola using the focus and
the directrix whereas the ellipse and hyperbola are defined using the distance to the two foci.
With polar coordinates, we can define conic sections in a unified way using the focus and the
directrix.
Considering Fig. 4.75 where F –the focus–is at the origin, the directrix is the line parallel to
the y axis and at a distance d from F . Let’s denote by e the eccentricity and by P a point in

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 471

y
d directrix

P (x, y)
E
r
A2 center θ A1
x
F
a c a−c

Figure 4.75: A conic section is defined as PF =PE D e. Illustrated with an ellipse in mind.

the conic section with coordinates .x; y/ D .r cos ; r sin / , then a conic section is defined as
PF=PE D e, which leads to the equation:

ed
r D e.d r cos / H) r D (4.13.1)
1 C e cos 
You might be not convinced that this equation is a conic section. We can check that by either
using a software to draw this equation and see what we get or we can transform this back to a
Cartesian form (which we already know the result). We do the latter now. Why bother doing all
of this? This is because, for certain problems, polar coordinates are more convenient to work
with than the Cartesian coordinates. Later on we shall use the result in this section to prove
Kepler’s 1st law that the orbit of a planet around the Sun is an ellipse (Section 7.10.9).
From Eq. (4.13.1), we have r D e.d r cos / D e.d x/ for x D r cos , now we square
this equation and use r 2 D x 2 C y 2 , we get

r D e.d x/ H) x 2 C y 2 D e 2 .d x/2 D e 2 .d 2 2dx C x 2 /

And a bit of massage to it, we obtain


2e 2 d y2 e2d 2
x2 C x C D
1 e2 1 e2 1 e2
Knowing already the Cartesian form of an ellipse (.x=a/2 C .y=b/2 D 1), we now complete the
square for x ŽŽ :
 2
e2d y2 e2d 2 e4d 2
xC C D C (complete the square)
1 e2 1 e2 1 e2 .1 e 2 /2
  2
e2d y2 e2d 2
xC C D (algebra)
1 e2 1 e2 .1 e 2 /2
ŽŽ
If we just learnt by heart the quadratic equation we would forget how to complete a square!

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 472

The next step is, of course, to introduce a and b, and h (now we need e < 1)

2 e2d 2 2 e2d 2 e2d


a D ; b D ; hD (4.13.2)
.1 e 2 /2 1 e2 1 e2
With these new symbols, our equation becomes, which is the familiar ellipse:

.x C h/2 y2
C D1
a2 b2
But what is h? You might be guessing correctly that it should be related to c. Indeed, we know
that, from Section 4.1, the distance from the center of an ellipse to one focus is c and it is defined
by c 2 C b 2 D a2 , thus

e2d 2 e2d 2 e4d 2


c 2 D a2 b2 D D D h2 [use Eq. (4.13.2)]
.1 e 2 /2 1 e2 .1 e 2 /2

The fact that c D h should not be a surprise as we just move the origin from the center of the
ellipse to its focus, these two points are separate of a distance c.
Theorem 4.13.1
A polar equation of the form

ed ed
rD or rD
1 ˙ e cos  1 ˙ e sin 
represents conic section with eccentricity e. The conic is an ellipse if e < 1, a parabola if
e D 1 and a hyperbola if e > 1.

Ellipse in terms of major axis and eccentricity. It y

is possible to write the ellipse in terms of a–the semi


major axis and e. Using Fig. 4.75, and considering the √
a 1 − e2
points A1 and A2 . Applying the definition of an ellipse
perihelion Sun aphelion
to these two points we have x
(−a, 0) (a, 0)
a(1 − e) c = ea a(1 − e2 )
r=
1 + e cos θ
a c D ed ea C ec; a C c D ed C ea C ec

Adding these two equations we get ed D a ec, and


substituting this into the first equation, we obtain: c D
ea. Thus, we can specify an ellipse using a and e. We can compute b from b 2 D a2 c 2 D
a2 .1 e 2 /. As planetary orbits around the Sun are ellipses, we pay attention to ellipses and
compute the distance from the center of the ellipse to one focus: c D ea. Then, we can determine
the distance from the Sun to the perihelion–the point nearest to the Sun in the planet orbit, it is
a.1 e/. Similarly a.1 C e/ is the distance from the Sun to the aphelion–the point in the orbit
of a planet most distant from the Sun. We shall use these in Section 12.6.4 when we simulate
the orbit of planets around the Sun.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 473

4.13.3 Cardioid
Let’s roll a circle of radius with a marked point around another circle of the same radius, this
point traces a curve. What is the shape of this curve? Surprisingly it is of a heart shape, and it is
called a cardioid, and a plot of this curve was given in Fig. 1.1. Now, we’re going to derive the
parametric equation of this fascinating curve.
The plan is to use analytic geometry, put two circles on a Cartesian plane, and consider a
point on the rolling circle. Finding the coordinates of this point in terms of the rolling velocity
gives us the parametric equation of the cardioid.

y a) y b)

O3 α
α O3
Q′ Q
ω
a
α P ′α α
O1 P O2 x O1 x

Figure 4.76: Derivation of the cardioid parametric equation: the center of the fixed circle is O1 .0; 0/ and
it has a radius of a. At the beginning, the center of the rolling (dashed) circle O2 is at .2a; 0/. The point
we pay attention to is P . After t, the dashed circle has moved to a new location with O3 as the center.
The segment O1 O3 makes an angle ˛ D !t with the x axis, if ! is the rolling angular speed. The
point P moves to P 0 . The key point: PQ0 D P 0 Q0 D PQ.

Referring to Fig. 4.76, one can determine the location of P 0 relative to O3 .2a cos ˛; 2a sin ˛/,
and then to O1 :
        
x.˛/ 2a cos ˛ cos ˛ sin ˛ a cos ˛ a.2 cos ˛ cos 2˛/
D C D (4.13.3)
y.˛/ 2a sin ˛ sin ˛ cos ˛ a sin ˛ a.2 sin ˛ sin 2˛/

The matrix in the above is the rotation matrix from the red coordinate system to the blue one
(see b). In Section 4.1.5 we have discussed rotation of axes (noting that the rotation direction
herein is different from that in Section 4.1.5). Now, we just replace ˛ by t, and we obtain the
parametric equation of a cardioid.

Polar equation. It is more convenient to put the origin at P (see Fig. 4.76a), then just subtract
from Eq. (4.13.3) for x an amount of a:
   
x.˛/ cos ˛
D 2a.1 cos ˛/ H) r 2 D x 2 Cy 2 D 4a2 .1 cos ˛ 2 / H) r D 2a.1 cos ˛/
y.˛/ sin ˛
(4.13.4)

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 474

4.13.4 Length and area of polar curves


Herein we revisit the old problem of calculation of arc length and area of planar curves, but the
curves are now written in terms of polar coordinates. Certainly we shall have new formula. But
do not worry, the principle is the same. For example, to compute the arc length of a curve, we
compute a small portion of that length ds, and integrate that to get the total length.

The arc length problem is: given a polar curve r D f ./ with  2 Œp
1 ; 2 , compute its length.
We start with the familiar Cartesian coordinates where we write ds D dx 2 C dy 2 , and the arc
length of a curve y D f .x/ is then given by, see Section 4.9.1
Z 2 p
LD ds; ds D dx 2 C dy 2
1

As we now work with polar coordinates, we need to convert .x; y/ to .r; /:
x D r cos ; y D r sin ; r D f ./ (4.13.5)
And that allows us to compute dx; dy in terms of dr; d and noting that dr D f 0 d:
dx D cos dr r sin d D .cos f 0 f ./ sin /d
(4.13.6)
dy D sin dr C r cos d D .sin f 0 C f ./ cos /d
And with that, we can now determine ds Ž and the arclength:
p Z 2 p
0
ds D .f / C f d H) L D
2 2 Œf ./2 C Œf 0 ./2 d (4.13.7)
1

That derivation is purely algebraic. Many people prefer geometry. With Cartesian coordinates,
p we
consider a point A.x; y/ and a nearby point B.x Cdx; y Cdy/, then ds D AB D dx C dy 2 2

from the Pythagorean theorem. In the same manner, we consider a point A.r; / and a nearby
point B.r C dr;  C d/. Fig. 4.77 shows that ds 2 D .rd/2 C .dr/2 , which is exactly what
we have obtained using algebra.
The area problem is: let R be the region bounded by the polar curve r D f ./ and by the rays
 D 1 and  D 2 , and 0  2 1  2, see the left picture in Fig. 4.78. Compute the area
of R. The idea is the same as that was used to determine the area of a circle: we chopped the
circle into many many wedges (Fig. 3.33).
We divide the angular interval 2 1 into n sub intervals of equal angle  D .2 1 /=n.
For sub interval i , we consider a point i inside it. In the left picture of Fig. 4.78, i was chosen
such as to bisect the angle  . Now, we build a sector of radius r D f .i / and angle , the
area of this sector is easy to compute: 1=2Œf .i /2 . Therefore, the area that we’re seeking for
is the sum of the area of all the sectors when n approaches infinity (see the middle and right
pictures of Fig. 4.78):
X n Z 2
1  2 1
A D lim Œf .i /  D Œf ./2 d
n!1
i D1
2 1 2
Ž
Some details are omitted: using Eq. (4.13.6) in ds 2 D dx 2 C dy 2 and the identity sin2  C cos2  D 1.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 475

y y
r D f . /
f .x/ ds
ds
rd
ds dr
dy d
dx r

a b x x

Figure 4.77: Arc length of polar curves: the idea is exactly the same as for Cartesian curves. We consider
two nearby points: one point at .r;  / and the other at .r C dr;  C d /. Then, using the Pythagorean
theorem noting that as dr and d are infinitesimals, we have a right triangle with sides dr; rd and ds.

y r D f . / y nD8 y n D 15

.f .i /; i /

2 
1
x x x

Figure 4.78: Area of the region bounded by the polar curve r D f . / and by the rays  D 1 and  D 2
(the area bounded by the three red curves in the left figure).

4.14 Bézier curves: fascinating parametric curves


Parametric curves are briefly mentioned in Section 4.2.6. Recall that a planar parametric curve
is given by
C.t/ W x.t/; y.t/; a  t  b (4.14.1)

In high school and in university, usually students are given a parametric curve in which the
expressions for x.t/ and y.t/ are thrown at their faces e.g. the spiral .t cos t; t sin t/, and asked
to do something with that: draw the curve for example. A boring task! The students thus missed
a fascinating type of curves called Bézier curves.
Bézier curves are ubiquitous in computer graphics. For instance, one of the most common
uses of Bézier curves is in the design of fonts. Cubic Bézier curves are used in Type 1 fonts,
and quadratic Bézier curves are used in True Type fonts. Cubic Bézier curves are also used in
the TEX fonts designed by Donald KnuthŽ , and one of the clearest explanations is in his book
MetaFont: the Program. This section presents a brief introduction to these curves.
Ž
Donald Ervin Knuth (born January 10, 1938) is an American computer scientist, mathematician, and professor
emeritus at Stanford University. He is the 1974 recipient of the ACM Turing Award, informally considered the
Nobel Prize of computer science. He has been called the "father of the analysis of algorithms". He contributed
to the development of the rigorous analysis of the computational complexity of algorithms. In addition to funda-
mental contributions in several branches of theoretical computer science, Knuth is the creator of the TEX computer
typesetting system by which this book was typeset.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 476

Starting with two points P 1 and P 2 , the segment P 1 P 2 is written as§ (this is a vector
equation, the bold symbols are used for the points)
P 1 P 2 D .1 t/P 1 C t P 2 ; t 2 Œ0; 1 (4.14.2)
This is neither interesting nor new. Do not worry it is just the beginning. If we now have three
points P 1 , P 2 and P 3 , we get a quadratic curve. de Casteljau developed a recursive algorithm
to get that curveŽŽ . For a given t fixed, using Eq. (4.14.2) to determine two new points P 12
and P 23 , then using Eq. (4.14.2) again with the two new points to get Q (Fig. 4.79a). When t
varies from 0 to 1, this point Q traces a quadratic curve passing P 1 and P 2 (Fig. 4.79c). The
points P k ; k D 1; 2; 3 are called the control points. They are so called because the control points
control the shape of the curve.

y y y

P2 P2 P2

P12
P23
P12
P23
Q
Q
P3 P3 P3
x x x
P1 P1 P1

(a) t D 0:4 (b) t D 0:8 (c) 0  t  1

Figure 4.79: A quadratic Bézier curve (red curve) determined by three control points P1 ; P2 ; P3 .

Indeed, the maths gives us:


Q D .1 t/P 12 C tP 23
D .1 t/Œ.1 t/P 1 C t P 2  C tŒ.1 t/P 2 C tP 3  (Eq. (4.14.2)) (4.14.3)
2 2
D .1 t/ P 1 C 2t.1 t/P 2 C t P 3
What we see here is that the last equation is a linear combination of some polynomials (the red
terms) and some constant coefficients being the control points.
Moving on to a cubic curve with four control points (Fig. 4.80). The procedure is the same,
and the result is
Q D .1 t/3 P 1 C 3t .1 t/2 P 2 C 3t 2 .1 t/P 3 C t 3 P 4 (4.14.4)
Animation of the construction of Bézier curves helps the understanding. A coding exercise for
people who likes coding is to write a small program to create Fig. 4.80. If you do not like coding,
check out geogbra where you can drag and move the control points to see how the curve changes.
And this allows free form geometric modeling.
§
If this is not clear, check Section 11.1.3.
Paul de Casteljau (born 19 November 1930) is a French physicist and mathematician. In 1959, while working
ŽŽ

at Citroën, he developed an algorithm for evaluating calculations on a certain family of curves, which would later
be formalized and popularized by engineer Pierre Bézier, leading to the curves widely known as Bézier curves.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 477

(a) (b)

Figure 4.80: A cubic Bézier curve determined by four control points. Created using the file bezier.jl.

To see the pattern (for the generalization to curves of higher orders), let’s put the quadratic
and cubic Bézier curves together:

quadratic Bézier curve: D 1.1 t/2 P 1 C 2t.1 t/P 2 C 1t 2 .1 t/0 P 3


cubic Bézier curve: D 1.1 t/3 P 1 C 3t.1 t/2 P 2 C 3t 2 .1 t/P 3 C 1t 3 .1 t/0 P 4
And what are we seeing here? Pascal’s triangle! And with that we can guess (correctly) that
the expression for an n degree Bézier curve determined by n C 1 control points P k (k D
0; 1; 2; : : : ; n) is !
X
n
n Xn
n k k
B.t/ D .1 t/ t P k D Bk;n .t/P k (4.14.5)
k
kD0 kD0

where Bk;n is the Bernstein basis polynomial , given by Ž

!
n
Bk;n .t/ D .1 t/n k t k ; 0t 1 (4.14.6)
k

The Bernstein
P basis polynomial possess some nice properties: they are non-negative, their sum
is one i.e., nkD1 Bk;n .t/ D 1ŽŽ , see Fig. 4.81a. Because of these two properties, the point B.t/,
which is a weighted average of the control points, hence lies inside the convex hull of those
points (Fig. 4.81b).
Ž
Sergei Natanovich Bernstein (5 March 1880 – 26 October 1968) was a Soviet and Russian mathematician of
Jewish origin known for contributions to partial differential equations, differential geometry, probability theory, and
approximation theory.
ŽŽ
Why? The binomial theorem is the answer.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 478

1.0 B0, 3 B2, 3 Bk, 3 P2 P3


k
B1, 3 B3, 3
0.8
0.6
P4
0.4 convex hull

0.2
0.0
0.0 0.2 0.4 0.6 0.8 1.0 P1
(a) Bernstein cubic polynomials (b) Convex hull property

Figure 4.81: Bernstein cubic polynomials and convex hull property of Bézier curves.

You might be asking: where are calculus stuff? Ok, let’s differentiate the cubic curve B.t/
to see what we get:

B 0 .0/ D 3.P 1 P 0 /; B 0 .1/ D 3.P 3 P 2/

What is this equation telling us? It indicates that the tangent to the curve at P 1 (or t D 0) is
proportional to the line P 1 P 0 . And the tangent to the curve at P 2 is proportional to the line
P 3 P 2 . This should be not a surprise as we have actually seen this in Fig. 4.80b. Because of
this, and the fact that the curve goes through the starting and ending points i.e., B.0/ D P 0 and
B.1/ D P 3 , we say that a cubic Bézier curve is completely determined by four numbers: the
values of the curve at the two end points and the slopes of the curve at these points. And this is
where Bézier curves look similar to Hermite interpolation (??).
The vectors extending from P 0 to P 1 and from P 3 to P 2 are called handles and can be
manipulated in graphics programs like Adobe Photoshop and Illustrator to change the shape of
the curve. That explains the term free form modeling.

Bézier curves, CAD, and cars. The mathematical origin of Bézier curves comes from a 1912
mathematical discovery: Bernstein discovered (or invented) the now so-called Bernstein basis
polynomial, and used it to define the Bernstein polynomial. What was his purpose? Only to prove
Weierstrass’s approximation theorem (Section 12.3.1). We can say that Bernstein polynomials
had no practical applications until ... 50 years later. In 1960s, through the work of BézierŽŽ and
de Castelijau, Bernstein basis polynomials come to life under the form of Bézier curves.

ŽŽ
Pierre Étienne Bézier (1 September 1910 – 25 November 1999) was a French engineer at Renault. Bezier
came from a family of engineers. He followed in their footsteps and earned degrees in mechanical engineering from
École nationale supérieure d’arts et métiers and electrical engineering from École supérieure d’électricité. At the
age of 67 he earned a doctorate in mathematics from Pierre-and-Marie-Curie University.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 479

de Casteljau’s idea of using mathematics to design car bod-


ies met with resistance from Citroën. The reaction was: Was it
some kind of joke? It was considered nonsense to represent a car
body mathematically. It was enough to please the eye, the word
accuracy had no meaning .... Eventually de Casteljau’s insane
persistence led to an increased adoption of computer-aided de-
sign methods in Citroën from 1963 onward. About his time at
Citroën in his autobiography de Casteljau wrote

My stay at Citroën was not only an adventure for me, but also an adventure for
Citroën!

Thanks to people like de Casteljau that now we have a field called computer aided design (CAD)
in which mathematics and computers are used to help the design of all things that you can
imagine of: cars, buildings, airplanes, phones and so on.

4.15 Infinite series


This section presents infinite series, such as

1 3 1 1 7
sin.x/ D x x C x5 x C 
3Š 5Š 7Š
1 2 1 1 6
cos.x/ D 1 x C x4 x C 
2Š 4Š 6Š
This is how computers compute trigonometric functions, exponential functions, logarithms etc.
It is amazing that to compute something finite we have to use infinity. Moreover, the expressions
have a nice pattern. That’s why maths is beautiful. Another theme here is function approximation:
a complex function (e.g. sin x) is replaced by a simpler function, e.g. a polynomial x 1=3Šx 3 C
1=5Šx 5 , which is easier to work with (easier to differentiate and integrate).

Regarding the organization, first, ingenious ways to obtain such infinite series are presented
and second, a systematic method, called Taylor’s series, is given.

4.15.1 The generalized binomial theorem


We all know that .1 C x/2 D 1 C 2x C x 2 , but what about .1 C x/1=2 ? Newton’s discovery of
the binomial series gave answer to negative and fractional powers of binomials. Newton were
working on the area of curves of which equations are of the form .1 x 2 /n=2 . For n D 1, this is
the problem of calculating the area of a circle segment.
He considered calculating the following integrals
Z x
fn .x/ D .1 u2 /n=2 du (4.15.1)
0

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 480

When n is even fn .x/ can be found explicitly since he knows from WallisŽ that
Z x
x pC1
up du D
0 pC1
Hence,
Z x  
x
f0 .x/ D du D 1
1
Z
0
x    
2 x x3
f2 .x/ D .1 u /du D 1 C1
1 3
Z
0
x      5 (4.15.2)
2 2 x x3 x
f4 .x/ D .1 u / du D 1 C2 C1
1 3 5
Z
0
x    3
  5  
2 3 x x x x7
f6 .x/ D .1 u / du D 1 C3 C3 C1
0 1 3 5 7
You can see that the red numbers follow the Pascal’s triangle (Section 2.27). These results for
even n can be generalized to have the following
X1  2mC1

m x
fn .x/ D amn . 1/ (4.15.3)
mD0
2m C 1

where amn denotes the red coefficients in Eq. (4.15.2), they are called Integral binomial coef-
ficientsŽŽ and . 1/m is either +1 or -1 and is used to indicate the alternating plus/minus signs
appearing in Eq. (4.15.2). And Newton believed that this formula also works for odd integers
n D 1; 3; 5; : : : So he collected the red coefficients in Eq. (4.15.2) in a table (Table 4.17). And
his goal was to find the coefficients for n D 1; 3; 5; : : : i.e., the boxes in this table. With those
coefficients, we know the integrals in Eq. (4.15.1) and by term-wise differentiation we would
get the series for .1 x 2 /n for n D 1=2; 3=2 etc.
A complete table for integral binomial coefficients is given in Table 4.18. And we determine
a; b; c; d; : : : by equating the m-th row in Table 4.18 with the corresponding row in Table 4.17,
but only for columns of even n.
For example, considering the third row (the red numbers in Table 4.18), we have the following
equations

1
9 ˆa21 D b C c D
ˆ
c D 0> ˆ
ˆ 8
= 1 1 <
3
a C 2b C c D 0 H) c D 0; a D ; b D H) a23 D 3a C 3b C c D
>
; 4 8 ˆ
ˆ 8
6a C 4b C c D 1 ˆ
ˆ
:̂a D 10a C 5b C c D 15
25
8
Ž
John Wallis (1616 – 1703) was an English clergyman and mathematician who is given partial credit for the
development of infinitesimal calculus.
ŽŽ
For example, if n D 0 and m D 0, then amn D 1 by looking at the first in Eq. (4.15.2).

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 481

n
m 0 1 2 3 4 5 6

0 1 1 1 1 1 1 1
1 0 1/2 1 3/2 2 5/2 3
2 0  0  1 3 3
3 0  0  0 1 1
4 0  0  0 0 0
5 0  0  0 0 0

Table 4.17: Integral binomial coefficients. The row of m D 0 is all 1, follow Eq. (4.15.2) (coefficient
of x term is always 1). The rule of this table is (because amn follows the Pascal’s triangle): am;nC2 D
am;n C am 1;n for m  1 (see the three circled numbers for one example). Note that a1n D n=2 for even
ns, and Newton believed it is also the case for odd ns. That’s why he put 1=2, 3=2 and 5=2 in the row of
m D 1 for odd ns.
n
m 0 1 2 3 4 5
0 a a a a a a
1 b aCb 2a C b 3a C b 4a C b 5a C b
2 c bCc a C 2b C c 3a C 3b C c 6a C 4b C c 10a C 5b C c
3 d cCd b C 2c C d a C 3b C 3c C d 4a C 6b C 4c C d 10a C 10b C 5c C d

Table 4.18: Integral binomial coefficients.

Similarly, considering now the fourth row, we have



9 8̂ 1
d D 0> a D 1= ˆ
ˆ a31 D c C d D
>
> ˆ
ˆ
8
ˆ
ˆ 16
b C 2c C d D 0= < b D 1=8 <
1
H) H) a33 D a C 3b C 3c C d D
4a C 6b C 4c C d D 0> > ˆ
ˆ c D 1=16 ˆ
ˆ 16
>
; :̂ ˆ
ˆ
20a C 15b C 6c C d D 1 d D0 :̂a D 10a C 10b C 5c C d D 5
35
16
So, we can write f1 .x/ and f2 .x/ as
Z x      7
2 1=2 1 x3 1 x5 1 x
f1 .x/ D .1 u / du D x C C C 
2 3 8 5 16 7
Z x
0
     7
2 3=2 3 x3 3 x5 1 x
f3 .x/ D .1 u / du D x C C 
0 2 3 8 5 16 7
Now, we differentiate the two sides of the above equations; for the LHS the fundamental theorem
of calculus is used to obtain directly the result, and for the RHS, a term-wise differentiation is
used:
1 2 1 4 1 6
.1 x 2 /1=2 D 1 x x x C 
2 8 16 (4.15.4)
3 2 3 4 1
.1 x 2 /3=2 D 1 x C x C x6 C   
2 8 16
Phu Nguyen, Monash University © Draft version
Chapter 4. Calculus 482

Verification. To test his result, Newton squared the series for .1 x 2 /1=2 and observed that it
became 1 x 2 plus some remaining terms which will vanish. Precisely, Newton squared the
quantity 1 1=2x 2 1=8x 4 1=16x 6 5=128x 8 C R.x/ and obtained 1 x 2 C Q.x/ where
Q.x/ contains the lowest order of 10 i.e., very small. Today, we can do this verification easily
using Sympy.
Now comes Pnthe surprising
 k part. We all know the binomial theorem which says, for n 2 N,
n
.1 C x/ D kD0 k x . The LHS of Eq. (4.15.4) are of the same form only with rational
n

exponents. The question is: can Eq. (4.15.4) still be written in the same form of the binomial
theorem? That is
!
X1
m
.1 x 2 /m D . 1/k x 2k (4.15.5)
k
kD0
The answer is yes. The only difference compared with integral exponent case is that the binomial
expansion is now an infinite series when m is a rational number.

Newton computed . He considered the first quarter of a unit circle and calculated its area
(even though he knew that it is =4; thus he wanted to compete with Archimedes on who would
get more digits of . Actually he was testing
p his generalized binomial theorem). The function
of the first quarter of a unit circle is y D 1 x 2 , and thus its area is
Z 1p
AD 1 x 2 dx
0
p
Now comes the power of Eq. (4.15.4): Newton replaced 1 x 2 by its power series, and with
A D =4, he obtained:
Z 1 
 1 2 1 4 1 6 5 8
D 1 x x x x    dx
4 0 2 8 16 128
 1
 1 x3 1 x5 1 x7 5 x9
D x 
4 2 3 8 5 16 7 128 9
  0
11 11 1 1 5 1
 D4 1 
2 3 8 5 16 7 128 9
However, he realized that this series converged quite slowlyŽŽ .
Why this series converge slowly? Because in the terms x n =n, we
substituted x D 1. If 1 was replaced by a number smaller than 1,
then x n =n would be much smaller, and the series would converge
faster. And that exactly what Newton did: he only integrated to
0.5, and obtained this series (see next figure)
p
 3 1 11 1 1 1 1 5 1
C D 
12 8 2 6 8 40 32 112 128 1152 512
ŽŽ
That is, the series needs a lots of terms to get accurate .

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 483

with which he managed to compute at least 15 digits. He admitted as much in 1666 (at the age of
23) when he wrote, "I am ashamed to tell you to how many figures I carried these computations,
having no other business at the time."
As you can see, having the right tool, the calculation of  became much easier than the
polygonal method of Archimedes.

4.15.2 Series of 1=.1 C x/ or Mercator’s series


This section presents Newton’s work on the function y D .1 C x/ 1 . He wanted to compute the
area under this curve. The idea is the same: first computing the following integrals
Z x
fn .x/ D .1 C u/n du (4.15.6)
0

for n D 0; 1; 2; 3; 4; : : :, then finding a pattern and finally interpolating it to n D 1. First thing


first, here are fn .x/ for non-negative integers from 0 to 4:

f0 .x/ D 1.x/
 
x2
f1 .x/ D 1.x/ C 1
2
 2  3
x x
f2 .x/ D 1.x/ C 2 C1
2 3 (4.15.7)
 2  3  4
x x x
f3 .x/ D 1.x/ C 3 C3 C1
2 3 4
 2  3  4  5
x x x x
f4 .x/ D 1.x/ C 4 C6 C4 C1
2 3 4 5
Pascal’s triangle shows up again! Now we put all the coefficients from Eq. (4.15.7) in Table 4.19
(left) and want to find the coefficients for column n D 1 assuming that the rules work for
n D 1 as well. The rule of this table is: am;nC1 D am;n C am 1;n . It follows that the coefficient
for n D 1 given in the right table ensures this rule.

n n
m -1 0 1 2 3 4 5 m -1 0 1 2 3 4 5
0  1 1 1 1 1 1 0 C1 1 1 1 1 1 1
1  0 1 2 3 4 5 1 1 0 1 2 3 4 5
2  0 0 1 3 6 10 2 C1 0 0 1 3 6 10
3  0 0 0 1 4 10 3 1 0 0 0 1 4 10
4  0 0 0 0 1 5 4 C1 0 0 0 0 1 5
5  0 0 0 0 0 1 5 1 0 0 0 0 0 1

Table 4.19: Integral binomial coefficients. The row of m D 0 is all 1, follow Eq. (4.15.7) (coefficient of x
term is always 1). The rule of this table is: am;nC1 D am;n C am 1;n .

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 484

Therefore, we can get the integral, and term-wise differentiation gives the series:
Z x
du x2 x3 x4 1
Dx C C    H) D1 x C x2 x3 C x4 
0 1Cu 2 3 4 1Cx
And we obtain the geometric series! And that confirms Newton was correct.

4.15.3 Geometric series and logarithm


From a geometric series and integration we can obtain interesting series for logarithm. And these
series are practical way to compute logarithm of any real positive number.
Consider the following geometric series

1
1 C x C x2 C x3 C    D ; jxj < 1 (4.15.8)
1 x
And by integrating both sides, we get the logarithm series:
Z Z
2 3 dx x2 x3
.1 C x C x C x C    /dx D H) x C C C  D ln.1 x/ (4.15.9)
1 x 2 3

Similarly, this geometric series 1 x C x2 x 3 C    gives us ln.1 C x/

1 x2 x3 x4
1 x C x2 x3 C    D H) ln.1 C x/ D x C C  (4.15.10)
1Cx 2 3 4
With this, it is the first time that we are able to compute ln 2 directly using only simple arithmetic
operations: ln 2 D ln.1 C 1/ D 1 1=2 C 1=3 1=4 C    . Using a calculator we know that
ln 2 D 0:6931471805599453. Let’s see how the series in Eq. (4.15.10) performs. The calculation
in Table 4.20 (of course done by a Julia code) indicates that this series is practically not useful
as it converges too slow. See column 2 of the table, with 1000 terms and still the value is not yet
close to ln 2.

Table 4.20: Convergence rate of two series for ln 2.

n ln 2 with Eq. (4.15.10) ln 2 with Eq. (4.15.11)

1 1.0 0.666667
2 0.5 0.666667
:: :: ::
: : :
11 0.736544 0.693147
:: :: ::
: : :
1000 0.692647 0.693147

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 485

How can we get a series with a better convergence? The issue might be in the alternating
C= sign in the series. By combining the series for ln.1 C x/ and ln.1 x/, we can get rid
of the terms with negative sign:


ˆ x2 x3 x4  
< ln.1 C x/ D x C C  1Cx x3 x5
2 3 4 H) ln D2 xC C C :::
ˆ x2 x3 x4 1 x 3 5
:̂ ln.1 x/ D x C C C C 
2 3 4
(4.15.11)
Using x D 1=3, we have ln 2 D 2.1=3 C .1=3/ =3 C : : :/ The data in column 3 in Table 4.20
3

confirms that this series converge much better: only 11 terms give us 0.693147. What is more
is that while Eq. (4.15.10) cannot be used to compute ln e (because of the requirement jxj < 1),
Eq. (4.15.11) can. For any positive number y, x D y 1=yC1 satisfies jxj < 1.

4.15.4 Geometric series and inverse tangent


Let’s consider the following geometric series

1
1 C x2 C x4 C x6 C    D
1 x2
1
1 x2 C x4 x6 C    D
1 C x2
From the second series we can get the series of the inverse tangent:
Z Z
2 4 6 dx
.1 x C x x C    /dx D
1 C x2
3 5 7
(4.15.12)
x x x
x C    C D tan 1 x
3 5 7
With this, we can derive the magical formula for  discovered by Gregory and Leibniz (actually
re-discovered as 200 years before Leibniz some Indian mathematician found it). The angle =4
has tangent of 1, thus tan 1 1 D =4, with x D 1 we have:

 1 1 1
D1 C  (4.15.13)
4 3 5 7
As  is involved, is there any hidden circle in Eq. (4.15.13)? The answer is yes, and to see that,
check https://www.youtube.com/watch?v=NaL_Cb42WyY&feature=youtu.be. This series
converges slowly i.e., we need to use lots of terms (and thus lots of calculations) to get a more
accurate result for . However this series is theoretically interesting as it provided a new way
of calculating  (compared with Archimedes’ method). To have a betterp series for , wepshould
use x < 1 in Eq. (4.15.12). For example, note that tan =6 D 1= 3, so with x D 1= 3, we
can compute a better approximation for .

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 486

4.15.5 Euler’s work on exponential functions


From Brigg’s work on logarithm, see Section 2.24.4, we know the following approximation for
a where  is a small number:
a D 1 C k (4.15.14)
Now, Euler introduced ax (this is what we need) with x D N where N is a very big number so
that  is a very small number. Now, we can write ax D aN D .a /N and use Eq. (4.15.14):

ax D .1 C k/N
N.N 1/ N.N 1/.N 2/
D 1 C N.k/ C .k/2 C .k/3 C    (binomial theorem)
2Š 3Š
x N.N 1/ 2 x 2 N.N 1/.N 2/ x3
D 1 C Nk C k 2C k3 C 
N 2Š N 3Š N3
1 1 1
D 1 C kx C .kx/2 C .kx/3 C   
1Š 2Š 3Š
(4.15.15)

The last equality is due to the fact that N D N 1 D N 2 as N is very large. Now, we
evaluate Eq. (4.15.15) at x D 1 to get an equation between a and k:
k 1 1
a D1C C k2 C k3 C   
1Š 2Š 3Š
Euler defined e as the number for which k D 1:
1 1 1
e D1C C C C  (4.15.16)
1Š 2Š 3Š
The series on the RHS indeed converges because nŠ gets bigger and bigger and 1=nŠ becomes
close to zero. A small code computing this series gives us e D 2:718281828459045. With
k D 1, Eq. (4.15.15) allows us to write e x as
 N
x 1 1 1
x
e D 1C D1C x C x2 C x3 C    (4.15.17)
N 1Š 2Š 3Š

4.15.6 Euler’s trigonometry functions


This section presents Euler’s derivation of the power series of the sine and cosine functions. He
started with the formula .cos ˛ ˙ i sin ˛/n D cos.n˛/ ˙ i sin.n˛/ to get cos.n˛/ in terms of
.cos ˛ C i sin ˛/n and .cos ˛ i sin ˛/n . Then, he used the binomial theorem to expand these
two terms. Finally, he replaced n by N a very large positive number and ˛n D ˛N D x so that
˛ is small and cos ˛ D 1.
Let’s start with
.cos ˛ C i sin ˛/n D cos.n˛/ C i sin.n˛/
.cos ˛ i sin ˛/n D cos.n˛/ i sin.n˛/

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 487

to get
1
cos.n˛/ D Œ.cos ˛ C i sin ˛/n C .cos ˛ i sin ˛/n 
2
1
i sin.n˛/ D Œ.cos ˛ C i sin ˛/n .cos ˛ i sin ˛/n 
2
P 
Using the binomial theorem: .a C b/n D nkD0 kn an k b k , we can expand the terms .cos ˛ C
i sin ˛/n as
n.n 1/
.cos ˛ C i sin ˛/n D cosn ˛ C i n cosn 1
˛ sin ˛ cosn 2 ˛ sin2 ˛

n.n 1/.n 2/ n.n 1/.n 2/.n 3/
i cosn 3 ˛ sin3 ˛ C cosn 4
˛ sin4 ˛
3Š 4Š
n.n 1/.n 2/.n 3/.n 4/
Ci cosn 5 sin5 ˛ C   

and similarly for .cos ˛ i sin ˛/n as
n.n 1/
.cos ˛ i sin ˛/n D cosn ˛ i n cosn 1
˛ sin ˛ cosn 2 sin2 ˛

n.n 1/.n 2/ n.n 1/.n 2/.n 3/
Ci cosn 3 ˛ sin3 ˛ C cosn 4
˛ sin4 ˛
3Š 4Š
n.n 1/.n 2/.n 3/.n 4/
i cosn 5 ˛ sin5 ˛ C   

Therefore,
n.n 1/ n.n 1/.n 2/.n 3/
cos.n˛/ D cosn ˛ cosn 2 ˛ sin2 ˛ C cosn 4
˛ sin4 ˛ C   
2Š 4Š
n.n 1/.n 2/
sin.n˛/ D n cosn 1 ˛ sin ˛ cosn 3 ˛ sin3 ˛

n.n 1/.n 2/.n 3/.n 4/
C cosn 5 ˛ sin5 ˛ C   

(4.15.18)

Now comes the magic of Euler. In Eq. (4.15.18), he used N for n where N is a very large
positive integer. He introduced a new variable x such that ˛N D x (or ˛ D x=N ). Obviously ˛
is very small, hence we have cos ˛  1, and sin ˛  ˛. Hence, cos.n˛/ becomes
N.N 1/ 1 2 N.N 1/.N 2/.N 3/ 1 4
cos.x/ D 1 x C x 
N2 2Š N4 4Š
1 2 1
D1 x C x4   
2Š 4Š
where we have used the fact that for a very large integer N D N 1 D N 2 D N 3 D : : :
and get rid of all coefficients involving this arbitrary number N . In the same manner, the series

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 488

for sin x can be obtained. Putting them together, we have


1 3 1 1 7 X 1
1
sin.x/ D x x C x5 x C  D . 1/i 1
x 2i 1
3Š 5Š 7Š i D1
.2i 1/Š
(4.15.19)
1 2 1 1 6 X 1
1 2i
cos.x/ D 1 x C x4 x C  D . 1/i x
2Š 4Š 6Š i D0
.2i/Š
I have included the formula using the sigma notation. It is not for beauty, that formula is trans-
lated directly to our Julia code, see Listing A.3. Even though this was done by the great
mathematician Euler, we have to verify them for ourselves. p Let’s compute sin =4 using the
series. With only 5 terms, we got 0.707106781 (same as 2=2 computed using trigonometry
from high school maths)! Why so fast convergence?
With Eq. (4.15.19) we can see that the derivative of sine is cosine: just differentiating the first
series and you will obtain the second. Can we also obtain the identity sin2 x C cos2 x D 1 from
these series? Of course, otherwise it was not called sine/cosine series. Some people is skillful
enough to use Eq. (4.15.19) to prove this identity. It is quite messy. We can go the other way:
g.x/ D sin2 x C cos2 x H) g 0 .x/ D 2 sin x cos x 2 cos x sin x D 0 H) g.x/ D constant
But, we know g.0/ D sin2 0 C cos2 0 D 1 (using Eq. (4.15.19) of course). So g D 1. What a
nice proof. We still have to relate the sine/cosine series to the traditional definition of sine/cosine
based on a right triangle. And finally, the identity sin.x C y/ D sin x cos y C sin y cos x and so
on (all of this can be done, but that’s enough to demonstrate the idea).
You might ask why bothering with all of this? This is because if we can do so, then you can
see that trigonometric functions can be defined completely without geometry! Why that useful?
Because it means that trigonometric functions are more powerful than we once thought. Indeed
later on we shall see how these functions play an important role in many physical problems that
have nothing to do with triangles!

4.15.7 Euler’s solution of the Basel problem


This section presents Euler’s solution to the Basel problem. Recall that the Basel problem
involves the sum of the reciprocals of the squares of natural numbers: 1 C 14 C 19 C 16
1
C .
This problem has defied all mathematicians including Leibniz who once declared that he could
sum any infinite series that converge whose terms follow some rule. And Euler found the sum is
 2=6.

Euler’s proof was based on the power series of sin x (see the previous section), and the
fact that if f .x/ D 0 has solutions x1 D a, x2 D b, etc. then we can factor it as f .x/ D
.a x/.b x/.c x/    D .1 x=a/.1 x=b/.1 x=c/    if all of the solutions are different
from zero.
Euler considered the function f .x/ D sin x=x . From the power series of sin.x/ in
Eq. (4.15.19), we can write f .x/ as
sin x x2 x4
f .x/ D D1 C C  (4.15.20)
x 3Š 5Š
Phu Nguyen, Monash University © Draft version
Chapter 4. Calculus 489

As the non-zero solutions of f .x/ D 0 are ˙, ˙2, ˙3, etc, we can also write it as
    
x x x x
f .x/ D 1 1C 1 1C 
  2 2
  (4.15.21)
1 1 1 2
D1 C C C  x C 
2 4 2 9 2

By equating the coefficient for x 2 in Eqs. (4.15.20) and (4.15.21), we obtain

1 1 1 1 1 1 2
C C C    D H) 1 C C C    D (4.15.22)
2 4 2 9 2 3Š 4 9 6
P
It is easy to verify this result by writing a small code to calculate the sum of niD1 1=i 2 , for
example with n D 1000 and see that the sum is indeed equal to  2 =6. And with this new toy,
Euler continued and calculated the following sums (note that all involve even powers)

1 1 2
1C C C  D .power 2/
4 9 6
1 1 4
1C C C  D .power 4/
16 81 90
But Euler and no mathematicians after him is able to crack down the sum with odd powers. For
example, what is 1 C 213 C 313 C 413    ? Can it be  3=n? No one knows.

Wallis’ infinite product Euler’s method simultaneously leads us to Wallis’ infinite product
regarding . The derivation is as follows
        
sin x x x x x x2 x2 x2
D 1 1C 1 1C  D 1 1 1 
x   2 2 2 4 2 9 2
Evaluating the above at x D =2 results in Wallis’ infinite product
       
2 1 1 1 3 15 35
D 1 1 1  D 
 4 16 36 4 16 36
       
13 35 57  22 44 66
D    H) D 
22 44 66 2 13 35 57

Harmonic series and Euler’s constant. Up to now we have met the three famous numbers in
mathematics: , e and i . Now is the time to meet the fourth number: D 0:577215 : : : While
Euler did not discover , e and i he gave the names to two of them ( and e). Now that he
discovered but he did not name it.
Recall that S.n/–the n-th harmonic number–is the sum of the reciprocals of the first n natural
numbers:
1 1 1 X n
1
S.n/ WD 1 C C C    C D (4.15.23)
2 3 n i D1
i

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 490

Now, define the following quantity

A.n/ WD S.n/ ln.n/ (4.15.24)

The sequence A.1/; A.2/; : : : converges because it is a decreasing sequence i.e.,


A.n C 1/ < A.n/ and it is bounded below because A.n/ > 0. And the limit of this sequence is
called the Euler–Mascheroni constant or Euler’s constant :
!
X n
1
WD lim ln n (4.15.25)
n!1
i D1
i

Using a computer, with n D 107 , I got D 0:577215, correct to six decimals. In 1734, Euler
computed to five decimals. Few years later he computed up to 16 digits.
But hey! How did Euler think of Eq. (4.15.24)? If someone told you to consider this sequence,
you could write a code to compute A.n/ and see it for yourself that it converges to a value of
0.577215. And you would discover . Now you see the problems with how mathematics is
currently taught and written. For detail on the discovery of , I recommend the book Gamma:
exploring Euler’s constant by Julian HavilŽŽ [30] for an interesting story about . There are many
books about the great incomparable Euler e.g. Euler: The master of us all by Dunham William
[19] or Paul Nahin’s Dr. Euler’s Fabulous Formula: Cures Many Mathematical Ills [48].

Question 12. Is Euler’s constant irrational? If so, is it transcendental? No one knows. This is
one of unsolved problems in mathematics.

History note 4.3: Euler (1707-1783)


Euler was a Swiss mathematician and physicist. He worked
in almost all areas of mathematics: geometry, algebra, calcu-
lus, trigonometry, number theory and graph theory. He was
the first to write f .x/ to denote the function of a sinle vari-
able x. He introduced the modern notation for the trigono-
metric functions,
P the letter e for the base of the natural
logarithm, for summations, i for the imaginary unit i.e.,
i D 1. Euler was one of the most eminent mathematicians
2

of the 18th century and is held to be one of the greatest in


history. A statement attributed to Pierre-Simon Laplace ex-
presses Euler’s influence on mathematics: "Read Euler, read
Euler, he is the master of us all." He is also widely consid-
ered to be the most prolific, as his collected works fill 92 volumes, more than anyone else
ŽŽ
Julian Havil (born 1952) is an educator and author working at Winchester College, Winchester, England. The
famous English-American theoretical physicist and mathematician Freeman Dyson was one student of Havil.

William Wade Dunham (born 1947) is an American writer who was originally trained in topology but became
interested in the history of mathematics and specializes in Leonhard Euler. He has received several awards for
writing and teaching on this subject.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 491

in the field. He spent most of his adult life in Saint Petersburg, Russia, and in Berlin, then
the capital of Prussia. Euler’s eyesight worsened throughout his mathematical career. He
becaome almost totally blind at the age of 59. Euler remarked on his loss of vision, "Now
I will have fewer distractions." Indeed, his condition appeared to have little effect on his
productivity, as he compensated for it with his mental calculation skills and exceptional
memory. Many of those pages were written while he was blind, and for that reason, Eu-
ler has been called the Beethoven of mathematics. Beethoven could not hear his music.
Likewise, Euler could not see his calculations.

4.15.8 Taylor’s series


In previous sections, we have seen the series representation of various functions: exponential
function e x , trigonometric functions sin x and cos x, logarithm functions and so on. In all cases,
a function f .x/ is written as a power series in the following form
X
1
f .x/ D a0 C a1 x C a2 x 2 C a3 x 3 C    D an x n (4.15.26)
nD0

where an are the coefficients that vary from function to function.


Brook Taylor found a systematic way to find these coefficients an for any differentiable
functions. His idea is to match the function value and all of its derivatives (first derivative,
second derivative, etc.) at x D 0. Thus, we have the following equations to solve for an :
f .0/ D a0
f 0 .0/ D a1
f 00 .0/ D 2Ša2
f 000 .0/ D 3Ša3 (4.15.27)
::
:
f .n/ .0/ D nŠan
And putting these coefficients into Eq. (4.15.26), we obtain the Taylor’s series of any function
f .x/|| :

f 0 .0/ f 00 .0/ 2 f 000 .0/ 3 X1


f .n/ .0/ n
f .x/ D f .0/ C xC x C x C  D x (4.15.28)
1Š 2Š 3Š nD0

where the notation f .n/ .x/ denotes the n-order derivative of f .x/; for n D 0 we have f .0/ .x/ D
f .x/ (i.e., the 0th derivative is the function itself). See Fig. 4.82 for a demonstration of the
||
Actually not all functions but smooth functions that have derivatives

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 492

Taylor series of cos x. The more terms we include a better approximation of cos x we get. What
is interesting is that we use information of f .x/ only at x D 0, yet the Taylor series (with
enough terms) match the original function for many more points. Taylor series expanded around
0 is sometimes known as the Maclaurin series, named after the Scottish mathematician Colin
Maclaurin (1698 – 1746).
y

cos x
1 − x2 /2
1 − x2 /2 + x4 /4!
1 − x2 /2 + x4 /4! − x6 /6!

Figure 4.82: The graph of cos x and some of its Taylor expansions: 1 x 2 =2, 1 x 2 =2 C x 4 =4Š and
1 x 2 =2 C x 4 =4Š x 6 =6Š.

There is nothing special about x D 0. And we can expand the function at the point x D a:

X
1
f .n/ .a/
f .x/ D .x a/n (4.15.29)
nD0

History note 4.4: Taylor (1685-1731)


Brook Taylor was an English mathematician who added to mathe-
matics a new branch now called the ’calculus of finite differences’,
invented integration by parts, and discovered the celebrated formula
known as Taylor’s expansion. Brook Taylor grew up not only to be an
accomplished musician and painter, but he applied his mathematical
skills to both these areas later in his life. As Taylor’s family were well
off they could afford to have private tutors for their son and in fact this
home education was all that Brook enjoyed before entering St John’s
College Cambridge on 3 April 1703. By this time he had a good grounding in classics and

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 493

mathematics. The year 1714 marks the year in which Taylor was elected Secretary to the
Royal Society. The period during which Taylor was Secretary to the Royal Society marks
what must be considered his most mathematically productive time. Two books which
appeared in 1715, Methodus incrementorum directa et inversa and Linear Perspective are
extremely important in the history of mathematics. The first of these books contains what
is now known as the Taylor series.

4.15.9 Common Taylor series


Equipped with Eq. (4.15.28) it is now an easy job to develop power series for trigonometric
functions, exponential functions, logarithm functions etc. We put commonly used Taylor series
in the following equation:

1 1 1 X
1
xn
x
e D 1 C x C x2 C x3 C    D x2R
1Š 2Š 3Š nD0

1 3 1 1 7 X 1
x 2nC1
sin x Dx x C x5 x C  D . 1/n x2R
3Š 5Š 7Š nD0
.2n C 1/Š
1 2 1 1 6 X1
x 2n
cos x D1 x C x4 x C  D . 1/n x2R
2Š 4Š 6Š nD0
.2n/Š
x3 x5 x7 X1
x 2nC1
arctan x D x C C  D . 1/n x 2 Œ 1; 1
3 5 7 nD0
.2n C 1/
x 2
x 3
x 4 X1 n
nC1 x
ln.1 C x/ D x C C  D . 1/ x 2 . 1; 1/
2 3 4 nD1
n
1 X1
D 1 C x C x2 C x3 C    D xn x 2 . 1; 1/
1 x nD0

If we look at the Taylor series of cos x we do not see odd powers. Why? This is because
cos. x/ D cos.x/ or cosine is an even function. Similarly, in the series of the sine, we do not
see even powers. In the above equation, for each series a condition e.g. x 2 Œ 1; 1 was included.
This is to show for which values of x that we can use the Taylor series to represent the origin
functions. For example, we cannot use x x 3=3 C x 5=5 x 7=7 C   to replace arctan x for jxj > 1.
In Fig. 4.83 we plot e x and ln.1 C x/ and their Taylor series of different number of terms
n. We see that the more terms used the more accurate the Taylor series are. But how accurate
exactly? You might guess the next thing mathematicians will do is to find the error associated
with a truncated Taylor series (we cannot afford to use large n so we can only use a small
number of terms, and thus we introduce error and we have to be able to quantify this error).
Section 4.15.10 is devoted to this topic.

Taylor’s series of other functions. For functions made of elementary functions, using the
definition of Taylor’s series is difficult. We can find Taylor’s series for these functions indirectly.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 494

4.0
2

3.5
1
3.0

2.5 0
1.5 1.0 0.5 0.0 0.5 1.0 1.5

2.0
1

1.5
2
1.0
exp(x)
T1(x) ln(1 + x)
0.5 3 T4(x)
T2(x)
T3(x) T7(x)
T4(x) T11(x)
0.0 T16(x)
1.5 1.0 0.5 0.0 0.5 1.0 1.5 4

(a) exp.x/ (b) ln.1 C x/

Figure 4.83: Truncated Taylor’s series for e x and ln.1 C x/.

For example, to find the Taylor’s series of the following function


  
f .x/ D ln.cos x/; x 2 ;
2 2
we first re-write f .x/ in the form ln.1 C t / so that Taylor’s series is available:
f .x/ D ln.1 C .cos x 1//
.cos x 1/2 .cos x 1/3 .cos x 1/4 (4.15.30)
D .cos x 1/ C C 
2 3 4
Now we use Taylor’s series for cos x:
1 2 1 1 6
cos x 1D x C x4 x C  (4.15.31)
2Š 4Š 6Š
Next, we substitute Eq. (4.15.31) into Eq. (4.15.30),
   2
1 2 1 4 1 6 1 1 2 1 1 6
f .x/ D x C x x C  x C x4 x C 
2Š 4Š 6Š 2 2Š 4Š 6Š
 3
1 1 2 1 1 6
C x C x4 x C  C 
3 2Š 4Š 6Š
Assume that we ignore terms of order 8 and above, we can compute f .x/ as:
     
1 2 1 4 1 6 1 1 2 1 4 2 1 1 2 3
ln.cos x/ D x C x x C  x C x C x
2Š 4Š 6Š 2 2Š 4Š 3 2Š
x2 x4 x6
D C O.x 8 /
2 12 45
Phu Nguyen, Monash University © Draft version
Chapter 4. Calculus 495

Big O notation. In the above equation I have introduced the big O notation (O.x 8 /). In that
equation, because we neglected terms of order of magnitude equal and greater than eight, the
notation O.x 8 / is used. Let’s see one example: the sum of the first n positive integers is

n.n C 1/ n2 n
1 C 2 C 3 C  C n D D C
2 2 2
When n is large, the second term is much smaller relatively than the first term; so the order of
magnitude of 1 C 2 C    C n is n2 ; the factor 1=2 is not important. So we write

1 C 2 C 3 C    C n D O.n2 /

To get familiar with this notation, we write, in below, the full Taylor’s series for e x , and two
truncated series
1 1 2 1 3 1
ex D 1 C xC x C x C  D 1 C x C O.x 2 /
1Š 2Š 3Š 1Š
1 1 2 1 3 1 1
ex D1C xC x C x C  D 1 C x C x 2 C O.x 3 /
1Š 2Š 3Š 1Š 2Š
The notation O.x 2 / allows us to express the fact that the error in e x D 1 C x is smaller in
absolute value than some constant times x 2 if x is close enough to 0ŽŽ . The big O notation
is also called Landau’s symbol named after the German number theoretician Edmund Landau
(1877–1938) who invented the notation. The letter O is for order.

4.15.10 Taylor’s theorem


We recall that it is possible to write any function f .x/ as a power series, see Eq. (4.15.29). In
Fig. 4.83, we have examined how a power series can approximate a given function. To that end,
we varied the number of terms in the series, and we have seen that the more terms used the more
accurately the series approximates the function. To quantify the error of this approximation,
mathematicians introduce the concept of the remainder of a Taylor series. That is they divide the
series into two sums:
X
1
f .n/ .a/ X
n
f .i / .a/ X1
f .i / .a/
n i
f .x/ D .x a/ D .x a/ C .x a/i (4.15.32)
nŠ i D0
iŠ i DnC1

„ ƒ‚ … „
nD0
ƒ‚ …
Tn .x/ Rn .x/

The first sum (has a finite term) is a polynomial of degree n and thus called a Taylor polynomial,
denoted by Tn .x/. The remaining term is called, understandably, the remainder, Rn .x/.
It is often that scientists/engineers do this approximation: f .x/  Tn .x/. This is because it’s
easy to work with a polynomial (e.g. differentiation/integration, root finding of a polynomial is
ŽŽ
You can play with some values of x close to zero, compute e x using the exponential function in a calculator,
and compute its approximation 1 C x, the difference between the two values is proportional to x 2 .

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 496

straightforward). In this case Rn .x/ becomes the error of this approximation. If only two terms
in the Taylor series are used, we get:
T2 .x/ D f .a/ C f 0 .a/.x a/
which is the linear approximation we have discussed in Section 4.5.3.
How to quantify Rn .x/? From Fig. 4.83 we observe that close to x D a the approximation
is very good, but far from a the approximation is bad. The following theorem helps to quantify
Rn .x/Ž .
Theorem 4.15.1
f .nC1/ .c/
Rn .x/ D .x a/nC1 (4.15.33)
.n C 1/Š

Example 4.18
The Taylor series for y D e x at a D 0 with the remainder is given by

x x x2 xn ec
e D1C C C  C C Rn .x/; Rn .x/ D x nC1
1Š 2Š nŠ .n C 1/Š

where 0 < c < x. The nice thing with e x is that Rn .x/ approaches zero as n goes large. Note
that we have jcj < jxj and e x is an increasing function, thus

e jxj jx nC1 j
jRn .x/j  jx nC1 j H) lim jRn .x/j < e jxj lim D0
.n C 1/Š n!1 n!1 .n C 1/Š

See Section 4.10.4 if you’re not clear why the final limit is zero.

4.16 Applications of Taylor’ series


Herein we present some applications of Taylor series: (1) Evaluate integrals, (2) Evaluate limits,
and (3) Evaluate series.

4.16.1 Integral evaluation


Z
x2
Let’s compute the following integral which we could not do before: e dx. The idea is to
replace the integrand by its Taylor’s series:
1 1 1
ex D 1 C x C x2 C x3 C   
1Š 2Š 3Š
2 1 1 1
e x D 1 C . x 2 / C . x 2 /2 C . x 2 /3 C   
1Š 2Š 3Š
Ž
I could not find an easy motivating way to come up with this theorem, so I accepted it.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 497

Then, term-wise integration gives us


Z Z 1 
x2 1 2 1 2 2 1 2 3
e dx D 1 C . x / C . x / C . x / C    dx
0 1Š 2Š 3Š
1 3 1 5 1 7 X1
1 x 2nC1
Dx x C x x C  D . 1/n
1Š3 2Š5 3Š7 nD0
nŠ 2n C 1

Now I present another interesting formula for . This is the formula:


p 1  
3 3 X . 1/n 2 1
D C (4.16.1)
4 nD0 8n 3n C 1 3n C 2

First, write a small Julia code to verify this formula (using n D 100 and compute the RHS
to see if it matches  D 3:1415 : : :). How on earth mathematicians discovered this kind of
equation? They started with a definite integral of which the integral involves :
Z 1=2
dx 
D p
0 x 2 xC1 3 3
If you cannot evaluate this integral: using a completing a square for x 2 x C 1, then using a
trigonometry substitution (tan ). That’s not interesting. Here is the great stuff:

1 C x 3 D .1 C x/.x 2 x C 1/

Thus,
Z 1=2 Z 1=2 Z 1=2 Z 1=2
dx xC1 xdx dx
I D D dx D C
0 x2 xC1 0 1 C x3 0 1 C x3 0 1 C x3
Of course, now we replace the integrands by corresponding power series. Starting with the
geometric series:
1
D 1 C x C x2 C x3 C   
1 x
We then have:
1
D1 x3 C x6 x9 C   
1 C x3
x
Dx x4 C x7 x 10 C    .obtained from the above times x/
1 C x3
Now, the integral I can be evaluated using these series:
Z 1=2 Z 1=2
4 7 10
I D .x x C x x C    /dx C .1 x3 C x6
x 9 C    C/dx
0
  
0

1 1 1 1 1 1 1 1 1 1 1 1 1
D  0  C  2  C 1 0  C  
4 2 8 5 8 8 8 2 8 4 8 7 82

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 498

Now we can understand Eq. (4.16.1).

Mysterious function x x . What would be the graph of the mysterious function y D x x ? Can it
be defined for negative x? Is it an increasing/decreasing function? We leave that for you, instead
we focus on the integration of this function. That is we consider the following integral:
Z 1
I WD x x dx
0

Here how John Bernoulli computed it in 1697. He converted x x to e ::: :

x x D .e ln x /x D e x ln x

Then, he used the power series for e x to replace e x ln x by a power series:

1 1 1 X1
.x ln x/n
x ln x 2 3
e D 1 C x ln x C .x ln x/ C .x ln x/ C    D
1Š 2Š 3Š nD0

Then, the original integral becomes:


Z 1X
1 X 1 1 Z 1
.x ln x/n
I D dx D .x ln x/n dx
0 nD0
nŠ nD0
nŠ 0

For the red integral, integration by parts is the way to go:

Œx nC1 .ln x/n 0 D .n C 1/.x ln x/n C nx n .ln x/n 1

Therefore,
Z 1 Z 1
n 1 n
.x ln x/ dx D Œx nC1 .ln x/n 10 x n .ln x/n 1 dx
nC1 nC1
Z 1
0 0
n
D x n .ln x/n 1 dx
nC1 0

This is because limx!0 x nC1 .ln x/n D 0. Now if we repeatedly apply integration by parts to
lower the power in .ln x/n 1 , we obtain:
Z 1 Z 1 
n n
.x ln x/ dx D x n .ln x/n 1 dx
nC1 0
  Z 1
0
n n 1
D x n .ln x/n 2 dx
nC1 nC1 0
   Z 1
n n 1 n 2
D x n .ln x/n 3 dx
nC1 nC1 nC1 0

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 499

Doing this until .ln x/0 , we’re then done:


Z 1

.x ln x/n dx D . 1/n
0 .n C 1/nC1

And finally the integral is given by

X
1
. 1/n 1 1 1
I D D1 C C 
nD0
.n C 1/ nC1 22 33 44

4.16.2 Limit evaluation


Move on to the problems of evaluation of limits, let us consider the following limit

x2ex
lim
x!0 cos x 1
And again, the idea is to replace e x and cos x by its Taylor’s series, and we will find that the
limit will come easily:
1 1 2 1 3 1 3 1 4 1 5
x 2 .1 C 1Š
x C 2Š x C 3Š x C / x2 C 1Š
x C 2Š x C 3Š x C 
AD 1 2 1 4
D 1 2 1 4
H) lim A D 2

x C 4Š x  2Š
x C 4Š x  x!0

4.16.3 Series evaluation


Taylor’s series can be used to compute series. For example, what does the following series
converge to?

1 1 1
1 C C    D‹
1Š 2Š 3Š
We can recognize that the above series is e x
evaluated at x D 1, so the series converges to 1=e.

4.17 Bernoulli numbers


In Section 2.27 we have discussed the story of Jakob Bernoulli in 1713, and Seki Takakazu in
1712 independently discovered a general formula for the sum of powers of counting numbers
(e.g. 13 C 23 C 33 C    ). The formula introduces the now so-called Bernoulli numbers. This
section elaborates on these amazing numbers.

Series of 1=.e x 1/ and Bernoulli numbers. Let’s start with

1 1 1 1 1
ex 1D x C x 2 C x 3 C x 4 C x 5 C O.x 6 /
1Š 2Š 3Š 4Š 5Š
Phu Nguyen, Monash University © Draft version
Chapter 4. Calculus 500

Then, we can write 1=.e x 1/ as

1 1
D
ex 1 1 1 1 1 1
x C x 2 C x 3 C x 4 C x 5 C O.x 6 /
1Š 2Š 3Š 4Š 5Š
0 1 1

1B C
B1 C 1 x C 1 x 2 C 1 x 3 C 1 x 4 CO.x 5 /C
D
x@ 2
„ 6 24
ƒ‚ 120 … A
y

Now, using the Taylor series for 1=.1 C y/ D 1 y C y 2 y 3 C y 4 (we stop at y 4 as we skip
terms of powers higher than 4), and also using SymPy, we get

x x x2 x4 x0 1 x1 1 x2 1 x4 X1
xn
D1 C C  D 1 C C  D Bn
ex 1 2 12 720 0Š 2 1Š 6 2Š 30 4Š nD0

(4.17.1)

The second equality is to introduce nŠ into the formula as we want to follow the pattern of the
Taylor series. With that, we obtain a nice series for x=ex 1 in which the Bernoulli numbers show
up again! They are

1 1 1 1
B0 D 1; B1 D ; B2 D ; B3 D 0; B4 D ; B5 D 0; B6 D ; B7 D 0; : : :
2 6 30 42
Recurrence relation between Bernoulli numbers. Recall that we have met Fibonacci numbers,
and they are related to each other. Then, we now ask whether there exists a relation between the
Bernoulli numbers. The answer is yes, that’s why mathematics is super interesting. The way to
derive
P1 this relation is also beautiful. From Eq. (4.17.1), we can compute x in terms of e x 1
n
and nD0 Bn xnŠ :
  X !
xnX
1
x x2
1
xn
x
x D .e 1/ Bn D C C  Bn
nŠ 1Š 2Š nŠ
!
nD0
! !
nD0
!
X1
xm X1
xn X
1
x mC1 X1
xn
D Bn D Bn
mD1
mŠ nD0
nŠ mD0
.m C 1/Š nD0

P
The last equality was to convert the lower limit of summation of 1mD1
x m=mŠ from 1 to zero, to

apply the Cauchy product. Now, we use the Cauchy product for two seriesŽ to get
! !
X1 X n
x n kC1 xk X1 Xn
nC1 x nC1
xD Bk D Bk
nD0
.n k C 1/Š kŠ nD0
k .n C 1/Š
kD0 kD0

Ž
Refer to Eq. (7.12.2) for derivation.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 501

Replacing n C 1 by n, with n starts from 1 instead of 0, we have


!
X1 Xn 1
n xn
xD Bk
nD1
k nŠ
kD0

Thus, we can conclude thatŽŽ ,


!
X
n 1
n
B0 D 1; Bk D 0 for n > 1 (4.17.2)
k
kD0

Explicitly, we have

1 D B0
0 D B0 C 2B1
0 D B0 C 3B1 C 3B2
0 D B0 C 4B1 C 6B2 C 4B3
0 D B0 C 5B1 C 10B2 C 10B3 C 5B4

You can see the Pascal triangle here!

Cotangent and Bernoulii numbers. If we consider the function g.x/ D x=ex 1 B1 x, we get;
check Section 5.17 for detail,

x x e x=2 C e x=2 X1
B2n 2n
g.x/ WD B1 x D D x
ex 1 2 e x=2 e x=2
nD0
.2n/Š

If we know hyperbolic trigonometric functions, see Section 3.15, then it is not hard to see that
the red term is coth .x=2/; and thus we’re led to

x x  X 1
B2n 2n
coth D x
2 2 nD0
.2n/Š

And to get from coth to cot, just replace x by ix, and we get the series for the cotangent function:

X
1
2B2n
cot x D . 1/n .2x/2n 1

nD0
.2n/Š

in terms of the Bernoulli numbers!


ŽŽ
This is similar to x D b0 x C .b1 C b2 /x 2 for all x, then we must have b0 D 1 and b1 C b2 D 0.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 502

4.18 Euler-Maclaurin summation formula


We have discussed the connection between the sums of powers of integers and the Bernoulli
numbers in Section 2.27. Recall that we have defined Sm .n/ as the sum of the mth power of the
first n positive integers:
Xn
Sm .n/ WD km
kD1

Now, to simplify the notation, we simply use Sm for Sm .n/. And for later use, we list the first
few sumsŽŽ :
S0 D 10 C 20 C 30 C    C n0 D B0 n
1 
S1 D 11 C 21 C 31 C    C n1 D B0 n2 2B1 n
2
1 
S 2 D 1 2 C 2 2 C 3 2 C    C n2 D B0 n3 3B1 n2 C 3B2 n (4.18.1)
3
1 
S 3 D 1 3 C 2 3 C 3 3 C    C n3 D B0 n4 4B1 n3 C 6B2 n2 C 4B3 n
4
1 
S 4 D 1 4 C 2 4 C 3 4 C    C n4 D B0 n5 5B1 n4 C 10B2 n3 C 10B3 n2 C 5B4 n
5
The Euler-Maclaurin summation formula involves the sum of a function y D f .x/ evaluated
at integer values of x from 1 to n. For example, considering y D x 2 , and this sum

S WD f .1/ C f .2/ C    C f .n/ D 12 C 22 C 32 C    C n2

which is nothing but the S2 we’re familiar with. Considering another function y D x 2 C 3x C 2,
and the sum S WD f .1/Cf .2/C  Cf .n/, which is nothing but S2 C3S1 C2S0 . To conclude, for
polynomials, S can be written in terms of S0 ; S1 ; : : : And we know how to compute S0 ; S1 ; : : :
using Eq. (4.18.1).
Moving on now to non-polynomial functions such as sin x or e x . Thanks to Taylor, we can
express these functions as a power series, and we return back to the business of dealing with
polynomials. For an arbitrary function f .x/–which is assumed to be able to have a Taylor’s
expansion, we can then write

f .x/ D c0 C c1 x C c2 x 2 C   
P
Thus, we can compute S D niD1 f .i/ in the same manner as we did for polynomials, only this
time we have an infinite sum:
X
n
S WD f .i/ D c0 S0 C c1 S1 C c2 S2 C c3 S3 C   
i D1

ŽŽ
Two conventions are used in the literature regarding the Bernoulli numbers, in one convention B1 D 1=2
and in the other B1 D 1=2. In this book, I use B1 D 1=2.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 503

Substituting S0 ; S1 ; : : : in Eq. (4.18.1) into S , we obtain


1  1 
S D c0 B0 n C c1 B0 n2 2B1 n C c2 B0 n3 3B1 n2 C 3B2 n C
2 3
1 4 3 2

C c3 B0 n 4B1 n C 6B2 n C 4B3 n C
4
1 
C c4 B0 n5 5B1 n4 C 10B2 n3 C 10B3 n2 C 5B4 n C   
5
Now, we need to massage S a bit so that it tells us the hidden truth; we group terms with
B0 ; B1 ; : : ::
 
1 2 1 3 1 4
S D B0 c0 n C c1 n C c2 n C c3 n C    C
2 3 4
2 3 4
B1 .c1 n C c2 n C c3 n C c4 n C   /C
3
C B2 .c2 n C c3 n2 C 2c4 n3 C   / C B3 .   / C   
2
Now come the magic, the red term is the integral of f .x/|| , the blue term is the first derivative of
f .x/ at x D n minus f 0 .0/, and the third term is f 00 .n/ f 00 .0/ and so on, so we have
Z n
B2  B3 
SD f .x/dx B1 .f .n/ f .0// C f 0 .n/ f 0 .0/ C f 00 .n/ f 00 .0/ C   
0 2Š 3Š
Noting that B2nC1 are all zeros except B1 and B1 D 1=2, we can rewrite the above equation
as
Z n
X f .n/ f .0/ X B2k  .2k 1/ 
n 1
f .i/ D f .x/dx C C f .n/ f .2k 1/ .0/
i D1 0 2 .2k/Š
kD1
Why can this formula be useful when we replace a finite sum by a definite integral (which can
be done) and an infinite sum? You will see that this is a powerful formula to compute sums,
both
P1 infinite sums and finite sums. That was the powerful weapon that Euler used to compute
kD1 1=k in the Basel problem. But first, we need to polish our formula, because there is an
2

asymmetry in the formula: on the LHS we start from 1, but on the RHS, we start from 0. If we
add f .0/ to both sides, we get a nicer formula:
Z n
X f .n/ C f .0/ X B2k  .2k 1/ 
n 1
f .i/ D f .x/dx C C f .n/ f .2k 1/ .0/
i D0 0 2 .2k/Š
kD1

Now if we ask why start from 0? What if f .0/ is undefined (e.g. for f .x/ D 1=x 2 /? We can
start from any value smaller than n. Let’s consider m < n, and we compute two sums:
Z n
X f .n/ C f .0/ X B2k  .2k 1/ 
n 1
f .i/ D f .x/dx C C f .n/ f .2k 1/ .0/
i D0 0 2 .2k/Š
kD1
Z m
X f .m/ C f .0/ X B2k  .2k 1/ 
m 1
.2k 1/
f .i/ D f .x/dx C C f .m/ f .0/
i D0 0 2 .2k/Š
kD1
R
|| n
0 .c0 C c1 x C c2 x 2 C    /dx D .c0 x C c1 x 2 =2 C c2 x 3 =3 C    /jn0 .

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 504

Now, we subtract the first formula from the second one, we then have a formula which starts
from m nearly (note that on the LHS, we start from m C 1 because f .m/ was removed):
Z
X X B2k  .2k 
n n 1
f .n/ f .m/ 1/
f .i/ D f .x/dx C C f .n/ f .2k 1/
.m/
i DmC1 m 2 .2k/Š
kD1

Using the same trick of adding f .m/ to both sides, we finally arrive at

Z
X f .n/ C f .m/ X B2k  .2k 
n n 1
1/
f .i / D f .x/dx C C f .n/ f .2k 1/
.m/
i Dm m 2 .2k/Š
kD1

(4.18.2)

And this is the Euler-Maclaurin summation formula, usually abbreviated as EMSF, about which
D. Pengelley wrote the formula that dances between continuous and discrete. This is the form
without the remainder term. This is because in the formula we do not know when to truncate the
infinite seriesŽŽ .

Basel sum. Now we use the EMSF to compute the Basel sum, tracing the footsteps of the great
Euler. We write the sum of the second powers of the reciprocals of the positive integers as

X
1
1 X1 1
N X
1
1
D C (4.18.3)
k2 k2 k2
kD1 kD1 kDN

Now, the first sum with a few terms, we compute it explicitly (i.e., add term by term) and for
the second term, we use the EMSF in Eq. (4.18.2). We can compute the red term as, with
f .x/ D 1=x 2
X1
1 1 1 1 1 1
D C C C C 
k2 N 2N 2 6N 3 30N 5 42N 7
kDN

For example with N D 10, we have (with only four terms in the above series)

X
1
1 X
9
1 1 1 1 1
2
D 2
C C 2
C
k k N 2N 6N 3 30N 5
kD1 kD1

An infinite sum was computed using only a sum of 13 terms! How about the accuracy? The exact
value is  2 =6 D 1:6449340668482264, and the one based on the EMSF is 1:644934064499874;
an accuracy of eight decimals. If we do not know the EMSF,
P we would have had to compute 1 bil-
lion terms to get an accuracy of 8 decimals! Note that 9kD1 1=k 2 is only 1:539767731166540.

A professor Emeritus in Mathematical Sciences at New Mexico State University.
ŽŽ
This derivation was based on Youtuber mathologer.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 505

4.19 Fourier series


We will discuss the origin of Fourier series in Sections 9.9 and 9.11 where differential equations
are discussed. Herein, we present briefly what are Fourier series and how to determine the
coefficients of this series.

4.19.1 Periodic functions with period 2


Before delving into Fourier series, we need some preparing results:
Z 
cos mxdx D 0
Z  

sin nx cos mxdx D 0 (4.19.1)


Z 

(
0 m¤n
cos nx cos mxdx D
  mDn

of which proof is not discussed here. We once had a thought that, in a boring calculus class, why
we spent a significant amount of our youth to compute these seemingly useless integrals like the
above? It is interesting to realize that these integrals play an important role in mathematics and
then in our lives.
Now, Fourier believed that it is possible to expand any periodic function f .x/ with period
2 as a trigonometric infinite series (as mentioned, refer to Sections 9.9 and 9.11 to see why
Fourier came up with this idea; once the idea is there, the remaining steps are usually not hard,
as I can understand them):

f .x/ D a0 C .a1 cos x C a2 cos 2x C    / C .b1 sin x C b2 sin 2x C    /


X1
(4.19.2)
D a0 C .an cos nx C bn sin nx/
nD1

We do not have b0 because sin 0x D 0. This trigonometric infinite series is called a Fourier
series and the coefficients an , bn are called the Fourier coefficients. Our goal now is to determine
these coefficients.
For a0 , we just integrate two sides of Eq. (4.19.2) from  to  ŽŽ , we get:
Z  Z  1 
X Z  Z  
f .x/dx D a0 dx C an cos nxdx C bn sin nxdx (4.19.3)
  nD1  

ŽŽ
The results do not change if we integrate from 0 to 2. In fact, if a function y D f .x/ is T -periodic, then
Z aCT Z bCT
f .x/dx D f .x/dx
a b

Drawing a picture of this periodic function, and note that integral is area, and you will see why this equation holds.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 506

Now the "seemingly useless" integrals in Eq. (4.19.1) come into play: the red integrals are all
zeroes, so
Z  Z 
1
f .x/dx D 2a0 H) a0 D f .x/dx (4.19.4)
 2 
For an with n  1, we multiply Eq. (4.19.2) with cos mx and integrate two sides of the
resulting equation. Doing so gives us:
Z  Z 
f .x/ cos mxdx D a0 cos mxdxC
 
1 
X Z  Z  
C an cos mx cos nxdx C bn cos mx sin nxdx
nD1  

Again, the integrals in Eq. (4.19.1) help us a lots here: the red integrals vanish. We’re left with
this term
X1 Z 
an cos mx cos nxdx
nD1 

As the blue integral is zero when n ¤ m and it is equal to  when n D m, the above term should
be equal am . Thus,
Z  Z
1 
f .x/ cos mxdx D am  H) am D f .x/ cos mxdx (4.19.5)
  
Similarly, for bn we multiply Section 11.8.2 with sin mx and integrate two sides of the
resulting equation. Doing so gives us:
Z
1 
bm D f .x/ sin mxdx (4.19.6)
 
Example 1. As the first application of Fourier series, let’s try the square wave function given by
(
0 if   x < 0
f .x/ D ; f .x C 2/ D f .x/ (4.19.7)
1 if 0  x < 
Square waves are often encountered in electronics and signal processing, particularly digital
electronics and digital signal processing. Mathematicians call the function in Eq. (4.19.7) a
piecewise continuous function. This is because the function is consisted of many pieces, each
piece is defined on a sub-interval. Within a sub-interval the function is continuous, but at some
points between two neighboring sub-intervals there is a jump.
The determination of the Fourier coefficients for this function is quite straightforward:
Z  Z 
1 1 1
a0 D f .x/dx D dx D
2  2 0 2
Z  Z 
1 1
an D f .x/ cos nxdx D cos nxdx D 0
   0
Z Z
1  1  1
bn D f .x/ sin nxdx D sin nxdx D .cos n 1/
   0 n

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 507

Noting that bn is non-zero only for odd n. In that case, cos n D 1. Thus, the Fourier series of
this square wave is:

1 X
1
1 2 2 2
f .x/ D C sin x C sin 3x C    D C sin.2n 1/x (4.19.8)
2  3 2 nD1 .2n 1/

Fig. 4.84 plots the square wave along with some of its Fourier series with 1,3,5,7 and 15 terms.
With more than 7 terms, a good approximation is obtained. Note that Taylor series cannot do
this!

1.00 S1 1.00 S3
0.75 f(x) 0.75 f(x)
0.50 0.50
0.25 0.25
0.00 0.00
3 2 1 0 1 2 3 3 2 1 0 1 2 3
1.00 S5 1.00 S7
0.75 f(x) 0.75 f(x)
0.50 0.50
0.25 0.25
0.00 0.00
3 2 1 0 1 2 3 3 2 1 0 1 2 3
1.00 S11 1.00 S15
0.75 f(x) 0.75 f(x)
0.50 0.50
0.25 0.25
0.00 0.00
3 2 1 0 1 2 3 3 2 1 0 1 2 3
1 2
Figure 4.84: Representing a square wave function by a finite Fourier series Sn D 2 C  sin x C    C
2
n sin nx for n D 2k 1. Source: fourier-square-wave.jl.

Let’s have some fun with this new toy and we will rediscover an old series. For 0  x < ,
f .x/ D 1, so we can write 1 D 1=2 C 2= sin x C 2=3 sin 3x C    . Then, a bit of algebra, and
finally choosing x D =2, we see again the well know series for =4:

1 2 2
D sin x C sin 3x C   
2  3
 1 1
D sin x C sin 3x C sin 5x C   
4 3 5
 1 1 1
D1 C C    (evaluating the above equation at x D =2)
4 3 5 7
Phu Nguyen, Monash University © Draft version
Chapter 4. Calculus 508

4.19.2 Functions with period 2L


Suppose we have a periodic function f .x/ with a period 2L rather than 2. Our goal is to derive
its Fourier series. We can do the same thing that we did in the previous section. But we do not
want to repeat that; instead we use a change of variable: t D x=L
 
Lt
f .x/ D f D g.t/

P1
As g.t / is a periodic function, we have g.t/ D a0 C nD1 .an cos nt C bn sin nt/ with the
Fourier coefficients given by
Z 
1
a0 D g.t/dt
2 
Z
1 
an D g.t/ cos ntdt
 
Z
1 
bn D g.t/ sin ntdt
 

With t D x=L, we have dt D =Ldx and thus:


1 
X nx nx 
f .x/ D a0 C an cos C bn sin
nD1
L L
Z L
1
a0 D f .x/dx
2L L (4.19.9)
Z
1 L nx
an D f .x/ cos dx
L L L
Z
1 L nx
bn D f .x/ sin dx
L L L

Example 2. In this example, we consider a triangular wave defined by

f .x/ D jxj 1  x  1; f .x C 2/ D f .x/ (4.19.10)

The determination of the Fourier coefficients for this function is also straightforward:
Z 1 Z 
1 1 1
a0 D jxjdx D dx D
2 2 2
Z Z
1 0
1 1
2
an D jxj cos nxdx D 2 x cos nxdx D .cos n 1/
n2  2
Z
1 0
1
bn D jxj sin nxdx D 0 (jxj sin nx is an odd function )
1

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 509

Of course, we have used integration by parts to compute an . Noting that an is non-zero only for
odd n. In that case, cos n D 1. Thus, the Fourier series of this triangular wave is:

1 X
1
1 4 4 4
f .x/ D 2
cos x 2
cos 3xC   D cos.2n 1/x (4.19.11)
2  9 2 nD1 .2n 1/2  2

A plot of some Fourier series of this function is given in Fig. 4.85. Only four terms and we
obtain a very good approximation.
1.00
S1
0.75 f(x)
0.50
0.25
0.00
4 2 0 2 4
1.00
S3
0.75 f(x)
0.50
0.25
0.00
4 2 0 2 4
1.00
S5
0.75 f(x)
0.50
0.25
0.00
4 2 0 2 4

1 4
Figure 4.85: Representing a triangular wave function by a finite Fourier series Sn D 2 2
cos x
   n24 2 cos nx for n D 2k 1.

Similarly to example 1, we can also get a nice series related to  by considering f .x/ and
its Fourier series at x D 0:
1 4 4 4
f .x/ D 2
cos x 2
cos 3x cos 5x
2  9 25 2
1 4 4 4 2 1 1 1
D 2C C C    H) D C C C 
2  9 2 25 2 8 1 9 25
Now, what is important to consider is the difference between the Fourier series for the square
wave and the triangular wave. I put these two series side by side now

1 X
1
2
square wave: f .x/ D C sin.2n 1/
2 nD1 .2n 1/
1 X
1
4
triangular wave: f .x/ D cos.2n 1/x
2 nD1
.2n 1/2  2

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 510

Now we can see why we need less terms in the Fourier series to represent the triangular wave
than the square wave. The difference lies in the red number. The terms in the triangular series
approach zero faster than the terms in the square series. And by looking at the shape of these
waves, it is obvious that smoother waves (the square wave has discontinuities) are easier for
Fourier series to converge.

4.19.3 Complex form of Fourier series


Herein we derive the complex form of Fourier series. The idea is to use Eq. (4.19.2) and replace
cos nx and sin nx with complex exponential using Euler’s formula:
e i nx C e i nx
e i nx e i nx
cos nx D ; sin nx D
2 2i
Doing so gives us (d is simply a0 =2):
X1   i nx   i nx 
e C e i nx e e i nx
f .x/ D d C an C bn
nD1
2 2i
     (4.19.12)
X1
an ibn i nx an C ibn i nx
DdC e C e
nD1
2 2

which can be written as


X
1
f .x/ D cn e i nx
nD 1
where the coefficients cn are given by

<d; if n D 0
cn D an 2i bn ; if n D 1; 2; 3; : : : (4.19.13)
:̂ a n Ci b n
2
; if n D 1; 2; 3; : : :
We need the formula for cn and we’re done. Recall that
Z  Z Z
1 1  1 
d D a0 D f .x/dx; an D f .x/ cos nxdx; bn D f .x/ sin nxdx
2     

Thus, for n > 0, cn is written as


Z  Z 
an ibn 1 1 i nx
cn WD D f .x/.cos nx i sin nx/dx D f .x/e dx
2 2  2 

And this formula holds for n  0 as well.


We now get the complex form of Fourier series, written as for any periodic functions of 2L:

X
1 Z L
i nx=L 1 i nx=L
f .x/ D cn e ; cn D f .x/e dx (4.19.14)
nD 1
2L L

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 511

Having another way to look at Fourier series is itself something significant. Still, we can see the
benefits of the complex form: instead of having a0 , an and bn and the sines and cosines, now we
just have cn and the complex exponential.
We have more, lot more, to say about Fourier series e.g. Fourier transforms, discrete Fourier
transform, fast Fourier transforms etc. (Section 9.12) We still do not know the meanings of
the a’s and b’s (or cn ). We do not know which functions can have a Fourier series. To an-
swer these questions, we need more maths such as linear algebra. I have introduced Fourier
series as early as here for these reasons. First, we learned about Taylor series (which allows us
to represent a function with a power series). Now, we have something similar: Fourier series
where
R a function is represented as a trigonometric series. Second, something like the identity
 sin nx cos mxdx D 0 looks useless, but it is not.
About Fourier’s idea of expressing a function as a trigonometry series, the German mathe-
matician Bernhard Riemann once said:

Nearly fifty years has passed without any progress on the question of analytic repre-
sentation of an arbitrary function, when an assertion of Fourier threw new light on
the subject. Thus a new era began for the development of this part of Mathematics
and this was heralded in a stunning way by major developments in mathematical
physics.

4.20 Special functions


4.20.1 Elementary functions
Before presenting special functions, we need to define the non-special or elementary functions.
Those are the familiar functions that we know how to differentiate and integrate:

(a) Powers of x: x; x 2 ; x 3 , etc..


p p
(b) Roots of x: x; 3 x; x 1=5 , etc.

(c) Exponentials: e x

(d) Logarithms: log x.

(e) Trigonometric functions: sin x; cos x; tan x, etc.

(f) Inverse trigonometric functions: arcsin x; arccos x; arctan x, etc.

(g) Composite functions of the previous six functions: log.sin x/x; cos2 x, etc.

(h) All functions obtained by adding, subtracting, multiplying, dividing any of the above seven
types a finite number of times. Examples are:

x 2 C sin x 3 log x

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 512

4.20.2 Factorial of 1=2 and the Gamma function


In Section 2.6.1, we have considered the sum of the first n count- 25
ing numbers e.g. 1 C 2 C 3, and encountered the so-called triangu- 20
lar numbers 1; 3; 6; 10; 15; : : : It is possible to have a formula for 15
the sequence of triangular numbers: T .n/ D n.nC1/=2. Because
we have a formula, we can compute T .5=2/ whatever it means. 10
In other words, we can interpolate in between triangular numbers. 5
However, for the factorials 1; 2; 6; 24; 120; : : : there is no such 0
1.0 1.5 2.0 2.5 3.0 3.5 4.0
formula, and thus we cannot interpolate in between factorials of
natural numbers. And this was the problem proposed by Goldbach in 1720. In the quest for
such a formula for nŠ, mathematicians like Euler, Daniel Bernoulli, Gauss invented the gamma
function. As can be seen from the figure, it is possible to draw a curve passing through .n; nŠ/
for n D 0; 1; 2 : : : Thus, theoretically there should exist at least one function f .x/ such that
f .n/ D nŠ for n 2 N.
Recall that in Eq. (4.7.19), we have obtained the following result (as an exercise of integral
calculus if we bypass the motivation of the function x 4 e x )
Z 1
x 4 e x dx D 4Š
0

And from that we get (I changed the dummy variable from x to t)


Z 1
nŠ D t n e t dt
0

The Gamma function (the notation € was due to Legendre) is defined as


Z 1
€.x/ WD t x 1 e t dt (4.20.1)
0

Therefore,
€.x/ D .x 1/Š (4.20.2)
And with this integral definition of factorial, we are no longer limited to factorials of natural
numbers. Indeed, we can compute .0:5/Š asŽŽ
    Z 1 p
1 3 
ŠD€ D t 1=2 e t dt D (4.20.3)
2 2 0 2

    Z 1
1 1 1=2
p
ŠD€ D t e t dt D  (4.20.4)
2 2 0
R u2
ŽŽ
For the final integral, change of variable u D t 1=2 and we get a new integral 2 u2 e du.

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 513

4.20.3 Zeta function


Recall that in Section 2.21 we have met the harmonic series and the Basel problem. Both are
given here:

1 1 1 X 1 1
S D 1 C C C C  D D1
2 3 4 k1
kD1
X1 (4.20.5)
1 1 1 1 2
S D1C C C C  D D
4 9 16 k2 6
kD1
Obviously, these sums are special cases of the following
X
1
1
S.p/ D ; p2N
kp
kD1

which can be seen as the sum of the integral powers of the reciprocals of the natural numbers.

X
1
1
.z/ WD ; z2C
kz
kD1

4.21 Review
It was a long chapter. This is no surprise for we have covered the mathematics developed during
a time span of about 200 years. But as it is always the case: try to do not lose the forest for the
trees. The core of calculus is simple, and I’ am trying to summarize that core now. Understand
that and others will follow quite naturally (except the rigorous foundation–that’s super hard).

 The calculus is the mathematics of change: it provides us notions and symbols and methods
to talk about changes precisely;
 What is better than motion as an example of change? For motion, we need three notions:
(1) position x.t/–to quantify the position (that answers the question where an object is
at a particular time), (2) velocity v.t/–to quantify the speed (that answers the question
how fast our object is moving), and (3) acceleration a.t/–to quantify how fast the object
changes its speed.
 Going from (1) to (2) to (3) is called “taking the derivative”: the derivative gives us the
way to quantify a time rate of change. For the velocity, it is the rate of change of the
position per unit time. That’s why we have the symbols dx, dt and dx=dt ;
Z t
 Going from (3) to (2) to (1) is called “taking the integral”: x.t/ D vdt . Knowing
0
the speed v.t/ and consider a very small time interval dt during which the distance the
object has traveled is v.t/dt , finally adding up all those tiny distances and we get the total
distance x.t/;

Phu Nguyen, Monash University © Draft version


Chapter 4. Calculus 514

 So, the calculus is the study of derivative and integral. But they are not two independent
things, they are the inverse of each other like negative/positive numbers, men/women,
war/peace and so on;

 When we studied counting numbers we have discovered many rules (e.g. odd + odd =
even). The same pattern is observed here: the new toys of mathematicians–the derivative
and the integral–have their own rules. For example, the derivative of a sum is the sum of
the derivatives. Thanks to this rule, we know how to determine the derivative of x 10 C
x 5 C 23x 3 , for example for we know to differentiate each term.

 Calculus does to algebra what algebra does to arithmetic. Arithmetic is about manipulating
numbers (addition, multiplication, etc.). Algebra finds patterns between numbers e.g. a2
b 2 D .a b/.a C b/. Calculus finds patterns between varying quantities;

 Historically Fermat used derivative in his calculations without knowing it. Later, Newton
and Leibniz discovered it. Any other mathematicians such as Brook, Euler, Lagrange
developed and characterized it. And only at the end of this long period of development,
that spans about two hundred years, did Cauchy and Weierstrass define it.

 Confine to the real numbers, the foundation of the calculus is the concept of limit. This is so
because with limits, mathematicians can prove all the theorems in calculus rigorously. That
branch of mathematics is called analysis. This branch focuses not on the computational
aspects of the calculus (e.g. how to evaluate an integral or how to differentiate a function),
instead it focuses on why calculus works.

In the beginning of this chapter, I quoted Richard Feynman saying that “Calculus is the
language God talks”, and Steven Strogatz writing ‘Without calculus, we wouldn’t have cell
phones, computers, or microwave ovens. We wouldn’t have radio. Or television. Or ultrasound
for expectant mothers, or GPS for lost travelers. We wouldn’t have split the atom, unraveled
the human genome, or put astronauts on the moon.’ But for that we need to learn multivariable
calculus and vector calculus (Chapter 7)–the generalizations of the calculus discussed in this
chapter and differential equations (Chapter 9). This is obvious: our world is three dimensions
and the things we want to understand depend on many other things. Thus, f .x/ is not sufficient.
But the idea of multivariable calculus and vector calculus is still the mathematics of changes: a
small change in one thing leads to a small change in another thing.
Consider a particle of mass m moving under the influence of a force F , then Newton gave
us the following equation md 2 x=dt 2 D F , which, in conjunction with the data about the position
of the particle at t D 0, can pinpoint exactly the position of the particle at any time t . This is
probably the first differential equation–those equations that involve the derivatives–ever. This is
the equation that put men on the Moon. R
Leaving behind the little bits dx, dy and the sum , our next destination in the mathematical
world is a place called probability. Let’s go there to see dice, roulette, lotteries–game of chances–
to see how mathematicians develop mathematics to describe random events, how they can see
through the randomness to reveal its secrets.

Phu Nguyen, Monash University © Draft version


Chapter 5
Probability

Contents
5.1 A brief history of probability . . . . . . . . . . . . . . . . . . . . . . . . . 517
5.2 Classical probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 518
5.3 Empirical probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 520
5.4 Buffon’s needle problem and Monte Carlo simulations . . . . . . . . . . 521
5.5 A review of set theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 524
5.6 Random experiments, sample space and event . . . . . . . . . . . . . . . 529
5.7 Probability and its axioms . . . . . . . . . . . . . . . . . . . . . . . . . . 530
5.8 Conditional probabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . 534
5.9 The secretary problem or dating mathematically . . . . . . . . . . . . . 550
5.10 Discrete probability models . . . . . . . . . . . . . . . . . . . . . . . . . 553
5.11 Continuous probability models . . . . . . . . . . . . . . . . . . . . . . . 582
5.12 Joint discrete distributions . . . . . . . . . . . . . . . . . . . . . . . . . . 588
5.13 Joint continuous variables . . . . . . . . . . . . . . . . . . . . . . . . . . 598
5.14 Transforming density functions . . . . . . . . . . . . . . . . . . . . . . . 598
5.15 Inequalities in the theory of probability . . . . . . . . . . . . . . . . . . . 599
5.16 Limit theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 601
5.17 Generating functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 605
5.18 Multivariate normal distribution . . . . . . . . . . . . . . . . . . . . . . 614
5.19 Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 622

Games of chance are common in our world–including lotteries, roulette, slot machines and
card games. Thus, it is important to know a bit about the mathematics behind them which is
known as probability theory.

515
Chapter 5. Probability 516

Gambling led Cardano–our Italian friend whom we met in the discussion on cubic equations–
to the study of probability, and he was the first writer to recognize that random events are
governed by mathematical laws. Published posthumously in 1663, Cardano’s Liber de ludo
aleae (Book on Games of Chance) is often considered the major starting point of the study of
mathematical probability.
Since then the theory of probability has become a useful tool in many problems. For ex-
ample, meteorologists use weather patterns to predict the probability of rain. In epidemiology,
probability theory is used to understand the relationship between exposures and the risk of health
effects. Another application of probability is with car insurance. Companies base your insurance
premiums on your probability of having a car accident. To do this, they use information on the
frequency of having a car accident by gender, age, type of car and number of kilometres driven
each year to estimate an individual person’s probability (or risk) of a motor vehicle accident.
Indeed probability is so useful that the famous French mathematician and astronomer (known
as the “Newton of France”) Pierre-Simon Marquis de Laplace once wrote:

We see that the theory of probability is at bottom only common sense reduced to
calculation; it makes us appreciate with exactitude what reasonable minds feel
by a sort of instinct, often without being able to account for it....It is remarkable
that this science, which originated in the consideration of games of chance, should
have become the most important object of human knowledge... The most important
questions of life are, for the most part, really only problems of probability.

This chapter is an introduction to probability and statistics. It was written based on the
following excellent books:

 The Unfinished game: Pascal, Fermat and the letters by Keith Devlin‘ [17]

 Introduction to Probability, Statistics, and Random Processes by Hossein Pishro-Nik

 A first course in Probability by Sheldon Ross|| [59]

 The history of statistics: the measurement of uncertainty before 1900 by Stefen StiglerŽŽ
[64].

Keith J. Devlin (born 16 March 1947) is a British mathematician and popular science writer. His current
research is mainly focused on the use of different media to teach mathematics to different audiences.

The book is freely available at https://www.probabilitycourse.com/.
||
Sheldon Ross (April 30, 1943) is the Daniel J. Epstein Chair and Professor at the USC Viterbi School of
Engineering. He is the author of several books in the field of probability. In 1978, he formulated what became
known as Ross’s conjecture in queuing theory, which was solved three years later by Tomasz Rolski at Poland’s
Wroclaw University.
ŽŽ
Stephen Mack Stigler (born August 10, 1941) is Ernest DeWitt Burton Distinguished Service Professor at the
Department of Statistics of the University of Chicago. He has authored several books on the history of statistics.
Stigler is also known for Stigler’s law of eponymy which states that no scientific discovery is named after its original
discoverer (whose first formulation he credits to sociologist Robert K. Merton).

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 517

 A History of Probability and Statistics and Their Applications before 1750, by Anders
Hald [27]

I did not like gambling and did not pay attention to probability. I performed badly in high
school and university when it came to classes on probability; I actually failed the unit. But I do
have companies. In 2012, 97 Members of Parliament in London were asked: ‘If you spin a coin
twice, what is the probability of getting two heads?’ The majority, 60 out of 97, could not give
the correct answer.
I did not plan to re-learn probability but the Covid pandemic came. People is talking about the
probability of getting the Covid etc. And I wanted to understand what they mean. Furthermore,
probability is used in machine learning, data science and many science and engineering fields.
Therefore, I decided to study the theory of probabality again. I do not have to be scared as this
time I will not have to take any exam about probability! I just have fun.

5.1 A brief history of probability


This brief historical account is taken from Calculus, Volume II
by Tom Apostol (2nd edition, John Wiley & Sons 1969).
A gambler’s dispute in 1654 led to the creation of a mathemat-
ical theory of probability by two famous French mathematicians,
Blaise Pascal and Pierre de Fermat. Antoine Gombaud, Cheva-
lier de Méré, a French nobleman with an interest in gaming and
gambling questions, called Pascal’s attention to an apparent con-
tradiction concerning a popular dice game. The game consisted
in throwing a pair of dice 24 times; the problem was to decide
whether or not to bet even money on the occurrence of at least one "double six" during the 24
throws. A seemingly well-established gambling rule led de Méré to believe that betting on a
double six in 24 throws would be profitable, but his own calculations indicated just the opposite|| .
This problem and others posed by de Méré led to an exchange of letters between Pascal
and Fermat in which the fundamental principles of probability theory were formulated for the
first time. Although a few special problems on games of chance had been solved by some
Italian mathematicians in the 15th and 16th centuries, no general theory was developed before
this famous correspondence. The correspondence between Pascal and Fermat was told in the
interesting book The Unfinished game: Pascal, Fermat and the letters by Keith Devlin.
The Dutch scientist Christian Huygens, a teacher of Leibniz who co-invented calculus with
Newton, learned of this correspondence and shortly thereafter published the first book on proba-
bility in 1657; entitled De Ratiociniis in Ludo Aleae or The Value of all Chances in Games of
Fortune, it was a treatise on problems associated with gambling. Because of the inherent appeal

Anders Hjorth Hald (1913 – 2007) was a Danish statistician. He was a professor at the University of Copen-
hagen from 1960 to 1982. While a professor, he did research in industrial quality control and other areas, and also
authored textbooks. After retirement, he made important contributions to the history of statistics.
||
The probability of getting at least one double six in 24 throws is 1 .35=36/24 D 0:4914, which is smaller
than 0.5.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 518

of games of chance, probability theory soon became popular, and the subject developed rapidly
during the 18th century. The major contributors during this period were Jakob Bernoulli with
Ars Conjectandi in 1713, and Abraham de Moivre with his classic The Doctrine of Chances in
1718.
In 1812 Pierre de Laplace (1749-1827) introduced a host of new ideas and mathematical
techniques in his book, Théorie Analytique des Probabilités or Analytical Theory of Probabil-
ity. Before Laplace, probability theory was solely concerned with developing a mathematical
analysis of games of chance. Laplace applied probabilistic ideas to many scientific and practical
problems. The theory of errors, actuarial mathematics, and statistical mechanics are examples
of some of the important applications of probability theory developed in the l9th century.
Like so many other branches of mathematics, the development of probability theory has been
stimulated by the variety of its applications. Conversely, each advance in the theory has enlarged
the scope of its influence. Mathematical statistics is one important branch of applied probability;
other applications occur in such widely different fields as genetics, psychology, economics, and
engineering. Many workers have contributed to the theory since Laplace’s time; among the most
important are Chebyshev, Markov, von Mises, and Kolmogorov.
One of the difficulties in developing a mathematical theory of probability has been to arrive
at a definition of probability that is precise enough for use in mathematics, yet comprehensive
enough to be applicable to a wide range of phenomena. The search for a widely acceptable
definition took nearly three centuries and was marked by much controversy. The matter was
finally resolved in the 20th century by treating probability theory on an axiomatic basis. In 1933
the Russian mathematician A. Kolmogorov outlined an axiomatic approach that forms the basis
for the modern theoryŽŽ . Since then the ideas have been refined somewhat and probability theory
is now part of a more general discipline known as measure theory.

5.2 Classical probability


Probability is a mathematical theory that helps us to quantify random events or experiments.
A random event or experiment is the one whose outcomes we can’t predict with certainty. For
instance, if we flip a coin, we’re not sure whether we get a head or a tail.
Although we cannot tell in advance whether a head or a coin will show up, we are, however,
able to tell how likely that a head (or a tail) is to occur. Here is how. The possible outcomes
of this coin tossing experiment are either a head (H) or tail (T); two possible outcomes. And
out of these two outcomes, we have one chance to get H, thus the probability that we get a H is
1=2 D 0:5. With equal probabilities, we say that getting a H and getting a T are equally likely
to occur.
Now suppose that we throw two coins simultaneously. What are the possible outcomes?
They are f.H; H /; .H; T /; .T; H /; .T; T /g. In the first case we get heads twice, in the last case
tails twice, whereas the two intermediate cases lead to the same result since it does not matter
to us in which coin heads or tails appear. Thus we say that the chance of getting heads twice is
Kolmogorov’s monograph is available in English translation as Foundations of Probability Theory, Chelsea,
ŽŽ

New York, 1950

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 519

1 out of 4 or 1=4, the chance of getting tails twice is also 1=4, whereas the chance of one head
and one tail is 2 out of 4 or 2=4 D 1=2.
Similar to other mathematical objects, there exist rules that probability obeys. It is interesting,
isn’t it? There are regularities behind random events! Here are some. Tossing a fair coin and
the probability of getting a head is 0.5, for brevity we write P .H / D 0:5; where P stands for
probability and the notation P .H / borrows the concept of functions f .x/ that we’re familiar
with from calculus. And we also have P .T / D 0:5. Obviously, P .H / C P .T / D 1. What
does that mean? It indicates that it is 100% sure that either we get a head or a tail. Thus, unity
in the theory of probability means a certainty. For the experiment of tossing two coins, again
1=4 C 1=2 C 1=4 D 1, meaning that we are certain to get one of the 3 possible combinations
(two Hs, two Ts, one H/one T).
Let’s do a more interesting experiment of tossing a coin three times. The possible outcomes
and the probability of some scenarios (called events in the theory of probability) are shown in
Table 5.1.

Table 5.1: Tossing a coin three times: all possible outcomes.

1st toss H H H H T T T T
2nd toss H H T T H H T T
3rd toss H T H T H T H T

category I II II III II III III IV


P 1=8 3/8 3/8 1/8

From the three experiments we have discussed, we can see that:

toss 1 coin 1 time: P .H / D 1=2


toss 1 coin 2 times: P .2H / D 1=4
toss 1 coin 3 times: P .3H / D 1=8
which indicates that the probability of getting heads twice is equal to the product of the proba-
bilities of getting it separately in the first and in the second tossing; in fact 1=4 D .1=2/.1=2/.
Similarly the probability of getting three heads in succession is the product of probabilities of
getting it separately in each tossing (1=8 D .1=2/.1=2/.1=2/). Thus if somebody asks you what
is the chance of getting head each time in ten tossings you can easily give the answer: .1=2/10 .
The result will be .00098, indicating that the chance is very low: about one chance out of a
thousand! Here we have the rule of multiplication of probabilities, which states that if you want
several different things, you may determine the probability of getting them by multiplying the
probabilities of getting the several individual ones. And that chance is usually low as we’re
asking too much! I am joking, the chance is low because we multiply numbers smaller than one.
If there is a rule of multiplication, there should be another rule: the rule of addition of
probabilities. This rule states that if we want only one of several things (no matter which one),

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 520

the probability of getting it is the sum of probabilities of getting individual items on our list. For
example, when flip a coin twice, the outcomes are f.H; H /; .H; T /; .T; H /; .T; T /g. Thus, the
chance to get at least one head is 3=4, which is equal to 1=4 C 1=4 C 1=4. Note that to get at
least one head, we either needs .H; H / or .H; T / or .T; H /, each has a probability of 1/4.
What we have discussed is known as classical probability or theoretical probability. It started
with the work of Cardano. The classical theory of probability has the advantage that it is con-
ceptually simple for many situations. However, it is limited, since many situations do not have
finitely many equally likely outcomes. Tossing a weighted die is an example where we have
finitely many outcomes, but they are not equally likely. Studying people’s incomes over time
would be a situation where we need to consider infinitely many possible outcomes, since there
is no way to say what a maximum possible income would be, especially if we are interested in
the future.

5.3 Empirical probability


What do we mean actually when we’re saying that the chance of getting a head when we toss a
coin is 1=2? What we mean by this is that if we toss a coin 1000 times we would expect to get a
head about 500 times. We could do the real experiment of tossing a coin 1000 times. But we ask
our computer to do that for us. See Listing A.15 for the Julia code for this virtual experiment
in which we flip a coin n times and count the number of times that a head shows, labeled by
n.H /. Then we define the probability of getting a head as its relative frequency of occurrence,
mathematically by n.H /=n. The result given in Table 5.2 tells us many things. One thing is that
when n is large the probability is indeed about 0:5, and this led to the following definition of the
probability of an event E:
n.E/
P .E/ WD lim (5.3.1)
n!1 n
As we have to carry out experiments this theory of probability is called empirical probability.
Moreover, as the probability is defined based on the relative frequency of the event, this theory
of probability is also referred to as Frequentist probability.

Table 5.2: Virtual experiment of tossing a coin n times.

n n.H / P

10 6 0.6
100 48 0.48
1 000 492 0.492
2 000 984 0.492
10 000 5041 0.5041

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 521

Limitations of the empirical probability. The limit of this theory of probability lies in
Eq. (5.3.1). How do we know that n.E /=n will converge to some constant limiting value that
will be the same for each possible sequence of repetitions of the experiment? Table 5.2 obvi-
ously indicates that the term n.E /=n is actually oscillating.
There is a need of another theory of probability. Axiomatic probability is such a theory, it uni-
fies different probability theories. Similar to Euclidean geometry, the axiomatic probability starts
with three axioms called Kolmogorov’s Three Axioms, named after the Soviet mathematician
Andrey Nikolaevich Kolmogorov (1903 – 1987).

5.4 Buffon’s needle problem and Monte Carlo simulations


5.4.1 Buffon’s needle problem
In 1777 Buffon posed (and solved) this problem: Let a needle of
length l be thrown at random onto a horizontal plane ruled with
parallel straight lines separated by a distance t which is greater
than l. What is the probability that the needle will intersect one
of these lines?
We use trigonometry and calculus to solve this problem. First, it is sufficient to consider just
two linesŽ . To locate the position of the needle, we just need two variables: its center O and its
orientation by  (Fig. 5.1). To specify O we use d –the distance from O to the nearest line, then
0  d  t=2. For the orientation, 0    =2. Now, we can specify when the needle cuts the
lines:
l
d  sin 
2

Figure 5.1: Buffon’s needle problem.

Now, we plot the function d D 2l sin  on the  d plane. The cut condition is then the area
of the shaded region in Fig. 5.1. The probability that the needle will intersect one of these lines

Georges-Louis Leclerc, Comte de Buffon (1707 – 1788) was a French naturalist, mathematician, cosmologist,
and encyclopédiste.
Ž
This is the most important step; without it we cannot proceed further. Why 2? Because 1 line is not enough
and there are infinitely many lines in the problem, then 2 is sufficient.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 522

is then: Z =2
l=2 sin d
0 2l
P D t D
t
22
It is expected that P is proportional to l (the longer the needle the more chance it hits the lines)
and inversely proportional to t–the distance between the lines. However, it is un-expected that
 shows up in this problem. No circles involved! We discuss this shortly.
In 1886, the French scholar and polymath Marquis Pierre–Simon de Laplace (1749 – 1827)
showed that the number  can be approximated by repeatedly throwing a needle onto a lined
sheet of paper N times and counting the number of intersected lines (n):

2l n 2l N
D H)  D
t N t n

In 1901, the Italian mathematician Mario Lazzarini performed Buffon’s needle experiment.
Tossing a needle 3 408 times with t D 3 cm, l D 2:5 cm , he got 1 808 intersections. Thus, he
obtained the well-known approximation 355=113 for , accurate to six significant digits. How-
ever, Lazzarini’s "experiment" is an example of confirmation bias, as it was set up to replicate
the already well-known approximation of , that is 355=113. Here’s the details:

2l N 2  25 3408 5 71  3  16 355
D D D D  3:14159292
t n 30 1808 3 113  16 113
Guessing Buffon’s formula. Herein, we’re trying to guess the solution without actually solving
it. This is a very important skill. (However, I admit that we’re doing it only after we have known
the result.) As the problem has only two parameters: the needle length l and the distance t
between two lines, the result must be of this form P D c .l=t / where c is a dimensionless
number (refer to Section 9.7.1 for detail on dimensions and units). To find out c, we reason that
the result should not depend on the shape of the needle. If so, we can consider a needle of the
form of a circle of radius r. The length of this circular needle is 2 r and it must be equal to l,
thus its diameter is d D l= . The probability is therefore 2l= t noting that a circular needle cuts
a line twice.

5.4.2 Monte Carlo method


The calculation of  based on Buffon needle experiment can be considered the first instance
of the so-called Monte-Carlo method. We can replicate this experiment on a computer, that’s
the essence of the Monte Carlo method. The underlying concept of the MC method is to use
randomness to solve problems that might be deterministic in principle. Throwing randomly
n needles is translated to generating n random number d 2 Œ0; t=2 and n random number
 2 Œ0; =2. Then we test for a needle cutting a line, if it cuts we record it. The code is given in
Listing 5.1.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 523

Listing 5.1: Julia code for Buffon’s needle experiment to compute .


1 function buffon_needle(t,l,n)
2 cut = 0
3 for i=1:n
4 d = (t/2) * rand()
5 theta = 0.5*pi* rand()
6 if ( 0.5*l*sin(theta) >= d ) cut += 1 end
7 end
8 return (2*l/t)*(n/cut) # this is pi
9 end
10 t = 2.0; l = 1.0
11 data = zeros(10,2)
12 data[:,1] = [500 3408 5000 6000 8000 10000 12000 14000 15000 20000]
13 for i=1:size(data,1)
14 data[i,2] = buffon_needle(t,l,data[i,1])
15 end

This was probably the simplest introduction to Monte Carlo methods. But if you look at line
5 in the code, you see that we have used  D 3:14159 : : : in a program that is about to determine
. Thus, this program is circular.
Another Monte Carlo way for calculating  is presented here. The area of 1/4 of a unit circle
is =4. We can compute this area by generating N points .x; y/ such that 0  x  1 and
0  y  1 . A point is within the area if x 2 C y 2  1 (see Fig. 5.2, blue points). We denote the
total number of hits by n, then the area is approximately n=N , and thus
n
 D4
N
A Julia code (Listing A.13) was written and the results are given in Table 5.3 for various N .
These Monte Carlo methods for approximating  are very slow compared to other methods (e.g.
those presented in ??), and do not provide any information on the exact number of digits that
are obtained. Thus they are never used to approximate  when speed or accuracy is desired.

  4 Nn
1.0
N
0.8
100 3.40000000
200 3.14000000 0.6

400 3.16000000 0.4

800 3.14000000 0.2


5 600 3.16642857
0.0
0.0 0.2 0.4 0.6 0.8 1.0

Table 5.3: Monte-Carlo calculation of .


Figure 5.2: Monte-Carlo calculation of .

Mathematicians like to write this as short as this x 2 Œ0; 12 , which is to indicate a point inside the square of


side 1. But this notation is general enough to cover points inside a unit cube: simply change it to x 2 Œ0; 13 .

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 524

And this MC method can be used to compute numerically any integrals.

5.5 A review of set theory


Probability theory uses the language of sets. Thus, here we briefly review some basic concepts
from set theory that are used in this chapter. We discuss set notations, definitions, and operations
such as intersections and unions. This section may seem somewhat theoretical and thus less
interesting than the rest of the chapter, but it lays the foundation for what is to come.
A set is a collection of things (called elements)Ž . We can either explicitly write out the
elements of a set as in the set of natural numbers

N D f1; 2; 3; : : :g

or, we can also define a set by stating the properties satisfied by its elements. For example, we
may write
A D fx 2 Njx  4g; or A D fx 2 N W x  4g
The symbols j and W are read "such that". Thus, the above set contains all counting numbers
equal to or greater than four. Because the order of the elements in a set is irrelevant, f2; 1; 5g
is the same set as f1; 2; 5g. Furthermore, an element cannot appear more than once in a set; so
f1; 1; 2; 5g is equivalent to f1; 2; 5g.

Ordered sets. Let A be a set. An order on A is a relation denoted by < with the following two
properties:

 If x 2 A and y 2 A, then one and only one of the following is true:

x < y; x D y; x>y

 If x; y; z 2 A, then
x < y; y < z H) x < z

An ordered set is a set on which an order is defined. For instance, the set of natural numbers is
an ordered set.

5.5.1 Subset, superset and empty set


Set A is a subset of set B if every element of A is also an element of B. We write A  B where
the symbol  indicates "subset". Inversely, B is a superset of A, which we write as B  A.
A set with no elements is called an empty set. How many empty sets there are? To answer
that question we need to define when two sets are equal. Two sets are equal if they have the
Why then mathematicians did not use collection? I am not sure, but a collection usually contains a quite large
Ž

number of things. In maths, a set can be empty or holds one element only.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 525

same elements. For example, f1; 2; 3g and f3; 2; 1g are equal. Now, assume there are two empty
sets. If they are not equal, then one set must contain a member that the other does not (otherwise
they would be equal). But both sets contain nothing. Thus, they must be equal. There is only one
empty set: the empty (or null) set is designated by ;. This null set is similar to number zero in
number theory.
A universal set is the collection of all objects in a particular context. We use the notation S
to label the universal set. Its role is similar to the number line in number theory. When we refer
to a number we visualize it as a point on the number line. In the same manner, we can visualize
a set on the background of the universal set.
The Cartesian product of two sets A and B, denoted by A  B, is defined as the set consisting
of all ordered pairs .a; b/ for which a 2 A and b 2 B. For example, if A D x; y and B D
f3; 6; 9g, then A  B D f.x; 3/; .x; 6/; .x; 9/; .y; 3/; .y; 6/; .y; 9/g. Note that because the pairs
are ordered so A  B ¤ B  A. An important example of sets obtained using a Cartesian product
is Rn , where n 2 N. For n D 2, we have

R2 D R  R D f.x; y/jx 2 R; y 2 Rg

Thus, R2 is the set consisting of all points in the two-dimensional plane. Similarly, R3 is the set
of all points in the three dimensional space that we’re living in.

Lower bound and upper bound. Given a set X 2 R (e.g. X D Œ0; 5), then

 u is an upper bound for X if: u  x for 8x 2 X ;

 l is a lower bound for X if: l  x for 8x 2 X ;

 X is bounded above if there exits an upper bound for X ;

 X is bounded below if there exits a lower bound for X ;

Sups and Infs. Suppose that X is bounded above, there exists infinite upper boundsŽ . One can
define the smallest among the upper bounds. The supremum of X , denoted by sup X, is the
smallest upper bound for X ; that is

 sup X  x 8x 2 X (sup X is an upper bound);

 8 > 0, 9x such that x > sup X  (sup X is the smallest upper bound))

Suppose that X is bounded below, there exists infinite lower bounds. One can define the largest
among the lower bounds. The infimum of X , denoted by inf X , is the largest lower bound for
X ; that is

 inf X  x 8x 2 X (inf X is a lower bound);

 8 > 0, 9x such that x < inf X C  (inf X is the largest lower bound))
Ž
For example, if X D Œ0; 5, then 6; 7; : : : are all upper bounds of X.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 526

Maximum vs supremum. Is maximum and supremum of an ordered set the same? Examples
can show the answer. Example 1: consider the set A D fx 2 Rjx < 2g. Then, the maximum
of A is not 2, as 2 is not a member of the set; in fact, the maximum is not well defined. The
supremum, though is well defined: 2 is clearly the smallest upper bound for the set. Example 2:
B D f1; 2; 3; 4g. The maximum is 4, as that is the largest element. The supremum is also 4, as
four is the smallest upper bound.

Venn diagrams. Venn diagrams are useful in visualizing relation between sets. Venn diagrams
were popularized by the English mathematician, logician and philosopher John Venn (1834 –
1923) in the 1880s. See Fig. 5.3 for one example of Venn diagrams. In a Venn diagram a big
rectangle is used to label the universal set, whereas a circle is used to denote a set.

5.5.2 Set operations


Sets can be combined (if we can combine numbers via arithmetic operations, we can do some-
thing similar for sets) via set operations (Fig. 5.3). We can combine two sets in many different
ways. First, the union of two sets A and B is a set, labelled as A [ B, containing all elements
that are in A or in B. For example, f1;S 3; 4g [ f3; 4; 5g D f1; 3; 4; 5g. If we have many sets
A1 ; A2 ; : : : ; An , the union is written as : niD1 Ai .
||

Second, the intersection of two sets A and B, denoted by A \ B, consists of all elements
that are both in A and B. For instance, f1; 2g \ f2; 3g D f2g. When the intersection of two sets
is empty i.e., A \ B D ;, the two sets are called mutually exclusive or disjoint. We now extend
this to more than two sets. If we have n sets A1 ; A2 ; : : : ; An , these sets are disjoint if they are
pairwise disjoint:
Ai \ Aj D ; for all i ¤ j
Third, the difference of two sets A and B is denoted by A B and is a set consists of elements
that are in A but not in B.
Finally we have another operation on set, but this operation applies to one single set. The
complement of a set A, denoted by Ac , is the set of all elements that are in the universal set S
but are not in A. The Venn diagrams for the presented set operations are shown in Fig. 5.3.

Cardinality. The cardinality of a set is basically the size of the set, it is denoted by jAj. For
finite sets (e.g. the set f1; 3; 5g), its cardinality is simply the number of elements in A. Again,
once a new object was introduced (or discovered) in the mathematical world, there are rules
according to which it (herein is the cardinality of a set) obeys. For instance, we can ask given
two sets A; B with cardinalities jAj and jBj, what is the cardinality of their union i.e., jA [ Bj?
For two sets A and B, we have this rule called the inclusion-exclusion principle or PIE:

jA [ Bj D jAj C jBj jA \ Bj (5.5.1)

When A and B are disjoint, the cardinality of its union is simply the sum of the cardinalities of
A and B. When they are not disjoint, when we add jAj and jBj, we’re counting the elements in
||
Pn
Note the similarity to the sigma notation i D1 xi D x1 C x2 C    C xn .

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 527

A∪B A∩B

A B A B

A−B Ac

A B A

Figure 5.3: Venn diagrams for set operations.

A \ B twice (a Venn diagram would help here), thus we need to subtract it to get the correct
cardinality. The name (of the principle) comes from the idea that the principle is based on
over-generous inclusion (red term), followed by compensating exclusion (blue term).
Then, mathematicians certainly generalize this result to the union of n sets. For simplicity,
we just extend this principle to the case of three sets:
jA [ B [ C j D jAj C jBj C jC j jA \ Bj jB \ C j jC \ Aj C jA \ B \ C j (5.5.2)

Example 5.1
How many integers from 1 to 100 are multiples of 2 or 3? Let A be the set of integers from
1 to 100 that are multiples of 2, then jAj D 50 (why?). Let B be the set of integers from 1 to
100 that are multiples of 3, then jBj D 33a . Our question is amount to computing jA [ Bj.
Certainly, we use the PIE:

jA [ Bj D jAj C jBj jA \ Bj

We need then A \ B which is the set of integers from 1 to 100 that are multiples of both 2 and
3 or multiples of 6, we have jA \ Bj D 16. Thus, jA [ Bj D 50 C 33 16 D 67.
a
A number that is a multiple of 3 if it can be written as 3m, then 1  3m  100, thus m D b100=3c D 33.

Generalized principle of inclusion-exclusion. Now we extend the PIE to the case of n sets for
whatever n. First, we put the two identities for n D 2 and n D 3 together to see the pattern:
jA [ Bj D jAj C jBj jA \ Bj
jA [ B [ C j D jAj C jBj C jC j jA \ Bj jB \ C j jC \ AjCjA \ B \ C j
To see the pattern, let x belong to all three sets A; B; C . It is then counted in every term in the
RHS of the second equation: 4 times added (the red terms) and 3 times subtracted, adding up to 1.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 528

As a preparation for the move to n sets, we no longer use A; B; C , instead we adopt A1 ; A2 ; : : :Ž


Now we write the second equation with the new symbols
ˇ 3 ˇ ˇ 3 ˇ
ˇ[ ˇ X3 X ˇ\ ˇ
ˇ ˇ ˇ ˇ
ˇ Ai ˇ D jA1 [ A2 [ A3 j D jAi j jAi \ Aj j C ˇ Ai ˇ
ˇ ˇ ˇ ˇ
i D1 i D1 1i <j 3 i D1

And with that it is not hard to get the general formula:


ˇ n ˇ ˇ n ˇ
ˇ[ ˇ X n X X ˇ\ ˇ
ˇ ˇ ˇ ˇ
ˇ Ai ˇ D jAi j jAi \ Aj j C jAi \ Aj \ Ak j    C . 1/n 1 ˇ Ai ˇ
ˇ ˇ ˇ ˇ
i D1 i D1 1i <j n 1i <j <kn i D1
(5.5.3)
Note that the first term is adding up elements in each set, the
 second term deals with pairs, the
n
third term deals with triplets and so on. The lth term has l summands (to see this just return
to the case n D 3, the second term in the RHS of the formula in this case is

jA \ Bj jB \ C j jC \ Aj

which has 32 summands. Did mathematicians stop with Eq. (5.5.3)? No that equation is not
in a best form yet. Note that the RHS of that equation involvesPnnP terms and each term in turns
involves a sum of terms. Mathematicians want to write it as i . j    j/. The key to this step
is to discard the subscripts i; j; k and replace them by subscripts with subscripts: i1 ; i2 ; : : :
ˇ n ˇ 0 1
ˇ[ ˇ X n X
ˇ ˇ
ˇ Ai ˇ D . 1/kC1 @ jAi1 \ Ai2    \ Aik jA (5.5.4)
ˇ ˇ
i D1 kD1 1i1 <<ik n

Still they are not


Thappy with this (the blue term, particularly). So they went a further step with
defining AI D i 2I Ai , I  f1; 2; :::; ng:
0 1
ˇ n ˇ
ˇ[ ˇ X n
B X C \
ˇ ˇ
ˇ Ai ˇ D . 1/kC1 B
@ jAI jC
A ; A D Ai (5.5.5)
ˇ ˇ I
i D1 kD1 I f1;2;:::;ng i 2I
jI jDk

The second sum runs over all subsets I of the indices 1; 2; : : : ; nŽŽ which contain exactly k
elements (i.e., jI j D k). At this moment, mathematicians stop because that form is compact.
If you play with the Venn diagrams you will definitely discover many more identities on sets
similar to Eq. (5.5.2). For example, A D .A \ B c / [ .A \ B/. As is always in mathematics, this
seemingly pointless identity will be useful in other contexts.
Ž
Obviously wePwill run out of alphabets and moreover subscripts allow for compact notation: we can write
A1 C A2 C    D i Ai . With A; B; ::: we simply cannot.
ŽŽ
One example clarifies everything, assume n D 3, k D 2, then I D f1; 2g, I D f1; 3g, I D f2; 3g.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 529

Definition 5.5.1
Set A is called countable if one of the following is true

(a) if it is a finite set i.e., jAj < 1, or

(b) it can be put in a one-to-one correspondence with natural numbers. In this case the set
is said to be countably infinite.

A set is called uncountable if it is not countable. One example is the set of real numbers R.

You can check again Section 2.31 on Georg Cantor and infinity if anything mentioned in
this definition is not clear.

de Morgan’s laws state that the complement of the union of two sets is equal to the intersec-
tion of their complements and the complement of the intersection of two sets is equal to the
union of their complements. The laws are named after Augustus De Morgan (1806 – 1871)–a
British mathematician and logician. He formulated De Morgan’s laws and introduced the term
mathematical induction, making its idea rigorous. For any two finite sets A and B, the laws are
.A[B/c D Ac \B c ; .A \ B/c D Ac [ B c
We can draw some Venn diagrams to see that the laws are valid, but that’s not enough as we
know that the laws might hold true for n > 2 sets, in that case no one can use Venn diagram for
a check. The generalized version of de Morgan’s first law is
!c
[n \n
.A1 [ A2 [    [ An /c D Ac1 \ Ac2    \ Acn ; or Ai D Aci
i D1 i D1

Proof of de Morgan’s 1st law for two sets. The plan is to pick an element x in .A [ B/c and
prove it is also an element of Ac \B c and vice versa. Let P D .A[B/c and Q D Ac \B c . Now,
consider x 2 P , we’re going to prove that x 2 Q, which means that P  Q. As x 2 .A [ B/c ,
it is not in A [ B:
H) x … .A [ B/
H) .x … A/ and .x … B/
H) .x 2 Ac / and .x 2 B c /
H) x 2 .Ac \ B c / W x 2 Q H) P  Q
Doing something similar with y 2 Q and then showing y 2 P , we get Q  P . Now we have
P  Q and Q  P . What does it mean? It means P D Q. You can use proof by induction to
prove the generalized version. 

5.6 Random experiments, sample space and event


Before rolling a die we do not know the result. This is an example of a random experiment.
Usually we carry out a random experiment multiple times; each time (called a trial) we get a

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 530

result which we call an outcome. The set of all possible outcomes of a random experiment is
called the sample space. Since this sample space is the biggest space as far as the experiment
is concerned, it is our universal set S. An event is a subset of the sample space. Some examples
are:

 Random experiment: toss a coin; sample space is S D fH; T g (H for head and T for tail),
and one event is E D fH g or E D fT g;

 Random experiment: roll a six-sided die; sample space is S D f1; 2; 3; 4; 5; 6g, and one
event can be E D f2; 4; 6g if we’re interested in the chance of getting an even number;

 Random experiment: toss a coin two times and observe the sequence of heads/tails; the
sample space is
S D f.H; H /; .H; T /; .T; H /; .T; T /g
One event can be E1 D f.H; H /; .T; T /g.

Note that events are subsets of the sample space S .

5.7 Probability and its axioms


Now comes the moment that we can talk about probability in a mathematically precise way. A
probability of an event A, denoted by P .A/, is a number that is assigned to A. The axiomatic
probability theory is based on the following three axioms:

Box 5.1: Three axioms of the theory of probability.


 Axiom 1: The probability of every event is at least zero. For any event A, P .A/  0.

 Axiom 2: The probability of the sample space is 100%; P .S/ D 1.

 Axiom 3: If two events are disjoint, the probability that either of the two events
happens is the sum of the probabilities that each happens:

P .A [ B/ D P .A/ C P .B/ if A \ B D ; (5.7.1)

It is important to take a few moments to understand each axiom thoroughly, because we


cannot prove them, we can only accept them and proceed from that. The first axiom states that
probability cannot be negative and the smallest probability is zero. When P .A/ D 0, the event
A will never occur. As the axiomatic theory of probability is based on the theory of measure
(which I do not know much!), we can think of P .A/ as the area of a domain A, and thus the
third axiom is something like: the area of two disconnected domains is the sum of the area
of each domain. Axiom 2 is just providing a scale. It says that the area of the sample space

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 531

S –which is the maximum area–is one. All other formulas, results, theorems, whatever, are
derived from these three axioms!

Union and intersection of events. As events are sets, we can apply set operations on events.
When working with events, intersection means "and", and union means "or". The probability of
the intersection of A and B, P .A \ B/ is sometimes written as P .AB/ or P .A; B/.

 Probability of intersection:
P .A \ B/ D P .AB/ D P .A and B/

 Probability of union:
P .A [ B/ D P .A or B/

Example 5.2
We roll a fair six-sided die, what is the probability of getting 1 or 5? So, the event is E D f1; 5g
and the sample space is S D f1; 2; 3; 4; 5; 6g. We use the three axioms to compute P .E/. First,
as the die is fair, the chance of getting any number from 1 to 6 is equal:

P .1/ D P .2/ D P .3/ D P .4/ D P .5/ D P .6/

where P .1/ is short for P .f1g/. Note that probability is defined only for sets not for numbers.
Now, we use axioms 2 and 3 together to writed )
(2) (3)
1 D P .S/ D P .1/ C P .2/ C    C P .6/

which results in the probability of getting any number from 1 to 6 is 1=6. Then, using the
axiom 3 again for E, we have
1 1 1
P .f1; 5g/ D P .1/ C P .5/ D C D
6 6 3
Note that, 1=3 D 2=6, we can deduce an important formula:

1 2 jf1; 5gj
P .f1; 5g/ D D D
3 6 jS j
Therefore, for a finite sample space S with equally likely outcomes, the probability of an event
A is the ratio of the cardinality of A over that of S :

jAj
P .A/ D
jS j
d
The symbol (2) above the equal sign to indicate that axiom 2 is being used.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 532

Example 5.3
Using the axioms of probability, prove the following:

(a) For any event A, P .Ac / D 1 P .A/.

(b) The probability of the empty set is zero i.e., P .;/ D 0.

(c) For any event A, P .A/  1.

(d) P .A B/ D P .A/ P .A \ B/

(e) P .A [ B/ D P .A/ C P .B/ P .A \ B/

Proof of P .Ac / D 1 P .A/. Referring back to Fig. 5.3, we know that A [ Ac D S and A
and Ac are disjoint, thus

P .S / D P .A [ Ac /
1 D P .A/ C P .Ac / H) P .A/ D 1 P .Ac /

where use was made of axiom 2 (P .S/ D 1) and axiom 3 (P .A [ Ac / D P .A/ C P .Ac /).
We also get P .A/  1 from P .A/ D 1 P .Ac / for P .Ac /  0 due to axiom 1. 
Proof of P .A B/ D P .A/ P .A \ B/. See figure below for the proof. It is based on set
properties and axiom 3.
A\B
.A B/ [ .A \ B/ D A

A B .A B/ \ .A \ B/ D ;
P .A B/ C P .A \ B/ D P .A/


Proof of P .A [ B/ D P .A/ C P .B/ P .A \ B/. Recall the inclusion-exclusion principle
that jA [ Bj D jAj C jBj jA \ Bj, P .A [ B/ D P .A/ C P .B/ P .A \ B/ is the version of
that principle for probability. From the figure below, we have P .A [ B/ D P .A/ C P .B A/.
Now we replace P .B A/ by P .B/ P .A \ B/–a result from (d).

A \ .B A/ D ;

A B A P .A [ .B A// D P .A/ C P .B A/
P .A [ B/ D P .A/ C P .B A/

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 533

The rule (a) can be referred to as the rule of complementary probability. It is very simple and
yet powerful for problems in which finding P .A/ is hard and finding P .Ac / is much easier. We
will use this rule quite often.
Corresponding to the principle of inclusion-exclusion in Eq. (5.5.3), we have the probability
version:
! !
[n X n X X \
n
P Ai D P .Ai / P .Ai \Aj /C P .Ai \Aj \Ak /   C. 1/n 1 P Ai
i D1 i D1 i <j i <j <k i D1

Or the compact form in Eq. (5.5.5)


0 1
!
[
n X
n
B X C \
P Ai D . 1/kC1 B
@ P .AI /C
A ; AI D Ai (5.7.2)
i D1 kD1 I f1;2;:::;ng i 2I
jI jDk

Example 5.4
Now we consider a classic example that uses the inclusion-exclusion principle. Assume that
a secretary has an equal number of pre-labelled envelopes and business cards (denoted by n).
Suppose that she is in such a rush to go home that she puts each business card in an envelope
at random without checking if it matches the envelope. What is the probability that each of
the business cards will go to a wrong envelope?
Always start simple, so we now assume that n D 3, and we define the following events:
A1 W 1st card in correct envelope
A2 W 2nd card in correct envelope
A3 W 3rd card in correct envelope
Now let E be the event that each of the business cards will go to a wrong envelope. We want
to compute P .E/. E occurs only when none of A1 ; A2 ; A3 has happened. Thus,
E D Ac1 \ Ac2 \ Ac3 D .A1 [ A2 [ A3 /c (de Morgan’s laws)
Now, we can compute P .E/ as
P .E/ D 1 P .E c / D 1 P .A1 [ A2 [ A3 /
The next step is to use the PIE to get the red term, and thus P .E/ is given by
0 1
X3 X
P .E/ D 1 @ P .Ai / P .Ai \ Aj / C P .A1 \ A2 \ A3 /A
i i <j

Now we compute all the probabilities, and we get something special:


.3 1/Š 1 .3 2/Š 1 .3 3/Š 1
P .Ai / D D ; P .Ai \ Aj / D D ; P .A1 \ A2 \ A3 / D D
3Š 3 3Š 6 3Š 6

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 534

How to get the above probabilities? Note that for n D 3 there are a total of 3Š outcomes, and
to have 2 cards in correct envelops we just need to care about the remaining cards (which is
3 2), and for them there are of course .3 2/Š ways.
What is special about this problem is that
 P .AI / in Eq. (5.7.2) is equal and it is pk D
.n k/Š=nŠ and each term in Eq. (5.7.2) has n summands. Therefore, we’re able to go for the
k
general case n with Eq. (5.7.2), and P .E/ is given by
!
Xn
n Xn
. 1/kC1 X n
. 1/k .n k/Š
P .E/ D 1 . 1/kC1 pk D 1 D ; pk D
k kŠ kŠ nŠ
kD1 kD1 kD0

To check this theoretical result, we can perform an experiment using the Monte Carlo method.
To practice Monte Carlo methods, you’re encouraged to implement it for this problem. If
need help, check the code monte-carlo-pi.jl, function MC_secretary_prob, on my github
account. You’ll see that the theoretical result matches the MC result.

5.8 Conditional probabilities


A conditional probability is the likelihood of an event occurring given that another event has
already happened. Conditional probabilities allow us to evaluate how prior information affects
probabilities. When we incorporate existing facts into the calculations, it can change the proba-
bility of an outcome. Probability is generally counter-intuitive, but conditional probability is the
worst! Conditioning can subtly alter probabilities and produce unexpected results.
This section will introduce the famous Bayes’s theorem. But, we first start with a simple
example for motivation.

5.8.1 What is a conditional probability?

Example 5.5
Consider a family that has two children. We are interested in the children’s genders. Our
sample space is S D f.G; G/; .G; B/; .B; G/; .B; B/g. Also assume that all four possible
outcomes are equally likely; that is P .G; G/ D P .G; B/ D    D 1=4.
 What is the probability that both children are girls?

 What is the probability that both children are girls given that the first child is a girl?

 What is the probability that both children are girls given that we know at least one of
them is a girl?

Of course the probability that both children are girls is 1=4. The two remaining probabilities
are more interesting and new; and most of us would say the answer is 1=2 for both. Let’s
denote by A the event that both children are girls and by B the event that the first child is a
girl. That is B D f.G; G/; .G; B/g. Now, the chance to have two girls is therefore 1=2. Let’s

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 535

denote by C the event that one of the children is a girl. That is C D f.G; G/; .G; B/; .B; G/g.
Now, the chance to have two girls is 1=3.

The probability that both children are girls (event A) given that the first child is a girl (event
B) is called a conditional probability. And it is written as P .AjB/; the vertical line | is read
“given that”. This example clearly demonstrates that when we incorporate existing facts into the
calculations, it can change the probability of an outcome. The sample space is changed!
The next thing we need to do is to find a formula for P .AjB/.
Because B has occurred it becomes the sample space, and the only way that A can happen
is when the outcome belongs to the set A \ B, we thus have P .AjB/ as
jA \ Bj
P .AjB/ D
jBj
Now we can divide the denominator and the numerator by jS j, the cardinality of the original
sample space, to have
jA \ Bj=jS j P .A \ B/
P .AjB/ D D (5.8.1)
jBj=jS j P .B/
Of course as B has occurred, P .B/ > 0, so there is no danger in dividing something by it. Note
that Eq. (5.8.1) was derived for sample spaces with equally likely outcomes only. For other cases,
take it as a definition for conditional probability.

5.8.2 P .AjB/ is also a probability


As P .AjB/ is a probability it must satisfy the three axioms of probability (it can have other
properties). I list them first, and then we shall prove them:
 Axiom 1: The conditional probability of every event is at least zero: P .AjF /  0.
 Axiom 2: The conditional probability of the sample space is 100%: P .SjF / D 1.
 Axiom 3: If two events are disjoint, the conditional probability that either of the two events
happens is the sum of the conditional probabilities that each happens; P .A [ BjF / D
P .AjF / C P .BjF / if A \ B D ;.
Proof. The proof of axiom 2 goes as simple as (based on the fact S \ F D F )
P .SF / P .F /
P .SjF / D D D1 .SF D S \ F D F /
P .F / P .F /
The proof of axiom 3 goes like this, I go from the LHS to the RHS, it’s just a personal taste:
P ..A [ B/F / P .AF [ BF /
P .A [ BjF / D D
P .F / P .F /
P .AF / C P .BF /
D .AF \ BF D ;/
P .F /
P .AF / P .BF /
D C D P .AjF / C P .BjF /
P .F / P .F /

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 536

So, the proof used the given information that A and B are disjoint, thus AF and BF are also
disjoint (why?). 

The generalized version of axiom 3 is


!
[
1 X
1
P Ai jF D P .Ai jF /
i D1 i D1

You should prove it. The proof is exactly the same as the one I presented for two events A1 and
A2 !
If we define Q.E/ D P .EjF /, then Q.E/ may be regarded as a probability function on the
events of S because it satisfies the three axioms. Hence, all of the propositions previously proved
for probabilities apply to Q.E/. For example, all results from Example 5.3 hold for conditional
probabilities:

(a) For any event A, P .Ac jF / D 1 P .AjF /.

(b) P .A [ BjF / D P .AjF / C P .BjF / P .ABjF /

Proof of P .Ac jF / D 1 P .AjF /. The proof is based on the fact that F D AF [ Ac F , thus
P .F / D P .AF / C P .Ac F /, noting that AF and Ac F are disjoint (a Venn diagram will show
all this). The rest is to use the definition of the conditional probability and some algebraic
manipulations:

P .AF / P .F / P .AF / P .AF / C P .Ac F / P .AF /


1 P .AjF / D 1 D D
P .F / P .F / P .F /
c
P .A F /
D D P .Ac jF /
P .F /


5.8.3 Multiplication rule for conditional probability


A simple massage to Eq. (5.8.1) results in (note that P .A \ B/ D P .B \ A/ because A \ B D
B \ A)

P .A \ B/
P .AjB/ D H) P .A \ B/ D P .B/P .AjB/ ; or P .A \ B/ D P .A/P .BjA/
P .B/
(5.8.2)
This formula, known as the multiplication rule of probability, is particularly useful in situations
when we know the conditional probability, but we are interested in the probability of the inter-
section. In words, this formula states that the probability that both A and B occur is equal to the
probability that B (or A) occurs multiplied with the conditional probability of A (or B) given
that B (or A) occurred. Note that we’re not talking about causality (with a direction), only about
statistical dependency.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 537

We can generalize the multiplication rule to more than two events. Let’s start with three
events A; B and C . We can write A \ B \ C D .A \ B/ \ C , thus

P .A \ B \ C / D P ..A \ B/ \ C / D P .A \ BjC /P .C /

To find P .A \ BjC /, we use P .A \ B/ D P .B/P .AjB/ and condition on C :

P .A \ B/ D P .B/P .AjB/ H) P .A \ BjC / D P .BjC /P .AjB; C / (5.8.3)

Thus,

P .A \ B \ C / D P .C /P .BjC /P .AjB; C /

Nothing can stop mathematicians to extend this rule to n events. How should they name the
events now? No longer A; B; C; : : : as there are less than 30 symbols! They now use subscripts
for that: E1 ; E2 ; : : : ; En . The generalized multiplication rule is:

P .E1 E2 E3 : : : En / D P .E1 /P .E2 jE1 /P .E3 jE1 E2 /    P .En jE1 E2 : : : En 1 / (5.8.4)

You wanna a proof? It is simple: application of the definition of conditional probability to all
terms, except the first one P .E1 / in the RHS of Eq. (5.8.4):

P.E  P (
1 E2 / ( .E(1 E2 E3 / E 1 E 2 E 3 : : : En
((( (
P
 .E1 /

 
 (((
P
 .E1/
 P
 .E1 E2 /

E
(1(E(2 E
(3(: (: : En 1
where all the terms cancel each other except the final numerator, which is the LHS of Eq. (5.8.4)

5.8.4 Bayes’ formula


Starting with this fact for two events E and F (you need to draw a Venn diagram to convince
yourself):
E D EF [ EF c
Then, we can express P .E/ in terms of P .E/, P .F / and so on:

P .E/ D P .EF / C P .EF c / .EF \ EF c D ;/


D P .EjF /P .F / C P .EjF c /P .F c / (conditional prob.) (5.8.5)
D P .EjF /P .F / C P .EjF c /Œ1 P .F / (complementary prob.)

which simply states that the probability of event E is the sum of the conditional probabilities of
event E given that event F has (red term) or has not occurred. This formula is extremely useful
when it is difficult to compute the probability of an event (E) directly, but it is straightforward
to compute it once we know whether or not some second event (F ) has occurred. The following
example demonstrates how to use this formula.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 538

Example 5.6
An insurance company believes that people can be divided into two classes: those who are
accident prone and those who are not. The company’s statistics show that an accident-prone
person will have an accident at some time within a fixed 1-year period with probability
0.4, whereas this probability decreases to 0.2 for a person who is not accident prone. If we
assume that 30 percent of the population is accident prone, what is the probability that a new
policyholder will have an accident within a year of purchasing a policy?

Solution. Let’s denote by E the event a new policyholder will have an accident within a year
of purchasing a policy. We need to find P .E/. This person is either accident-prone or not.
Let’s call F the event that a new policyholder is accident-prone, then F c is the event that this
person is not accident-prone. Then, we have P .F / D 0:3 and P .F c / D 0:7, P .EjF / D 0:4
and P .EjF c / D 0:2, then Eq. (5.8.5) gives:

P .E/ D .0:4/.0:3/ C .0:2/.0:7/

B1 B2 B3 A ∩ B2

S A ∩ B1 A ∩ B3

S = B1 ∪ B2 ∪ B3 A = (A ∩ B1 ) ∪ (A ∩ B2 ) ∪ (A ∩ B3 )

Figure 5.4: The sample space S is partitioned into three disjoint events B1 ; B2 ; B3 . Then, we have
A D .A \ B1 / [ .A \ B2 / [ .A \ B3 /.

Now we generalize Eq. (5.8.5). How? Note that in that formula, we have two events F and
F , which are two disjoint events that together fill completely the sample space. Now, what we
c

have to do is to just generalize this to n events. First, for simplicity, assume that we can partition
the sample space S into three disjoint sets B1 , B2 and B3 . Then, we have, see Fig. 5.4

A D .A \ B1 / [ .A \ B2 / [ .A \ B3 /

and A \ B1 , A \ B2 and A \ B3 are mutually disjoint. Thus, we can write P .A/ as

P .A/ D P ..A \ B1 / [ .A \ B2 / [ .A \ B3 //
D P .A \ B1 / C P .A \ B2 / C P .A \ B3 / .axiom 3/
D P .AjB1 /P .B1 / C P .AjB2 /P .B2 / C P .AjB3 /P .B3 / .Eq. (5.8.2)/

With that, we have this general result, with Bi , i D 1; 2; : : : ; n partition S :

X
n
P .A/ D P .AjBi /P .Bi / (5.8.6)
i D1

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 539

which is referred to as the law of total probability. This formula states that P .A/ is equal to
a weighted average of P .AjBi /, each term being weighted by the probability of the event on
which it is conditioned.
We’re now deriving the Bayes’s formula or Bayes’s rule that relates P .AjB/ to P .BjA/. We
start with the conditional probability:
P .AjB/P .B/ D P .BjA/P .A/ .D P .A \ B/ D P .B \ A//
Dividing this equation by P .A/ > 0, we get the Bayes’s formula:
P .AjB/P .B/
P .BjA/ D (5.8.7)
P .A/
This formula is referred to as Bayes’ theorem or Bayes’ Rule or Bayes’ Law and is the foundation
of the field of Bayesian statistics. Bayes Theorem is also widely used in the field of machine
learning. For sure, it is one of the most useful results in conditional probability. The rule is
named after 18th-century British mathematician Thomas Bayes. The term P .BjA/ is referred
to as the posterior probability and P .B/ is referred to as the prior probability.
We can use Eq. (5.8.6) to compute P .A/, and thus obtained the extended form of Bayes’s
formula:
P .AjBj /P .Bj /
P .Bj jA/ D Pn (5.8.8)
i D1 P .AjBi /P .Bi /

Example 5.7
A certain disease affects about 1 out of 10 000 people. There is a test to check whether the
person has the disease. The test is quite accurate. In particular, we know that the probability
that the test result is positive (i.e., the person has the disease), given that the person does not
have the disease, is only 2 percent; the probability that the test result is negative (i.e., the
person does not have the disease), given that the person has the disease, is only 1 percent.
A random person gets tested for the disease and the result comes back positive. What is the
probability that the person has the disease?

Solution. A person either gets the disease or not. So the sample space is partitioned into two
sets: D for having the disease and D c for not. We have P .D/ D 0:0001 and P .D c / D 1
0:0001. Let’s denote by A the event that the test result is positive. The problem is asking us to
compute P .DjA/. From the problem description, we have P .AjD c / D 0:02 and P .Ac jD/ D
0:01 which also yields P .AjD/ D 1 0:01 (complementary probability). Now, it is just an
application of Bayes’ formula, i.e., Eq. (5.8.8)b

P .AjD/P .D/ .1 0:01/.0:0001/


P .DjA/ D D
P .AjD/P .D/ C P .AjD c /P .D c / .1 0:01/.0:0001/ C .0:02/.1 0:0001/
which is 0:0049, which indicates that there is less than half a percent chance that the person

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 540

has the disease. This might seem counter-intuitive because the test is quite accurate. The
point is that the disease is very rare. Thus, there are two competing forces here, and since the
rareness of the disease (1 out of 10,000) is stronger than the accuracy of the test (98 or 99
percent), there is still good chance that the person does not have the disease.
b
Herein, B1 D D and B2 D D c .

Example 5.8
The Monty Hall problem is a probability puzzle, loosely based on the American television
game show Let’s Make a Deal and named after its original host, Monty Hall. The problem
was originally posed and solved in a letter by Steve Selvin to the American Statistician in
1975. In the problem, you are on a game show, being asked to choose between three doors.
A car is behind one door and two goats behind the other doors. You choose a door. The host,
Monty Hall, picks one of the other doors, which he knows has a goat behind it, and opens it,
showing you the goat. (You know, by the rules of the game, that Monty will always reveal
a goat.) Monty then asks whether you would like to switch your choice of door to the other
remaining door. Assuming you prefer having a car more than having a goat, do you choose to
switch or not to switch?
Vos Savant’s response was that the contestant should switch to the other door. Many
readers of vos Savant’s column refused to believe switching is beneficial and rejected her
explanation. After the problem appeared in Parade, approximately 10 000 readers, including
nearly 1 000 with PhDs, wrote to the magazine, most of them calling vos Savant wrong. Even
when given explanations, simulations, and formal mathematical proofs, many people still did
not accept that switching is the best strategy. Paul Erdősa remained unconvinced until he was
shown a computer simulation demonstrating vos Savant’s predicted result.
a
Paul Erdős (1913 – 1996) was a renowned Hungarian mathematician. He was one of the most prolific
mathematicians and producers of mathematical conjectures of the 20th century. He devoted his waking hours to
mathematics, even into his later years—indeed, his death came only hours after he solved a geometry problem
at a conference in Warsaw. Erdős published around 1 500 mathematical papers during his lifetime, a figure that
remains unsurpassed. He firmly believed mathematics to be a social activity, living an itinerant lifestyle with the
sole purpose of writing mathematical papers with other mathematicians.

First, we solve this problem using a computer simulation. The code of a computer simulation
of this problem is given in Listing 5.2. The result shows that the probability of not switching is
1=3, which is making sense, and the probability of switching is 2=3, that is twice higher. The
code assumes that the car is behind door 1 without loss of generality. Note that the host will
choose a door that the player did not select and that does not contain a car and reveal this to us.
Another way to see the solution is to explicitly list out all the possible outcomes, and count
how often we get the car if we stay versus switch. Without loss of generality, suppose our
selection was door 1. Then the possible outcomes can be seen in Table 5.4. In two out of three
cases, we win the car by changing our selection after one of the doors is revealed.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 541

Listing 5.2: Monte Carlo simulation of the Monty Hall problem. Source: monty_hall.jl
1 using Random
2 function monty_hall_one_trial(changed)
3 # assume that door 1 has the car, changed=1: switching; changed=0: no switching
4 # the contestant select one door, can be any of (1,2,3,...)
5 number_of_doors = 3
6 chosen_num = rand(1:number_of_doors)
7 # if the contestant decided to change
8 if changed == 1
9 if chosen_num == 1 revealed_num = rand(2,3) end
10 if chosen_num == 2 revealed_num = 3 end # revealed_num: by the host
11 if chosen_num == 3 revealed_num = 2 end
12 # switch to the remaining door
13 avai_doors = setdiff(1:number_of_doors, (chosen_num,revealed_num))
14 chosen_num = rand(avai_doors)
15 end
16 return chosen_num == 1
17 end
18 N = 10000 # Monte Carlo trials
19 prob_changed = sum([monty_hall_one_trial(1) for _ in 1:N])/N # => ~2/3
20 prob no changed = sum([monty_hall_one_trial(0) for _ in 1:N])/N # => ~1/3
_ _

Table 5.4: The Monty Hall problem: listing all possible outcomes. We chose door 1.

Door 1 Door 2 Door 3 Stay at door 1 Switch to offered door

Car Goat Goat WIN LOSS


Goat Car Goat LOSS WIN
Goat Goat Car LOSS WIN

5.8.5 The odds form of the Bayes’ rule


The odds of an event A, denoted by O.A/ are defined by

P .A/ P .A/
O.A/ WD c
D (5.8.9)
P .A / 1 P .A/

That is, the odds of an event A tell how much more likely it is that A occurs than it is that it does
not occur. For instance, if P .A/ D 2=3, then P .A/ D 2P .Ac /, so the odds are 2. If the odds are
equal to ˛, then it is common to say that the odds are ˛ to 1, or ˛ W 1 in favor of the hypothesis.
Having defined the odds of an event, we now write the Bayes’ formula in the odds form.
To this end, consider now a hypothesis H that is true with probability P .H /, and suppose that
new evidence E is introduced (or equivalently, new data is introduced). Then the conditional
probabilities, given the evidence E, that H is true and that H is not true are respectively given

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 542

by (from Eq. (5.8.7))

P .EjH /P .H / P .EjH c /P .H c /
P .H jE/ D ; P .H c jE/ D (5.8.10)
P .E/ P .E/
Therefore, the new odds after the evidence E has been introduced are, obtained by taking the
ratio of P .H jE/ and P .H c jE/

P .H jE/ P .H / P .EjH /
D (5.8.11)
P .H c jE/ P .H c / P .EjH c /
That is, the new value of the odds of H is the old value, multiplied by the ratio of the conditional
probability of the new evidence given that H is true to the conditional probability given that H
is not true.

Example 5.9
Suppose there are two bowls of cookies. Bowl 1 contains 30 vanilla cookies and 10 chocolate
cookies. Bowl 2 contains 20 of each. Now suppose you choose one of the bowls at random
and, without looking, select a cookie at random. The cookie is vanilla. What is the probability
that it came from Bowl 1?

Solution. Let’s denote by H the event that the cookie comes from Bowl 1, and E the event
that the cookie is a vanilla. We have P .H / D P .H c / D 1=2 (without the information that
the chosen cookie was a vanilla, the probability for it to come from either of the two bowls is
50%). We also have P .EjH /, the probability that the cookie is a vanilla given that it comes
from Bowl 1, which is 30=40 D 3=4. Similarly we have P .EjH c / D 20=40 D 1=2. Then,
using the odds form of Bayes’s rule, we have
  
P .H jE/ P .H / P .EjH / 1=2 3=4 3
D D D
P .H c jE/ P .H c / P .EjH c / 1=2 1=2 2

Therefore, P .H jE/ is 3=5. Of course, we can also find this probability w/o using the odds
form of Bayes’ rule: Eq. (5.8.8) gives us

P .EjH /P .H / 3
P .H jE/ D D    D
P .EjH /P .H / C P .EjH c /P .H c / 5
And this is not unexpected as the two formula are equivalent. The odds form is still useful, as
demonstrated in the next example, for cases that we do not know how to compute the prior
odds.

Now, we introduce a new term–Bayes factor–to express Eq. (5.8.11) in a simpler form, easier
to memorize. For a hypothesis H and evidence (or data) E, the Bayes factor is the ratio of the
likelihoods:
P .EjH /
Bayes factor WD (5.8.12)
P .EjH c /

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 543

With this definition, Eq. (5.8.11) can be succinctly written as


posterior odds D prior odds  Bayes factor (5.8.13)
From this formula, we see that the Bayes’ factor (BF) tells us whether the evidence/data provides
evidence for or against the hypothesis:
 If BF > 1 then the posterior odds are greater than the prior odds. So the data provides
evidence for the hypothesis.
 If BF < 1 then the posterior odds are less than the prior odds. So the data provides
evidence against the hypothesis.
 If BF D 1 then the prior and posterior odds are equal. So the data provides no evidence
either way.
The two forms are summarized in Box 5.2.

Box 5.2: Summary of important formulae of conditional probability.

 conditional probability P .AjB/

P .AB/
P .AjB/ D
P .B/

 the law of total probability

X
n
P .A/ D P .AjBi /P .Bi /
i D1

 for two events A and B with P .A/ ¤ 0, we have

P .AjB/P .B/
P .BjA/ D
P .A/

 if B1 ; B2 ; : : : ; Bn form a partition of the sample space S , and P .A/ ¤ 0, then we


have
P .AjBj /P .Bj /
P .Bj jA/ D n
X
P .AjBi /P .Bi /
i D1

 the odds form:


P .H jE/ P .H / P .EjH /
D (5.8.14)
P .H c jE/ P .H c / P .EjH c /

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 544

Example 5.10
Here is another problem from MacKay’s Information Theory, Inference, and Learning
Algorithms: Two people have left traces of their own blood at the scene of a crime. A suspect,
Oliver, is tested and found to have type ‘O’ blood. The blood groups of the two traces are
found to be of type ‘O’ (a common type in the local population, having frequency 60%) and
of type ‘AB’ (a rare type, with frequency 1%). Do these data [bloods of type ‘O’ and ‘AB’
found at the scene] give evidence in favor of the proposition that Oliver was one of the people
who left blood at the scene?

Solution. Let’s call H the hypothesis (or proposition) that Oliver was one of the people who
left blood at the scene. And let E be the evidence that there are bloods of type ‘O’ and ‘AB’
found at the scene. The only formula we have is the odds form of Bayes’ rule:

P .H jE/ P .H / P .EjH /
D
P .H jE/
c P .H c / P .EjH c /

It is obvious that we cannot compute P .H /=P .H c /. In fact, we do not need it, because the
question is not about the actual probability that Oliver was one of the people who left blood
at the scene! If we can compute the Bayes factor and based on whether it is larger than or
smaller than one, we can have a conclusion. What is P .EjH /? When H happens, Oliver
left his blood of type ‘O’ at the scene, the other people has to have type ‘AB’ blood with
probability of 0.01. Thus, P .EjH / D 0:01. For P .EjH c /, we have then two random people
at the scene, and we want the probability that they have type ‘O’ and ‘AB’ blood. Thus,
P .EjH c / D 0:6  0:01  2b .
So, the Bayes factor is:

P .EjH / 0:01
D D 0:83333333
P .EjH c / 0:6  0:01  2
Since the Bayes factor is smaller than 1, the evidence does not support the proposition that
Oliver was one of the people who left blood at the scene.
Another suspect, Alberto, is found to have type ‘AB’ blood. Do the same data give evidence
of the proposition that Alberto was one of the two people at the scene?

P .EjH / 0:6
D D 50
P .EjH c / 0:6  0:01  2
Since the Bayes factor is a lot larger than 1, the data provides strong evidence in favor of
Alberto being at the crime scene.
b
Note that we have assumed that the blood types of two people are independent (so that we can just multiply
the probabilities). And why 2?

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 545

History note 5.1: Thomas Bayes (1701-1761)


Thomas Bayes was an English statistician, philosopher and Presbyterian
minister who is known for formulating a specific case of the theorem
that bears his name: Bayes’ theorem. Bayes never published what would
become his most famous accomplishment; his notes were edited and
published posthumously by Richard Price as Essay towards solving a
problem in the doctrine of chances published in the Philosophical Transactions of the
Royal Society of London in 1764.

5.8.6 Independent events


It is obvious that usually P .AjB/ is different from P .A/, but when they are equal i.e., the
probability of event A is not changed by the occurrence of event B, we say that event A is
independent of event B. Using the conditional probability definition, Eq. (5.8.1), we can show
that P .AjB/ D P .A/ leads to P .AB/ D P .A/P .B/:
P .AB/
P .AjB/ WD D P .A/ H) P .AB/ D P .A/P .B/
P .B/
Note that this equation is symmetric with respect to A and B, thus if A is independent of B, B
is also independent of A. Thus, we have this definition of two independent events A; B:
A and B are independent when P .AB/ D P .A/P .B/ (5.8.15)
What this formula says is that for two independent events A and B, the chance that both of them
happen at the same time is equal to the product of the chance that A happens and the chance
that B happens. And this is the multiplication rule of probability that Cardano discovered, check
Section 5.2.

Example 5.11
Suppose that we toss 2 fair six-sided dice. Let E1 denote the event that the sum of the dice
is 6, E2 be the event that the sum of the dice equals 7, and F denote the event that the
first die equals 4. The questions are: are E1 and F independent and are E2 and F independent?

Solution. We just need to check whether the definition of independence of two events i.e.,
P .AB/ D P .A/P .B/ holds. We have
5 6 5
P .E1 /P .F / D  D
36 36 216
and
1
P .E1 F / D P .f.4; 2/g/ D
36
Thus, P .E1 F / ¤ P .E1 /P .F /: the two events E1 and F are not independent; we call them
dependent events.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 546

In the same manner, we compute


6 6 1
P .E2 /P .F / D  D
36 36 36
and
1
P .E2 F / D P .f.4; 3/g/ D
36
Thus, P .E2 F / D P .E2 /P .F /: the two events E2 and F are independent. Shall we move
on to other problems? No, we had to compute many probabilities to get the answers. Can we
just use intuitive guessing? Let’s try. To get a sum of six (event E1 ), the first die must be one
of any of f1; 2; 3; 4; 5g; the first die cannot be six. Thus, E1 depends on the outcome of the
first die. On the other hand, to get a sum of seven (event E2 ), the first die can be anything
of f1; 2; 3; 4; 5; 6g; all the possible outcomes of a die. Therefore, E2 does not depend on the
outcome of the first die.

Independent events vs disjoint events. Are disjoint events independent or not? If A and
B are two disjoint events, then AB D ;, thus P .AB/ D 0, whereas P .A/P .B/ ¤ 0. So,
P .AB/ ¤ P .A/P .B/. Two disjoint events are dependent.

Some rules of independent events. Given that A and B are two independent events, what can
we say about their complements or unions? Regarding the complementary events, we have this
result: If A and B are independent then

 A and B c are independent;

 Ac and B are independent;

 Ac and B c are independent;

Thus, if A and B are independent events, then the probability of A’s occurrence is unchanged
by information as to whether or not B has happened.
Now we are going to generalize the definition of independence of two events to more than
two events. Let’s start simple with three events, and with one concrete example. It motivates our
definition of the independence of three events.

Example 5.12
Two fair 6-sided dice are rolled, one red and one blue. Let A be the event that the red die’s
result is 3. Let B be the event that the blue die’s result is 4. Let C be the event that the sum of
the rolls is 7. Are A; B; C mutually independent?

Solution. It’s clear that A and B are independent. From Example 5.11, we also know that A; C
are independent and B; C are also independent. We’re now checking whether P .ABC / D
P .A/P .B/P .C /; this is our guess based on generalization of the case of two independent

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 547

events. First,
1 1 6 1
P .A/P .B/P .C / D  D
6 6 36 216
Second, using Eq. (5.8.4) we can compute P .ABC /:
1 1 1
P .ABC / D P .A/P .BjA/P .C jAB/ D  1D
6 6 36

Three events A, B, and C are independent if all of the following conditions hold

P .AB/ D P .A/P .B/


P .BC / D P .B/P .C /
(5.8.16)
P .CA/ D P .C /P .A/
P .ABC / D P .A/P .B/P .C /

And from three to n events is a breeze even n is infinite.

5.8.7 The gambler’s ruin problem


Two gamblers, A and B, are betting on the tosses of a fair coin. At the beginning of the game,
player A has 1 coin and player B has 3 coins. So there are 4 coins between them. In each play
of the game, a fair coin is tossed. If the result of the coin toss is head, player A collects 1 coin
from player B. If the result of the coin toss is tail, player A pays player B 1 coin. The game
continues until one of the players has all the coins (or one of the players loses all his/her coins).
What is the probability that player A ends up with all the coins?
This is a simple version of the classic gambler’s ruin problem: two players begin with fixed
stakes, transferring points until one or the other is "ruined" by getting to zero points. The earliest
known mention of the gambler’s ruin problem is a letter from Blaise Pascal to Pierre Fermat in
1656 (two years after the more famous correspondence on the problem of points).
A solution to the general version of the gambler’s ruin problem is presented now. Let’s
denote by E the event that player A ends up with all the coins when he starts with i coins,
i D 0; 1; 2; : : : ; N where N is the total coins of both players. And to show the dependence on i ,
we use the notation Pi D P .E/. Now, we compute P .E/ by conditioning on the event that the
first coin lands on head or tail. Using the law of total probability we can write:

P .E/ D P .H /P .EjH / C P .H c /P .EjH c /

where P .H / D p is the probability that the coin lands on head and P .H c / D q is the probability
that the coin land on tail. What is P .EjH / and P .EjH c /? Now, as the first coin lands on head
(i.e., H ), A has i C 1 coins. Since successive flips are assumed to be independent, we just have
the same game in which A starts with i C 1 coins. Therefore, P .EjH / D Pi C1 . Similarly, if
the first coin shows tail, P .EjH c / D Pi 1 . With that, we can write

Pi D pPiC1 C qPi 1 (5.8.17)

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 548

Now, using the fact that p C q D 1, we have Pi D .p C q/Pi D pPi C qPi . Replace Pi in the
above equation with this, we obtain
q
pPi C qPi D pPiC1 C qPi 1 H) Pi C1 Pi D .Pi Pi 1 /; i D 1; 2; 3; : : : ; N 1
p
Now, we will explicitly write out this equation for i D 1; 2; 3; : : :, with the so-called boundary
condition that P0 D 0 i.e., we assume that if player A starts with zero coin, he will lose:
q q
i D1 W 
P
2 P1 D .P1 P0 / D P1
p p
 2
q q
i D2 W 
P
3 P
2 D .P2 P1 / D P1 (use the result from above row)

p  p 3
q q
i D3 W P4 P
3 D .P3 P2 / D P1

p p
:: : ::
: W :: :
 k 1
q q
i Dk 1 W Pk Pk 1 D .Pk 1 Pk 2 / D P1
p p
What we do next? We sum up all the above equations, because we see a telescoping sum
.P2 P1 / C .P3 P2 / C    ; for the first three rows, only P4 and P1 are left without being
canceled out, for k D 1; 2; 3; : : : ; N ŽŽ :
 2  3  k 1 !
q q q q
Pk D P1 1 C C C C  C
p p p p

And what is the infinite sum on the RHS? It is a geometric series, recall from Eq. (2.21.5) that
a
a C ar C ar 2 C ar 3 C    C ar n 1
D .1 r n/
1 r
Thus, we have a geometric series with a D 1, r D q=p, thus for i D 1; 2; 3; : : : ; N (I switched
back to i instead of k) 8
<iP1 ; if q=p D 1
Pi D 1 .q=p/i
:P1 ; if q=p ¤ 1
1 q=p
All is good, but still we do not know P1 . Now, we use another boundary condition PN D 1,
and then we’re able to determine P1 and then Pi . Plug i D N into the above equation, we can
determine P1  : 8̂
1
< ; if p D 1=2
P1 D N1 q=p
:̂ ; if p ¤ 1=2
1 .q=p/N
ŽŽ
I have moved P1 to the RHS.

Note that q=p D 1 is equivalent to p D 1=2.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 549

And with that, Pi is given by


8̂ i
< ; if p D 1=2
Pi D N 1 .q=p/i (5.8.18)
:̂ ; if p ¤ 1=2
1 .q=p/N
What are the outcomes of this gambler’s ruin game? First outcome is player A wins, the
second is player B wins. Is that all? Is it possible that the game is never ending? To check that we
need to compute the probability that player B wins when A starts with i coins, this probability
is designated by Qi . And if Pi C Qi D 1, then the game will definitely end with either A wins
or B wins.
By symmetry, we can get the formula for Qi from Pi by replacing i with N i , which is
the amount of coins that player B starts, and p with q:
8̂ N i
< ; if p D 1=2
Qi D 1 N .p=q/N i
:̂ ; if p ¤ 1=2
1 .p=q/N
The sum Pi C Qi is 1 for p D 1=2. For p ¤ 1=2, we have

1 .q=p/i 1 .p=q/N i
P i C Qi D C D  D 1
1 .q=p/N 1 .p=q/N
Some details were skipped for sake of brevity. Thus, the game will end with either A wins or
B wins. Let’s pause a bit and see what we have seen: we have seen a telescoping sum, and a
geometric series in a game of coin tossing! Isn’t mathematics cool?

Solution using difference equations. Eq. (5.8.17) is a (linear) difference equation (or a recur-
rence equation) which involves the differences between successive values of a function of a
discrete variable. In that equation we have the difference between Pi , Pi C1 and Pi 1 , all are
values of a function of i –a discrete variable. (A discrete variable is a variable of which values
can only be integers.) Note that a difference equation is the discrete analog of a differential
equation discussed in Chapter 9.
To solve Eq. (5.8.17), we re-write it as follows

pPi C1 Pi C qPi 1 D0 (5.8.19)


Now, we guess that the solution of this equation is of the form Ar i ŽŽ :

Pi D Ar i H) Pi C1 D Ar i C1 ; Pi 1 D Ar i 1

ŽŽ
Why this form? If we start with this simpler equation Pi D qPi 1, then, we have

Pi D q 2 Pi 2 D q 3 Pi 3 D    D q i P0

Thus, the solution is of an exponential form.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 550

Substituting these into Eq. (5.8.19) results in

Ar i 1
.pr 2 r C q/ D 0 H) pr 2 r Cq D0 (5.8.20)
This is a quadratic equation, thus, for the case p ¤ 1=2 it has two roots (note that q D 1 p):
q
r1 D 1; r2 D
p
Thus, A1 r1i C A2 r2i is the general solution to Eq. (5.8.19), so we can write
 i
q
Pi D A1 C A2 (5.8.21)
p
Now, we determine A1 and A2 using the two boundary conditions: P0 D 0 and PN D 1:
 N
q
A1 C A2 D 0; A1 C A2 D1
p
Solving these 2 equations, we obtain A1 and A2 :
1
A1 D ; A2 D A1
1 .q=p/N
Substituting A1=2 into Eq. (5.8.21) gives the final solution:
 i !
q 1 .q=p/i
Pi D A1 1 D
p 1 .q=p/N
Let’s see what are the odds playing in a casino. Assume that N = 10 000 Units. Using
Eq. (5.8.18), the odds are calculated for different initial wealth. The results shown in Table 5.5
are all bad news. As we cannot have more money than the casino, we look at the top half of the
table, and the odds are all zero (do not look at the column with p D 0:5; that’s just for reference).
One way to improve our odds is to be bold: instead of betting 1 dollar, betting 10 dollars, for
example.
If N D 100 dollars, and player A starts with 10 dollars, what is his chance if he bets 10
dollars per game? Think of 1 coin is 10 dollars, then we can just use Eq. (5.8.18) with i D 1 and
N D 10: P1 D 1 .q=p/1=1 .q=p/10 .

5.9 The secretary problem or dating mathematically


The statement of the secretary problem goes as follows
You are the HR manager of a company and need to hire the best secretary out of
a given number N of candidates. You can interview them one by one, in random
order. However, the decision of appointing or rejecting a particular applicant must
be taken immediately after the interview. If nobody has been accepted before the
end, the last candidate is chosen. What strategy do you use to maximize the chances
to hire the best applicant?

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 551

Table 5.5: Probabilities of player A breaking the bank with total initial wealth N = 10000 Units.

A’s initial wealth Fair game Craps Roulette


i p D 0:5 p D 0:493 p D 0:474

100 0.0100 0.0000 0.0000


500 0.0500 0.0000 0.0000
1000 0.1000 0.0000 0.0000
5000 0.5000 0.0000 0.0000
6000 0.6000 0.0000 0.0000
9000 0.9000 0.0000 0.0000
9950 0.9950 0.2466 0.0055
9990 0.9990 0.7558 0.3531

The first thing we need to do is to translate the problem into mathematics. Let’s assign a
counting number to each candidate. Thus, four candidates John, Sydney, Peter and Laura would
be translated to a list of integers: .1; 7; 3; 9/, an integer can be thought of as the score of a
candidate. In general we denote by .a1 ; a2 ; : : : ; aN / this list. The problem now is to find the
maximum of this list, denoted by amax .
If you’re thinking, this is easy, I pick Laura as the best applicant for 9 is the maximum of
.1; 7; 3; 9/. No, you cannot do this for one simple reason: you cannot look ahead. Think of your
dating, you cannot know in advance who will you date in the future! Thus, at the time the HR
manage is interviewing Peter (3) she does not know that there is a better candidate waiting for
her. Note that she has to make a decision (rejecting or accepting) immediately after the interview.
That is the rule of this problem. It might not be real, but mathematicians do not care.
Ok. I pick the last applicant! But the probability of getting the best is only 1=N , if N is large
then that probability is slim. So, we cannot rely on luck, we need some strategy here. Again
think of dating, what is the strategy there? The strategy most adults adopt — insofar as they
consciously adopt a strategy — is to date around for a while, gain some experience, figure out
one’s options, and then choose the next best thing that comes around.
We adopt that strategy here. Thus, we scan the first r candidates, record the maximum score,
denoted by a , (i.e., a D max ai ; 1  i  r), and then select the first candidate whose score
is larger than a (Fig. 5.5). Now, we’re going to compute what is the probability if we do this.
Obviously, that probability depends on r; if we have that probability, labeled by P .r/, then we
can find r that maximizes P .r/; such an r is called the optimal r. Assume that that optimal r is
five, then the optimal strategy (for dating) is: date 5 persons, discard all of them and marry the
next person who is better than the best among your five old lovers.
Let n be the nth candidate (after r rejected or scanned candidates) of which the score is
maximum. Of course we need to have n  r C 1 (if not, we would lose the best among the

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 552

Figure 5.5: Secretary problem: scan the first r candidates, record the maximum score (i.e., a D
max ai ; 1  i  r, and select the first candidate whose score is larger than a .

rejected r candidates). The first candidate with a score higher than a is the best candidate (i.e.,
an D amax ) only happens when the second best is in r candidates. Therefore, P .r/ is
P 
P .r/ D N nDrC1 P .1st > a and amax /
P
D N nDrC1 P .nth is the best and the second best is in r candidates/
PN
D nDrC1 P .nth is the best/  P .the second best is in r candidates out of n 1/
P PN PN 1 1
D N 1 r
nDrC1 N n 1 D N
r 1
nDrC1 n 1 D r
N nDr n (5.9.1)

The question now is what should be the value of r so that P .r/ 0.4

is maximum? To answer that question we need to know more 0.3


about P .r/ and there is nothing better than a picture. So, choose
P (r)

N D 10 and for r D 1; 2; 3; 4; 5; 6; 7; 8; 9; 10 we compute ten 0.2

P .r/ using Eq. (5.9.1), plot them and we obtain a plot shown in 0.1

the figure. This plot tells us that there is indeed a value of r such
that P .r/ is maximum, and there is only one such r. Because of 0.0
1 2 3 4 5
r
6 7 8 9 10

that, the next r C 1 has a lower probability. So, we just need to find r such that P .r C 1/  P .r/:

r C1 X 1 r X1 X1 1
N 1 N 1 N
P .r C 1/  P .r/ ”  ” 1
N nDrC1 n N nDr n nDrC1
n

Recognizing the red sum is related to the n harmonic numberŽ , we rewrite the above sum as

X1
N
1 X
r
1 N 1
 ln.N 1/ ln r D ln
nD1
n nD1
n r

where the first sum in the left most term is the .N 1/th harmonic number HN 1 , the second term
is the rth harmonic number, and noting that we can approximate Hn  ln n C C O.n/ where
Ž
If needed, check Section 4.15.7 for refresh on harmonic numbers.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 553

is the Euler-Mascheroni constant defined in Eq. (4.15.25). When N is very large, N 1 D N,


and thus we need to find r such that
N N N
ln 1”  e H) r   0:37N; .e D 2:718281 : : :/
r r e
What this formula tells us is that we should discard 37% of the total number of candidates, then
select the next person that comes along who is better than all of those discarded.

Kepler and the marriage problem. In 1611, after losing his first wife, Barbara, to cholera, the
great astronomer and mathematician Johannes Kepler wanted to re-marry. His first marriage was
an arranged one and not so happy so he decided to find a suitable second wife with care. Now
we know how Kepler went about the selection process because he documented it in great detail
to Baron Strahlendorf on October 23, 1613. In his process, Kepler had considered 11 different
matches over two years. The fourth woman was nice to look at — of "tall stature and athletic
build", but Kepler wanted to check out the next one, who, he’d been told, was "modest, thrifty,
diligent and [said] to love her stepchildren," so he hesitated. He hesitated so long, that both No.
4 and No. 5 got impatient and took themselves out of the running, leaving him with No. 6, who
scared him. He eventually returned to the fifth match, 24-year-old Susanna Reuttinger, who, he
wrote, "won me over with love, humble loyalty, economy of household, diligence, and the love
she gave the stepchildren. On 30 October 1613, Kepler married Reuttinger who was a wonderful
wife and both she and Kepler were very happy.

5.10 Discrete probability models


Consider a sample space S , and if S is a countable set, this refers to a discrete probability model.
As S is countable, we can list all the elements in S :

S D fs1 ; s2 ; s3 ; : : :g

Now if A is an event, we have A  S , then A is also countable. By the third axiom (of probabil-
ity), we have 0 1
[ X
P .A/ D P @ sj A D P .sj / (5.10.1)
sj 2A sj 2A

Thus in a countable sample space, to find the probability of an event, all we need to do is to sum
the probability of individual elements in that set. How can we find the probability of individual
elements then? We answer this question next.

Finite sample spaces with equally likely outcomes. An important special case of discrete
probability models is when we have a finite sample space S, where each outcome is equally
likely to occur i.e.,

S D fs1 ; s2 ; s3 ; : : : ; sN g; where P .si / D P .sj / for all i; j 2 f1; 2; : : : ; N g (5.10.2)

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 554

Examples are tossing a fair coin or rolling a fair die.


From the second axiom we have P .S/ D 1, and by denoting P D P .s1 / D P .s2 / D    D
P .sN /, we have
XN
1 D P .S/ D P .si / D NP
i D1

Therefore,
1
P .si / D for all i D f1; 2; :::; N g
N
Next, we’re going to calculate P .A/ for event A with jAj D M , we write

0 1
[ X M jAj
P .A/ D P @ sj A D P .sj / D D (5.10.3)
sj 2A sj 2A
N jSj

Thus, finding the probability of A reduces to a counting problem in which we need to count
how many elements are in A and S . We get the results that Cardano had discoveredŽŽ . And do
we know how to count things...efficiently? Yes, we do (Section 2.26). If your understanding of
factorial, permutations and combinations is not solid (yet), you have to study them again before
continuing with probability.

The birthday problem deals with the probability that in a set of n randomly selected people,
at least two people share the same birthday. This problem is often referred to as the birthday
paradox because the probability is counter-intuitively high: with only 23 people, the probability
is 50% that at least two people share the same birthday, and with 50 people that chance is about
90%. The first publication of a version of the birthday problem was by Richard von Mises|| in
1939.
Equipped with probability theory, we’re going to solve this problem. But, we need a few
assumptions. First, we disregard leap year, which simplifies the math, and it doesn’t change the
results by much. We also assume that all birthdays have an equal probability of occurring .
Because leap years are not considered, there are only 365 birthdays. And we use this formula
P .Ac / D 1 P .A/. That is, instead of working directly, we approach the problem indirectly by
asking what is the probability that none people share the same birthday. This is because doing
so is much easier (note that in the direct problem, handling “at least” two people is not easy as
there are two many possibilities).
The sample space is f1; 2; : : : ; 365gn , which has a cardinality of 365n . For the first person of
n people, there are 365 choices, for the second person, there are only 364 choices (to not have
ŽŽ
Note that Cardano could not prove this formula, and we could, starting from Kolmogorov’s three axioms.
||
Richard Edler von Mises (1883 – 1953) was an Austrian scientist and mathematician who worked on solid
mechanics, fluid mechanics, aerodynamics, aeronautics, statistics and probability theory. In solid mechanics, von
Mises made an important contribution to the theory of plasticity by formulating what has become known as the von
Mises yield criterion. If you want to become a civil/mechanical/aeorspace engineer, you will encounter his name.

The second assumption is not true. But for the first attack to this problem, do not bother too much.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 555

the same birthday with the 1st person), third person 363 choices (to not share the birthday with
the first two persons). And for the nth person, there are 365 n C 1 choices. Thus the probability
that none people share the same birthday is
.365/.364/    .365 n C 1/
365n
Therefore, the probability we’re looking for is:
.365/.364/    .365 n C 1/
P .n/ D 1 (5.10.4)
365n
Now, we compute P .n/ for different values of n from 1
to 50. And we also carry out virtual experiments (see Ap- 1.0 exact
experiment
pendix A.6). The idea is to check the exact solution with the 0.8

solution of empirical probability. With 105 trials, the empir-


0.6
ical solutions match well with the analytical solutions. And

P (n)
we have P .23/  0:5, thus with just 23 people in a room, 0.4
there is 50% chance that at least two of them have the same 0.2

birthday. With about 50 people that chance is increased to


about 90%. 0.0
0 10 20
n
30 40 50

The main reason that this problem is called a paradox


is that if you are in a group of 23 and you compare your birthday with the others, you think
you’re making only 22 comparisons. This means that there are only 22 chances of sharing the
birthday with someone. However, we don’t make only 22 comparisons. That number is much
larger and it is the reason that we perceive this problem as a paradox. Indeed, the comparisons
of birthdays
 will be made between every possible pair of individuals. With 23 individuals, there
23
are 2 D .23  22/=2 D 253 pairs to consider, which is well over half the number of days in a
year (182.5 or 183).
Invert, Always Invert. (Carl Gustav Jacob Jacobi)
Carl Gustav Jacob Jacobi, 19th century mathematician, using the phrase to describe how he
thought many problems in math could be solved by looking at the inverse.
Now, we consider the inverse problem of the birthday problem: how many people (i.e., n D‹)
so that at least two people will share a birthday with a probability of 0.5? It seems easy, we just
need to solve the following equation for n
.365/.364/    .365 n C 1/
1 D 0:5
365n
Hmm. How to solve this equation? It is interesting to realize that a bit massage to P .n/ will be
helpful. We rewrite P .n/ as follows by 365n D 365  365      365, and pair each 365 with
one number in the nominator:
    
365 364 365 n C 1
P .n/ D 1 
365 365 365
      (5.10.5)
365 1 2 n 1
D1 1 1  1
365 365 365 365

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 556

Now comes the art of approximation, recall that for small x close to zero, we haveŽŽ

e x  1 C x H) e x
1 x

(Note that Eq. (5.10.5) has terms of the form 1 x). Thus, Eq. (5.10.5) becomes
 1  2   n 1
P .n/  1 e 365 e 365    e 365
 
1 C 2 C  C n 1
 1 exp (5.10.6)
365
   
n.n 1/ n2
 1 exp  1 exp
2  365 2  365
where use was made of the sum of the first counting numbers formula (Section 2.6.1).
With this approximation, it is easy to find the n such that P .n/ D 0:5:
  p
n2 n2
1 exp D 0:5 H) D ln 2 H) n D ln 2  730 D 22:494
2  365 2  365
And from that we get n D 23.

5.10.1 Discrete random variables


Frequently, when an experiment is performed, we are interested mainly in some function of the
outcome as opposed to the actual outcome itself. For example, in rolling two dice, we are often
interested in the sum of the two dice and are not really concerned about the separate values of
each die. That is, we may be interested in knowing that the sum is 4 and may not be concerned
over whether the actual outcome was .1; 3/; .2; 2/ or .3; 1/. These quantities of interest are
known as random variables.
Usually the notation X is used to denote a random variable. And the notation x is used to
denote a value of X .
Back to the dice rolling, if we roll two six-sided dice the sample space is shown in Fig. 5.6.
This space is the Cartesian product of f1; 2; 3; 4; 5; 6g and f1; 2; 3; 4; 5; 6g. Now, we can define
many random variables. For instance, let’s define X as the total number of points on the two
dice; a few values of X are 2; 3; 12. The event A that X D 2 is the one with two dice showing
one on their faces i.e., .1; 1/. The event B that X D 3 is the one with .1; 2/ or .2; 1/. Precisely,
such X is called a discrete random variable because its possible values are countable.
Because the value of a random variable X is determined by the outcome of an experiment,
we want to assign probabilities to the possible values of the random variable. This is achieved
with defining a probability mass function discussed shortly in Section 5.10.2.
Mathematically, a random variable X is a real-valued function that maps a set s 2 S to a real
number. See Fig. 5.7 for a graphical illustration. Because of this, it is a bit confusing when we
ŽŽ
If this is not clear check Taylor series in Section 4.15.8. It is hard to live without calculus!

Now we understand why mathematicians need two notations for the exponential function: e x and exp.x/, the
latter is for lengthy inputs.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 557

Figure 5.6: Sample space ˝ of rolling two six-sided dice.

Figure 5.7: A random variable is a real-valued function from the sample space S to R.

use the term variable. However, as we will define other functions depending on X , it is called a
variable in that sense.
There also exits the so-called continuous random variables. For example, the heights of ran-
domly selected people from a population is a continuous random variable. A continuous variable
is so called because we cannot list it as we do for discrete random variable. Still remember
Hilbert’s hotel with infinite rooms and Georg Cantor? This section is confined to a discussion of
discrete random variables only.

5.10.2 Probability mass function


Let’s consider a discrete random variable (RV) X of which the rangeŽ is given by
RX D fx1 ; x2 ; x3 ; : : :g
where x1 ; x2 ; : : : are possible values of the random variable X. If we know the probability that
X gets a value xk for all xk in RX we will know its probability distribution. The probability of
the event fX D xk g is called the probability mass function (PMF) of X. Following Pishro-Nik,
the notation for it is PX .xk /; the subscript X is needed as we shall deal with more than one
random variables, each has its own PMF.

Ž
Check Section 4.2.4 if you need a refresh on what is a range of a function.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 558

Example 5.13
Tossing a coin twice and let X be the number of heads observed. Find the probability mass
function PX . The sample space is S D f.H; H /; .H; T /; .T; H /; .T; T /g. So, the number of
heads X is:
X D f0; 1; 2g
Now, we compute P .X D xk / for k D 1; 2; 3:

PX .0/ D P .X D 0/ D P .T; T / D 1=4


PX .1/ D P .X D 1/ D P ..H; T /; .T; H // D 1=2
PX .2/ D P .X D 2/ D P ..H; H // D 1=4

So, the probability mass function of a random variable X is the function that takes a num-
ber x 2 R as input and returns the number P .X D x/ as output. (Note that we included
continuous random variables in this discussion).

Because a PFM is a probability, it has to satisfy the following two properties:


X
0  PX .x/  1; PX .x/ D 1 (5.10.7)
x2RX

To better visualize the PMF, we can plot it. Fig. 5.8 shows the PMF of the above random
variable X ; the plot on the right is known as a bar plot. As we see, the random variable can take
three possible values 0; 1 and 2. The figure also clearly indicates that the event X D 1 is twice
as likely as the other two possible values.

PX (x)
0.50 0.50
PX (x)

0.25
0.25

0.00
0.00 0 1 2
0 1 2 x x

(a) (b)

Figure 5.8: Visualization of the probability distribution of a discrete random variable. Source code:
probability_plots.jl.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 559

5.10.3 Special distributions


Bernoulli distribution. A Bernoulli random variable is a discrete random variable that can only
take two possible values, usually 0 and 1. This random variable models random experiments that
have two possible outcomes: "success" and "failure." Here are some examples:
 You take a pass-fail exam. You either pass (resulting in X D 1) or fail (resulting in X D 0).

 You toss a coin. The outcome is either head or tail.

 A child is born. The gender is either male or female.

 Rolling two dice. You either get a double six (with probability of 1=36) or not a double
six (with a chance of 35=36).

Definition 5.10.1
A random variable X is said to be a Bernoulli random variable with parameter p, denoted by
X Ï Bernoul li.p/, if its PMF is given by

<p; if x D 1
PX .x/ D 1 p; if x D 0 (5.10.8)

0; otherwise

Geometric distribution. Assume that we have an unfair coin for which P .H / D p, where
0 < p < 1 and p ¤ 0:5. We toss the coin repeatedly until we observe a head for the first time.
Let X be the total number of coin tosses. Find the distribution of X .
First, we see that X D f1; 2; 3; : : : ; k; : : :g. To find the distribution of X is to find PX .k/ D
P .X D k/ for k D 1; 2; 3; and so on. These probabilities are (as all tosses are independent, the
probability of, let say, TH is just the product of the probabilities of getting T and H )

PX .1/ W P .H / Dp
PX .2/ W P .TH / D .1 p/p
PX .3/ W P .T TH / D .1 p/.1 p/p D .1 p/2 p
:: :: ::
: : :
PX .k/ W P .T T : : : H / D .1 p/k 1 p
If we list these probabilities we obtain this sequence 0.30P (x) X

p; qp; q 2 p; : : : with q D 1 p, which is a geometric sequence 0.25


with a D p and r D q (Section 4.15). For that reason, X is 0.20
called a geometric distribution, denoted by X Ï Geomet ric.p/. 0.15
See Fig. 5.9 for a plot of this distribution (for p D 0:3). 0.10
Check file probability_plots.jl on my github account 0.05
to see how this plot was generated. Note that, we can think 0.00
1 2 3 4 5 6 7 8 9 10 11 12 13 14 x
of the experiment behind the geometric distribution as re-
peating independent Bernoulli trials until observing the first Figure 5.9: Geometric distribu-
tion with p D 0:3.
Phu Nguyen, Monash University © Draft version
Chapter 5. Probability 560

success.

Binomial distribution. Suppose that we have a coin for which P .H / D p and thus
P .T / D 1 p. We toss it five times. What is the probability that we observe exactly k heads
and 5 k tails?|| To solve this problem, we start with a concrete case: Let A be the event that we
observe exactly three heads and two tails. What is P .A/?
Because A is the event that we observe exactly three heads and two tails, we can write

A D fHHH T T; T THHH; THHH T; : : :g

It can be shown that the probability of each member of A is p 3 .1 p/2 . As there are jAj such
members, the probability of A is

P .A/ D jAjp 3 .1 p/2



But from Section 2.26.5, we know that jAj D 53 , so
!
5 3
P .A/ D p .1 p/2
3

With this, we have the following definition of a binomial distribution.

Definition 5.10.2
A random variable X is said to be a binomial random variable with parameters n and p, shown
as X Ï Bi nomial.n; p/, if its PMF is given byd
!
n k
PX .k/ D p .1 p/n k for k D 0; 1; 2;    ; n (5.10.9)
k
d
How to make sure that this is indeed a PMF? Eq. (5.10.7) is the answer.

Example 5.14
What is the probability that among five families, each with six children, at least three of the
families have four or more girls? Of course, we assume that the probability to have a boy is
0.5.
To solve this problem, first note that the five families are the five trials. And each trial is a
success if that family has at least four girls. And if we denote by p0 the probability of a family
to have at least four girls, the probability that at least three of the families have four or more
girls is: ! ! !
5 3 5 5
p0 .1 p0 /2 C p04 .1 p0 / C p05 (5.10.10)
3 4 5
||
Of course k D 0; 1; 2; 3; 4; 5.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 561

To find p0 , we realize that to get six children, each family has to perform six Bernoulli trials
with p D 0:5 to get a boy or a girl, thus:
! ! !
6 6 6 11
p0 D .0:5/6 C .0:5/6 C .0:5/6 D
4 5 6 32

Plugging this p0 into Eq. (5.10.10) we get the answer to this problem. But that number is not
important than the solution process.

We can generalize what we have found in the above example, to have a formula for calculat-
ing the probability of a  X  b:
!
X
b
n k
P .a  X  b/ D p .1 p/n k
(5.10.11)
k
kDa

PX (x) PX (x)
0.12
0.08
0.10

0.06
0.08

0.06
0.04

0.04
0.02
0.02

0.00 0.00
0 20 40 60 80 100
x 0 20 40 60 80 100
x

(a) X Ï Bi nomi al.50; 0:4/ (b) X Ï Bi nomi al.100; 0:4/

Figure 5.10: Visualization of two binomial distributions. Observe that the curves peak at around np.

To have a better understanding of the binomial distribution, we plot some of them in Fig. 5.10.
The curve has an ascending branch starting from k D 0 to kmax , and a descending branch with
k  kmax . It is possible to determine the value for kmax . First, let’s denote bn .k/ D PX .k/, and
we need to compute the ratio of two successive terms:
   
bn .k/ nŠ k n k nŠ k 1 n kC1
D p .1 p/ = p .1 p/
bn .k 1/ .n k/ŠkŠ .n k C 1/Š.k 1/Š
.n k C 1/p
D
kq
(5.10.12)

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 562

To find the peak of the binomial distribution curve, we find k such that the ratio bn .k/=bn .k 1/ is
larger than or equal to one:

bn .k/
 1 ” .n C 1/p  k H) kmax  np (5.10.13)
bn .k 1/

Now we can understand why each plot in Fig. 5.10 has a peak near np. And why np is at the
peak? Because it is the expected value of X i.e., it is the average value of X . And it should be
the average value that has the highest probability.
Having the ratio between successive terms, it is possible to compute bn .k/ recursively. That
is, we compute the first term i.e., bn .0/, then use it to compute the second term bn .1/ and so on:

bn .0/ D .1 p/n (from Eq. (5.10.9))


np
bn .1/ D bn .0/ (Eq. (5.10.12) with k D 1, q D 1 p)
1 p
.n 1/p (5.10.14)
bn .2/ D bn .1/
2.1 p/
::: D :::

John Arbuthnot and Willem Jacob ’s Gravesande. In 1710 John Arbuthnot (1667–1735)
presented a paper titled An Argument for Divine Providence to the London Royal Society, which
is a very early example of statistical hypothesis testing in social science. The paper presents
a table containing the number of baptised children in London for the previous 82 years. One
seemingly spectacular feature of this data was that in each of these 82 years the number of boys
was higher than that of the girls. Willem Jacob ’s Gravesande (1688 – 1742)ŽŽ set out a task to
find out why.
’s Gravesande first found a representative year by taking the average number of births over
the 82 years in question, which was 11 429. For each year, he then scaled the numbers of births
per sex to that average number. In this scaled data, Gravesande found that the number of boys
had always been between 5 745 and 6 128.
Now, seeing a birth as a Bernoulli trial with p D 0:5, he used Eq. (5.10.11) to compute the
probability of the number of male births falling within this range in a given year as
! 
X
6128
11429 1 11429
P D (5.10.15)
k 2
kD5745

How did ’sGravesande compute this P in 1710 ? First, he re-wrote it as follows (using the
fact that the sum of the coefficients of the nth row in Pascal’s triangle is 2n , check ??, he wrote
ŽŽ
Willem Jacob ’s Gravesande was a Dutch mathematician and natural philosopher, chiefly remembered for
developing experimental demonstrations of the laws of classical mechanics and the first experimental measurement
of kinetic energy. As professor of mathematics, astronomy, and philosophy at Leiden University, he helped to
propagate Isaac Newton’s ideas in Continental Europe.

Today we would write a few lines of code and get the result of 0.2873. But doing so would not improve our
math ability.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 563

P11429 11429

211429 as kD0 k
)
P6128 
11429 P6128 
11429
P11429 11429
kD5745 k kD5745 k
P D D (5.10.16)
211429 kD0 k

The problem now boils down to how to handling the coefficients (and sum of them) in a row
of Pascal’s triangle when n is large. To show how Gravesande did that, just consider the case
n D 5 (noting that 11 429 is an odd number):
5
 5
 
5 5
 5
 5

0 1 2 3 4 5
; or 1 5 10 10 5 1 (5.10.17)

Since we have the following identity between adjacent binomial coefficients in any row (of the
Pascal triangle), ! !
n n n k
D (5.10.18)
kC1 k kC1
 
we now assign the middle term (i.e., 53 ) to any value; say 53 D a, and we then compute the
 
next term 54 in terms of a, then the next term 55 in terms of a. Adding all these three terms
and multiplying the result by twoŽ , we get the sum of all the coefficients in terms of a. In this
way, Gravesande constructed a tableŽŽ containing half of the coefficients in .a C b/11429 starting
from the middle term 5 715 to 5 973. Note that the coefficients are decreasing from the middle
term, and from 5 973 on, the coefficients are negligible.
Although ’s Gravesande was able to solve this computationally challenging binomial related
problem, he stopped there. Thus, ’s Gravesande was not a systematic mathematician but rather a
good problem solver and a number cruncher.

Abraham de Moivre (1667-1754) was a French-born mathematician


who pioneered the development of analytic geometry and the theory
of probability. While he was young de Moivre read mathematics texts
in his own time. In particular he read Huygens’ treatise on games of
chance De ratiociniis in ludo aleae. He moved to England at a young
age due to the religious persecution of Huguenots in France. After
arriving in London he became a private tutor of mathematics, visiting
the pupils whom he taught and also teaching in the coffee houses of
London. As he travelled from one pupil to the next he read Newton’s
Principia. In 1718 he published The Doctrine of Chance: A method of
calculating the probabilities of events in play.
Herein I present de Moivre’s solution to the binomial distribution
problem when the number of trials, n, is large. He started with a simpler version of the problem:
he considered only the symmetric binomial distribution (that is n is an even number, which is
2m), and p D 1=2. He computed the probability of getting n=2 heads during n tosses of a
Ž
Because the row is symmetric.
ŽŽ
If you like coding write a program to reconstruct Gravesande’s table and compute P .

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 564

fair coin. That


 is, according to Eq. (5.10.9) with n D 2m; k D m; p D 1=2, he computed the
2m
quantity m =2 . Note that this is similar to computing the middle term of .1 C 1/2m and
2m

divide it by the sum of all the coefficients. Let’s denote A D 2m
m
. We can write A as
!      
2m .2m/Š mC1 mC2 mCm 1 mCm
AD D D  (5.10.19)
m mŠmŠ m 1 m 2 m .m 1/ m 0

The next step is to take the natural logarithm of Eq. (5.10.19) to have a sum instead of a product:
     
mC1 mC2 mCm 1
ln A D ln C ln C    C ln C ln 2
m 1 m 2 m .m 1/
     
1 C 1=m 1 C 2=m 1 C .m 1/=m
D ln C ln C    C ln C ln 2 (5.10.20)
1 1=m 1 2=m 1 .m 1/=m
X1  1 C i=m 
m
D ln C ln 2
i D1
1 i=m

Now, for the red term, we use the following series for ln 1Cx=1 x , check Section 4.15.3 for details,
  X
1
1Cx x3 x5 x 2k 1
ln D2 xC C C ::: D 2 (5.10.21)
1 x 3 5 2k 1
kD1

to have

X1 X
m 1  2k 1 X
1 X1
m
1 i 1
ln A ln 2 D 2 D2 i 2k 1
(5.10.22)
i D1 kD1
2k 1 m .2k 1/m2k 1
i D1
kD1

What the red term is? It is the sum of powers of integers that Bernouilli computed some years
ago! Using Eq. (2.27.3), we thus can compute it:

X1
m
.m 1/2k 1 1
i 2k 1
D .m 1/2k 1
C .2k 1/B2 .m 1/2k 2
C  (5.10.23)
i D1
2k 2 2

Setting t D m 1=m, and substituting Eq. (5.10.23) into Eq. (5.10.22), we get ln A ln 2 as

X
1
t 2k 1 X
1
t 2k 1 B2 X 2k
1
2
2.m 1/ C t C  (5.10.24)
.2k 1/2k 2k 1 m
kD1 kD1 kD1


To see why A has this form, consider one example with m D 4:

8Š .8/.7/.6/.5/ .4 C 4/.4 C 3/.4 C 2/.4 C 1/


AD D D
4Š4Š .4/.3/.2/.1/ .4 0/.4 1/.4 2/.4 3/

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 565

Now, we have to compute the three sums in the above expression. The second one is easy; it is
just Eq. (5.10.21):
X 1
t 2k 1 1 1Ct 1
D ln D ln.2m 1/ (5.10.25)
2k 1 2 1 t 2
kD1

The first one is very similar to Eq. (5.10.21). In fact if we integrate both sides of that equation
we will meet the first sum:
Z X1 Z X1
1Cx x 2k 1 x 2k
ln dx D 2 dx D 2 (5.10.26)
1 x 2k 1 .2k 1/.2k/
kD1 kD1
R
For the integral ln 1Cx
1 x
dx I have used the Python package SymPy and with that integral com-
puted, the above equation becomes:
  X
1
1Cx 2
 x 2k
x ln C ln 1 x D 2 (5.10.27)
1 x .2k 1/.2k/
kD1

Dividing this with x, we get (also replaced x by t , and then t by m using t D m 1=m)
X1  
t 2k 1 1Ct 
2 D ln C t 1 ln 1 t 2
.2k 1/.2k/ 1 t
kD1
  (5.10.28)
m 2m 1
D ln.2m 1/ C ln
m 1 m2
The third sum involves a geometric series, and can be shown to converge to 1=12 when m
approaches infinity. Similarly, the next sum in Eq. (5.10.22) is 1=360 and so on. With all these
results we can write ln A as
   
1 1 1 1 1
ln A  2m ln.2m 1/ 2m ln.m/ C ln 2 C C C 
2 12 360 1260 1680
(5.10.29)
Then, we can compute the logarithm of A=2n with n D 2m and ln B D ln 2 C 1=12 : : ::
   p   n
A n A 1 B
ln n  ln.n 1/ ln nn n 1 C ln B H) n  1 p (5.10.30)
2 2 n n 1
with the constant B being computed from the following series
B 1 1 1 1
ln D C C  (5.10.31)
2 12 360 1260 1680
Because de Moivre was able to compute B from this series, he did not bother what B is really.
But James Stirling worked out that mysterious series
p 1 1 1 1
ln 2 D 1 C C  (5.10.32)
12 360 1260 1680
Phu Nguyen, Monash University © Draft version
Chapter 5. Probability 566

p
Thus, B D 2e= 2 , where e D 2:718281828459045 is the number we have met earlier in
compounding interest, Section 2.28. With .1 1=n/n  1=e, and n 1  n, from the boxed
equation in Eq. (5.10.30) we then get
A 2e 1 1 2
n
Ïp p Dp (5.10.33)
2 2 e n 2 n
The next step de Moivre did was to compute the probability bn .kmax Cl/ in terms of bn .kmax /§ .
First, we need to use Eq. (5.10.12) to determine the ratio (note that kmax  np):
.n kmax i C 1/p
bn .kmax C i/=bn .kmax C i 1/ D
.kmax C i/q
(5.10.34)
.nq i/p 1 i=.nq/
 D
.np C i/q 1 C i=.np/
The logarithm of the last ratio equals (with this approximation ln.1 C x/  x for x near 0)
   
i i i i i
ln 1 ln 1 C  D (5.10.35)
nq np nq np npq
For l  1 and kmax C l  n, we can compute the term which is distant from the middle by the
distance l i.e., ln bn .kmax Cl/=bn .kmax / using Eq. (5.10.35), as follows
 
bn .kmax C l/ bn .kmax C 1/ bn .kmax C 2/ b.kmax C l/
ln D ln    
bn .kmax / bn .kmax / bn .kmax C 1/ bn .kmax C l 1/
bn .kmax C 1/ bn .kmax C 2/ bn .kmax C l/
D ln C ln C  C
bn .kmax / bn .kmax C 1/ bn .kmax C l 1/
2
1 C 2 C  C l 1 l
  .sum of first l integers=l.l C 1/=2/
npq 2 npq
(5.10.36)
Thus, bn .kmax C l/ is exponentially proportional to bn .kmax /:
 
l2
bn .kmax C l/  bn .kmax / exp (5.10.37)
2npq
where exp.x/ D e x is the exponential functionŽŽ . Using Eq. (5.10.33), which is bn .kmax / for the
case p D q D 1=2 and n is even, we get de Moivre’s approximation to the symmetric binomial
distribution:
 
2 2l 2
bn .n=2 C l/  p exp (5.10.38)
2 n n

Remarkably two famous numbers in mathematics  D 3:1415 : : : and e appear in this formula!
§
This is similar to ’s Gravesande’s approach.

Thus theoretically this works only for small i .
ŽŽ
We use e x when the term in the exponent is short and exp.: : :/ when that term is long or complex.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 567

Even though de Moivre did not draw his approximation, he men-


0.08 normal curve

tioned the curve in his "The Doctrine of Chances" published in 1738 0.06

PX (x)
(when he was 71 years old). He even computed the two inflection 0.04

points of the curve. And this is probably the first time the normal curve 0.02

appears. Later on, Gauss and Laplace defined the normal distribution 0.00
0 20 40 60 80 100

that we shall have more to say. x

After getting this approximation, de Moivre used it to compute some probabilities. For
example, with the help of Eq. (5.10.11), he computed the following probability

X
d   Z d  
2 2l 2 2 2x 2
P .n=2  X  n=2 C d /  p exp p exp dx
2 n lD0 n 2 n 0 n
(5.10.39)
Noting that he approximated the sum in his approximate binomial distribution by an integral.
Thus, de Moirve did not think of a probability distribution
pfunction. And from that, it is easy to
have with a factor of two and a change of variable (x D ny):
Z p
4 d= n 
P .jX n=2j  d /  p exp 2y 2 dy (5.10.40)
2 0

To evaluate the integral, de Moivre replaced the exponential function by its series and did a term
by term integration. This is what Newton and mathematicians in the 18th p century did. We also
discussed it in Section 4.16. He obtained a result of 0:682688 for d= n D 1=2. As we’re not
in a calculus class, we can use a library to do this integral for us, see Listing 5.3. The result is
0:682689. Note that what de Moirve computed shows that 68% of the data is within p one standard
deviation of the mean. We shall know shortly that the standard deviation is 0:5 n.

Listing 5.3: Example of using the QuadGK pacakge for numerical integration.
1 using QuadGK
2 integral, err = quadgk(x -> (4/sqrt(2*pi))*exp(-2*x^2), 0, 0.5, rtol=1e-8)

p p
Continuing with d= n D 1 and d= n D 3=2, he
obtained what is now referred to as the 68 95 99
rule. See Fig. 5.11 and check Listing A.18 for the code.
This is the well know bell-shaped normal curve. It is
symmetric about zero: the part of the curve to the right
of zero is a mirror image of the part to the left. Despite
de Moivre’s scientific eminence his main income was
as a private tutor of mathematics and he died in poverty.
Desperate to get a chair in Cambridge he begged Johann
Bernoulli to persuade Leibniz to write a supporting let-
ter for him. Bernoulli did so in 1710 explaining to Leib- Figure 5.11: Bell-shaped normal curve.
niz that de Moivre was living a miserable life of poverty.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 568

Indeed Leibniz had met de Moivre when he had been in London in 1673 and tried to obtain
a professorship for de Moivre in Germany, but with no success. Even his influential English
friends like Newton and Halley could not help him obtain a university post.
He was unmarried, and spent his closing years in peaceful study. De Moivre, like Cardano,
is famed for predicting the day of his own death. He found that he was sleeping 15 minutes
longer each night and summing the arithmetic progression, calculated that he would die on the
day that he slept for 24 hours. He was right!

Negative Binomial Distribution. Suppose that we have a coin with P .H / D p. We toss the
coin until we observe m heads, where m 2 N. We define X as the total number of coin tosses in
this experiment. Then X is said to have Pascal distribution with parameter m and p. We write
X Ï P ascal.m; p/. Note that P ascal.1; p/ D Geomet ric.p/. Note that by our definition
the range of X is given by RX D fm; m C 1; m C 2; : : :g. This is because we need to toss at
least m times to get m heads.
Our goal is to find PX .k/ for k 2 RX . It’s easier to start with a concrete case, say m D 3.
What is PX .4/? In other words, what is the probability that we have to toss the coin 4 times to
get 3 heads? The fact that we had to toss the coin 4 times indicating that in the first three tosses
we only got 2 heads. This observation is the key to the solution of this problem. And in the final
toss (the fourth one) we got a head. Thus,

PX .4/ D P .2 heads from 3 tosses/  P .1 head for the last toss/

The problem has become familiar, and we can compute PX .4/:


! !
3 2 3
PX .4/ D p .1 p/1  p D p 3 .1 p/1
2 2

And with that, it is just one small step to get the general result:
!
k 1 m
PX .k/ D p .1 p/k m ; k D m; m C 1; : : : (5.10.41)
m 1

Binomial distribution versus Pascal distribution. A binomial random variable counts the
number of successes in a fixed number of independent trials. On the other hands, a negative
binomial random variable counts the number of independent trials needed to achieve a fixed
number of successes.

Poisson’s distribution. Herein, we’re going to present an approximation to the binomial distri-
bution when n is large, p is small and np is finite. Let’s introduce a new symbol  such that
np D . We start with bn .0/, and taking advantage of the fact that n is large, we will use some
approximations:
 
n  n
bn .0/ D .1 p/ D 1 (5.10.42)
n

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 569

Now, taking the natural logarithm of both sides of the above equation, and we get
 

ln bn .0/ D n ln 1 (5.10.43)
n
Now, we use an approximation for ln.1 x/, check Taylor’s series in Section 4.15.8 if this was
not clear:
x2 x3 x4
ln.1 x/ D x C 
2 3 4
With that approximation, we now can write ln bn .0/ as (with x D =n)
2 3
ln bn .0/ D   (5.10.44)
2n 3n2
For very large n’s, we get a good approximation of bn .0/ by omitting terms with n in the
denominator:
ln bn .0/   H) bn .0/  e  (5.10.45)
And of course, we use the recursive formula, Eq. (5.10.14), to get the next term bn .1/ and so
on. But first, we also need an approximation (when n is large) for the ratio bn .k/=bn .k 1/; using
Eq. (5.10.13) with p D =n, q D 1 p:
bn .k/ .n k C 1/p 
D 
bn .k 1/ kq k
Now, starting with bn .0/, we obtain bn .1/, bn .2/ and so on:

bn .0/  e

bn .1/  e
2 
bn .2/  e
2
3 
bn .3/  e
23
Thus, we have a formula for any k:
k e 
bn .k/ 

And this is now known as Poisson distribution, named after the French mathematician Siméon
Denis Poisson (1781 – 1840). A random variable X is said to be a Poisson random variable with
parameter , shown as X Ï P oisson./ , if its range is RX D f0; 1; 2; 3; :::g, and its PMF is
given by
k e 
PX .k/ D for k 2 RX (5.10.46)

What should we do next after we have discovered the Poisson approximation to the binomial
distribution? We should at least do two thingsŽŽ :
P1 P1 k e 
ŽŽ
Actually we need to check whether kD0 PX .k/ D 1, or kD0 kŠ
D 1.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 570

1. Check the accuracy of the Poisson approximation. We can do this by computing


Bi nomial.n; p/ and P oisson./, with  D np, for some values of n and p, and for
different x’s. I skip this step here for brevity.

2. Justify the need of the Poisson approximation. We’re going to do this with one example
next.
Suppose you’re trying to get something to happen in a video game that is rare; maybe it
happens 1% of the time you do something. You’d like to know how likely it is to happen at least
once if you try, say, 100 times. Here we have p D 1=100, n D 100. So the binomial distribution
gives us an exact answer, namely
 
1 100
P D1 1
100
The result is 0:63396 with a calculator, of course. Using the Poisson approximation with  D
np D 1, that probability is (easier)
1
P D1 e D 0:632120

5.10.4 Cumulative distribution function


The PMF is one way to describe the probability distribution of a discrete random variable. As we
will see later on, the PMF cannot be defined for continuous random variables, because the PMF
for a x 2 R would be zero! Why is that? This is because there are infinite values of x (within
any interval Œa; b there are infinite real numbers, still remember Hilbert’s hotel?), the probability
of getting one value of x is zero (for division by infinity is zero). And this is consistent with
daily observations. We know that all measurements have a degree of uncertainty regardless of
precision and accuracy. This is caused by two factors, the limitation of the measuring instrument
(systematic error) and the skill of the experimenter making the measurements (random error).
Thus, it is meaningless to say that I measure the length of my son and get exactly 1:43 m.
If we cannot have P .X D x/ for real x, then the only option left is P .a  x  b/–the
probability that x falls within a range. And the cumulative distribution function (CDF) of a
random variable is what we need to describe the distribution of (continuous) random variables.
The advantage of the CDF is that it can be defined for any kind of random variable being it a
discrete, continuous, and mixed one.
The cumulative distribution function (CDF) of random variable X is defined as

FX .x/ WD P .X  x/ ; for all x 2 R

Example. A fair coin is flipped twice. Let X be the number of observed heads. Find the CDF
of X . Note that here X Ï Bi nomial.2; 1=2/, the range of X is RX D f0; 1; 2g and its PMF is
given by
1 2 1
PX .0/ D P .X D 0/ D ; PX .1/ D P .X D 1/ D ; PX .2/ D P .X D 2/ D
4 4 4
Phu Nguyen, Monash University © Draft version
Chapter 5. Probability 571

To find the CDF, we argue as follows. First, note that if x < 0, then FX .x/ D P .X  x/ D 0.
Second, if x  2, then FX .x/ D P .X  x/ D 1. Next, for 0  x < 1, FX .x/ D P .X  x/ D
P .X D 0/ D 1=4 and so on. To summarize, the CDF of X is:


ˆ 0; if x < 0
ˆ
<1; if 0  x < 1
FX .x/ D 43
ˆ
ˆ4; if 1  x < 2

1; if x  2

Now that we have seen a CDF, it’s time to talk about its properties. By looking at the graph
of this CDF, we can tell that

1. The range of FX .x/ is Œ0; 1;

2. When x approaches 1 then FX .x/ approaches 0;

3. When x approaches C1 then FX .x/ approaches 1;

4. The CFD is a non-decreasing function.

The first property is just a consequence of the second and third properties. The second property
is just another way of saying that the probability of X smaller than 1 is zero. Similarly, the
third property is the fact that the probability of something in the sample space occurs is one, as
any X must be smaller than infinity! About the last property, as we’re adding up probabilities,
the CDF must be non-decreasing. But we can prove it rigorously using the following result: for
a; b 2 R such that a < b:

P .a < x  b/ D FX .b/ FX .a/ (5.10.47)

of which a proof is given in Fig. 5.12. As probability is always non-negative, the above results
in FX .b/ FX .a/  0 or FX .b/  FX .a/.

Figure 5.12: Proof of Eq. (5.10.47).

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 572

5.10.5 Expected value


A roulette wheel has 36 slots–numbered from 1 to 36, half of them colored red, half black–and
two zeros, colored green. Casinos offer even odds for betting on red or black at roulette. Suppose
we bet $100 on red. The question is: on average how much we win or lose per game.
First, we compute the probability of getting a red; it is 18=38 D 9=19. Thus the probability
of not getting a red is 1 9=19 D 10=19. Now, suppose we play 100 games. In 100 games, we
will win in .9=19/.100/ games and lose in .10=19/.100/ games, thus the amount of money we
gain or lose in 100 games is:
   
9 10
.100/.$100/ C .100/. $100/ D .$5:26/.100/
19 19

Thus, per game, we will lose $5.26. What does this number mean? Obviously for each game, we
either win $100 or lose $100. But in a long run when we have played many games, on average
we would have lost $5.26 per game.
We can see that this average amount can be computed by adding the product of the probability
of winning $100 and $100 to the product of the probability of losing $100 and -$100:
   
9 10
.$100/ C . $100/ D $5:26
19 19

Let’s consider another example of rolling a die N times. Assume that among these N times,
we observe 1 n1 times, we observe 2 n2 times , we observe 3 n3 times, and so on. Now we
compute the average of all the numbers observed:

.1 C 1 C    C 1/ C .2 C 2 C    C 2/ C    C .6 C 6 C    C 6/
„ ƒ‚ … „ ƒ‚ … „ ƒ‚ …
n1 n2 n6
xD
N
.1/.n1 / C .2/.n2 / C    C .6/.n6 /
D
N

Now, assume that N is large, then ni =N D 1=6, which is the probability that we observe i for
i D 1; 2; : : : Thus,
n  n  n  1
1 2 6
x D .1/ C .2/ C    C .6/ D .1 C 2 C 3 C 4 C 5 C 6/  (ni =N D 1=6)
N N N 6
21 7
D D D 3:5
6 2

Thus the averaged value of rolling a die is 7=2. In ??, we shall know that there is a law called
the law of large numbers that.
Notice that in both examples the averaged number is the sum of the products of the random
variable times its probability. This leads to the following definition for the expected value.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 573

Definition 5.10.3
If X is a discrete random variable with values of fx1 ; x2 ; : : : ; xn g and its PFM is PX .xk /, then
the expected value of X, denoted by EŒX, is defined as:
X
EŒX D x1 PX .x1 / C x2 PX .x2 / C    D xk PX .xk / (5.10.48)
k

History note 5.2: Blaise Pascal (1623-1662)

Blaise Pascal was the third of Étienne Pascal’s children. Pascal’s mother
died when he was only three years old. Pascal’s father had unorthodox
educational views and decided to teach his son himself. Étienne Pascal
decided that Blaise was not to study mathematics before the age of 15
and all mathematics texts were removed from their house. Curiosity
raised by this, Pascal started to work on geometry himself at the age
of 12. He discovered that the sum of the angles of a triangle are two
right angles and, when his father found out, he relented and allowed
Blaise a copy of Euclid. About 1647 Pascal began a series of experiments on atmospheric
pressure. By 1647 he had proved to his satisfaction that a vacuum existed. Rene Descartes
visited Pascal on 23 September. His visit only lasted two days and the two argued about
the vacuum which Descartes did not believe in. Descartes wrote, rather cruelly, in a letter
to Huygens after this visit that Pascal ...has too much vacuum in his head.

Now, we’re deriving another formula for the expected value of X, but in terms of the proba-
bility of the members of the sample space:
X
EŒX D X.s/p.s/ (5.10.49)
s2S

We shall prove the important and useful result that the expected value of a sum of random
variables is equal to the sum of their expectations (i.e., EŒX C Y  D EŒX C EŒY  for two RVs
X and Y ) using Eq. (5.10.49).
Proof of Eq. (5.10.49). Let’s denote by Si the event that X.Si / D xi for i D 1; 2; : : : That is,
Si D fs W X.s/ D xi g
For example, in tossing two dice, and let X be the total number of faces, we have x1 D 2 and
x2 D 3, with S2 D f.1; 2/; .2; 1/g are the outcomes that led to x2 . Moreover, let p.s/ D P .s/
be the probability that s is the outcome of the experiment. The proof then starts with the usual
definition of EŒX and replaces X D xi by Si , Fig. 5.7 can be helpful to see the connection
between s, S and X :
X X X
EŒX D xi PX .xi / D xi PX .X D xi / D xi P .Si /
i i i

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 574

P
We continue with replacing P .Si / by s2Si p.s/ (that is using the third axiom),
X X XX XX
EŒX D xi p.s/ D xi p.s/ D X.s/p.s/
i s2Si i s2Si i s2Si
P P P
And finally, because S1 ; S2 ; : : : are disjoint or mutually exclusive, i Si is just s2S , thus
X
EŒX D X.s/p.s/
s2S
PP
which
P concludes
P the proof.
P P Below, I will elaborate some steps which involves . For example,
i xi s2Si p.s/ D i s2Si xi p.s/, just use one concrete case:
X X X X XX
xi p.s/ D xi .p.s1 / C p.s2 // D .xi p.s1 / C xi p.s2 // D xi p.s/
i s2Si i i i s2Si

5.10.6 Functions of random variables


Consider this simple statistics problem: we want to find the average height  of the students
at a school. To this end, we randomly select n students and measure their heights; we get
x1 ; x2 ; : : : ; xn , which are random variables. Now, it is reasonable to estimate  by xN D .x1 C
   C xn /=n. Such xN is a function of random variables. Since we cannot be certain that xN is an
exact estimation for , we need to find the probability distribution of x. N
Assume now that we are given a discrete RV X along with its probability mass function and
that we want to compute the expected value of some function of X, say, g.X/. How can we
accomplish this? One way is as follows: As g.X/ is itself a discrete random variable, it has a
probability mass function, which can be determined from the PMF of X, see Fig. 5.13. Once
we have determined the PFM of g.X/, we can compute EŒg.X/ by using the definition of the
expected value.

Figure 5.13: Pictorial presentation of sample space S, RV X, and function of a RV Y D g.X / and its
PFM.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 575

Example 5.15
Let X be a RV that takes on any values 1; 0; 1 with respective probabilities

P .X D 1/ D 0:2; P .X D 0/ D 0:5; P .X D 1/ D 0:3

Compute EŒX 2 ; so g.X/ D X 2 in this example.


First, we compute the PMF of Y D X 2 whose range is f0; 1g:

P .Y D 0/ D P .X D 0/ D 0:5
P .Y D 1/ D P .X D 1/ C P .X D 1/ D 0:5

Second, the expected value of Y is computed:

EŒX 2  D EŒY  D .1/.0:5/ C .0/.0:5/ D 0:5 (5.10.50)

But there is a faster way of doing this. The expected value of g.X/, EŒg.X/, is simply given
by
X
EŒg.X/ D g.xi /PX .xi / (5.10.51)
i

And this result is known as the law of the unconscious statistician, or LOTUS. This a theorem
used to calculate the expected value of a function g.X/ of a random variable X when one knows
the probability distribution of X but one does not know the distribution of g.X/. Th name comes
from the fact that some statisticians present Eq. (5.10.51) as the definition of the expected value
rather than a theorem.
Before proving this result, let’s check that it is in accord with the results obtained directly
using the definition of EŒX 2  for the above example. Applying Eq. (5.10.51), we get

EŒX 2  D . 1/2 .0:2/ C .0/2 .0:5/ C .1/2 .0:3/ D 0:5

which is the same as the direct result. To see why the same result was obtained, we can do some
massage to the above expression:

EŒX 2  D . 1/2 .0:2/ C .0/2 .0:5/ C .1/2 .0:3/


D .1/.0:2 C 0:3/ C .0/.0:5/ .grouping terms with equal g.xi //
D .1/.0:5/ C .0/.0:5/

The last expression is exactly identical to Eq. (5.10.50). The proof of Eq. (5.10.51) proceeds
similarly.

P
Proof of Eq. (5.10.51). We
P start with i g.xi /PX .xi /, then group terms with the same g.xi /,
and then transform it to j yj PY .yj / which is EŒg.X/ with yj are all the (different) values of

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 576

Y:
X X X
g.xi /PX .xi / D g.xi /PX .xi / (grouping step)
i j i Wg.xi /Dyj
X X
D yj PX .xi / (replacing g.xi / D yj )
j i Wg.xi /Dyj
X X X
D yj PX .xi / D yj PY .yj /
j i Wg.xi /Dyj j
P
The notation i Wg.xi /Dyj g.xi /PX .xi / means that the sum is over i but only for i such that
g.xi / D yj , and that is achieved by the subscript i W g.xi / D yj under the sum notation.


5.10.7 Linearity of the expectation


In this section we shall discuss some properties of the expectation of random variables. For the
motivation, let’s consider an example first.

Expected value of sum of two random variables. Let’s roll two dice and denote by S the sum
of faces. If we denote by X the face of the first die and by Y the face of the second die, then
S D X C Y . Obviously S is a discrete RV, and we can compute its PFM. Thus, we can compute
its expected value. First, we list all possible elements of S :
S D 2 W .1; 1/
S D 3 W .1; 2/; .2; 1/
S D 4 W .1; 3/; .3; 1/; .2; 2/
S D 5 W .1; 4/; .4; 1/; .2; 3/; .3; 2/
S D 6 W .1; 5/; .5; 1/; .2; 4/; .4; 2/; .3; 3/
S D 7 W .1; 6/; .6; 1/; .2; 5/; .5; 2/; .3; 4/; .4; 3/
S D 8 W .2; 6/; .6; 2/; .3; 5/; .5; 3/; .4; 4/
S D 9 W .3; 6/; .6; 3/; .4; 5/; .5; 4/
S D 10 W .4; 6/; .6; 4/; .5; 5/
S D 11 W .5; 6/; .6; 5/
S D 12 W .6; 6/
Now, we can compute P .S D xj / for xj D f2; 3; : : : ; 12g, and then using Eq. (5.10.48) to
compute the expected value:
1 2 3 4 5
EŒS  D 2  C3 C4 C5 C6
36 36 36 36 36
6 5 4 3 2 1 252
C7 C8 C9 C 10  C 11  C 12  D D7
36 36 36 36 36 36 36
You might be asking what is special about this problem? Is it just another application of the
concept of expected value? Hold on. Look at the result of 7 again. Rolling one die and the

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 577

expected value is 7=2ŽŽ , now rolling two dice and the expected value is 7. We should suspect
that
EŒX C Y  D EŒX C EŒY  (5.10.52)
which implies that the expected value of the sum of two random variables is equal to the sum
of their individual expected values, regardless of whether they are independent. In calculus, we
have the derivative of the sum of two functions is the sum of the derivatives. Here in the theory
of probability, we see the same rule.

Proof of Eq. (5.10.52). Let X and Y be two random variables and Z D X C Y . We’re now
using Eq. (5.10.49) for the proof:
X X
EŒZ D Z.s/p.s/ D ŒX.s/ C Y.s/p.s/
X X X
s s

D ŒX.s/p.s/ C Y.s/p.s/ D X.s/p.s/ C Y.s/p.s/ D EŒX C EŒY 


s s s

This proof also reveals that the property holds not only for two RVs but for any number of
RVs. Thus, for n 2 N, we can write

EŒnX D EŒX C X C    C X  D EŒX C    C EŒX D nEŒX


„ ƒ‚ …
n terms

And from that, it is a short step to guess thatŽ

EŒaX C b D aEŒX C b .a; b 2 R/ (5.10.53)

Proof of Eq. (5.10.53). Let X be a RV and Y D g.X/ D aX C b. We’re now using


Eq. (5.10.51)–the LOTUS–for the proof.
X X
EŒg.X/ D g.x/PX .x/ D .ax C b/PX .x/
X X X X
x x

D axPX .x/ C bPX .x/ D a xPX .x/ C b PX .x/


x x
! x x
X
D aEŒX C b PX .x/ D 1
x


ŽŽ
Check the paragraph before definition 5.10.3 if this was not clear.
Ž
First, from the fact that EŒnX  D nEŒX  we generalize to EŒaX  D aEŒX . We have seen mathematicians
did this many time (e.g. check Eq. (2.24.2)).

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 578

5.10.8 Variance and standard deviation


It is easy to see that the expected value is not sufficient to describe a probability distribution. For
example, consider the three distributions shown in Fig. 5.14. Although they all have an expected
value of zero, they are quite different: the values of the third distribution vary a lot. We need
to define a measure for this spread. And it is called the variance. To see the motivation behind
the definition of the variance, we just need to know that the spread of a probability distribution
indicates how far a value is from the expected value. To say how far a number a is from a number
b, we can either use ja bj or .a b/2‘ .

1.0 dist1
dist2
0.8 dist3

0.6
PX (x)

0.4

0.2

0.0
−5 −4 −3 −2 −1 0 1 2 3 4 5
x

Figure 5.14: Three distributions of the same expected value but difference variances.

Now, consider a RV X with EŒX now being denoted by . The variance of a RV X , desig-
nated by Var.X/, is defined as the average value of the squares of the difference from X to the
mean value i.e., .X /2 . Thus, it is given by

Var.X/ WD EŒ.X /2  (5.10.54)

Why square? Squaring always gives a positive value, so the variance will not be zeroŽŽ . A
natural question is: the absolute difference also has this property, why we can’t define the
variance as EŒjX j? Yes, you can! The thing is that the definition in Eq. (5.10.54) prevails
because it is mathematically easier to work with x 2 than to work with jxj. Again, just think
about differentiating these two functions and you will see what we mean by that statement.
Note that Var.X/ has a different unit than X . For example, if X is measured in meters
then Var.X / is in meters squared. To solve this issue, another measure, called the standard
deviation, is defined. The standard deviation, usually denoted by X , is simply the square root
of the variance.
Instead of using the definition of the variance directly to compute it, we can use LOTUS to

Of course we prefer working with power functions, and .a b/2 is the lowest power function.
ŽŽ
You’re encouraged to think of an example to see this.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 579

P
have a nicer formula for it (recall that  D xPX .x/):
x
X
Var.X/ D EŒ.X /2  D .x /2 PX .x/
X
x

D .x 2 2x C 2 /PX .x/


(5.10.55)
X X X
x

D x 2 PX .x/ 2 xPX .x/ C 2 PX .x/


x x x
D EŒX 2  2 D EŒX  2
.EŒX/2

This formula is useful as we know EŒX (and thus its squared) and we know how to compute
EŒX 2  using the LOTUS. If you want to translate this formula to English, it is: the variance is
the mean of the square minus the square of the mean. Eventually, nothing new is needed, it is
just a combination of all the things we know of!
Let’s now compute Var.aX C b/. Why? To see if the variance is a linear operator or not.
Denoting Y D aX Cb, then Y D aCb, which is the expected value of Y (from Eq. (5.10.53)).
Now, we can write
Var.Y / D EŒ.Y Y /2  D EŒ.aX C b a b/2 
(5.10.56)
D EŒa2 .X /2  D a2 EŒ.X /2  D a2 Var.X/
Thus, we have
Var.aX C b/ D a2 Var.X/ ¤ aVar.X/ C b (5.10.57)
What else does the above equation tell us? Let’s consider a D 1, that is Y D X C b, then
Var.Y / D Var.X/. Does this make sense? Yes, noting that Y D X C b is a translation of X
(Section 4.2.2), and a translation does not distort the object (or the function), thus the spread of
X is preserved.

Sample variance. Herein we shall meet some terminologies in statistics. For example, if we
want to find out how much the average Australian earns, we do not want to survey everyone
in the population (too many people), so we would choose a small number of people in the
population. For example, you might select 10 000 people. And that is called a sample. Why
10 000, you’re asking? It is not easy to answer that question. That’s why a whole field called
design of experiments was developed, just to have unbiased samples. This is not discussed here.
Ok. Suppose now that we have already a sample with n observations (or measurements)
x1 ; x2 ; : : : ; xn . The question now is: what is the variance for this sample? You might be surprised
to see the following§

1 X
n
1X
n
S2 D .xi N 2;
x/ xN D xi
n 1 i D1
n i D1
§
When work with the samples, we do not know the probabilities pi , and thus we cannot use the definition of
mean and expected value directly. Instead we just include each output x as often as it comes. We get the empirical
mean instead of the expected mean. Similarly we get the empirical variance.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 580

Why n 1 but not n? In statistics, this is called Bessel’s correction, named after Friedrich Bessel.
The idea is that we need S 2 to match the population variance  2 , to have an unbiased estimator
of  2 . As shown below, with n in the denominator, we cannot achieve this. And what’s why
n 1 was used‘ .
Proof.
P First, we have the following identity (some intermediate steps were skipped, noting that
i xi D nx)
N
Xn Xn Xn
2 2 2
.xi x/ N D .xi 2xi xN C xN / D xi2 nxN 2 (5.10.58)
i D1 i D1 i D1
Applying that identity to yi D .xi / (with yN D xN ) we have
X
n X
n
Œ.xi / .xN /2 D .xi /2 n.xN /2
i D1 i D1

Now, we compute the expected value of the LHS of the above equation:
" n # " n #
X X
E Œ.xi / .xN /2 D E .xi /2 n.xN /2
i D1 i D1
X
n
 
D E .xi /2 nEŒ.xN /2  .EŒX C Y  D EŒX C EŒY /
i D1
X n
D Var.xi / N
nVar.x/
i D1
(5.10.59)
Now, we can compute the expected value of S 2 :
" n # " n #
1 X 2 1 X 2
2
EŒS  D E .xi x/N D E ..xi / .xN //
n 1 i D1
n 1 i D1
" #
1 X
D Var.xi / nVar.x/ N (used Eq. (5.10.59))
n 1 i

Note that as x1 ; x2 ; : : : ; xn are a random sample from a distribution with variance  2 , thus (check
Eq. (5.12.19) for the second result)
2
Var.xi / D  2 ; Var.x/
N D
n
Substituting these into EŒS 2 , we obtain
" n #
1 X 1
EŒS 2  D 2 2 D .n 2  2/ D  2
n 1 i D1 n 1

Another explanation that I found is: one degree of freedom was accounted for in the sample mean. But I do
not understand this.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 581

Thus the sample variance coincides with the population variance, which justifies the Bessel
correction. 

5.10.9 Expected value and variance of special distributions


Having established the two vital concepts: expected value and variance, we are now going to
compute these parameters for the various discrete distributions considered in this section (e.g. the
binomial distribution). To summarize the results, Table 5.6 lists these quantities for the Bernoulli,
binomial, geometric, Pascal and Poisson distributions.

Table 5.6: Expected value, variance and SD of special distributions.

X Meaning EŒX Var.X/ 

Bernoul li.p/ p p
p
Bi nomial.n; p/ n coin toss, X is # of heads observed np npq npq
Geomet ric.p/ X is # of coin toss until a H is observed 1
p
p
P ascal.m; p/ X is # of coin toss until m heads observed m
p
npq
P oi sson./  

How they were computed? Of course using the definition of the expected value and variance,
massage the algebraic expression until the simplest form is achieved. I am going to give one
example.

Example 5.16
Determine the expected value for the geometric distribution with the PMF given by q k 1 p for
k D 1; 2; : : : Using Eq. (5.10.48), we can straightforwardly write EŒX as

X X
1 X
1
EŒX D xk PX .xk / D kq k 1 p D p kq k 1

xk 2RX kD1 kD1

Now, the trouble is the red sum. To attack it, we need to use the geometric series,
!
X1
1 d X 1 X1
1
k k
x D H) x D kx k 1 D
1 x dx .1 x/2
kD0 kD0 kD0

Thus, we handled the red term, now come back to EŒX:

X
1
1 1
EŒX D p kq k 1
Dp D
.1 q/2 p
kD1

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 582

5.11 Continuous probability models


Whereas a discrete variable is a variable whose value is obtained by counting (e.g. the number
of marbles in a jar, the numbers of boys in a class and so on), a continuous variable is a variable
whose value is obtained by measuring. For examples, the height of students in class, the weight
of students in class, the time it takes to get to school. This section is about probability models
for these continuous random variables. The central concept is of a probability density function,
that function which gives us the probability of a continuous random variable (Section 5.11.1).
Then, expected value and variance are discussed in Section 5.11.2. Finally, common contiuous
distributions (e.g. normal distribution) are given in Section 5.11.3.

5.11.1 Probability density function


The table below (Table 5.7) gives the heights of fathers and their sons, based on a famous
experiment by Karl Pearson‘ around 1903. The number of cases is 1 078. Random noise was
added to the original data, to produce heights to the nearest 0.1 inch|| .

Table 5.7: Karl Pearson’s height data.

Row Father Son

1 65.00 59.80
2 63.30 63.20
3 65.00 63.30
:: :: ::
: : :
1077 70.70 69.30
1078 70.00 67.00

One good way to analyze a continuous data sample (such as the one in Table 5.7) is to use a
histogram. A histogram is built as follows. First, denote the range of the data e.g. fathers’ heights
by Œl; m, where l and m represent the minimal and maximal value of the data. Second, we
"bin" (or "bucket") the range of values—that is, divide the entire range of values into a series of
intervals. Mathematically, the interval Œl; m is partitioned into a finite set of bins B1 ; B2 ; : : : ; BL .
Third, the relative frequency in each bin is recorded. To this end, let’s denote by n the number
of data observations (in the case of Pearson’s data, it is 1078), and for bin j , its frequency fj
is defined (as it should be) the ratio of how many data is in this bin to n. Using symbols, fj is
written as

Karl Pearson (1857-1936) was a British statistician, leading founder of the modern field of statistics.
||
You can download the data at https://www.randomservices.org/random/data/Pearson.html.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 583

1X
n
fj D 1fxi 2 Bj g; for j D 1; 2; : : : ; L (5.11.1)
n i D1
where 1fxi 2 Bj g returns 1 if xi is in bin Bj and 0 otherwise.
The final step is to plot the bins and fj . A bar plot where the X -axis represents the bin
ranges and the Y -axis gives information about frequency is used for this. Fig. 5.15a presents a
histogram for the fathers’ heights .

1.0
0.10

Cumulative distribution function


0.8
0.08
Frequency fj

0.6
0.06

0.04 0.4

0.02 0.2

0.00 0.0
60 65 70 75 60 65 70 75
Father’s height Father’s height

(a) (b)

Figure 5.15: Fathers’ height: probability histogram and cumulative distribution function.

It is useful to assume that the CDF of a continuous random variable is a continuous function,
see Fig. 5.15b to see why. Then, recall from Eq. (5.10.47) that
P .a < x  b/ D FX .b/ FX .a/
And from the fundamental theorem of calculus (Chapter 4), we know that
Z b
dFX .x/
FX .b/ FX .a/ D fX .x/dx; where fX .x/ D (5.11.2)
a dx
Thus, we can find the probability that x falls within an interval Œa; b in terms of the new function
fX .x/:
Z b Z b
P .a < x  b/ D fX .x/dx; or P .a  x  b/ D fX .x/dx (5.11.3)
a a

The function fX .x/ is called the probability density function or PDF. Why that name? This is
because fX .x/ D dFX .x/=dx , which is probability per unit length. Note that for a continuous RV

See Listing B.1 for the code. I used Julia packages to compute and plot the histogram. You’re encouraged to
code Eq. (5.11.1) if you want to learn programming.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 584

writing P .a < x  b/ or P .a  x  b/ is the same because P .x D a/ D 0. Actually we have


seen something similar (i.e., probability is related to an integral) in Eq. (5.10.40).
The probability density function satisfies the following two properties (which is nothing but
the continuous version of Eq. (5.10.7))

1. Probabilities are non-negative:

fX .x/  0 for 8x 2 R

2. Probabilities sum to one: Z C1


fX .x/dx D 1 (5.11.4)
1
Geometrically, this equation indicates that the area under any PDF curve is one.

5.11.2 Expected value and variance


Recall that the expected value and the variance of a discrete RV X are defined as
X
EŒX WD xk PX .xk /; Var.X/ WD EŒ.X /2 
k

And from that we have the continuous counterparts, where sum is replaced by integral and the
PDF replacing the PMF
Z 1 Z 1
EŒX D xfX .x/dx; Var.X/ D .x /2 fX .x/dx (5.11.5)
1 1

5.11.3 Special continuous distributions


Uniform distribution. This is the simplest type of continuous distributions. What we want is a
PDF that is constant (i.e., uniform) in the interval Œa; b. Because the PDF is constant, it has a
rectangular shape of width b a, and of height 1=b a. Why? Because the area under any PDF
curve is one (Eq. (5.11.4)). Thus, a continuous random variable X is said to have a Uniform
distribution over the interval Œa; b , shown as X Ï U nif orm.a; b/, if its PDF is given by
(
1
; if a < x < b
fX .x/ D b a (5.11.6)
0; otherwise

Standard normal distribution. de Moivre had derived an approximation to the binomial distri-
2
bution and it involves the exponential function of the form e x . Thus, there is a need to evaluate
the following integral (see Eq. (5.10.39)):
Z 1
2
I D e x dx
1

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 585

2
Unfortunately
R it is impossible to find an antiderivative of e x . Note that if the integral was
2 2
2xe x dx, then life would be easier. The key point is the factor x in front of e x . If we go to
2D, then, we can make this factor appear. Let’s compute I 2 insteadŽ :
Z 1  Z 1  “ 1
x2 y2
e .x Cy / dxdy
2 2 2
I D e dx e dy D
1 1 1

The next step is to switch to polar coordinates in which dxdy will become rdrd (see Sec-
tion 7.8.2), and voilà:
Z 2 Z 1  Z 1
2 r2 2 p
I D e rdr d D  H) I D e x dx D 
0 0 1

With that, we can define what is called a standard normal variable as follows. A continuous
random variable Z is said to be a standard normal (or standard Gaussian) random variable,
denoted by Z Ï N.0; 1/, if its PDF is given byŽ
 
1 z2
Z Ï N.0; 1/ W fZ .z/ D p exp (5.11.7)
2 2
p 2
Why this form? Why not this form .1=  /e z ? This one is also a legitimate PDF, actually it is
the form that Gauss used. However, the one in Eq. (5.11.7) prevails simply because with it, the
variance is one (this is to be shown shortly)–which is a nice number.
The CDF of a standard normal distribution is
Z z  2
1 u
FZ .z/ D P .Z  z/ D p exp du WD ˆ.z/ (5.11.8)
2 1 2

The integral in Eq. (5.11.8) does not have a closed form solution . Nevertheless, because of
the importance of the normal distribution, the values of this integral have been tabulated; see
Table 5.8 for such a tableŽŽ . Nowadays, it is available in calculators and in many programming
languages. Moreover, mathematicians introduced the short notation ˆ to replace the lengthy
integral expression. Fig. 5.16 plots both fZ .z/ and ˆ.z/.

To explain the shape of the bell curve, we use calculus. Let’s compute the first and second
derivatives of fZ .z/:
 2
1 z
fZ .z/ D p exp H) fZ0 .z/ D zfZ .z/; fZ00 .z/ D .z 2 1/fZ .z/
2 2
Thus, fZ .z/ has a maximum at z D 0, and fZ00 .z/ < 0 for jzj < 1: the curve is concave here.
And fZ00 .z/ > 0 for jzj > 1: the curve is convex here. That’s why the curve is of a bell shape.
Ž
Yes, sometimes making a problem harder and we can find the solution to the simpler problem.
p
Ž
The factor 1= 2 before the exponential function is required because of Eq. (5.11.4).

This means that there is no antiderivative written in elementary functions. The situation is similar to there is

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 586

Z 1
z
1 − u2
2
φ(z) = √ e du φ(z1 )
2π −∞ φ(z)
y
1
1 − u2
2 2
√ e

φ(z)
φ(z2 )

z u z2 0 z1 u

(a) (b)

Figure 5.16: Plot of the standard normal curve (a) and plot of the CDF which is the area underneath the
normal curve from 1 to z. As the total area under the normal curve is one, half of the area is 0.5, thus
ˆ.0/ D 1=2. Another property: ˆ. z/ D 1 ˆ.z/. This property is useful as we only need to make
table of ˆ.z/ for z  0. Why we have this property? Plot the normal curve, mark two points z and z on
the horizontal axis. Then, 1 ˆ.z/ is the area under the curve from z to 1 while ˆ. z/ is the area from
1 to z. The normal curve is symmetric, thus the two areas must be equal.

Table 5.8: Table for ˆ.z/ for z  0.

z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08

0.0 0.5000 0.5040 0.5080 0.5120 0.5160 0.5199 0.5239 0.5279 0.5319
0.1 0.5398 0.5438 0.5478 0.5517 0.5557 0.5596 0.5636 0.5675 0.5714
0.2 0.5793 0.5832 0.5871 0.5910 0.5948 0.5987 0.6026 0.6064 0.6103
:: :: :: :: :: :: :: :: :: ::
: : : : : : : : : :
2.1 0.9821 0.9826 0.9830 0.9834 0.9838 0.9842 0.9846 0.9850 0.9854

Now, using Eq. (5.11.5) we’re going to find the expected value and the variance of N.0; 1/.

no formula for the roots of a polynomial of high degree, e.g. five. This was proved by the French mathematician
Joseph Liouville (1809 – 1882).
ŽŽ
Why we need this table? It is useful for inverse problems where we need to find z  such that ˆ.z  / D a where
a is a given value. This table was generated automatically (even the LATEX code to typeset it) using a Julia script.
For me it was simply a coding exercise for fun.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 587

It can be shown that if Z Ï N.0; 1/ then§


Z 1 Z 1
1 1 z 2 =2
EŒZ D z fZ .z/dz D p z1e dz D 0
2
Z 1
1
Z 1
1
1 z 2 =2
Var.Z/ D 2
z fZ .z/dz D p z2e dz D 1
1 2 1

Normal distribution. A continuous random variable X is said to be a normal (or Gaussian)


random variable, denoted by X Ï N.;  2 /, where  is the expected value and  2 is the variance
of X , if its PDF is given by
 
2 1 .x /2
X Ï N.;  / W fX .x/ D p exp (5.11.9)
 2 2 2

How did mathematicians come up with the above form of the PDF for the normal dis-
tribution? Here is one way. The standard normal distribution has a mean of zero and a
variance of one and the graph is centered around z D 0. Now, to have a distribution of the
same shape (exponential curve) but with mean  and variance ( 2 ) different from one,
we need to translate and scale the standard normal curve (Section 4.2.2). This is achieved
with X D Z C  where Z Ï N.0; 1/. We can see that

EŒX D EŒZ C  D EŒZ C  D   0 C  D 


Var.X/ D Var.Z C / D  2 Var.Z/ D  2

So far so good, now to get Eq. (5.11.9), we start with the CDF of X :
 x  x 
FX .x/ D P .X  x/ D P .Z C   x/ D P Z  Dˆ
 
From that we can determine the PDF of X :
d d x  1 0 x  1 x 
fX .x/ D FX .x/ D ˆ D ˆ D fZ
dx dx     

Figure 5.17 shows this translating– with (x )–and scaling–with (x = ). Now we can
write the CDF: x 
FX .x/ D P .X  x/ D ˆ (5.11.10)

And thus we can compute P .a  X  b/ as
  a 
b 
P .a  X  b/ D ˆ ˆ (5.11.11)
 
Uniform distribution
Uniform distribution
§
The first integral is zero because the integrand is an even function. For the second integral, using integration
by parts.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 588

0.4
N (0, 1)
N (2, 2)
0.3

0.2

0.1

0.0
−4 −2 0 2 4 6

Figure 5.17: Transformation of the standard normal curve to get a normal curve with  ¤ 0 and  ¤ 1.

5.12 Joint discrete distributions


So far we have dealt with only one single variable, now is the time to consider more than one
variable. Life is complicated and more often we have to deal with many variables. For example,
we might measure the height and weight of horses, or the IQ and birth weight of children, or the
frequency of exercise and the rate of heart disease in adults.
In such situations the random variables have a joint distribution that allows us to compute
probabilities of events involving both variables and understand the relationship between the
variables. The situation is simplest when the variables are independent. When they are not, we
use covariance and correlation as measures of the dependence between them.

5.12.1 Two jointly discrete variables


Suppose that we have two discrete random variables X and Y , and that X takes values
fx1 ; x2 ; : : : ; xn g and Y takes values fy1 ; y2 ; : : : ; ym g. The pair .X; Y / take values in the Carte-
sian product f.x1 ; y1 /; .x1 ; y2 /; : : : ; .xn ; ym /g; that is the joint range of X and Y : RX Y . Now,
we’re interested in the probability of the event X D xi and Y D yj , for xi ; yj in the ranges
of X; Y , respectively. We can define the so-called joint probability mass function, denoted by
PX Y .xi ; yj / as:
PX Y .xi ; yj / D P .X D xi ; Y D yj / D P ..X D xi / and .Y D yj // (5.12.1)
which gives us the probability of the joint outcome X D xi ; Y D yj .
Table 5.9 gives an example of a joint PFM. The joint PMF contains all the information
regarding the distributions of X and Y . This means that, for example, we can obtain the PMF of
X and Y from the joint PMF. For example, the probability that X D 129 is computed as, using
the law of total probability (Eq. (5.8.6))Ž
P .X D 129/ D P .X D 129; Y D 15/ C P .X D 129; Y D 16/ D 0:12 C 0:08 D 0:20
Ž
If not clear, see Y D 15 as B1 and Y D 16 as B2 .

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 589

Similarly, we computed P .X D 130/, and P .X D 131/. And we put them in the margins of
the original joint PFM table (Table 5.10). Because of this, the probability mass functions for X
and Y are often referred to as the Marginal Distributions for X and Y .
With that example, we now give the definition of the marginal distribution for X (the one for
Y is similar): X
PX .x/ WD PX Y .x; yj / for any x 2 RX (5.12.2)
yj 2RY

Table 5.9: Example of a joint PFM. Table 5.10: Marginal PFM from joint PFM.

y x 129 130 131 y x 129 130 131


15 0.12 0.42 0.06 15 0.12 0.42 0.06 0.60
16 0.08 0.28 0.04 16 0.08 0.28 0.04 0.40
0.20 0.70 0.10

A joint probability mass function is a probability, thus it has to satisfy the two properties in
Eq. (5.10.7):
Xn X m
0  PX Y .xi ; yj /  1; PX Y .xi ; yj / D 1 (5.12.3)
i D1 j D1

Recall that the cumulative distribution function of random variable X is defined as


FX .x/ WD P .X  x/; for all x 2 R
From this, we can, in a similar manner, define the joint cumulative distribution function for X
and Y :
FX Y .x; y/ WD P .X  x; Y  y/; for all x; y 2 R (5.12.4)
And of course, from the joint CDF FX Y .x; y/ we can determine the marginal CDFs for X and
Y:
FX .x/ WD P .X  x; Y  1/ D lim FX Y .x; y/
y!1
(5.12.5)
FY .y/ WD P .X  1; Y  y/ D lim FX Y .x; y/
x!1

5.12.2 Conditional PMF and CDF


Suppose that we roll a die. Let X be the observed number. Find the probability mass function of
X given that we know the observed number was less than 5. Such a PMF is called a conditional
PMF. Specifically, the conditional PMF of X given event A is defined as
P .X D xi and A/
PX jA .xi / D P .X D xi jA/ D (5.12.6)
P .A/
where Eq. (5.8.1) was used in the last equality. Now, instead of event A, we consider the case it
is the event that Y D yj , then we have a conditional PMF of X given Y :
P .X D xi and Y D yj / PX Y .xi ; yj /
PX jY .xi jyj / D D (5.12.7)
P .Y D yj / PY .yj /

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 590

5.12.3 Independence
Roll two dice. Let X be the number on the first die and let Y be the number on the second die.
Then both X and Y take values 1 to 6 and the joint pmf is PX Y .i; j / D 1=36 for all i and j
between 1 and 6. The joint probability table is shown in Table 5.11. It is obvious that the two
events X and Y are independent. Now, look at the mentioned table, we can observe that

PX Y .x D xi ; y D yj / D PX .xi /PY .yj /; i; j D 1; 2; : : : ; 6 (5.12.8)

(1=36 D .1=6/.1=6/). So, we suspect that for two random variables X and Y to be independent,
we should haveŽ

PX Y .x D xi ; y D yj / D PX .xi /PY .yj / 8.xi ; yj /

To check this definition of independence of two discrete RVs, we are going to use it to prove that
the conditional PMF is equal to the marginal PMF. In other words, knowing the value of Y does
not provide any information about X . That is we need to prove PXjY .xi jyj / D PX .xi /. Indeed,

PX Y .xi ; yj / PX .xi /
PY.y /
j
PXjY .xi jyj / D D D PX .xi /
PY .yj / PY.y
j /


Table 5.11: Joint PFM of rolling two dice.


x
y 1 2 3 4 5 6 PY .yj /
1 1=36 1=36 1=36 1=36 1=36 1=36 1=6

2 1=36 1=36 1=36 1=36 1=36 1=36 1=6

3 1=36 1=36 1=36 1=36 1=36 1=36 1=6

4 1=36 1=36 1=36 1=36 1=36 1=36 1=6

5 1=36 1=36 1=36 1=36 1=36 1=36 1=6

6 1=36 1=36 1=36 1=36 1=36 1=36 1=6

PX .xi / 1=6 1=6 1=6 1=6 1=6 1=6

5.12.4 Conditional expectation


We have probability and conditional probability. It is thus natural to have conditional expectation.
Given a random variable X and an event A that has occured, the conditional expectation of X,
denoted by EŒXjA, is defined as
X X
EŒX D xk PX .xk / H) EŒX jA D xk PX jA .xk / (5.12.9)
k k

Ž
Actually we know that for two independent events A and B, P .A \ B/ D P .A/P .B/.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 591

which is nothing but the ordinary expectation in which the normal PMF is replaced by the
conditional PMF. If A is the event Y D y, then we have EŒXjY D y:
X
EŒX jY D y D xk PX jY .xk jy/ (5.12.10)
k
Obviously EŒXjY D y is a real number, and it depends on y. Suppose now that Y D
fy1 ; y2 ; : : : ; ym g, then for each yi we have a corresponding EŒX jY D yi . So, EŒX jY  is a
function of Y . As Y is a random variable, so is EŒXjY . It is natural to consider the expected
value of EŒXjY , that is EŒEŒXjY  and so on. Let’s consider one example first to see what is
EŒEŒXjY .

Example 5.17
Considering the table below. Let Z D EŒXjY .
 find the conditional PMF of X given Y D 0 and Y D 1 i.e., PX jY .xj0/ and PX jY .xj1/
 find the PMF of Z
 find the expected value of Z i.e., EŒZ
x
y 0 1 PY .yj /
0 1=5 2=5 3=5

1 2=5 0 2=5

PX .xi / 3=5 2=5

 Using Eq. (5.12.7), we compute PX jY .0j0/ as


PX Y .0; 0/ 1=5 1
PX jY .0j0/ D D D
PY .0/ 3=5 3
And from that PX jY .1j0/ D 1 1=3 D 2=3. In the same manner, we have
PX Y .0; 1/ 2=5
PX jY .0j1/ D D D 1; PX jY .1j1/ D 0
PY .1/ 2=5
 Using Eq. (5.12.10) and noting that Y D f0; 1g, we can obtain
8
<2=3; with probability 3=5
Z D EŒXjY  D
:0; with probability 2=5
Therefore, we can write PZ .z/ as

ˆ
< =5;
3 if z D 2=3
PZ .z/ D if z D 0
ˆ =5;
2

0; otherwise
And compute EŒZ D .2=3/.3=5/ C 0.2=5/ D 2=5. Noting that EŒX is also 2=5.
Thus, EŒEŒXjY  D EŒX, at least for this example.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 592

What we have seen in this example that EŒX D EŒEŒX jY  is known as the law of iterated
expectation:
The law of iterated expectation W EŒX D EŒEŒXjY  (5.12.11)
Proof. We go from EŒEŒXjY  to EŒX using the definition of the conditional expectation and
conditional probability:
X
EŒEŒXjY  D EŒXjY D yPY .Y D y/ (def.)
X
y
X
D xPXjY .xjy/PY .y/ (def of red term).
y x
X X PX Y .x; y/
D x PY .y/ (Eq. (5.12.7))
PY .y/
XX
y x
XX
D xPX Y .x; y/ D xPX Y .x; y/ (switch sum index)
y x !x y
X X X
D x PX Y .x; y/ D xPX .x/ (Eq. (5.12.2))
x y x

If the step in which I switched the sums was not clear, just
P consider
P aP
concrete
P case in which
P D fx1 ; x2 g and Y D fy1 ; y2 g, then you will see that y x  D x y . Noting that
X
x xPX .x/ D EŒX. That’s the end of our proof. 
And from Eq. (5.12.11) we get a new way to compute EŒX. Suprisingly it is similar to the
law of total probability, see Eq. (5.8.6):
X
The law of total probability W EŒX D EŒXjY D yPY .Y D y/ (5.12.12)
y

Expectation for independent random variables. The law of unconsscious statistician for two
discrete random variables is:

X XX
EŒg.X; Y / D g.xi ; yj /PX Y .xi ; yj / D g.xi ; yj /PX Y .xi ; yj / (5.12.13)
.xi ;yj /2RXY i j

Now, we use the LOTUS to derive this result: if X; Y are two independent RVs, then
EŒX Y  D EŒXEŒY .
Proof. The proof adopts Eq. (5.12.13) for the function g D XY , then uses PX Y .xi ; yj / D
PX .xi /PY .yj / for the two independent RVs X and Y :
P P
EŒXY  D i j xi yj PX Y .xi ; yj /
P P
D i j xi yj PX .xi /PY .yj / (5.12.14)
P  P 
D i xi PX .xi / j yj PY .yj / D EŒXEŒY 

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 593

Definition 5.12.1
Random variables X1 ; X2 ; : : : ; Xn are said to be independent and identically distributed (i.i.d)
if they are independent and they have the same (marginal) distributions:

FX1 .x/ D FX2 .x/ D : : : D FXn .x/; 8x 2 R

5.12.5 Covariance
For two jointly distributed real-valued random variables X and Y , we know that EŒX C Y  D
EŒX  C EŒY . The question is how about the variance of the sum i.e., Var.X C Y /? Let’s see
what we get. We start with the definition and use the linearity of the expectationŽ :

Var.X C Y / D EŒ..X C Y / EŒX C Y /2  (def. Eq. (5.10.54))


D EŒ.X C Y EŒX EŒY /  (linearity of EŒX C Y )
2

D EŒ.X EŒX/2  C EŒ.Y EŒY /2  C 2EŒ.X EŒX/.Y EŒY /


D Var.X/ C Var.Y / C 2EŒ.X EŒX/.Y EŒY /

The variance of a sum is the sum of variances plus something new–the red term. Let’s massage
it and see what we get (recalling that EŒaX D aEŒX from Eq. (5.10.53)):

EŒ.X EŒX/.Y EŒY / D EŒXY XEŒY  YEŒX C EŒXEŒY 


D EŒXY  EŒY EŒX EŒXEŒY  C EŒXEŒY 
D EŒXY  EŒXEŒY 

If X D Y , then the above becomes the variance of Y (or of X , if not clear, check Eq. (5.10.55)).
And we know that, see Eq. (5.12.14), if X; Y are independent, then EŒXY  D EŒXEŒY , and
the red term vanishes. So, what we call the red term? We call it the covariance of X and Y ,
denoted by Cov.X; Y / or X Y :

Cov.X; Y / D X Y D EŒ.X EŒX/.Y EŒY / D EŒXY  EŒXEŒY  (5.12.15)

Thus, the variance is a measure of the spread of one single variable w.r.t its mean. And the
covariance is a measure of two variables. The covariance is in units obtained by multiplying the
units of the two variables. What are we going to do now? Compute some covariance? That’s
important but not interesting: the software Microsoft Excel can do that. As usual in maths, we
will deduce properties of the covariance before actually computing it!

Properties of the covariance. The covariance can be seen as an operator with two inputs and it
looks similar to the dot product of two vectors. If we look at the properties of the dot product in
Ž
The step from the 2nd equality to the third is: .X C Y EŒX  EŒY /2 D Œ.X EŒX / C .Y EŒY /2 D   
using .a C b/2 D a2 C b 2 C 2ab; finally the linearity of expected value EŒX C Y  D EŒX  C EŒY  is used again.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 594

Box 11.2 we guess the following are true (the last one not coming from dot product though):

(a): Cov.X; X/ D Var.X/


(b): commutative law Cov.X; Y / D Cov.Y; X/
(d): distributive law Cov.Z; X C Y / D Cov.Z; X/ C Cov.Z; Y / (5.12.16)
(d): Cov.˛X; Y / D ˛Cov.X; Y /
(e): Cov.X C c; Y / D Cov.X; Y /
The proof is skip as it is 100% based on the definition of the covariance i.e., Eq. (5.12.15). The
first property is: if Y always takes on the same values as X, we have the covariance of a variable
with itself (i.e., XX ), which is nothing but the variance.

Example 5.18
We consider the data given in Table 5.9 and use Eq. (5.12.15) to compute X Y . First, we need
the sample means: XN D .129 C 130 C 131/=3 D 130 and YN D .15 C 16/=2 D 15:5. Then,
X Y can be computed as, using the LOTUS Eq. (5.12.13):

X
3 X
2
X Y D .Xi N j
X/.Y YN /Pij
i D1 j D1

D .129 130/.15 15:5/.0:12/ C .129 130/.16 15:5/.0:08/C


C .130 130/.15 15:5/.0:42/ C .130 130/.16 15:5/.0:28/
C .131 130/.15 15:5/.0:06/ C .131 130/.16 15:5/.0:04/

5.12.6 Variance of a sum of variables


Suppose we have a sum of several random variables, in particular Y D X1 C    C Xn . The
question is: what is Var.Y /? If Y is just the sum of two variables, then we know thatŽŽ ,

Var.X1 C X2 / D Var.X1 / C Var.X2 / C 2Cov.X1 ; X2 /


P
With that, it is only a small step to go to the general case Y D niD1 Xi . It might help if we go
slowly with n D 3 or Y D X1 C X2 C X3 , then Var.Y / D Cov.Y; Y / can be written as
Var.Y / D Cov .X1 C X2 C X3 ; X1 C X2 C X3 /
D Cov.X1 ; X1 C X2 C X3 / C Cov.X2 ; X1 C X2 C X3 / C Cov.X3 ; X1 C X2 C X3 /
D Cov.X1 ; X1 / C Cov.X1 ; X2 / C Cov.X1 ; X3 / C Cov.X2 ; X1 C X2 C X3 /
C Cov.X3 ; X1 C X2 C X3 /
D Var.X1 / C Var.X2 / C Var.X3 / C 2Cov.X1 ; X2 / C 2Cov.X1 ; X3 / C 2Cov.X2 ; X3 /
(5.12.17)
ŽŽ
We can get this formula by this: Var.Y / D Cov.Y; Y / and use the distributive law of the covariance operator.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 595

where in the second equality, we used the distributive property in Eq. (5.12.16). Then, this
property is used again in the third equality. Doing the same thing for Cov.X2 ; X1 C X2 C X3 /
and Cov.X3 ; X1 C X2 C X3 / we then obtain the P final expression for Var.Y /. Now, we can go
to the general case (just Eq. (5.12.17) but with notation):

X
n X
n
 X
n X
n
Var.Y / D Cov.Y; Y / D Cov Xi ; Xj D Cov.Xi ; Xj /
i D1 j D1 i D1 j D1
(5.12.18)
X
n X
D Var.Xi / C 2 Cov.Xi ; Xj /; with Y D X1 C    C Xn
i D1 i <j

If Xi are uncorrelated all the Cov.Xi ; Xj / terms vanish, and thus we get the nice identityŽŽ
!
Xn Xn
Var Xi D Var.Xi / (5.12.19)
i D1 i D1

This statement is called the Bienaymé formula and was discovered in 1853. From that we can
deduce that Var.X/N D  2 =n.
For any constants a1 ; : : : ; an and b1 ; : : : ; bm , we have
0 1
X n X
m Xn Xm
Cov @ ai X i ; bj Yj D A ai bj Cov.Xi ; Yj /
i D1 j D1 i D1 j D1

5.12.7 Correlation coefficient


It is better to work with dimensionless quantities, we thus introduce the following dimensionless
variables U and V :
X EŒX Y EŒY 
U D ; V D
X Y
Then, we compute the covariance of U; V i.e., Cov.U; V / and give it a name and a symbol:
   
X EŒX Y EŒY  (a) X Y (b) Cov.X; Y /
X Y WD Cov.U; V / D Cov ; D Cov ; D
X Y X Y X Y

where, for (a) I used property (e) in Eq. (5.12.16). And for (b), I used the definition of the
covariance in Eq. (5.12.15) and the property of EŒ˛X D ˛EŒX. The symbol X Y denotes
the correlation coefficient of X and Y . It is a coefficient as it is dimensionless. So, X Y is the
covariance between the standardized versions of X and Y .
ŽŽ
In English this rule is familiar: the var of sum is the sum of var. We have similar rules for the derivative, the
limit etc.

Irénée-Jules Bienaymé (1796 – 1878) was a French statistician. He built on the legacy of Laplace generalizing
his least squares method. He contributed to the fields of probability and statistics, and to their application to finance,
demography and social sciences.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 596

Now, we are going to show that 1  X Y  1. The proof uses Eq. (5.12.18) to compute
the variance of X=X ˙ Y=Y :
       
X Y X Y X Y
Var ˙ D Var C Var ˙ 2Cov ;
X Y X Y X Y
(a) 1 1 2 (5.12.20)
D 2 Var .X / C 2 Var .Y / ˙ Cov .X; Y /
X Y X Y
D 2 ˙ 2X Y (def. of X Y )

where, for (a) I used Eq. (5.10.57) (i.e., Var.aX C b/ D a2 Var.X/).


But, the variance of X=X ˙ Y=Y is non-negative, thus

0  2 ˙ 2X Y H) 1  X Y  1

5.12.8 Covariance matrix


If we have two variables X; Y we have one single Cov.X; Y /, what if we have more than two
variables? Let’s investigate the case of three variables X; Y and Z. Of course, we would have
Cov.X; Y /, Cov.X; Z/, Cov.Y; Z/, and so on. And if we put all of them in a matrix, we get the
so-called covariance matrix:
2 3
Cov.X; X/ Cov.X; Y / Cov.X; Z/
6 7
C D 4 Cov.X; Y / Cov.Y; Y / Cov.Y; Z/ 5
Cov.X; Z/ Cov.Y; Z/ Cov.Z; Z/

Note that the diagonal terms are the variances of the variables. Obviously this matrix C is a
symmetric matrix. Is that all we know about it? It turns out that there is also another property
hidden there. Let’s investigate it. A 2  2 covariance matrix is sufficient to reveal the secret.
Without loss of generality, we consider only discrete random variables X; Y with means XN and
YN , respectively. Thus, we have
8̂ X
" # Cov.X; D N 2
ˆ
<
X/ Pi .Xi X/
Cov.X; X/ Cov.X; Y /
XX
i
CD ;
Cov.X; Y / Cov.Y; Y / ˆ Cov.X; Y / D N j YN /
:̂ Pij .Xi X/.Y
i j

There is a non-symmetry in the formula of Cov.X; X/ and Cov.X; Y /: there is no PijPin the
former! Let’s make it appear and something wonderful will show up (this is due to Pi D j Pij ,
check the marginal probability if this was not clear):
X XX
Cov.X; X/ D Pi .Xi N 2D
X/ Pij .Xi N 2
X/
i i j

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 597

With that, we can have a beautiful formula for C, in which C is a sum of a bunch of matrices,
each matrix is multiplied by a positive number (i.e., Pij ):
" #
XX .Xi X/N 2 N j YN /
.Xi X/.Y
CD Pij
N j YN /
.Xi X/.Y .Yi YN /2
i j

What is special about the red matrix? It is equal to UU> , where U D .Xi X; N Yi YN /.
>
So what? Every matrix UU is positive semidefinite . Thus, C combines all these positive
ŽŽ

semidefinite matrices with weights Pij  0: it is positive semidefinite. This turns out to be a
useful property and exploited in principal component analysis–which is an important tool in
statistics. Check Section 6.7 for a discussion on this topic.

Sample covariance. If we have n samples and each sample has two measurements X and Y ,
hence we have X D .x1 ; : : : ; xn / and Y D .y1 ; : : : ; yn /, then the sample covariance between
X and Y is defined as (noting the Bessel’s correction n 1 in the denominator)

1 X
n
Cov.X; Y / D .xi N i
x/.y N
y/ (5.12.21)
n 1 i D1

What does that actually mean? Assume that X denotes the number of hours studied for
a subject and Y is the marks obtained in that object. We can use real data to compute the
covariance, and assume that the value is 90.34. What does this value mean? A positive value
of covariance indicates that both variables increase or decrease together e.g. as the number of
hours studied increase, the grades also increase. A negative value, on the other hand, means that
while one variable increases the other decreases or vice versa. And if the covariance is zero, the
two variables are uncorrelated.
Now, we derive the formula for the covariance matrix for the whole data. We start with the
sample mean:
P
X W x1 x2    xn W xN D 1=n. i xi /
P
Y W y1 y2    yn W yN D 1=n. i yi /
Then, we subtract the data from the mean, to center the data
" # " #
x x    xn x xN x2 xN    xn xN
AD 1 2 H) A D 1
y1 y2    yn y1 yN y2 yN    yn yN

And the covariance matrix is given by

1
CD AA>
n 1

Check Section 11.10.6 for quadratic forms and positive definiteness of matrices. The proof goes:
ŽŽ

x .UU> /x D kU> xk2  0.


>

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 598

5.13 Joint continuous variables


Similar to the probability of a single random variable, see Eq. (5.11.3), for two continuous RVs
X and Y , we have (Fig. 5.18)
Z bZ d “
P .a  x  b; c  y  d / D fX Y .x; y/dxdy; P ..X; Y / 2 A/ D fX Y .x; y/dxdy
a c A
(5.13.1)
And we call fX Y .x; y/ the joint density probability function.

Figure 5.18: From a probability density function fX .x/ to a joint probability density function fX Y .x; y/.

5.14 Transforming density functions


Suppose that we have a continuous random variable X with known cdf FX .x/ and pdf fX .x/.
Assume that we’re interested in a derived random variable Y defined as Y D h.X/ for some
function y D h.x/. The question is: what is the pdf of Y i.e., fY .y/?
There are two ways to answer this question. In the first way, we first determine the cdf of
Y in terms of FX .x/, then we differentiate it to get fY .y/. Assuming now that y D h.x/ is a
strictly increasing function. Thus, we have the inverse x D g.y/. Then, we can write FY .y/ as
def
FY .y/ D P .Y  y/ D P .h.X/  y/
D P .X  g.y// (y D h.x/ is an increasing func.) (5.14.1)
def
D FX .g.y//
Next, we’re going to differentiate FY .y/ to get fY .y/
defd d
fY .y/ D FY .y/ D FX .g.y// (Eq. (5.14.1))
dy dy (5.14.2)
D FX0 .g.y//g 0 .y/ (chain rule) D fX .g.y//g 0 .y/
If the function y D h.x/ is a decreasing function, then by repeating this procedure, we will get
fY .y/ D fX .g.y//g 0 .y/ which is non-negative as g 0 .y/  0. So,

Y W Y D h.X/ H) fY .y/ D fX .g.y//jg 0 .y/j (5.14.3)

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 599

Rb
I present now the second way that uses the fact that P .a  X  b/ D a fX .x/dx. Again
suppose that y D h.x/ is such a function that for x 2 Œa; b, then y 2 Œh.a/; h.b/. We have
Z b
P .h.a/  Y  h.b// D P .a  X  b/ D fX .x/dx (5.14.4)
a

Now, comes the change of variable x D g.y/, and substitute it into the above integral we obtain
Z h.b/
P .h.a/  Y  h.b// D fX .g.y//g 0 .y/dy (5.14.5)
h.a/

5.15 Inequalities in the theory of probability


There is an adage in probability that says that behind every limit theorem lies a probability
inequality (i.e., a bound on the probability of some event happening). Since a large part of
probability theory is about proving limit theorems, mathematicians have developed a bewildering
number of inequalities. This section presents some common inequalities in probability.

5.15.1 Markov and Chebyshev inequalities


Let X be any non-negative continuous random variable with PDF fX .x/, we can write EŒX as
Z 1 Z 1
EŒX D xfX .x/dx D xfX .x/dx (as X  0)
Z 1 1 0

 xfX .x/dx (for any a > 0)


Za 1
 afX .x/dx (since x  a)
aZ 1

a fX .x/dx  aP .X  a/
a

Thus, we have proved the so-called Markov’s inequality:

EŒX
Markov’s inequality: X is any non-negative RV W P .X  a/ 
a
Markov’s inequality says that for a non-negative random variable X and any positive real number
a, the probability that X is greater than or equal to a is less than or equal to the expected value
of X divided by a. This is a tail bound because it imposes an upper limit on how big the right
tail at a can be.

Example 5.19
Suppose that an individual is randomly extracted from a population of individuals having
an average yearly income of $60 000. What is the probability that the extracted individual’s
income is greater than $200 000? In the absence of more information about the distribution of

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 600

income, we can still use Markov’s inequality to calculate an upper bound to this probability:
60 000
P .X  200 000/  D 0:3
200 000
Therefore, the probability of extracting an individual having an income greater than $200 000
is less than 30%.

Now, we apply Markov’s inequality to get Chebyshev’s inequality. Motivation: Markov’s


inequality involves the expected value. Where is the variance? It is involved in Chebyshev’s
inequality. Can we guess the form of this inequality? The variance is about the spread of X with
respect to the mean. So, we would get something similar to P .jX EŒXj  b/  g.Var.X/; b/.
Note that because of symmetry when talking about spread, we have to have two tails involved:
that’s where the term jX EŒXj  b comes into play.
Now, to get the Chebyshev inequality, we consider the non-negative RV Y D .X EŒX/2 .
We then have P .Y  b 2 /  EŒY =b 2 according to Markov. But, EŒY  D Var.X/, thus we
obtain the Chebyshev inequality:
Var.X/
Chebyshev’s ineq.: X  0 and b > 0 W P .jX EŒXj  b/ 
b2
Now, it is more convenient to consider b D z (z 2 N), and we obtain the following form of
the Chebyshev inequality (with  D EŒX and Var.X/ D  2 ):
2 1
P .jX j  z/  2 2 D 2
z  z
With this form, we can see that, no matter what the distribution of X is, the bulk of the probability
is in the interval “expected value plus or minus a few SDs”, or symbolically  ˙ z . Indeed,
using P .A/ D 1 P .Ac /, we have P .jX j < z/ D 1 P .jX j  z/  1 1=z 2 . I
present some concrete cases for z D 2; 3; 4:
P . 2 < X <  C 2/  1 1=22 D 75%
P . 3 < X <  C 3/  1 1=32 D 88:88%
P . 4 < X <  C 4/  1 1=42 D 93:75%

5.15.2 Chernoff’s inequality


Now we use moment generating functions (MGF) to derive another inequality. Refer to Sec-
tion 5.17.2 for a discussion on MGF. As the MGF mX .t/ is defined to be EŒe tX , then X  c
is equivalent to e tX  e t c for a fixed t > 0 (as the exponential function is an increasing and
non-negative function). Now, we can write
P .X  c/ D P .e tX  e t c /
EŒe tX 
 (Markov’s inequality for e tX )
et c
mX .t/ tc
D D mX .t/e (definition of mX .t//
et c
Phu Nguyen, Monash University © Draft version
Chapter 5. Probability 601

Thus, we obtain the Chernoff’s bound on the right tail:


tc
P .X  c/  min mX .t/e
t >0

5.16 Limit theorems


In this section, we will discuss two important theorems in probability: (1) the law of large
numbers (LLN) and (2) the central limit theorem (CLT). The LLN basically states that the
average of a large number of i.i.d. random variables converges to the expected value. The
CLT states that, under some conditions, the sum of a large number of random variables has an
approximately normal distribution.

5.16.1 The law of large numbers


In Section 5.10.5 we have discussed the expected value of a ran-
dom variable. Considering rolling a die, and we say that the ex- 6
pected value is 3:5. What this means is that, if we roll a die for
5
a large number of times then in average we will get 3:5. We can
use a MC simulation to verify this. See the figure. As the number 4
of rolls in this run increases, the average of the values of all the

average
results approaches 3.5. The law of large numbers (LLN) is a the- 3
orem that describes the result of performing the same experiment
2
a large number of times. According to the law, the average of the
results obtained from a large number of trials should be close to 1
0 200 400 600 800 1000
the expected value and tends to become closer to the expected numbre of trials

value as more trials are performed.


A sequence of independent and identically distributed random variables X1 ; X2 ; : : : ; Xn is
considered. (If you do not like it, then think of the income of a population, and n data of this
income.) Then, for each n, we compute the sample mean

1X
n
XN n D Xi
n iD1

Then, we have a sequence of sample means: XN 1 ; XN 2 ; : : : If the mean of each random variables is
, then the law of large number claims that the sequence fXN n g1 nD1 converges to  .
Ž

There are two different versions of the law of large numbers that are described below. They
are called the strong law of large numbers and the weak law of large numbers.

Ž
A special form of the LLN (for a binary random variable) was first proved by Jacob Bernoulli. It took him over
20 years to develop a sufficiently rigorous mathematical proof which was published in his Ars Conjectandi (The
Art of Conjecturing) in 1713. He named this his "Golden Theorem" but it became generally known as "Bernoulli’s
theorem.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 602

Theorem 5.16.1: Weak law of large numbers


Let X1 ; X2 ; : : : ; Xn be iid random variables with expected value  < 1. Then, for any  > 0

lim P .jXN j  / D 0
n!1

Proof. It is obvious that we shalle use Chebyshev’s inequality and assume that the variance is
finite i.e., Var.X/ D  2 < 1. Then, we have
N
Var.X/ 2
P .jXN j  /  D 2
2 n
Therefore, we have
2
0  lim P .jXN j  /  lim D0
n!1 n!1 n 2

Theorem 5.16.2: Strong law of large numbers
Let X1 ; X2 ; : : : ; Xn be iid random variables with expected value  < 1. Then, for any  > 0

lim P .jXN j  / D 0
n!1

5.16.2 Central limit theorem


We recall that de Moirve, while solving some probability problem concerning the binomial
distribution b.kI n; p/, had derived a normal approximation for the binomial distribution when
n is large. As a binomial variable can be considered as a sum of n independent Bernoulli
variables, we suspect that the sum of n independent and identically distributed random variables
is approximately normally distributed. Now, we are going to check this hypothesis.

Mean of n uniformly distributed variables. In this test we consider n uniformly distributed


RVs i.e., Xi Ï U nif orm.1; 2/ for i D 1; 2; : : : ; n. Note that each Xi has an expected value of
1:5 and a SD of 1=12 D 0:08333333. We now define a new variable Y as the mean of Xi :
X1 C X2 C    C Xn
Y WD WD XN (5.16.1)
n
What we want to see is whether Y is approximately normally distributed when n is sufficiently
large. Three cases are considered: n D 5, n D 10 and n D 30 and the results are given in
Fig. 5.19ŽŽ . A few observations can be made from these figures. First, the
pmean variable Y has
an expected value of 1:5–similar to that of Xi –and a SD of 0:08333333= n. Second, and what
is more is that even though each and every of Xi has a uniform distribution (with a rectangle
PDF), but the distribution of their mean has a bell-shaped curve of the normal distribution,psee
the red curve in Fig. 5.19c which is the normal curve with  D 1:5 and  D 0:08333333= 30.
ŽŽ
See Listing A.19 if you’re interested in how this was done.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 603

8 8 8

7 7 7

6 6 6

5 5 5

4 4 4

3 3 3

2 2 2

1 1 1

0 0 0
1.2 1.4 1.6 1.8 2.0 1.2 1.4 1.6 1.8 2.0 1.2 1.4 1.6 1.8 2.0

(a) n D 5 (b) n D 10 (c) n D 30

Figure 5.19: The mean of n uniformly distributed RVs Xi Ï U nif orm.1; 2/. Note that each Xi has an
expected value of 1:5 and a SD of 1=12.

It is quite simple to verify the observations on the expected value and SD of Y . Indeed, we
can compute EŒY  and Var.Y / using the linearity of the expected value and the property of the
variance. Let’s denote by  and  2 the expected value and variance of Xi (all of them have the
same). Then,

 
1
EŒY  D EŒX1 =n C EŒX2 =n C    C EŒXn =n D n  D (5.16.2)
n

and,
   
X1 C X2 C    C Xn Xi 2
Var.Y / D Var D nVar D (5.16.3)
n n n

where in the second equality, the Bienaymé formula i.e., Eq. (5.12.19) was used to replace the
variance of a sum with the sum of variances.
About the bell-shaped curve of Y when n is large, it is guaranteed by the central limit theorem
(CLT). According to this theorem (of which proof is given in Section 5.17.2), Y Ï N.;  2=n/.
Therefore, we have, for large ns (Eq. (5.11.11)):

   
b  a 
P .a  Y  b/ D ˆ p ˆ p (5.16.4)
= n = n

When n is sufficiently large? Another question that comes to mind is how large n should
be so that we can use the CLT. The answer generally depends on the distribution of the Xi s.
Nevertheless, as a rule of thumb it is often stated that if n is larger than or equal to 30, then the
normal approximation is very good.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 604

Theorem 5.16.3: Central limit theorem


Let X1 ; X2 ; : : : ; Xn be iid random variables with expected value  and variance  2 . Then, the
random variable
XN  X1 C X2 C    C Xn n
Zn D p D p
= n n
converges in distribution to the standard normal random variable as n goes to infinity. That is,

lim P .Zn  x/ D ˆ.x/ for all x 2 R


n!1

Example 5.20
Test scores of all high school students in a state have mean 60 and variance 64. A random
sample of 100 (n D 100) students from one high school had a mean score of 58. Is there
evidence to suggest that this high school is inferior than others?
Let XN denote the mean of n D 100 scores from a population
p with  D 64 and  2 D 64.
We know from the central limit theorem that .XN /=.= n/ is a standard normal distribution.
Thus,
   
N 58  58 60
P .X  58/ D ˆ p Dˆ p D ˆ. 2:5/ D 1 ˆ.2:5/ D 0:0062
= n 8= 100

History note 5.3: Pierre-Simon Laplace (1749-1827)


Pierre-Simon, marquis de Laplace was a French scholar and polymath
whose work was important to the development of engineering, math-
ematics, statistics, physics, astronomy, and philosophy. Laplace is re-
membered as one of the greatest scientists of all time. Sometimes re-
ferred to as the French Newton or Newton of France, he has been de-
scribed as possessing a phenomenal natural mathematical faculty supe-
rior to that of any of his contemporaries. He was Napoleon’s examiner
when Napoleon attended the École Militaire in Paris in 1784. Laplace
became a count of the Empire in 1806 and was named a marquis in 1817, after the Bourbon
Restoration.
Laplace attended a Benedictine priory school in Beaumont-en-Auge, as a day pupil, be-
tween the ages of 7 and 16. At the age of 16 Laplace entered Caen University. As he
was still intending to enter the Church, he enrolled to study theology. However, during
his two years at the University of Caen, Laplace discovered his mathematical talents and
his love of the subject. Credit for this must go largely to two teachers of mathematics
at Caen, C Gadbled and P Le Canu of whom little is known except that they realized
Laplace’s great mathematical potential. Once he knew that mathematics was to be his
subject, Laplace left Caen without taking his degree, and went to Paris. He took with him
a letter of introduction to d’Alembert from Le Canu.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 605

5.17 Generating functions


In Section 4.17 we’ve got to know the so-called Bernoulli numbers:
1 1 1 1
B0 D 1; B1 D ; B2 D ; B3 D 0; B4 D ; B5 D 0; B6 D ; B7 D 0; : : :
2 6 30 4
There are infinity of them, and it seems impossible to understand them. But, with Euler’s defini-
tion, in 1755, of the Bernoulli numbers in terms of the following function

x X
1
xn
D Bn (5.17.1)
ex 1 nD0 nŠ

we have discovered the recurrence relation between Bn , Eq. (4.17.2). The function x=ex 1 is
called a generating function. It encodes the entire Bernoulli numbers sequence. Roughly speak-
ing, generating functions transform problems about sequences into problems about functions.
And by fooling around with this function we can explore the properties of the sequence it en-
codes. This is because we’ve got piles of mathematical machinery for manipulating functions
(e.g. differentiation and integration).
Now, we give another example showing the power of generating functions. If, we observe
carefully we will see that, except B1 D 1=2, all the odd numbers B2nC1 for n > 1 are zeros.
Why? Let’s fool with the functionŽŽ :
x x x x ex C 1 x e x C 1 e x=2 x e x=2 C e x=2
g.x/ WD B1 x D C D D D
ex 1 ex 1 2 2 ex 1 2 e x 1 e x=2 2 e x=2 e x=2
We added the red term so that we can have a symmetric form (e C 1 is not symmetric but
x

e x=2 C e x=2 is). It’s easy to see that g. x/ D g.x/, thus it is an even function. Therefore, with
Eq. (5.17.1)
B2 2 B3 3
g.x/ D 1 C x C x C    is an even function H) B2nC1 D 0
2Š 3Š
George Pólya wrote in his book Mathematics and plausible reasoning in 1954 about gener-
ating functions:
A generating function is a device somewhat similar to a bag. Instead of carrying
many little objects detachedly, which could be embarrassing, we put them all in a
bag, and then we have only one object to carry, the bag.

5.17.1 Ordinary generating function


The ordinary generating function for the sequence .a0 ; a1 ; a2 ; : : :/ is the power seriesŽ :
X
1
G.an I x/ D an x n (5.17.2)
nD0

ŽŽ
Why this function?
Ž
I wrote this part based on the lecture notes of the MIT course Mathematics for Computer Science.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 606

The pattern here is simple: the nth term in the sequence (indexing from 0) is the coefficient of
x n in the generating function. There are a few other kinds of generating functions in common
use (e.g. x=ex 1, which is called an exponential generating function), but ordinary generating
functions are enough to illustrate the power of the idea, so we will stick to them and from now
on, generating function will mean the ordinary kind.
A generating function is a “formal” power series in the sense that we usually regard
x as a placeholder rather than a number. Only in rare cases will we actually evaluate
Remark
a generating function by letting x take a real number value, so we generally ignore
the issue of convergence.

Just looking at the definition in Eq. (5.17.2), there is no reason to believe that we’ve made any
progress in studying anything. We want to understand a sequence .a0 ; a1 ; a2 ; : : :/; how could
it possibly help to make an infinite series out of these! The reason is that frequently there’s a
simple, closed form expression for G.an I x/. The magic of generating functions is that we can
carry out all sorts of manipulations on sequences by performing mathematical operations on
their associated generating functions. Let’s experiment with various operations and characterize
their effects in terms of sequences.

Example 5.21
The generating function for the sequence 1; 1; 1; : : : is 1=1 x . This is because (if you still
remember the geometric series)
1
D 1 C x C x 2 C x 3 C    where the coefs. of all x n is 1
1 x
We can create different generating functions from this one. For example, if we replace x by
3x, we have
1
D 1 C 3x C 9x 2 C 27x 3 C    which generates 1; 3; 9; 27; : : :
1 3x
Multiplying this with x, we get
x
D 0 C x C 3x 2 C 9x 3 C 27x 4 C    which generates 0; 1; 3; 9; 27; : : :
1 3x
which right-shift the original sequence (i.e., 1; 3; 9; 27; : : :) by one. We can multiply the GF
by x k to right-shift the sequence k times.

Solving difference equations. Assume that we have this sequence 1; 3; 7; 15; 31; : : : which can
be defined as
a0 D 1; a1 D 3; an D 3an 1 2an 2 .n  2/
The question is: what is the generating function for this sequence? Let’s denote by f .x/ that
function, thus we have (by definition of a generating function)
f .x/ D 1 C 3x C 7x 2 C 15x 3 C 31x 4 C    (5.17.3)

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 607

Now, the recurrent relation (an D 3an 1 2an 2 ) can be re-written as an 3an 1 C 2an 2 D 0,
and we will multiply f .x/ by 3x and also multiply f .x/ by 2x 2 and add all up including
f .x/:

f .x/ D 1 C 3xC7x 2 C 15x 3 C 31x 4 C    C an x n C   


3xf .x/ D 0 3x 9x 2 21x 3 45x 4 C    3an 1 x n C   
2x 2 f .x/ D 0 C 0xC2x 2 C 06x 3 C 14x 4 C    C 2an 2 x n C   

1
f .x/Œ1 3x C 2x 2  D 1 H) f .x/ D
1 3x C 2x 2
where all the columns add up to zero except the first one, because of the recurrence relation
an 3an 1 C 2an 2 D 0.
But, why having the generating function is useful? Because it allows us to find a formula for
an ; thus we no longer need to use the recurrence relation to get an starting from a0 ; a1 ; : : : all
the way up to an 2 Ž . The trick is to re-write f .x/ in terms of simpler functions (using the partial
fraction decomposition discussed in Section 4.7.8) and then replace these simpler functions by
their corresponding power series. Now, we can decompose f .x/ easily with the ‘apart’ function
in SymPy§
1 1 2
f .x/ D D C
1 3x C 2x 2 1 x 1 2x
Next, we write the series of these two fractional functionsŽŽ :

1
D 1 x x2 x3    H) bn D 1
1 x
2
D 2 C 4x C 8x 2 C 16x 3 C    H) cn D 2nC1
1 2x
Thus, we can determine an :
an D 2nC1 1
To conclude, generating functions provide a systematic method to solving recurrence/difference
equations. At this step, I recommend you to apply this method to the Fibonacci sequence,
discussed in Section 2.10, to re-discover the Binet’s formula and many other properties of this
famous sequence.
P
Evaluating sums. Suppose we need to evaluate this sum sn D nkD0 ak , which is the sum
of n elements of a sequence .a0 ; a1 ; : : :/. Herein, I present a technique to compute such sums
using generating functions. To this end, we need to use the Cauchy product formula for two
power series. Recall the Cauchy product formula for two power series, see Eq. (7.12.2) and the
Ž
So, we want a Ferrari instead of a Honda CRV.
§
Check Section 3.23 if you’re not sure about SymPy.
ŽŽ
Note that we also know the series of 1=.1 3x C 2x 2 /, but that series is simply the RHS of Eq. (5.17.3).

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 608

surrounding for a discussion:


! ! !
X
1 X
1 X
1 X
n
an x n bm x m D ak bn k xn
nD0 mD0 nD0 kD0

Now, we consider the product of two sequences. For two sequences and their associated gener-
ating functions,
.a0 ; a1 ; : : :/ ! A.x/; .b0 ; b1 ; : : :/ ! B.x/
we will obtain a new sequence by multiplying the two given sequences:
X
n
.c0 ; c1 ; : : :/ ! A.x/B.x/; cn D ak b n k
kD0

using the above mentioned


Pn CauchyPn product formula.
As we aim for kD0 ak not kD0 ak bn k , we need to work with B.x/ such that bk D 1 for
all k. That special B.x/ is: B.x/ D 1=1 x , all bi ’s equal one, and thus we have (move from c to
s for sum)
A.x/ X n
.s0 ; s1 ; : : :/ ! ; sn D ak (5.17.4)
1 x
kD0

This is called the summation rule as it allows us to compute the sum in the box: that sum is
simply the coefficient of the term x n of the function A.x/=1 x . We have turned the problem of
sum evaluation to a problem of finding the coefficient of a function! Math is super cool, isn’t it?
I provide one example below to demonstrate the idea.

Example 5.22
Suppose we want to compute the sum of the first n squares
X
n
sn D i2
i D0

All we need to do is (1) to determine A.x/, which is the generating functions for the sequence
.0; 1; 4; 9; : : :/, and (2) multiply A.x/ with 1=1 x , (3) find the coefficients of that function.
First, we have
x.1 C x/
A.x/ D
.1 x/3
Therefore,
x.1 C x/
.s0 ; s1 ; : : :/ !
.1 x/4
Pn 2
Which means that i D0 i is nothing but the coefficient of x n in F .x/ D x.1Cx/=.1 x/4 . This
is a complicated function, let’s break it into simpler ones:
x.1 C x/ x x2
F .x/ D D C
.1 x/4 .1 x/4 .1 x/4

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 609

Now, we observe that the coefficient of x n in F .x/ is the sum of the coefficient x n 1 in 1=.1 x/4 ,
and the coefficient x n 2 in 1=.1 x/4 . The problem boils down to finding the coefficient of x k
in G.x/ D 1=.1 x/4 . How to do that? Taylor series is the answer. Recall that

f 0 .0/ f 00 .0/ 2 f .n/ .0/


f .x/ D f .0/ C xC x C  C C 
1Š 2Š nŠ
What does it tell us? It says that the nth coefficient of f .x/ is f .n/ .0/=nŠ Now, we need the nth
derivative of G.x/. Let’s compute the first, second derivatives and so on to find the pattern:
4 9
0
G .x/ D >
>
.1 x/5 > >
>
>
00 45 = .n C 3/Š
G .x/ D H) G .n/ .x/ D
.1 x/ > 6
> 6.1 x/nC4
>
>
456 > >
G 000 .x/ D ;
.1 x/ 7

Hence, the nth coefficient of G.x/ is

G .n/ .0/ .n C 3/Š .n C 3/.n C 2/.n C 1/


D D
nŠ 6nŠ 6
And from that we can determine the n 1th and n 2th coefficients, and finally sn :

n.n C 1/.n C 2/ .n 1/n.n C 1/ n.n C 1/.2n C 1/


sn D C D
6 6 6
Same result as in Eq. (2.6.8).

5.17.2 Moment generating functions


There are several reasons to study moments and moment generating functions. One of them is
that moment generating functions make our life easier and the second is that these functions can
be used to prove the central limit theorem.

Moments, central moments. To motivate the introduction of moments in probability, let’s look
at how the expected value and the variance were defined:

 D EŒX D EŒX 1 ; Var.X/ D EŒ.X /2 

It is then logical to define the k th moment of a random variable X as k D EŒX k . Why?


Because the first moment is the expected value, the second moment does not give us the variance
directly, but indirectly:  2 D 2 21 . And mathematicians also define the k th central moment
of a random variable X as EŒX k . Thus, the variance is simply the second central moment.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 610

Moment generating functions. The moment generating function (MGF) of a random variable
(discrete or continuous) X is simply the expected value of e tX :

m.t/ D EŒe tX  (5.17.5)


Now, we will elaborate m.t/ to reveal the reason behind its name (and its definition). The idea
is to replace e tX by its Taylor series, then applying the linearity of the expected value and we
shall see all the moments k :
 
.tX/2 .tX/k
m.t/ D E 1 C tX C C  C C 
2Š kŠ
.t 2 EŒX 2  t k EŒX k 
D 1 C tEŒX C C  C C 
2Š kŠ
2 2 k k
D 1 C 1 t C t C  C t C 
2Š kŠ
Compared with Eq. (5.17.2), which is the ordinary generating function for the sequence
.a0 ; a1 ; a2 ; : : :/, we can obviously see why m.t/, as defined in Eq. (5.17.5), is called the mo-
ment generating function; it encodes all the moments k of X . How? Simply by differentiating
m.t / and evaluate it at t D 0, we can retrieve any moment. For example, m0 .0/ D 1 , and
m00 .0/ D 2 .
We can now give a full definition of the moment generating function of either a discrete or
continuous random variable:
X
Discrete RV m.t/ D e tx P .x/
Zx 1 (5.17.6)
Continous RV m.t/ D e tx f .x/dx
1

Next, we are going to consider some examples to see how powerful the MGFs are.

Example 5.23
We consider the geometric series, compute its moment generating function, and see what we
can get from it. First, m.t/ is given bya
X
1
pX t k
1
p et q pe t
tk k 1
m.t/ D e q pD .e q/ D D
q q 1 qe t 1 qe t
kD1 kD1

Now, it is easy to compute the expected value and variance:


p 1 pe t
EŒX  D 1 D m0 .0/ D D ; m0 .t/ D
.1 q/2 p .1 qe t /2
You can compare this and the procedure in Example 5.16 and conclude for yourself which
way is easier. You can use this method to fill in Table 5.6 for other distributions.
a
The red term is a geometric series.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 611

Example 5.24
We now determine the MGF of a standard normal variable Z. We use the definition,
Eq. (5.17.6), to compute it:
Z 1 Z 1
tz 1 z 2 =2 t 2 =2 1 2 2
mZ .t/ D e p e dz D e p e 1=2.z 2t zCt / dz
1 2 2
Z 1 1
2 1 2 2
D e t =2 p e 1=2.z t / dz D e t =2
1 2
Noting the red integral is simply one: it is the probability density function of N.t; 1/.

5.17.3 Properties of moment generating functions


Consider Y D X1 C X2 , where X1 ; X2 are independent random variables. Now we compute the
moment generating function of Y :
mY .t / D EŒe t Y  D EŒe t .X1 CX2 /  D EŒe tX1 e tX2  D EŒe tX1 EŒe tX2  D mX1 .t/mX2 .t/
P (5.17.7)
which says that the MGF of the sum i Xi is the product of the MGFs of Xi . This is known
as the convolution rule of the MGF. Note that we have used EŒXY  D EŒXEŒY  for two
independent RVs X and Y . It is obvious that we’re going to generalize this result:
X
n Y
n
Y D ci Xi H) mY .t/ D mXi .ci t/ (5.17.8)
i D1 i D1

This result gives us a powerful tool for determining the distribution of the sum of independent
random variables.
Now, we derive another property of the MGF. Consider now a transformation Y D aX C b,
and see what is the MGF of Y , especially how it is related to the MGF of X :
mY .t / D EŒe t Y  D EŒe t .aXCb/  D EŒe t aX e bt  D e bt EŒe .at /X /  D e bt mX .at/ (5.17.9)
We will use all these results in the next section when we prove the central limit theorem. I am
not sure if they were developed for this or not. But note that, for mathematicians considering the
sum of X1 ; X2 or aX C b are something very natural to do.
From Example 5.24 and Eq. (5.17.9) we can determine the MGF for Y Ï N.;  2 /. The
idea is to use
2
X Ï N.0; 1/ W mX .t/ D e t =2
2 2
Y D aX C b W mY .t/ D e bt mX .at/ D e bt e a t =2
And we know from Section 5.11.3 that we have to use this transformation Y D X C  to have
Y Ï N.;  2 /. Therefore, we obtain
 
2  2t 2
Y Ï N.;  / W mY .t/ D exp t C (5.17.10)
2

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 612

With this result and Eq. (5.17.8), we have the following theorem

Theorem 5.17.1
Let X1 ; X2 ; : : : ; Xn be independent normal variables with expected values 1 ; : : : ; n and
variances 12 ; : : : ; n2 . Then, the random variable Y which is a linear combination of Xi i.e.,

X
n
Y D ci Xi
i D1

follows the normal distribution:


!
X
n X
n
N ci i ; ci2 i2
i D1 i D1

This theorem indicates that the sum of independent normal random variables is itself a
normal random variable.

Proof. Certainly we have to use Eqs. (5.17.8) and (5.17.10). To ease the presentation let’s
assume n D 2, then
   
c12 12 t 2 c22 22 t 2
mY .t/ D exp c1 1 t C exp c2 2 t C
2 2
 2 2 2  2 2 2
c1 1 t c  t
D exp .c1 1 t / exp exp .c2 2 t/ exp 2 2
2 2
 2

t
D exp..c1 1 C c2 2 /t / exp .c12 12 C c22 22 /
2

5.17.4 Proof of the central limit theorem


The central limit theorem is probably the most beautiful result in the mathematical theory of
probability. Thus, if I cannot somehow see its proof, I feel something is missing. Surprisingly,
the proof is based on the moment generating function concept. First, the CLT is recalled now.
Let X1 ; X2 ; : : : ; Xn be iid random variables with expected value  and variance  2 . Then, the
random variable

Sn n
Sn D p ; Sn D X1 C X2 C    C Xn
n
converges in distribution to the standard normal random variable as n goes to infinity.
The plan of the proof: (i) compute the MGF of Sn , (ii) show that when n is large this MGF
2
is approximately the MGF of N.0; 1/ i.e., e t =2 (according to Example 5.24) and (iii) if two

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 613

variables have the same MGFs, then they have the same probability distribution (we need to
prove this, but it is reasonable so I accept it). Quite a simple plan, for a big theorem in probability.
The first thing is to write Sn as the sum of something:

X1 C X2 C    C Xn n .X1 / C .X2 / C    C .Xn /


Sn D p D p
n n
X 1
n   X n
Xi  1 Xi 
D p D p Xi ; Xi D
i D1
n  i D1
n 

Why this particular form? Because we have the convolution rule that works for a sum. Using
Eq. (5.17.7), the MGF of Sn is simply:
n
mSn .t/ D mX  =pn .t/ (Eq. (5.17.7))
p n p (5.17.11)
D mX  .t= n/ (Eq. (5.17.9) with a D 1= n, b D 0)
p
Now, we use Taylor’s series to approximate mX  .t= n/ when n is large:

p t t2 t2
mX  .t= n/  mX  .0/ C mX0  .0/ p C mX00  .0/ D1C
n 2n 2n

because mX0  .0/ D EŒX   D 0 and mX00  .0/ D EŒ.X  /2  D 1. ThereforeŽŽ ,


 n
t2 2 =2
mSn .t/  1 C
  et
2n

So, we have proved that when n is large the MGF of Sn is approximately the MGF of N.0; 1/.
Thus, Sn has a standard normal distribution. Q.E.D.

History note 5.4: The Bernoullis


The city of Basel in Switzerland was one of many free cities
in Europe and by the 17th century had become an impor-
tant center of trade and commerce/ The University of Basel
became a noted institution largely through the same of an
extraordinary family–the Bernouillis. This family had come
from Antwerp (Belgium) to Basel and the founder of the
mathematical dynasty was Nicholas Bernoulli (1687-1759).
He had three sons, two of whom, James or Jacob (1654-1705)
and Johann or John (1667–1748), became famous mathematicians. Both were pupils of
Leibnitz. James was a professor at Basel until his death in 1705. John, who had been a
professor at Groningen, replaced him. It was John who gave Euler special instruction on
ŽŽ
This is because .1 C a=n/n ! e a when n ! 1. Check Eq. (4.15.17) if this is not clear.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 614

Saturdays when Euler was young. Johann had also a most famous private pupil Guillaume
François Antoine, Marquis de l’Hospital wrote the first ever calculus textbook, which was
actually the notes that he had made from his lesson with Bernoulli.
John Bernoulli had three sons. Two of them, Nicolas II and Daniel were mathematicians
who befriended Euler. Both went to St. Petersburg in 1725 and Daniel secured a position
for Euler at the Russian Academy.
The Bernoulli family had a habit of re-using first names through the generations and this
leads to a great deal of confusion amongst people trying to understand the history of 18th
century mathematics and physics (me included).

5.18 Multivariate normal distribution


The normal distribution is very important in probability theory and it shows up in many different
applications. We have discussed a single normal random variable previously in Section 5.11.3.
In this section we shall now talk about two or more normal random variables. Two important
concepts are multivariate normal distribution and the special case for two variables: the bivariate
normal distribution. The bivariate distribution helps us understand more the general multivariate
case, especially with the use of 3D plots and contour plots.
You need some knowledge of linear algebra to read this section; check Chapter 11 if needed.
In what follows > denotes P the transpose of . And x > y is the dot product of two n-vectors
x and y. That is x > y D i D1 xi yi . The zero vector is denoted by 0 D .0; 0; ::; 0/> and the
n

identity matrix by I.

5.18.1 Random vectors and random matrices


When dealing with multiple random variables, it is useful to use vector and matrix notations.
This makes the formulas more compact and lets us use facts from linear algebra. A random
vector X 2 Rn is a vector .X1 ; X2 ; : : : ; Xn /> of scalar random variables Xi .
The expected value vector or the mean vector of the random vector X is

EX D .EŒX1 ; EŒX2 ; : : : ; EŒXn /> (5.18.1)

All the properties of the expected value work for random vectors. For example, we know that
EŒaX C b D aEŒX C b, which is Eq. (5.10.53). We

EŒAX C b D AEŒX  C b; A 2 Rmn ; b 2 Rm (5.18.2)

If you need a proof, then consider the i th element of the vector AX C b, compute its expected
value, using EŒaX C b D aEŒX C b. Details: E.Aij Xj C bi / D Aij EŒXj  C bi .
If we have random vectors, then we have random matrices. Not a big deal. Actually we
have met one random matrix: the covariance matrix. Now, we provide a formal definition of this

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 615

matrix. Consider again the random vector X D .X1 ; X2 ; : : : ; Xn /> , we then have
2 3 2 3 2 3
X1 EX1 X1 EX1
6 7 6 7 6 7
6X2 7 6EX2 7 6 X2 EX2 7
X D6 7 6 7
6 :: 7 ; EX D 6 :: 7 ; X EX D 6
6
::
7
7
4 : 5 4 : 5 4 : 5
Xn EXn Xn EXn
Next, we build the matrix A D .X EX /.X EX /> , which is an n  n matrix:
2 3
.X1 EX1 /2 .X1 EX1 /.X2 EX2 /    .X1 EX1 /.Xn EXn /
6 7
6 .X2 EX2 /.X1 EX1 / .X2 EX2 /2    .X2 EX2 /.Xn EXn /7
AD6 6 :: :: ::
7
7
4 : : : 5
.Xn EXn /.X1 EX1 / .Xn EXn /.X2 EX2 /    .Xn EXn /2
Finally, the covariance matrix of X , denoted by CX is simply EŒA:
2 3
Var.X1 / Cov.X1 ; X2 /    Cov.X1 ; Xn /
6 7
6Cov.X2 ; X1 / Var.X /    Cov.X ; X /7
CX D EŒ.X EX /.X EX /  D 6 7
> 2 2 n
6 :
: :
: :: :
: 7
4 : : : : 5
Cov.Xn ; X1 / Cov.Xn ; X2 /    Var.Xn /
Let X be an n-dimensional random vector and the random vector Y 2 Rm be defined as
Y D AX C b, where A is a fixed m  n matrix and b is a fixed m-dimensional vector. And we
want to compute CY –an m  m matrix–in terms of CX . The result is
CY D ACX A> D A> CX A (5.18.3)
Proof. First we compute EY D AEX C b, then using the definition
CY D EŒ.Y EY /.Y EY /> 
D EŒ.AX C b AEX b/.AX C b AEX b/> 
D EŒ.A.X EX //.A.X EX //> 
D EŒA.X EX /.X EX /> A>  ..AB/> D B> A> /
D AEŒ.X EX /.X EX /> A> .linearity of E/
If the last equality was not clear, the proof is similar to how we have proved Eq. (5.18.1): it is
always based on the fact that EŒaX C b D aEŒX C b. 
We have proved in Section 5.12.8 that the covariance matrix is semi-positive definite for
discrete random variables. Herein, we prove this for any random variables.
Theorem 5.18.1
Let X be a random vector of n elements, then its covariance matrix is always symmetric and
semi-positive definite (PSD).

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 616

Proof. A matrix A is PSD if x > Ax  0 for all x. We use this and the fact that EŒY 2   0 for
all Y to prove the theorem. So, let’s define Y D b> .X EX / with b being any fixed vector,
then

0  EŒY 2  D EŒY Y >  D EŒb> .X EX /.X EX /> b


D b> EŒ.X EX /.X EX /> b .linearity of E/
>
D b CX b

5.18.2 Functions of random vectors


Let X be an n-dimensional random vector with the joint PDF fX .x/. Let G W Rn ! Rn be
a continuous and invertible function with continuous partial derivatives, and let H D G 1 .
Suppose that the random variable Y is given by Y D G.X /, or X D G 1 .Y / D H.Y /. The
question is: what is fY .y/?
23 2 3
X1 H1 .Y1 ; Y2 ; : : : ; Yn /
6 7 6 7
6X2 7 6H2 .Y1 ; Y2 ; : : : ; Yn /7
X D6 7 6
6 :: 7 D 6 ::
7
7
4 : 5 4 : 5
Xn Hn .Y1 ; Y2 ; : : : ; Yn /

To answer that question we use the method described in Section 5.14 and the Jacobian in
Section 7.8.6. The basic idea is the same:
Z Z
0
P .Y 2 A / D P .X 2 A/ D fX .x/dA D fX .H.y//jJ jdA (5.18.4)
A A0

where J is the Jacobian, which is given by


2 @H @H1 @H1
3
1

6 @H12 @H2 7
@y @y2 @yn
6 @y @H2
 7
J D det 6
6 ::
1 @y2
:: ::
@yn 7
:: 7
4 : : : : 5
@Hn @Hn @Hn
@y1 @y2
 @yn

The red term is the pdf of Y . So, we have

@Hi
fY .y/ D fX .H.y//jJ j; J D det A; Aij D (5.18.5)
@yj

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 617

Example 5.25
Let X be an n-dimensional random vector with fX .x/ and the random vector Y 2 Rm be
defined as Y D AX C b, where A is a fixed m  n matrix and b is a fixed m-dimensional
vector. And we want to compute fY .y/.
First, we have X D A 1 .Y b/ D H.Y /. Eq. (5.18.5)
1
fY .y/ D j det A 1 jfX .A 1 .y b// D fX .A 1 .y b//
j det Aj

5.18.3 Multivariate normal distribution


Let X D .X1 ; X2 ; : : : ; Xn /> be a random vector and t D .t1 ; t2 ; : : : ; tn /> is a vector in Rn . The
moment generating function for X is defined by (this is also called the joint moment generating
function)
" !#
X n
t>X
mX .t/ D EŒe  D EŒexp .t1 X1 C    C tn Xn / D E exp ti Xi
i D1

The plan of developing the multivariate normal distribution is to repeat what we have done
for univariate normal distribution. Let me recall what we have done:
2
X Ï N.0; 1/ W mX .t/ D e t =2
2 2
Y D aX C b W mY .t/ D e bt mX .at/ D e bt e a t =2

And with this transformation Y D X C , to have Y Ï N.;  2 /, we obtain the MGF of


N.;  2 /  
2  2t 2
Y Ï N.;  / W mY .t/ D exp t C
2
Let’s carry out that plan. The first step is to consider n independent standard normal variables
Z1 ; Z2 ; : : : ; Zn i.e., Zi Ï N.0; 1/. Let Z D .Z1 ; Z2 ; : : : ; Zn /> be a random vector and t D
.t1 ; t2 ; : : : ; tn /> . Now, we are going to compute MZ .t/:
>
MZ .t/ D EŒe t Z  D EŒe t1 Z1 CCtn Zn 
D EŒe t1 Z1  e t2 Z2      e tn Zn 
D EŒe t1 Z1 EŒe t2 Z2     EŒe  (Z1 ; : : : ; Zn are independent)
tn Zn
  (5.18.6)
t12 =2 tn2 =2 t12 tn2
De e D exp C  C (Example 5.24)

2 2
1 >
D exp 2
t t

This is nothing but a generalized result of Example 5.24.


The second step, is of course, to consider this variable X D AZ C  with  2 Rn and A is
an n  n matrix, which is nothing but the generalization of X D aZ C b. In what follows, we
compute MX .t/:

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 618

>
MX .t/ D e t MZ .A> t/ (Eq. (5.17.9))
t> 1=2.A> t/> .A> t/
De e (Eq. (5.18.6)) (5.18.7)
 
1
D exp t >  C t > ˙ t ; ˙ D AA>
2
Based on Eq. (5.18.7), it is not a surprise to see the following definition of a multivariate
normal distribution.

Definition 5.18.1
Let  2 Rn and let ˙ be an n  n semi-positive definite matrix. A random vector X D
.X1 ; X2 ; : : : ; Xn /> is said to have a multivariate normal distribution with parameters  and
˙ if its multivariate moment generating function is
 
> 1 >
MX .t/ D exp t  C t ˙ t (5.18.8)
2

The notation is: X Ï N.; ˙ /.

This definition is reduced to a single variate normal distribution in Eq. (5.17.10) when n D 1.
That is,  is simply a number  and the matrix ˙ is simply the variance  2 .
It is clear that, for Z D .Z1 ; Z2 ; : : : ; Zn /> with n independent standard normal variables
Z1 ; Z2 ; : : : ; Zn , Z Ï N.0; I/. Just compare Eq. (5.18.6) with Eq. (5.18.8). And with X D
AZ C , we have X Ï N.; ˙ /. Just the same old story but now for random vectors.

5.18.4 Mean and covariance of multivariate normal distribution


First, let’s consider Z Ï N.0; I/ i.e., Z D .Z1 ; Z2 ; : : : ; Zn /> where the Zi are independent
N.0; 1/ random variables. Then,
EŒZ  D .EZ1 ; EZ2 ; : : : ; EZn /> D 0 (5.18.9)
Thus, the mean vector is the zero vector. Moving on to the covariance matrix, its ij th element is
given by
8
<1; if i D j
EŒ.Zi EZi /.Zj EZj / D EŒZi Zj  D EŒZi EŒZj  D (5.18.10)
:0; if i ¤ j

So, CZ D I.
Second, we consider X Ï N.; ˙ /. Of course, X D AZ C . Hence,
EŒX  D EŒAZ C  D AEŒZ  C  D  (5.18.11)
And for the covariance, we use Eq. (5.18.3):
Cov.X / D Cov.AZ C / D ACov.Z /A> D AA> D ˙ (5.18.12)

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 619

5.18.5 The probability density function for N.; ˙ /


First, we determine the pdf for Z Ï N.0; I/. Then, we consider the transformation X D AZ C
and then use the result in Example 5.25 to get the pdf of N.; ˙ /. That’s the plan.
Noting that Zi are independent N.0; 1/ variables, the pdf of Z is simply the product of the
pdf of all Zi , which we know:
Y
n Y
n  
1 zi2 =2 1 1 >
fZ .z/ D fZi .zi / D p e D exp z z (5.18.13)
i D1 i D1
2 .2/n=2 2

Now, we consider the transformation X D AZ C , and the result from Example 5.25 gives
us
fX .x/ D j det A 1 jfZ .A 1 .x //
 
1 1 1 >

1 > 1
D j det A j exp .x / A A .x / (Eq. (5.18.13))
.2/n=2 2
   
1 1 1 > 1

1 > 1 1
D j det A j exp .x / ˙ .x / A A D ˙
.2/n=2 2
 
1 1 > 1
D p exp .x / ˙ .x /
.2/n=2 det ˙ 2
(5.18.14)

where I have used the following relation between det A 1


and det ˙ :
1 1
j det A 1 j D p Dp
det AA> det ˙
Proof. The proof is based on the following facts:
1
det A 1
D ; det AB D .det A/.det B/; det A> D det A
det A
Now, we can write
 q 
> >
det AA 2
D .det A/.det A / D .det A/ H) j det Aj D det AA>

5.18.6 The bivariate normal distribution


The multivariate normal distribution becomes a bivariate normal distribution for n D 2. In this
case, we have X D .X1 ; X2 /> and
" # " # " #
1 Var.X1 / Cov.X1 ; X2 / 12 1 2
D ; ˙ D D
2 Cov.X1 ; X2 / Var.X2 / 1 2 22

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 620

where  is the correlation coefficient introduced in Section 5.12.7. Thus, the bivariate normal
distribution is determined by five scalar parameters: 1 ; 2 ; 1 ; 1 and .
Now, we compute all quantities involved in the final expression in Eq. (5.18.14):
" #
1 2
  
1 2
det ˙ D .1 2 /12 22 ; ˙ 1 D 2
2
det ˙ 1 2 1

and,
 
> 1 1 .x1 1 /2 .x2 2 /2 2.x1 1 /.x2 2 /
.x / ˙ .x / D C
1 2 12 22 1 2
This term obviously measures the distance (not Euclidean) from x to , and it is called the
squared Mahalanobis distanceŽ .
Finally, the pdf of a bivariate normal distribution is given by
1
f .x1 ; x2 / D p 
21 2 1 2
  
1 .x1 1 /2 .x2 2 /2 2.x1 1 /.x2 2 /
exp C
2.1 2 / 12 22 1 2
(5.18.15)

For the special case that  D 0, we then have


  
1 1 .x1 1 /2 .x2 2 /2
f .x1 ; x2 / D exp C
21 2 2 12 22
     
1 1 .x1 1 /2 1 1 .x2 2 /2
Dp exp p exp
21 2 12 22 2 22
D fX1 .x1 /fX2 .x2 /
(5.18.16)

Now, for 1 D 2 D 0 and 1 D 2 D 1, from Eq. (5.18.16) we get the following PDF (note
that I used .x; y/ instead of .x1 ; x2 /)
 
1 1 2 2
f .x; y/ D exp .x C y /
2 2
which is a circular Gaussian surface (see Fig. 5.20a): it is the 3D version of the well known bell
curve. Using Eq. (5.18.15) with 1 D 2 D 0, 1 D 2 D 1 but  D 0:8, we get Fig. 5.20b.
The distributions plotted in Fig. 5.20 have the following covariance matrices:
" # " #
1 0 1 0:8
˙1 D ; ˙2 D (5.18.17)
0 1 0:8 1
Ž
Named after Prasanta Chandra Mahalanobis (1893–1972), an Indian scientist and statistician.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 621

(a)  D 0 (b)  D 0:8

Figure 5.20: Visualization of the PDF of the bivariate normal distributions: generated using Asymptote.
Asymptote is a descriptive vector graphics (programming) language that provides a natural coordinate-
based framework for technical drawing. Labels and equations are typeset with LATEX, the de-facto standard
for typesetting mathematics.

There are two methods of plotting the bivariate normal distribution. One method is to plot
a 3D graph (Fig. 5.20) and the other method is to plot a contour graph (Fig. 5.21). A contour
graph is a way of displaying three dimensions on a 2D plot. A 3D plot is sometimes difficult to
visualize properly. The contour plot shows only two dimensions i.e., x1 ; x2 . The third dimension
is defined by the colour. If two points have the same colour in the contour plot, then they have
equal values for their third dimension. A contour plot is usually accompanied by a legend relating
the colours to values.
And why the contour of the bivariate normal distribution is an ellipse? The answer is in
the square Mahalanobis distance: it is of the form y > Ay, which is a quadratic form. And in
Section 11.10.6, we know that y > Ay D c is an ellipse.

4 +0.2 4 +0.3
3 +0.2 3 +0.3
+0.2 +0.2
2 2
+0.1 +0.2
1 +0.1 1 +0.2
f (x, y)

f (x, y)

y 0 +0.1 y 0 +0.2
−1 +0.1 −1 +0.1
+0.1 +0.1
−2 −2
0.0 +0.1
−3 0.0 −3 0.0
−4 0.0 −4 0.0
−2.5 0 2.5 −2.5 0 2.5
x x
(a)  D 0 (b)  D 0:8

Figure 5.21: Contour plot of the PDF of the bivariate normal distributions: generated using Asymptote.

Phu Nguyen, Monash University © Draft version


Chapter 5. Probability 622

5.19 Review
I had a bad experience with probability in university. It is quite unbelievable that I have now man-
aged to learn it at the age of 42 to a certain level of understanding. Here are some observations
that I made

 Probability had a humbling starting point in games of chances. But mathematicians turned
it into a rigorous branch of mathematics with some beautiful theorems (e.g. the central
limit theorem) with applications in many diverse fields far from gambling activities;

 It is beneficial to carry out Monte-Carlo experiments to support the learning of probability;

 To learn probability for discrete random variables, we need to have first a solid understand-
ing of counting methods (e.g. factorial, permutations and so on).

 Probability is very counter intuitive so rote memorization does not help. Facing a proba-
bility we should sit back and solve it slowly.

Phu Nguyen, Monash University © Draft version


Chapter 6
Statistics and machine learning

Contents
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 624

6.2 A brief introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 624

6.3 Statistical inference: classical approach . . . . . . . . . . . . . . . . . . . 624

6.4 Statistical inference: Bayesian approach . . . . . . . . . . . . . . . . . . 625

6.5 Least squares problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 625

6.6 Markov chains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 628

6.7 Principal component analysis (PCA) . . . . . . . . . . . . . . . . . . . . 631

6.8 Neural networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 632

 Statistics: a very short introduction by David J. Hand|| [29] ;

 Statistics with Julia: Fundamentals for Data Science, Machine Learning and Artificial
Intelligence, by Yoni Nazarathy and Hayden KlokŽŽ [52];

 d;

 d
||
d

d
ŽŽ
d

623
Chapter 6. Statistics and machine learning 624

6.1 Introduction
6.1.1 What is statistics
6.1.2 Why study statistics
6.1.3 A brief history of statistics

6.2 A brief introduction

Table 6.1: Some terminologies in statistics.

Term Definition Example

Population All members of a well-defined All students of Monash Univer-


group sity

Parameter A characteristic of a population Average height of a population

Sample A subset of a population 2nd year students of Monash

Statistic A characteristic of a sample Average height of a sample

Descriptive statistics Techniques allow us to summa- histogram plot, mean, variance


rize the data of a sample etc.

Inferential statistics Techniques allow us to infer the Bayesian statistics


properties of a population from a
sample

6.3 Statistical inference: classical approach


The objective of statistics is to make inferences about a population based on information con-
tained in a sample. Populations are characterized by numerical descriptive measures called
parameters. Typical population parameters are the mean, the standard deviation and so on. Most
inferential problems can be formulated as an inference about one of these parameters.
So far we have considered problems like the following

Let X be a normal random variable with mean  D 100 and variance  2 D 15:
Find the probability that X > 100:

Phu Nguyen, Monash University © Draft version


Chapter 6. Statistics and machine learning 625

In statistical inference problems, the problem is completely different. In real life, we do not
know the distribution of the population (i.e., X). Most often, we use the central limit theorem to
assume that X has a normal distribution, yet we still do not know the values for  and  2 .
This brings us to the problem of estimation. We use sample data to estimate for example the
mean of the population. If we just use a single number for the mean, we’re doing a point estima-
tion, whereas if we can provide an interval for the mean, we’re doing an interval estimation.

6.4 Statistical inference: Bayesian approach


6.5 Least squares problems
6.5.1 Problem statement
In many scientific problems experimental data are used to infer a mathematical relationship
among the variables being measured. In the simplest case there is one single independent variable
and one dependent variable. Then, the data come in the form of two measurements: one for
the independent variables and one for the dependent variable. Thus, we have a set of points
.xi ; yi /, and we are looking for a function that best approximates the relationship between the
independent variable x and the dependent variable y. Once we have found that function (e.g.
y D f .x/), then we can make predictions: given any x  (not in the experimental data), we can
determine the corresponding y D f .x  /. Fig. 6.1 gives two examples.

Figure 6.1: Fitting a curve through a cloud of points.

Suppse that the function relating x and y is a linear function y D f .x/ D ax C b, or a


quadratic function y D ax 2 C bx C c. The problem is then to determine the parameters a; b or
a; b; c (for the quadratic case) so that the model best fits the data. It was foundŽŽ that to get the
best fit, we need to minimize the sum of the squares of the error, where the error is the difference
between the data and the model evaluated at xi :
X
n
minimize W SD ei2 ; ei D yi f .xi /
i D1

ŽŽ
By Roger Cotes, Legendre and Gauss. In 1809 Carl Friedrich Gauss published his method (of least squares)
of calculating the orbits of celestial bodies.

Phu Nguyen, Monash University © Draft version


Chapter 6. Statistics and machine learning 626

Even though this problem can be solved by calculus (i.e., setting the derivative of S w.r.t a and b
to zero), I prefer to use linear algebra to solve it. Why? To understand more about linear algebra!
To this end, we introduce the error vector e D .e1 ; e2 ; : : : ; en / where ei D yi f .xi /. Let’s
start with the simplest case where f .x/ D ˛x C ˇ, then we can write the error function as
2 3 2 3 2 3
e1 y1 x1 1 " #
6 7 6 7 6 7
6e2 7 6y2 7 6x2 17 ˛
eD6 7 6 7 6
6 :: 7 D 6 :: 7 6 :: :: 7 ˇ
7 (6.5.1)
4 : 5 4 : 5 4 : :5
„ƒ‚…
en yn xn 1 x
„ƒ‚… „ ƒ‚ …
b A

In statistics, the matrix A is called design matrix. Usually we have lots of data thus this matrix
is skinny meaning that it has more rows than columns. Now the problem is to find x D .˛; ˇ/ to
minimize S which is equivalent to minimize kek (where jjvjj is the Euclidean norm), which is
equivalent to minimize kb Axk. We have converted the problem to a linear algebra problem
of solving Ax D b, but with a rectangular matrix. This overdetermined system is unsolvable in
the traditional sense that no x  would make Ax  equals b. Thus, we ask for a vector x  that
minimize kb Axk, such a vector is called the least square solution to Ax D b. So, we have
the following definition:

Definition 6.5.1: Least squares problem


If A is an m  n matrix and b is in Rn , a least squares solution of Ax D b is a vector x  such
that
kb Ax  k  kb Axk
for all x in Rn .

6.5.2 Solution of the least squares problem


We are seeking for a vector x that minimizes kb Axk. Noting that Ax is a vector living
in the column space of A. So the problem is now: find a vector y in C.A/ that is closest to
b. According to the best approximation theorem (Section 11.11.10), the solution is then the
projection of b onto C.A/.
Ax  D projC.A/ .b/
We do not have to solve this system to get x  , a bit of algebra leads to
b Ax  D b projC.A/ .b/ D perpC.A/ .b/
which means that b Ax  is perpendicular to the columns of A:
ai  .b Ax  / D 0; i D 1; 2; : : : ; n
Thus, we obtain the following equation known as the normal equation

A> Ax  D A> b H) x  D .A> A/ 1 A> b (6.5.2)

Phu Nguyen, Monash University © Draft version


Chapter 6. Statistics and machine learning 627

The solution given in the box holds only when rank.A/ D n i.e., all the cols of A are linear
independentŽŽ . In that case, due to theorem 11.5.5 which states that rank.A> A/ D rank.A/ D n.
An n  n matrix has a rank of n, it is invertible. That’s why the normal equation has the unique
solution expressed in terms of the inverse of A> A.

Calculus based solution. We just need to rewrite S as

1 1
minimize W S D kek2 D e > e
2 2
and with e D b Ax, S becomes

1 1 1
SD .b Ax/> .b Ax/ D x > A> Ax b> Ax C b> b
2 2 2
Now, we use dS=d x D 0 (check Section 12.9.2 to know how to do differentiation with matri-
ces):
dS
D A> Ax A> b H) A> Ax A> b D 0 H) x D .A> A/ 1 A> b
dx
And we have obtained the same result.

Pseudoinverse. For the square matrix A the solution to Ax D b is written in terms of its inverse
matrix: x D A 1 b. We should do the same thing for rectangular matrices! And that leads to the
pseudoinverse matrix of which definition comes from x  D .A> A/ 1 A> b.

Definition 6.5.2
If A is a matrix with linearly independent columns, then the pseudoinverse of A is the matrix
AC defined by
AC D .A> A/ 1 A>

Fitting a cloud of points with a parabola. The least squares method just works when f .x/ D
˛x 2 C ˇx C . Everything is the same, except that we have a bigger design matrix and we have
three unknowns to solve for:
2 3 2 3
x12 x1 1 2 3 y1
6 2 7 ˛ 6 7
6x2 x2 17 6y2 7
AD6 7; x D 6 4
7
5 D 6 7
6: : :
: :
: 7 ˇ ; b 6 :: 7
4 : : : 5 4:5

xn2 xn 1 yn

Fitting a cloud of 3D points with a plane. So far we just dealt with y D f .x/. How about
z D f .x; y/? No problem, the exact same method works too. Assume that we want to find the
When that happens? Take Eq. (6.5.1) as an example, for this design matrix to have a rank of 2, at least there
ŽŽ

must be two different xi ’s.

Phu Nguyen, Monash University © Draft version


Chapter 6. Statistics and machine learning 628

best plane z D ˛x C ˇy C :
2 3 2 3
x1 y1 1 2 3 z1
6 7 ˛ 6 7
6x2 y2 17 6 7 6z2 7
AD6
6 :: ::
7
:: 7 ; x D 4ˇ 5 ; bD6 7
6 :: 7
4: : :5 4:5

xn yn 1 zn

6.6 Markov chains


In Section 2.10 we have met Fibonacci with his famous sequence. His sequence is defined by
a linear recurrence equation. In that section I have presented Binet’s formula to compute the
nth Fibonacci number directly. In this section, a method based on linear algebra is introduced to
solve similar problems. We start with one example. To read this section you need linear algebra,
particularly on matrix diagonalization. Check Chapter 11.

Example 6.1
Consider the sequence .xn / defined by the initial conditions x1 D 1; x2 D 5 and the recur-
rence relation xn D 5xn 1 6xn 2 for n  2. Our problem is to derive a direct formula for
xn ( n  2 ) using matrices. To this end, we introduce the vector x n D .xn ; xn 1 /. With this
vector, we can write the given recurrent equation using matrix notation:
" # " #" # " #
xn 5 6 xn 1 5 6
xn D D D xn 1
xn 1 1 0 xn 2 1 0

And we have obtained a recurrent formula x n D Ax n 1 . With that, we get

x 3 D Ax 2 ; x 4 D Ax 3 D A2 x 2 : : : H) x n D An 2 x 2 ; x 2 D .5; 1/ (6.6.1)

Now our task is simply to compute Ak . With the eigenvalues of 3; 2 and eigenvectors .3; 1/
and .2; 1/, it is easy to do so:
" #" #" # 1 " #
3 2 3k 0 3 2 3kC1 2kC1 2.3kC1 / C 3.2kC1 /
Ak D
1 1 0 2k 1 1 3k 2k 2.3k / C 3.2k /
With that and the boxed equation, we can get xn D 3n 2n .

6.6.1 Markov chain: an introduction


Consider the following survey on people’s toothpaste preferences conducted by a market team,
taken from [56]. The sample consists of 200 people (in which 120 use brand A and 80 use B)
each of whom is asked to try two brands of toothpaste over several months. The result is: among

Phu Nguyen, Monash University © Draft version


Chapter 6. Statistics and machine learning 629

those using brand A in any month, 70% continue using it in the following month, while 30%
switch to brand B; of those using brand B, those numbers are 80% and 20%.
The question is: how many people will use each brand after 1 month later? 2 months later?
10 months? To answer the first equation is very simple:

people use brand A after 1 month W 0:7.120/ C 0:2.80/ D 100


people use brand B after 1 month W 0:3.120/ C 0:8.80/ D 100

Nothing can be simpler but admittedly the maths is boring. Now comes the interesting part. We
rewrite the above using matrix notation, this is what we get
" #" # " #
0:7 0:2 120 100
D ; or Px 0 D x 1
0:3 0:8 80 100
„ ƒ‚ … „ ƒ‚ … „ ƒ‚ …
P x0 x1

Let’s stop here and introduce some terminologies. What we are dealing with is called a Markov
chain with two states A and B. There are then four possibilities: a person in state A can stay in
that state or he/she can hop to state B and the person in state B can stay in it or move to A. The
probabilities of these four situations are the four numbers put in the matrix P.

Figure 6.2: A finite Markov chain with two states.

And from that we can see that the Markov chain satisfies the recurrise formula x kC1 D Px k ,
for k D 0; 1; 2; : : :. Alternatively, we can write

x 1 D Px 0 ; x 2 D Px 1 D P2 x 0 ; : : : H) x k D Pk x 0 ; k D 1; 2; : : :
where the vectors x k are called state vectors and P is called the transition matrix. Instead of
working directly with the actual numbers of toothpaste users, we can use relative numbers:
" # " #
120=200 0:6
x0 D D W probability vector
80=200 0:4
Why relative numbers? Because they add up to one! That’s why vectors such as x 0 are called
probability vectors.
We’re now ready to answer the question: how many people will use each brand after, let say,
10 months? Using x k D Pk x 0 , we can compute x 1 ; x 2 ; : : : and get the following result

Phu Nguyen, Monash University © Draft version


Chapter 6. Statistics and machine learning 630

" # " # " # " #


0:5 0:45 0:4 0:4
x1 D ; x2 D ; : : : ; x9 D ; x 10 D
0:5 0:55 0:6 0:6

Two observations can be made based on this result. First, all state vectors are probability vectors
(i.e., the components of each vector add up to one). Second, the state vectors convergence to
a special vector .0:4; 0:6/. It is interesting that once this state is reached, the state will never
change:
" #" # " #
0:7 0:2 0:4 0:4
D
0:3 0:8 0:6 0:6

This special vector is called a steady state vector. Thus, a steady state vector x is one such that
Px D x. What does this equation say? It says that x is an eigenvector of P with corresponding
eigenvalue of one.
All these results are of course consequences of the following two properties of the Markov
matrix:
8
<1. Every entry is positive: P > 0
ij
Markov matrix: P
:2. Every column adds to 1:
i Pij D 1

Proof. [State vectors are probability vectors] Start with a state vector u, we need to prove that
x D Pu is a probability vector, where P is a Markov matrix. We know that the components of
u sum up to one. We need to translate that to mathematics, which is u1 C u2 C    C un D 1
or better Œ1 1 : : : 1u D 1. So, to prove x adds up to one, we just need to show that
Œ1 1 : : : 1.Pu/ D 1. This is true because Œ1 1 : : : 1.Pu/ D .Œ1 1 : : : 1P/u D Œ1 1 : : : 1u,
which is one. (Œ1 1 : : : 1P D Œ1 1 : : : 1 because each column of P adds up to one). 

Now, we need to study why x k D Pk x 0 approaches a steady state vector when k ! 1. To


this end, we need to be able to compute Pk . For that, we need its eigenvalues and eigenvectors:

" # " # " # " #


2 C1 2 C1 1 0
1 D 1; 2 D 0:5; x 1 D ; x2 D ; QD ; DD
3 1 3 1 0 0:5

And noting that P D QDQ 1 , thus


" # " #
1 k k 1 1 1 0 1 0:4 0:4
P D QDQ H) P D QD Q H) P DQ Q D
0 0 0:6 0:6

Phu Nguyen, Monash University © Draft version


Chapter 6. Statistics and machine learning 631

6.6.2 dd

6.7 Principal component analysis (PCA)


In many problems we have a matrix of data (measurements). For example, there are n samples
and for each sample we are measuring m variables. Thus the data matrix A has n columns and
m rows. Geometrically we have n points in the Rm space. Most often m > 3 which makes
visualization and understanding of this data very hard.
Principal component analysis provides a way to understand this data. The starting point is
the covariance matrix S of the data (Section 5.12.5). This is a symmetric positive semidefinite
matrix of dimension m  m. According to the spectral theorem (theorem 11.10.3), S has a
spectral decomposition S D QQ> with real eigenvalues in  and orthonormal eigenvectors in
the columns of Q:
S D QQ 1 D QQ> with Q> D Q 1
Now, if we label 1 the maximum of all the eigenvalues of S, then from Section 11.10.6, we
know that
1 D max u> Su
kuD1k

And this happens when u D u1 where u1 is the eigenvector corresponding to 1 . We hence try
to understand the geometric meaning of u> 1 Su1 . To this end, we confine to the 2D plane i.e.,
m D 2, and we can write then (see Eq. (5.12.21) for S)
"P P # " #
1 x 2
x y 1 X x 2
x y
> > >
u1 Su1 D u Pi i Pi 2 u1 D
i i
u1 i i i
u1
n 1 1 x y
i i i y
i i
n 1 i
x y
i i yi2
1 X > 1 X > 2
D u1 x i x >
i u1 D .x i u1 / ; x i D .xi ; yi /
n 1 i n 1 i

Recognizing that jx > >


i u1 j is the length of the projected vector of x i on u1 , the term u1 Su1 is
then the sum of squares of the projection of all data points on the line with direction given by
u1 (Fig. 6.3). In summary, we have found the axis u1 which gives the maximum variance of the
data. Which also means that this axis yields the minimum of the squared distances from data
points to the line.

Figure 6.3

Phu Nguyen, Monash University © Draft version


Chapter 6. Statistics and machine learning 632

If we wish we can find the second axis given by, what else, the second eigenvector u2
(corresponding with the second largest eigenvalue 2 ). Along this axis the variance is also
maximum. And we can continue with other eigenvectors, thus we can project our data points to
a k-dimensional space spanned by u1 ; : : : ; uk . We put these eigenvectors in matrix Qk –a m  k
matrix, then Y D Q> k A is the transformed data points living in a k-dimensional space where
k  m.

6.8 Neural networks

Phu Nguyen, Monash University © Draft version


Chapter 7
Multivariable calculus

Contents
7.1 Multivariable functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 635
7.2 Derivatives of multivariable functions . . . . . . . . . . . . . . . . . . . . 639
7.3 Tangent planes, linear approximation and total differential . . . . . . . . 641
7.4 Newton’s method for solving two equations . . . . . . . . . . . . . . . . . 642
7.5 Gradient and directional derivative . . . . . . . . . . . . . . . . . . . . . 642
7.6 Chain rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 646
7.7 Minima and maxima of functions of two variables . . . . . . . . . . . . . 647
7.8 Integration of multivariable functions . . . . . . . . . . . . . . . . . . . . 657
7.9 Parametric surfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 676
7.10 Newtonian mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 683
7.11 Vector calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 698
7.12 Complex analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 723

In Chapter 4 we have studied the calculus of functions of one variable e.g. functions ex-
pressed by y D f .x/. Basically, we studied curves in a 2D plane, the tangent to a curve at any
point on the curve (1st derivative) and the area under the curve (integral). Now is the time to the
real world: functions of multiple variables. We will discuss functions of the form z D f .x; y/
known as scalar-valued functions of two variables. A plot of z D f .x; y/ gives a surface
in a 3D space. Of course, we are going to differentiate z D f .x; y/ and thus partial deriva-
tives @f , @f naturally emerge. We also compute integrals of z D f .x; y/, the double integrals
’ @x @y
f .x;” y/dxdy which can be visualized as the volume under the surface f .x; y/. And triple in-
tegrals f .x; y; z/dxdydz appear when we deal with functions of three variables f .x; y; z/.
All of this are merely an extension of the calculus we know from Chapter 4. If there are some
difficulties, they are just technical not mentally as when we learned about the spontaneous speed
of a moving car.

633
Chapter 7. Multivariable calculus 634

Then comes vector-valued functions used to describe vector fields. For example, if we want
to study the motion of a moving fluid, we need to know the velocity of all the fluid particles. The
velocity of a fluid particle is a vector field and is mathematically expressed as a vector-valued
function of the form v.x; y/ D .g.x; y/; h.x; y// in two dimensions. The particle position is
determined by its coordinates .x; y/ and its velocity by two functions: g.x; y/ for the horizontal
component of the velocity and h.x; y/ for the vertical component.
And with vector fields, we shall have vector calculus that consists of differential calculus of
vector fields and integral calculus of vector fields. In differential calculus of vector fields, we
shall meet the gradient vector of a scalar field rf , the divergence of a vector field
R r  C and the
curl of a vector
R field r  C . In the integral calculus, we have the line integral F  d s, surface
integrals S C  ndA and volume integrals. And these integrals are linked together via Green’s
theorem, Stokes’ theorem and Gauss’ theorem. They are generalizations of the fundamental
theorem of calculus (Table 7.1).
Table 7.1: Integral calculus of vector fields: a summary.

Theorem Formula
Z b
df
FTC dx D f .b/ f .a/
a dx

Z2
FTC of line integrals r  ds D .2/ .1/
1
along C

Z   I
@Cy @Cx
Green’s theorem dA D .Cx dx C Cy dy/
S @x @y

Z I
Stokes’ theorem .r  C /  ndA D C  ds
S

Z Z
Gauss’s theorem C  ndA D r  C dV
S V

This chapter starts with a presentation of multivariable functions in Section 7.1. The deriva-
tives of these functions are discussed in Section 7.2. Section 7.3 presents tangent planes and
linear approximations. Then, Newton’s method for solving a system of nonlinear equations is

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 635

treated in Section 7.4. The gradient of a scalar function and the directional derivative are given
in Section 7.5. The chain rules are introduced in Section 7.6. The problem of finding the extrema
of functions of multiple variables is given in Section 7.7. Two and three dimensional integrals
are given in Section 7.8. ??. Newtonian mechanics is briefly discussed in Section 7.10. Then
comes a big chapter on vector calculus (Section 7.11). A short introduction to the wonderful
field–complex analysis–is provided in Section 7.12.
Some knowledge on vector algebra and matrix algebra are required to read this chapter.
Section 11.1 in Chapter 11 provides an introduction to vectors and matrices.
I have used primarily the following books for the material presented herein:

 Calculus by Gilbert Strang [67];

 Calculus: Early Transcendentals by James Stewart‘ [63];

 The Feynman Lectures on Physics by Feynman [22];

 Vector calculus by Jerrold MarsdenŽ and Anthony Tromba [46].

7.1 Multivariable functions


The concept of a function of one variable can be easily generalized to the case of functions
of two or more variables. If a function returns a scalar we call it a scalar valued function e.g.
sin.x C y/. We discuss such functions in Section 7.1.1. On the other hand, if a function returns
a vector (multiple outputs), then it is called a vector valued function, see Section 7.1.4. One
example is a helix curve given by .sin t; cos t; t/.

7.1.1 Scalar valued multivariable functions


In the case of a function of two variables, we consider the set of ordered pairs .x; y/ where x
and y are both real numbers. If there is a law according to which each pair .x; y/ is assigned
to a single value of z, then we say about a function of two variables. Usually, this function is
denoted by z D f .x; y/ W R2 ! R. Similarly we can have a function taking three real numbers
and produce a real number, mathematically written as T D g.x; y; z/ W R3 ! R. For example,
T D g.x; y; z/ is the temperature of a point in the earth.
Visualizing functions of two variables is more difficult by hand as they represent surfaces in
a 3D space; these surfaces are formed by the set of all the points .x; y; z/ where z D f .x; y/,

James Drewry Stewart (1941 – 2014) was a Canadian mathematician, violinist, and professor emeritus of
mathematics at McMaster University. Stewart is best known for his series of calculus textbooks used for high
school, college, and university level courses.
Ž
Jerrold Eldon Marsden (1942 – 2010) was a Canadian mathematician. Marsden, together with Alan Weinstein,
was one of the world leading authorities in mathematical and theoretical classical mechanics. He has laid much of
the foundation for symplectic topology. The Marsden-Weinstein quotient is named after him.

Anthony Joseph Tromba (born 10 August 1943, Brooklyn, New York City) is an American mathematician,
specializing in partial differential equations, differential geometry, and the calculus of variations.

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 636

z z

z D x2 C y2

y
y

y 2 D1
x x2 C x

(a) Paraboloid z D x 2 C y 2 (b) Sphere

Figure 7.1: Graph of a paraboloid and a sphere (drawn with Tikz). Source: tikz.net.

i.e., the set of points .x; y; f .x; y//. We need to use software for this task. For example, Fig. 7.1
shows a plot of the function z D x 2 C y 2 (which is a paraboloid, a generalization of the 2D
parabola) and x 2 C y 2 C z 2 D 1 which is a sphere. In Fig. 7.2a we plot another function
z D f .x; y/ in which the surface is colored according to the value of z. In this way it is easy to
see where is the highest/lowest points of the surface. Furthermore, it is the only way to visualize
T D f .x; y; z/ (Fig. 7.2b). This is because the graph of a function f .x; y; z/ of three variables
would be the set of points .x; y; z; f .x; y; z// in four dimensions, and it is difficult to imagine
what such a graph would look like.

0:75

1
0:50
0:5
0:25
0 10
0:00
10 0
5 0 5
10 10

(a) a surface (b) a solid


p  p
Figure 7.2: Graph of the surface z.x; y/ D sin x 2 C y 2 = x 2 C y 2 in which the surface is colored
according to the value of z (a) and a 3D solid colored by the temperature inside.

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 637

7.1.2 Level curves, level surfaces and level sets


Three-dimensional plots are more difficult to draw and visualize than two-dimensional plots.
Therefore, we now discuss another way of visualizing a multivariable function, borrowed from
mapmakers: using level sets, i.e., the set of points in the domain of a function where the function
is constant. The nice part of level sets is that they live in the same dimensions as the domain of
the function.

Definition 7.1.1: Level set


A level set of a real-valued function f of n real variables is a set where the function takes on
a given constant value c (in the range of f ), that is:

Lc .f / D f.x1 ; x2 ; : : : ; xn /jf .x1 ; x2 ; : : : ; xn / D cg (7.1.1)

When the number of independent variables is two, a level set is called a level curve, also
known as contour line or isoline; so a level curve is the set of all real-valued solutions of an
equation in two variables x; y. When n D 3, a level set is called a level surface (or isosurface);
so a level surface is the set of all real-valued roots of an equation in three variables x; y; z.
For higher values of n, the level set is a level hypersurface.

8 C1:0
0.8 6 C0:8
0.6 C0:7
4
0.4 C0:6
0.2 2 C0:4

f .x; y/
0.0 y 0 C0:2
0.2
2 C0:1
10 0:1
5 4
0:2
10 0
5 6 0:3
0 5
5 10 8 0:5
10 5 0
x
5

(a) (b)
p  p
Figure 7.3: Graph of the surface z.x; y/ D sin x 2 C y 2 = x 2 C y 2 and its level curves (a). In (b)
only level curves are plotted and they are just 2D curves. Looking at the level curves we can say something
about the function: the surface is steep where the level curves are close together. And it is flatter when
they’re father apart.

A level set of a function of two variables f .x; y/ is a curve in the two-dimensional xy-plane,
called a level curve (Fig. 7.3). A level set of a function of three variables f .x; y; z/ is a
surface in three-dimensional space, called a level surface. For a constant value c in the range of
f .x; y; z/, the level surface of f is the implicit surface given by the graph of c D f .x; y; z/.

Domain, co-domain and range of a function. For the function z D f .x; y/ W R2 ! R, we say
that the domain of this function is the entire 2D plane i.e., R2 . Thus, the domain of a function

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 638

is the set of all inputs. We also say that the co-domain is R: the co-domain is the set of outputs.
And finally the range of a function is a sub-set of its co-domain which contains the actual outputs.
For example, if f .x; y/ D x 2 C y 2 , then its co-domain is all real numbers but its range is only
non-negative reals.

7.1.3 Multivariate calculus: an extension of univariate calculus


If we keep one variable, say y constant, then from z D f .x; y/ we obtain a function of a single
variable x, see Fig. 7.1b. We can then apply the calculus we know from Chapter 4 to this function.
That leads to partial derivatives. Using these two partial derivatives, we will have directional
derivative Du that gives the change in f .x; y/ along the direction u. Other natural extensions of
Chapter 4‘s calculus are summarized in Table 7.2. We will discuss them, but as you have seen,
they are merely extensions of calculus of functions of single variable.

Table 7.2: Multivariate calculus is simply an extension of univariate calculus.

f .x/ f .x; y/

@f @f
1st derivative df =dx partial derivatives ;
@x @y
@2 f 2 2 2
2nd derivative d 2 f =dx 2 second par. der. @x 2
; @@yf2 ; @y@x
@ f @ f
; @y@x

directional derivative Du f .x; y/ D ux @f


@x
C uy @f
@y
Rb ’ ”
integral a f .x/dx double/triple integrals f .x; y/dxdy, f .x; y/dxdydz
@f @f
extrema fx .x0 / D 0 @x
D @y
D0

7.1.4 Vector valued multivariable functions


Functions of the form r.t/ D .f .t/; g.t/; h.t// having one variable for input and a vector for
output, are called single-variable vector-valued functions. The functions f .t/; g.t/ and h.t/,
which are the component functions of r.t/, are each a single-variable real-valued function.
Single-variable vector-valued functions can be denoted as r W R ! Rn . The graph of such
functions is a 3D curve, see Fig. 7.4a for one example.
A function of the form: " #! " #
x f .x; y/
f D (7.1.2)
y g.x; y/
is a multi-variable vector-valued function, which maps a point in 2D space to another point in
the same space. In Fig. 7.4b we show such a function: f .x; y/ D .x 2 y 2 ; 2xy/. The black
lines are the standard grid lines in a 2D Cartesian plane, and we apply this function to those
lines to obtain the red lines. As can be seen, a square was transformed to a curved shape. This
function is thus called a transformation.

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 639

1
20

15

10 0

5
0
1
1.0
0.5
1.0 0.0
0.5
0.0 0.5 2
0.5 1.0 1.0 0.5 0.0 0.5 1.0
1.0

(a) single-variable vector-valued (b) multi-variable vector-valued function


function

Figure 7.4: Vector valued multivariable functions.

7.2 Derivatives of multivariable functions


For functions of one variable, y D f .x/, the derivative was defined as the ratio of the change
in the function and the change in x when this change is approaching zero. For functions of two
variables f .x; y/, it is natural to consider changes in x and in y separately. And this leads to
two derivatives: one with respect to x when y is held constant, and the other with respect to y
when x is held constant.
For example, consider f .x; y/ D x 2 C y 2 . When x is changed to x C x (and y is held
constant), the corresponding change in f is f given by

f D .x C x/2 C y 2 .x 2 C y 2 / D 2xx C .x/2

And thus f =x D 2x C x. The derivative with respect to x, denoted by @f @x


, is therefore 2x.
So, we are ready to give the formal definition of the partial derivatives of functions of two
variables:

@f f .x C x; y/ f .x; y/
D lim
@x x!0 x
(7.2.1)
@f f .x; y C y/ f .x; y/
D lim
@y y!0 y

In words, the partial derivative w.r.t x is the ordinary derivative while holding other variables
(y) constant. Sometimes, people write fx for @f =@x . For y D f .x/ the derivative f 0 .x0 / has the
geometrical meaning of being the slope of the tangent to y D f .x/ at the point .x0 ; f .x0 //. Do
we have something similar for z D f .x; y/? Of course, yes. See Fig. 7.5.
And of course, nothing can stop us from moving to second derivatives. From @f =@x we have
@2 f =@x 2 – derivative w.r.t x of @f =@x and @2 f =@x@y – derivative w.r.t y of @f =@x . And from @f =@y we

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 640

p
Figure 7.5: Graph of z.x;
p y/ D 1 x
2 y 2 –which is the first quarter of the unit sphere. Now, let’s
consider P .1=2; 1=2; 2=2/ on this sphere. We consider two curves C1 W z.x; 1=2/ and C2 W z.1=2; y/;
the former is the intersection of the surface with the plane y D 1=2 and the latter is the intersection of
the surface with the plane x D 1=2. Then C1 is given by g.x/ D z.x; 1=2/, and thus zx .1=2; 1=2/ is the
slope of the tangent to C1 at P . Similarly zy .1=2; 1=2/ is the slope of the tangent to C2 at P .

have @2 f =@y 2 and @2 f =@y@x . To summarize, we write


   
@f @ @f @2 f @ @f @2 f
! WD .orfxx /; WD .orfxy /
@x @x @x @x 2 @y @x @x@y
    (7.2.2)
@f @ @f @2 f @ @f @2 f
! WD .orfyx /; WD .orfyy /
@y @x @y @y@x @y @y @y 2
where fxy and fyx are cross derivatives or mixed derivatives. The origin of partial derivatives
was partial differential equation such as the wave equation (Section 9.5.1). Briefly, eighteenth
century mathematicians and physicists such as Euler, d’Alembert and Daniel Bernoulli were
investigating the vibration of strings (to understand music), and there was a need to consider
partial derivatives.

Example 1. Let’s consider the function f .x; y/ D x 2 y 2 C xy C y, its first and second (partial)
derivatives are
fx D 2xy 2 C y fxx D 2y 2 fxy D 4xy C 1
2 2
fy D 2x y C x C 1 fyy D 2x fyx D 4xy C 1
The calculations were nothing special, but one thing special is @2 f =@y@x D @2 f =@y@x . Is it luck?
Let’s see another example.
2
Example 2. Let’s consider this function f .x; y/ D e xy , its first and second derivatives are
2 2 2 2
fx D y 2 e xy fxx D y 4 e xy fxy D 2ye xy C 2xy 3 e xy
2 2 2 2 2
fy D 2xye xy fyy D 2xe xy C 4x 2 y 2 e xy fyx D 2ye xy C 2xy 3 e xy

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 641

Again, we get @2 f =@y@x D @2 f =@y@x . Actually, there is a theorem called Schwarz’s Theorem or
Clairaut’s TheoremŽŽ which states that mixed derivatives are equal if they are continuous.

7.3 Tangent planes, linear approximation and total differen-


tial
When we considered functions of one variable, we used the first derivative f 0 .x0 / to get the
equation of the tangent line to f .x/ at .x0 ; f .x0 //. The equation of the tangent line is y D
f .x0 / C f 0 .x0 /.x x0 /. And the tangent line led to linear approximation: near x0 we can
use the tangent line instead of the curve. Now, we’re doing the same thing for f .x; y/. But,
instead of tangent lines we have tangent planes. The equation of a plane passing through the
point .x0 ; y0 ; z0 / is given by (see Section 11.1.3 to know why)

a.x x0 / C b.y y0 / C c.z z0 / D 0; or z D z0 C A.x x0 / C B.y y0 / (7.3.1)

Our task is now to determine the coefficients A and B in terms of fx and fy (we believe in the
extension of elementary calculus to multi-dimensions). To determine A, we consider the plane
y D y0 . The intersection of this plane and the surface z D f .x; y/ is a curve in the x z plane,
see Fig. 7.5 for one example. The tangent to this curve at .x0 ; y0 / is z D z0 C fx .x0 ; y0 /.x x0 /
and thus A D fx .x0 ; y0 /. Similarly, consider the plane x D x0 , and we get B D fy .x0 ; y0 /.
The tangent plane is now written as

z D z0 C fx .x0 ; y0 /.x x0 / C fy .x0 ; y0 /.y y0 / (7.3.2)

Linear approximation. Around the point .x0 ; y0 /, we can approximate the (complicated) func-
tion f .x; y/ by a simpler function–the equation of the tangent plane:

f .x; y/  f .x0 ; y0 / C fx .x0 ; y0 /.x x0 / C fy .x0 ; y0 /.y y0 / (7.3.3)

which is called a linear approximation of a f .x; y/. Compared to a linear approximation of


f .x/: f .x0 / C fx .x0 /.x x0 /, we can see the analogy. And we can also guess this is coming
from a Taylor’s series for f .x; y/ where higher order terms are omitted.
Seeing the pattern from functions of single variable to functions of two variables, we can
now generalize the linear approximation of functions of n variables (n 2 N and n  2):

f .x 0 / C .x x 0 /> rf .x 0 / (7.3.4)

We will discuss the notation rf shortly. Note that vector notation is being used:
x D .x1 ; x2 ; : : : ; xn / is a point in an n-dimensional space, refer to Section 11.1 in Chapter 11
ŽŽ
Alexis Claude Clairaut (13 May 1713 – 17 May 1765) was a French mathematician, astronomer, and geophysi-
cist. He was a prominent Newtonian whose work helped to establish the validity of the principles and results that
Sir Isaac Newton had outlined in the Principia of 1687.

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 642

for a discussion on vectors.

Total differential. On the curve y D f .x/, a finite change in x is x, and if we climb on the
curve, we move an amount y. But if we move an infinitesimal along x that is dx, and we
follow the tangent to the curve, then we move an amount dy D f 0 .x/dx. Now we do the same
thing, but we’re now climbing on a surface. Using Eq. (7.3.2), we write

@f @f
dz D dx C dy (7.3.5)
@x @y

and we call dz a total differential.

7.4 Newton’s method for solving two equations


We have used the linear approximation to solve f .x/ D 0. The same idea works for a system of
two equations of the following form
" #
f .x; y/ D 0
(7.4.1)
g.x; y/ D 0

Their linear approximations lead to (where .x0 ; y0 / is the starting point):

f .x0 ; y0 / C fx .x0 ; y0 /x C fy .x0 ; y0 /y D 0


(7.4.2)
g.x0 ; y0 / C gx .x0 ; y0 /x C gy .x0 ; y0 /y D 0

which is a system of linear equations for two unknowns x and y. Formally, we can express
the solutions as
" # " # 1" #
x fx .x0 ; y0 / fy .x0 ; y0 / f .x0 ; y0 /
D (7.4.3)
y gx .x0 ; y0 / gy .x0 ; y0 / g.x0 ; y0 /

With that we update the solution as x0 C x and y0 C y. And the iterative process is repeated
until convergence. This Newton method has been applied to solve practical problems that involve
millions of unknowns. In the above A 1 means the inverse of matrix A. We refer to Chapter 11
for details.

7.5 Gradient and directional derivative


For functions of one single variable y D f .x/ there is only one derivative that measures the rate
of change of the function with respect to the change in x. The x axis is the only direction we can
go! With functions of two variables z D f .x; y/ there are infinite directions to go; on a plane,
we can go any direction. We can go along the west-east direction, to have fx . Or, we can go

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 643

the north-south direction to have fy . They are described by partial derivatives. We can go along
a direction u D .u1 ; u2 / and the change in f .x; y/ is described by the so-called directional
derivative:
@f f .x C x; y/ f .x; y/
change in x: D lim
@x x!0 x
@f f .x; y C y/ f .x; y/
change in y: D lim
@y y!0 y
f .x C hu1 ; y C hu2 / f .x; y/
change in x; y: Du f D lim
h!0 h
where u is a unit vector (as only its direction is important). Ok, so we have the definition of a
new kind of derivative. It generalizes the old partial derivatives: when u D i D .1; 0/, we get
back to fx and when u D j D .0; 1/, we get back to fy . But how can we actually compute the
directional derivative at a certain point .x0 ; y0 / and a given u?
One example will reveal the secret. What is simpler than f .x; y/ D x 2 C y 2 ? Let’s compute
Du of this simple function at .x0 ; y0 /:

.x0 C u1 h/2 C .y0 C u2 h/2 x02 y02


Du f .x0 ; y0 / D lim
h!0 h
2x0 u1 h C 2y0 u2 h C .u1 h/2 C .u2 h/2
D lim
h!0 h
D 2x0 u1 C 2y0 u2

What is this result? It is fx .x0 ; y0 /u1 C fy .x0 ; y0 /u2 . This result makes sense as it is reduced
to the old rules of partial derivatives. Indeed, when u are is unit vector e.g. u D .1; 0/, we get
the familiar result of fx .x0 ; y0 /. And with u D .0; 1/, we get the familiar result of fy .x0 ; y0 /.
So, we guess it is correct for general cases. But, we need a proof.

Proof. [Proof of Du .x0 ; y0 /f D fx .x0 ; y0 /u1 C fy .x0 ; y0 /u2 .] Of course, we start of with the
definition of the directional derivative evaluated at a particular point .x0 ; y0 / as

f .x0 C hu1 ; y0 C hu2 / f .x0 ; y0 /


Du f .x0 ; y0 / D lim (7.5.1)
h!0 h
Then, we’re stuck as there is no concrete expression for f .x; y/ for us to manipulate. Here we
need a change of view. Note that in the above equation x0 ; y0 and u1 ; u2 are all fixed numbers,
only h is a variable. Thus, we can define a new function of a single variable g.z/ as

g.z/ WD f .x.z/; y.z//; x.z/ D x0 C u1 zI y.z/ D y0 C u2 z

What we are going to do with this new function? We differentiate it, using the chain rule (Sec-
tion 7.6):
dx dy
g 0 .z/ D fx C fy D fx u1 C fy u2
dz dz
Phu Nguyen, Monash University © Draft version
Chapter 7. Multivariable calculus 644

We’re on good track as we have obtained fx u1 C fy u2 –the suspect that we’re looking for. From
this, we have
g 0 .0/ D fx .x0 ; y0 /u1 C fy .x0 ; y0 /u2
Now, we just need to prove that g 0 .0/ is nothing but the RHS of Eq. (7.5.1). That is,

‹ f .x0 C hu1 ; y0 C hu2 / f .x0 ; y0 /


g 0 .0/ D lim
h!0 h
Indeed, we can compute g 0 .0/ using the definition of derivative and replacing g with f (we need
it to appear now):
g.h/ g.0/ f .x0 C hu1 ; y0 C hu2 / f .x0 ; y0 /
g 0 .0/ D lim D lim
h!0 h h!0 h

The French mathematician, theoretical physicist, engineer, and philosopher of science Henri
Poincaré (1854 – 1912) once said ‘Mathematics is the art of giving the same name to different
things’. Herein we see the same expression of Du f but as the normal derivative of g.z/. That’s
the art. This can also be seen in the following joke

A team of engineers were required to measure the height of a flag pole. They only
had a measuring tape, and were getting quite frustrated trying to keep the tape
along the pole. It kept falling down, etc. A mathematician comes along, finds out
their problem, and proceeds to remove the pole from the ground and measure it
easily. When he leaves, one engineer says to the other: "Just like a mathematician!
We need to know the height, and he gives us the length!"

We now have a rule to compute the directional derivative for any functions. But there is one
more thing in its formula: @f =@x u1 C @f =@y u2 is actually the dot productŽŽ between the vector u
and a vector, which we do not know, with components fx ; fy .
We now give the rule for a directional derivative for a function f .x; y; z/ and define the
gradient vector, denoted by rf (read nabla f or del f):
@f @f @f
Du f D rf  u; rf D iC jC k (7.5.2)
@x @y @z
In words, the gradient of a function f .x; y; z/ at a any point is a 3D vector with components
.fx ; fy ; fz /. The gradient vector of a scalar function is significant as it gives us the direction
of steepest ascent. That is because the directional derivative indicates the change of f in a
direction given by u. Among many directions, due to the property of the dot product, this change
is maximum when u is parallel to rf (note that jjujj D 1):

Du f D rf  u D jjrf jjjjujj cos  H) max Du f D jjrf jj when  D 0


ŽŽ
Refer to Section 11.1.2 if you need a refresh on the concept of the dot product of two vectors.

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 645

where  is the angle between


q u and rf ; the notation jjrf jj means the Euclidean length of
rf –which is jjrf jj D fx2 C fy2 C fz2 .
Let’s see how the gradient vector looks like geometrically. For the function f .x; y/ D
x C y 2 , we plot its gradient fieldŽ 2xi C 2yj superimposed with the level curves of f .x; y/
2

on Fig. 7.6a. We can see that the gradient vectors are perpendicular to the level curves. This
is because going along a level curve does not change f : Du f D rf  u D 0 when u is
perpendicular to rf .
So far we have considered functions of two variables only. How about functions of three
variables w D f .x; y; z/? We believe that at any point P D .x0 ; y0 ; z0 / on the level surface
f .x; y; z/ D c the gradient rf is perpendicular to the surface. By this we mean it is perpendic-
ular to the tangent to any curve that lies on the surface and goes through P (see Fig. 7.6b).

y 0

−1

−2

−2 −1 0 1 2
x
(a) level curve (b) level surface

Figure 7.6: The gradient vector rf , those red arrows in (a), of a level curve f .x; y/ is perpendicular to
it (a) and the gradient vector is also perpendicular to level surfaces (b).

Proof. Let’s consider a level surface S given by f .x; y; z/ D c, and a point P .x0 ; y0 ; z0 / on
this surface. Then, we consider a curve C lying on the surface and passing through P (see
Fig. 7.6b). We describe C parametrically by C W .x.t/; y.t/; z.t//. Now, for any point on C we
have
f .x.t/; y.t/; z.t// D c
(because points on C are also on S ). Using the chain rule, we’re going to differentiate the above
equation
@f dx @f dy @f dz
C C D0
@x dt @y dt @z dt
So, the dot product of rf and the vector .dx=dt; dy=dt; dz=dt/ is zero, hence rf is perpen-
dicular to the vector r D .dx=dt; dy=dt; dz=dt/, which is nothing but the tangent vector to the
curve C . 
What is a gradient field? At a point .x0 ; y0 / the gradient of f .x; y/ is a vector. Hence, for every points on the
Ž

2D plane we have a bunch of gradient vectors; together they make a vector field, called the gradient field. We have
more to say about fields in Section 7.11.

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 646

Why .fx ; fy ; fz / makes a vector?

It is not true that every three numbers make a vector. For example, we cannot make a
vector from this .fxx ; fy ; fz /. How to prove that .fx ; fy ; fz / is indeed a vector? We use
the fact that the dot product of two vectors is a scalar. To this end, we consider two nearby
points P1 .x; y; z/ and P2 .x Cx; y Cy; z Cz/. Assume that the temperature at P1 is
T1 and the temperature at P2 is T2 . Obviously T1 and T2 are scalars: they are independent
of the coordinate system we use. The difference of temperature T is also a scalar, it is
given by
@T @T @T
T D x C y C z
@x @y @z
Since T is a scalar, and .x; y; z/ is a vector (joining P1 to P2 ), we can deduce
that .Tx ; Ty ; Tz / is a vector.

7.6 Chain rules


Chain rules are for function composition. With multiple variables, there are many possibilities.
Without loss of generality, we consider the following three cases:

1. f .z/ with z D g.x; y/. We need fx and fy ;

2. f .x; y/ with x D x.t/, y D y.t/. We need df =dt , as f is just a function of t .

3. f .x; y/ with x D x.u; v/, y D y.u; v/. We need fu and fv .

Case 1: f .z/ with z D g.x; y/. For example, with f .z/ D e z and z D x 2 C y 2 we get
f .x; y/ D e x Cy . Thus, fx D 2xe x Cy and fy D 2ye x Cy . The rule is:
2 2 2 2 2 2

@f @f @z @f @f @z
D ; D (7.6.1)
@x @z @x @y @z @y

Case 2: f .x; y/ with x D x.t/, y D y.t/. This is simple: df =dt D fx dx=dt Cfy dy=dt . When
t changes t, both x and y change by x  .dx=dt/t and y  .dy=dt/t, respectively.
These changes lead to a change in f :

@f dx @f dy
f  fx x C fy y D t C t
@x dt @y dt

Dividing by t and let it go to zero, we get the formula: df =dt D fx dx=dt C fy dy=dt .

Case 3: f .x; y/ with x D x.u; v/, y D y.u; v/. By holding v constant and using the chain rule
in case 2, we can write @f
@u
D @f @x
@x @u
C @f @y
@y @u
. Doing the same thing for @f
@v
, and putting these two

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 647

together, we have:
@f @f @x @f @y
D C
@u @x @u @y @u
(7.6.2)
@f @f @x @f @y
D C
@v @x @v @y @v
This rule can be re-written in a matrix form as:
2 3 2 32 3
@f @x @y @f
6 @u 7 6 @u @u 7 6 @x 7
7 6
6 7D6 7 (7.6.3)
4 @f 5 4 @x @y 5 4 @f 5
@v @v @v @y
We can generalize this to the case of a function of n variables, f .x1 ; x2 ; : : : ; xn / and the variables
depend on m other variables, xi D xi .u1 ; u2 ; : : : ; um / for i D 1; 2; : : : ; n, then we have
@f @f @x1 @f @x2 @f @xn
D C C  C .1  j  m/
@uj @x1 @uj @x2 @uj @xn @uj
X n
@f @xi @f @xi
D D (Einstein’s summation rule on dummy index i )
i D1
@xi @uj @xi @uj

7.7 Minima and maxima of functions of two variables


This section discusses the problem of finding the minima/maxima of the function z D f .x; y/.
We shall see that this prpblem is similar to the case of y D f .x/ with only one change: partial
derivatives are used. Stationary points are discussed in Section 7.7.1. The Taylor series of
multivariate functions is given in Section 7.7.2, and the multi index notation is presented in
Section 7.7.3. This notation provides us a convenient way to write partial derivatives and thus
we can write the Taylor series of any order for a function of n variables. Quadratic forms are
then discussed in Section 7.7.4. For a spring of stiffness k its potential energy is 1=2kx 2 , this
is a quadratic form even though the simplest one. For unknown reason, quadratic forms appear
again and again in physics and mathematics. And finally constrained minimization problems
(e.g. finding the minimum of x 2 C y 2 subjecting to x C y D 5) using Lagrange multiplier
method is the topic of Section 7.7.5.

7.7.1 Stationary points and partial derivatives


Consider a general function of two variable z D f .x; y/, of which the graph of such a function
is shown in Fig. 7.7. Similar to functions of one variable, at stationary points (which can be a
local minimum, local maximum, absolute minimum etc.) the tangent planes are horizontal. So,
at a stationary point .x0 ; y0 / the two first partial derivatives are zero (check Eq. (7.3.2) for the
equation of a plane if this is not clear):
fx .x0 ; y0 / D fy .x0 ; y0 / D 0 (7.7.1)

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 648

Figure 7.7: Graph of a function of two variables z D f .x; y/ with a colorbar representing the height
z. Using a colorbar is common in visualizing functions, especially functions of three variables. We can
quickly spot the highest/lowest points based on the color.

Saddle point. If we consider the function z D y 2 x 2 , the stationary point is .0; 0/ using
Eq. (7.7.1). But this point cannot be a minimum or a maximum point, see Fig. 7.8. We can see
that f .0; 0/ D 0 is a maximum along the x-direction but a minimum along the y-direction.
Near the origin the graph has the shape of a saddle and so .0; 0/ is called a saddle point of f .

Figure 7.8: Graph of z D y 2 x 2 which has a saddle point at .0; 0/.

Minimum or maximum or saddle point. For y D f .x/, we need to use the second derivative
at the stationary point x0 , f 00 .x0 /, to decide if x0 is a minimum or maximum or inflection point.
How did the second derivative help? It decides whether the curve y D f .x/ is below the tangent
at x0 (i.e., if y 00 .x0 / < 0 then x0 is a maximum point as we’re going downhill) or it is above the
tangent (i.e., if y 00 .x0 / > 0 then x0 is a minimum point). We believe this reasoning also applies
for f .x; y/. The difficulty is that we now have three second derivatives fxx ; fyy ; fxy not one!

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 649

The idea is to replace the general function f .x; y/ by a quadratic function of the form
ax C bxy C cy 2 to which finding its extreme is straightforward (using only algebra). The
2

means to do this is Taylor’s series expansion of f .x; y/, see Section 7.7.2, around the stationary
point .x0 ; y0 / up to the second order (as the bending of a surface depends on second order terms
only):
f .x; y/ f .x0 ; y0 / C fx .x0 ; y0 /.x x0 / C fy .x0 ; y0 /.y y0 /C
fxx .x0 ; y0 / fyy .x0 ; y0 /
C .x x0 /2 C .y y0 /2 C fxy .x0 ; y0 /.x x0 /.y y0 /
2 2
At stationary point .x0 ; y0 / the first two partial derivatives are zero, so the above is simplified to
fxx .x0 ; y0 / fyy .x0 ; y0 /
f .x; y/  .x x0 /2 C .y y0 /2 C fxy .x0 ; y0 /.x x0 /.y y0 /
2 2
where the constant term f .x0 ; y0 / was skipped as it does not affect the characteristic of the
stationary point (it does change the extremum value of the function, but we’re not interested in
that here).
Now, we can write f .x; y/ in the following quadratic form (we have multiplied the above
equation by two, and assume .x0 ; y0 / D .0; 0/, as if it is not the case we can always do a
translation to make it so):
f .x; y/ D ax 2 C 2bxy C cy 2 ; a D fxx .0; 0/; b D fxy .0; 0/; c D fyy .0; 0/ (7.7.2)
which can be re-written as
" 2  #
by ac b2
f .x; y/ D a xC C y2 (7.7.3)
a a2
from which we can conclude:
if a > 0 and ac > b 2 : f .x; y/ > 0 8x; y minimum at .0; 0/
if a < 0 and ac > b : f .x; y/ < 0 8x; y
2
maximum at .0; 0/
if ac < b :
2
the parts have opposite signs saddle point at .0; 0/
This is called a second derivatives test. Fig. 7.9 confirms this test. It is helpful to examine
the contour plot of the surfaces in Fig. 7.9 to understand geometrically when a function has a
min/max/saddle point. Fig. 7.10 tells us that around a max/min point the level curves are oval,
because going any direction will decrease/increase the function. On the other hand, around a
saddle point the level curves are hyperbolas (xy D c).
Often, as a means to remember the condition on the sign of D D ac b 2 , D is written as
the determinant of the following 2  2 matrix containing all the second partial derivatives of f :
" #
fxx fxy 2
D D det D fxx fyy fxy D ac b 2 (7.7.4)
fxy fyy
This matrix is special as it stores all the second derivatives of f .x; y/. It must have a special
name. It is called a Hessian matrix, named after the German mathematician Ludwig Otto Hesse
(22 April 1811 – 4 August 1874).

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 650

(a) z D x 2 C 10xy C y 2 (b) z D x 2 C xy C y 2 (c) z D x 2 C xy y2

Figure 7.9: Examples to verify the second derivatives test.

(a) z D x 2 C 10xy C y 2 (b) z D x 2 C xy C y 2

Figure 7.10: Contour plots of z D x 2 C 10xy C y 2 (saddle point) and z D x 2 C xy C y 2 (minimum


point).

7.7.2 Taylor’s series of scalar valued multivariate functions

We have Taylor’s series of functions of single variable y D f .x/. Of course, we also have
Taylor’s series for multivariate functions, z D f .x; y/ or h D f .x; y; z/ etc. First, we develop
a second order Taylor’s polynomial that approximates a function of two variables z D f .x; y/.

The second order Taylor’s polynomial has this general form T .x; y/ D a C bx C cy C
dxy C ex 2 C f y 2 . We find the coefficients a; b; c : : : by matching the function at .0; 0/ and all

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 651

derivatives up to second order at .0; 0/. The same old idea we have met in univariate calculus:
f .0; 0/ D T .0; 0/ H) a D f .0; 0/
fx .0; 0/ D Tx .0; 0/ H) b D fx .0; 0/
fy .0; 0/ D Ty .0; 0/ H) c D fy .0; 0/
fxy .0; 0/ D Txy .0; 0/ H) d D fxy .0; 0/
1
fxx .0; 0/ D Txx .0; 0/ H) e D fxx .0; 0/
2
1
fyy .0; 0/ D Tyy .0; 0/ H) f D fyy .0; 0/
2
Thus, the second order Taylor’s series for z D f .x; y/ is written as
1 
f .x/  f .0/ C fx .0/x C fy .0/y C fxx .0/x 2 C 2fxy .0/xy C fyy .0/y 2 (7.7.5)
2
Now, we rewrite this equation using vector-matrix notation, the advantage is that the same
equation holds for functions of more than 2 variables:
" # " #" #
h i x 1h i f .0/ f .0/ x
xx xy
f .x/  f .0/ C fx .0/ fy .0/ C x y
y 2 fxy .0/ fyy .0/ y
" #
1 x (7.7.6)
 f .0/ C rf > .0/x C x > H.0/x .x D /
2 y
1
 f .0/ C x > rf .0/ C x > H.0/x
2
where the linear part fx .0/x C fy .0/y is re-written as the dot productŽŽ of the gradient vector
and x. The quadratic term is re-written as x > Hx, as any quadratic form (to be discussed in
Section 7.7.4) can be written in this form. As the dot product of two vectors is symmetric, we
can write the linear term in another way as in the final expression.

7.7.3 Multi-index notation


In the previous section I intentionally wrote the Taylor series for y D f .x; y/ upto second order
terms. This was simply because it would be tedious to include higher order terms. For functions
of a single variable, we are able to write the Taylor series:
X
1
f n .x0 /
f .x/ D .x x0 /n
nD0

The question now is: can we have the same formula as above for y D f .x1 ; x2 ; : : : ; xn /? The
answer is yes and to that end mathematicians have developed the so-called multi-index, which
generalizes the concept of an integer index to an ordered tuple of indices.
ŽŽ
Refer to Section 11.1.2 if you need a refresh on the concept of the dot product of two vectors.

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 652

A multi-index ˛ D .˛1 ; ˛2 ; : : : ; ˛n / is an n-tuple of non-negative integers. For example,


˛ D .1; 2; 3/, or ˛ D .2; 3; 6/. The norm of a multi-index ˛ is defined to be

j˛j D ˛1 C ˛2 C    C ˛n

For a vector x D .x1 ; x2 ; : : : ; xn / 2 Rn , define


Y
n
˛
x D xi˛i
i D1

Also, define the factorial of a multi-index ˛ by


Y
n
˛Š D ˛i Š
i D1

Finally, to write partial derivatives we define the differential operator


       
˛ ˛ @ ˛1 @ ˛2 @ ˛n @ ˛i @˛i
@ DD D  ; WD ˛i (7.7.7)
@x1 @x2 @xn @xi @xi
With this multi-index notation, the Taylor series for a function y D f .x1 ; : : : ; xn / is given by
X
1
D ˛ f .x 0 /
f .x/ D .x x 0 /˛
˛Š
j˛jD0
P
To understand this, let’s consider y D f .x1 ; x2 /. The first term in the sum j˛jD0 is when
j˛j D 0, which is ˛ D .0; 0/. Then, ˛Š D 0Š0Š D 1, and D ˛ f D f according to Eq. (7.7.7).
The second term is when j˛j D 1, which can be ˛ D .1; 0/ or ˛ D .0; 1/. For the former, we
@f @f
then have D ˛ f D @x 1
, and for the latter D ˛ f D @x 2
. The third term, j˛j D 2: ˛ D .2; 0/,
˛ D .1; 1/ and ˛ D .0; 2/. We then have, using Eq. (7.7.7)
 If ˛ D .2; 0/, then D ˛ f D fx1 x1 ;
 If ˛ D .0; 2/, then D ˛ f D fx2 x2 ;
 If ˛ D .1; 1/, then D ˛ f D fx1 x2 ;

7.7.4 Quadratic forms


Quadratic forms are homogeneous polynomials of second degree. Let’s denote by x1 ; x2 ; x3
the variables, then the following are quadratic forms in terms of x1 ; x2 ; x3 (a1 ; a2 ; : : : are real
constants):
Q.x1 / D a1 x12
Q.x1 ; x2 / D a1 x12 C a2 x1 x2 C a3 x22 (7.7.8)
Q.x1 ; x2 ; x3 / D a1 x12 C a2 x1 x2 C a3 x1 x3 C a4 x2 x3 C a5 x22 C a6 x32

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 653

Note that we have not used the conventional x; y; z; instead we have used x1 ; x2 ; x3 . This is
because if we generalize our quadratic forms to the case of, let say 100, variables we will run
out of symbols using x; y; z; : : :
Now, we re-write this quadratic form Q.x1 ; x2 / D a1 x12 C a2 x1 x2 C a3 x22 as follows

X
2 X
2
Q.x1 ; x2 / D a11 x12 C a12 x1 x2 C a21 x2 x1 C a22 x22 D aij xi xj
i D1 j D1

so that it can be written using matrix-vector as :


" #" # " #!
h i a a x1 x
Q.x1 ; x2 / D x1 x2
11 12
D x > Ax x WD 1
a21 a22 x2 x2

So we have just demonstrated that any quadratic form can be expressed in this form x > Ax.
Let’s do that for this particular quadratic form Q.x1 ; x2 / D x12 C 5x1 x2 C 3x22 :
" #" # " #" #
h i 1 1 x h i 1 5=2 x
1 1
Q.x1 ; x2 / D x1 x2 D x1 x2
4 3 x2 5=2 3 x2

It is certain that we prefer the red matrix–which is symmetric i.e., a12 D a21 D 5=2–than the
non-symmetric matrix (the blue one). So, any quadratic form can be expressed in this form
x > Ax where A is a symmetric matrix. We need a proof because we used the strong word any
quadratic form, while we just had one example.

Proof. Suppose x > Bx is a quadratic form where B is not symmetric. Since it is a scalar, we get
the same thing when we transpose it:

x > Bx D x > B> x

Thus, we can write

1 >  1 
x > Bx D x Bx C x > B> x D x > B C B> x
2 2
The red matrix is our symmetric matrix A. 

Why quadratic forms? Because, for unknown reasons, they show up again and again in
mathematics, physics, engineering and economics. The simplest example is 1=2kx 2 , which is the
energy of a spring of stiffness k. There are more to say about quadratic forms in Section 11.10.6,
such as positive definiteness of a quadratic form.

If you’re not familiar with matrices, refer to Chapter 11.

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 654

7.7.5 Constraints and Lagrange multipliers

Now we consider a constrained minimization problem: find the minima (or maxima) of a function
z D f .x; y/ subject to the constraint g.x; y/ D k. You might be thinking: we can definitely
solve it by first solving for x in terms of y using the constraint equation, then substituting that x
into the function f .x; y/ to get a function of one variable (y). There are, however, many issues
with this approach. First, for complex g.x; y/ D k it is not possible to solve x in terms of y and
vice versa. Second, why eliminating x not y? We are destroying the symmetry of the problem.
Third, this technique is hard to apply (or impossible) for problems with many constraints. Ok,
we need a new way. And it is quite often true that new problems require new mathematics.
Joseph-Louis Lagrange (1736 – 1813), an Italian mathematician and astronomer, found that
new mathematics and herein we reproduce his method. Let’s start with an example of finding
the minima/maxima of z.x; y/ D x 2 C 2y 2 with the constraint x 2 C y 2 D 1. Wherever the
extremum points are they must be on the unit circle centered at the origin in the xy plane. On this
plane we also plot the level curves x 2 C 2y 2 D c, which are ellipses, where c ranges from zero
to infinityŽ . It is then clear that the extremum points of z.x; y/ are where the level curve touches
the constraint curve. From Fig. 7.11, we know that they are the points .˙1; 0/ and .0; ˙1/.

c%

max
1

y 0 min

rg
rf
2
2 1 0 1 2
x

(a) z D x 2 C 2y 2 and its level curves (b) x 2 C 2y 2 D c & x 2 C y 2 D 1

Figure 7.11: Graph of z D x 2 C 2y 2 and its level curves (a) and the level curves and the constraint (red)
curve x 2 C y 2 D 1 on a plane (b). So we start with the smallest ellipse x 2 C 2y 2 D c1 in (b), c1 cannot
be the minimum value of z.x; y/ as the constraint is not satisfied. So, we keep climping the mountain
z.x; y/ until we’re at the second smallest ellipse. let’s say x 2 C 2y 2 D c2 which touches the constraint
curve. Here we have two minima marked by “min” both giving the minimum of c2 of z.x; y/. If we
continue going up we will touch the constraint curve the second time: that’s where z.x; y/ attains the
maximum value.

At the touching point of two curves, the tangents are the same. In other words, the normal

Ž
Noting that z D x 2 C 2y 2  0 8x; y.

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 655

vectors are parallel:


8
<f D g
x x
rf .x; y/ D rg.x; y/; or (7.7.9)
:f D g
y y

where  is a real number. These are the two equations to solve for the three unknowns x; y; .
But do not forget the constraint x 2 C y 2 D 1. Three equations for three unknowns. Perfect.
Without constraints, the necessary condition for a function f .x; y/ to be stationary at .x0 ; y0 /
is rf .x0 ; y0 / D 0. With the constraint g.x; y/ D 0, we have instead Eq. (7.7.9). With a bit of
algebra, we can see the old criterion of zero gradient. Let’s introduce a new function L.x; y; /
as (L is called the Lagrange function)

L.x; y; / WD f .x; y/ .g.x; y/ k/ (7.7.10)

And we now compute the gradient of L:

rL D .fx gx ; fy gy ; g C k/

The condition rL D 0 then resembles Eq. (7.7.9) and g.x; y/ D k. So, by adding one more un-
known  to the problem, and building a new function L.x; y; /, Lagrange turned a constrained
minimization problem into an unconstrained minimization problem! This method is now known
as the Lagrange multiplier method and  is called a Lagrange multiplier. Once Eq. (7.7.10) has
been solved, we get possibly a few solutions .xN i ; yNi /; the maximum of f .xN i ; yNi / is the maximum
we’re looking for, and minimum of f .xN i ; yNi / is the minimum we sought for.
As an example, we consider the problem given in Fig. 7.11. Eq. (7.7.9) and the constraint
gives us the following system of equations to solve for x; y; :

2x D 2x; 4y D 2y; x2 C y2 D 1

From the first equation we either get x D 0 (which leads to y D ˙1 from the constraint) or
 D 1. From the second equation we obtain either y D 0 (which leads to x D ˙1 from the
constraint) or  D 2. So, we have 4 points .0; 1; 2/, .0; 1; 2/, . 1; 0; 1/, .1; 0; 1/. These points
are exactly the ones we have found graphically (shown in Fig. 7.11b). Evaluating f at these four
points:
f .0; 1/ D 2; f .0; 1/ D 2; f . 1; 0/ D 1; f .1; 0/ D 1
So the maximum of f is 2 at .0; ˙1/ and the minimum of f is 1 at .˙1; 0/.

Justifying Lagrange multipliers. We need a proof in addition to the above geometrical deriva-
tion. The constraint curve g.x; y/ D k can be parametrized as C.t/ W .x.t/; y.t//. Now, assum-
ing that P .x0 ; y0 / is a point on the surface z.x; y/ where the function is a (local) maximum. Of
couse P is on the constraint curve C , it is C.t0 /. We build the function h.t/ D f .x.t/; y.t//.
This function has a maximum at t0 , so we have
@f dx @f dy
h0 .t0 / D 0 ” .x0 ; y0 / .t0 / C .x0 ; y0 / .t0 / D 0 ” rf .P /  r.P / D 0
@x dt @y dt

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 656

Hence, at P , rf is perpendicular to the constraint curveŽ . But rg is also perpendicular to the


curve, thus rf and rg are parallel at P .
The geometric interpretation also reveals that the method still works for the function of
three variables f .x; y; z/ and the constraint g.x; y; z/ D k. Instead of level curves, we have
level surfaces. And only when the level surface f .x; y; z/ D c touches the level surface
g.x; y; z/ D k, we have a critial point at that touching point.

Meaning of the multiplier.

Two constraints. After one constraint is of course two constraints, and then multiple constraints.
For two constraints, we have to move to functions of three variables. Otherwise two constraints
g.x; y/ D c1 and h.x; y/ D c2 already decide what is the critical point. Nothing left for
Lagrange to do!
We start with a concrete example. Consider the function f .x; y; z/ D x 2 C y 2 C z 2 and
two constraints g.x; y; z/ D x C y C z D 9 and h.x; y; z/ D x C 2y C 3z D 20. Find the
maximum/minimum of f . The two constraints are two planes and they meet at a line C . Now we
consider different level surfaces of f .x; y; z/ D x 2 C y 2 C z 2 D c; they are concentric spheres
of radius c (centered at .0; 0; 0/). When we increase c from 0 we have expanding spheres, and
one of them will touch the line C at a point P . At that special point P , we have (for the reason,
check Fig. 7.6b):

the gradient of f is perpendicular to C


the gradient of g is perpendicular to C
the gradient of h is perpendicular to C

Therefore, all three vectors rf; rg; rh are in the same plane perpendicular to C . Thus, rf is
a linear combination of rg and rh. In other words,

rf D 1 rg C 2 rh (7.7.11)

With one constraint g.x; y/ D k we have rf D 1 rg. You can see the pattern here. The
method still works for any level surfaces g.x; y; z/ D c1 and h.x; y; z/ D c2 , not just planes.
Why? Because the gradient vector of a level surface at a point on the surface is perpendicular
to it at that point. Now, we build the Lagrange function L.x; y; z; 1 ; 2 / D f .x; y; z/
1 .g.x; y; z/ c1 / 2 .h.x; y; z/ c2 /. Setting the gradient of L to zero: rL D 0 will give
us Eq. (7.7.11) and the constraints g.x; y; z/ D c1 and h.x; y; z/ D c2 .
Let’s get back to the specific case of f .x; y; z/ D x 2 C y 2 C z 2 and two constraints
g.x; y; z/ D x C y C z D 9 and h.x; y; z/ D x C 2y C 3z D 20. Eq. (7.7.11) gives us:

2x D 1 C 2 ; 2y D 1 C 22 ; 2z D 1 C 32

Substitute these x; y; z into g.x; y; z/ D x Cy Cz D 9 and h.x; y; z/ D x C2y C3z D 20, we


obtain two equations to solve for 1 ; 2 : 1 D 2 and 2 D 2. Then, we get .x; y; z/ D .2; 3; 4/;
Ž
This result is nothing but the fact that the directional derivative along the constraint curve is zero at P .

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 657

that’s the point P we are looking for.

Proof of the AM-GM inequality. Still remember the AM-GM inequality that states
x1 C x2 C    C xn p
 n x1 x2 : : : xn
n
which was proved by Cauchy with his genius backward-forward induction method? Well, with
Lagrange and calculus, the proof is super easy. We demonstrate the proof for n D 3.
We consider the following function, with the constraint:
p
f .x/ D 3 x1 x2 x3 s.t x1 C x2 C x3 D c
Then we construct the Lagrange function L.x/:
p
L.x/ WD 3 x1 x2 x3 .x1 C x2 C x3 c/
And then, we can compute the derivatives of L with respect to x1 ; x2 ; x3 , then rL D 0 gives us
1
Lx1 D .x1 x2 x3 / 2=3 x2 x3  D 0
3
1
Lx2 D .x1 x2 x3 / 2=3 x1 x3  D 0
3
1
Lx3 D .x1 x2 x3 / 2=3 x1 x2  D 0
3
Solving this system of equations (easy) gives us x1 D x2 D x3 , then from the constraint
x1 C x2 C x3 D c, we get:
c
x1 D x2 D x3 D
3
p
Therefore, the maximum of f .x/ is .c=3/3 which is c=3 or 1=3.x1 C x2 C x3 /. In other words,
3

p x1 C x2 C x3
3
x1 x2 x3 
3

7.7.6 Inequality constraints and Lagrange multipliers

7.8 Integration of multivariable functions


This ’
section is about integration of z D f .x; y/ and z D f .x; y; z/. We start with double inte-
gral D f .x; y/dxdy in Section 7.8.1.” Then, we move to double integrals in polar coordinates
(Section 7.8.2). Next is triple integral D f .x; y; z/dxdydz (Section 7.8.3). Triple integrals
using spherical and cylindrical coordinates are treated in Section 7.8.4. The most important triple
integral in physics–Newton’s shell theorem–is given in Section 7.8.5. It is this theorem that we
can treat massive objects such as the sun as a point mass with all mass concentrated at its center.
Then, change of variables and the Jacobian are discussed in Section 7.8.6.
One application of double/triple integrals is center of mass and moment of inertia, which
is important in physics and engineering. Section 7.8.7 is devoted to such topics. And finally,
barycentric coordinates–which are closely related to center of mass–are the topic of Section 7.8.8.

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 658

7.8.1 Double integrals


Rb
In elementary calculus we know that a f .x/dx can be geometrically seen as the area of a 2D
region defined by x D a, x D b, y D 0 and y D f .x/. Its natural extension is a double integral
that measures a volume of a 3D region. To define this region, on the xy plane, consider a region
R and for every point x; y in R we compute f .x; y/–the height of the surface at this point.

Figure 7.12: Double integrals as volumes under a surface z D f .x; y/.

For 1D integrals we divide the interval Œa; b into many sub-intervals and compute the area
as a sum of the area of all the rectangles (Fig. 7.12). We do the same thing here: the region R
is divided into many rectangles xi yi . For a point .xi ; yi / inside this rectangle, we compute
the base f .xi ; yi / of a box (the 3D counterpart of a rectangle in 2D). Then, the volume is
approximated as the sum of all the volumes of these boxes; that is sum of f .xi ; yi /xi yi .
When there are infinitely many such boxes, we get the true volume and define it as a doubleŽ
integral:
Xn “
volume D lim f .xi ; yi /xi yi D f .x; y/dxdy (7.8.1)
n!1
i D1 R

To compute a double integral we proceed as shown in Fig. 7.13. First, we consider the plane
perpendicular to the x axis and we fix this plane, this plane intersects with the 3D region of
which the volume we’re trying Rto determine. The area of the intersection plane (crossed area in
the referred figure) is A.x/ D f .x; y/dy. Multiply this area with the thickness dx we get a
volume A.x/dx, and integrate this we get the sought-for volume:
“ Z "Z a
#
b Z Z b  a
f .x; y/dxdy D f .x; y/dy dx D f .x; y/dx dy (7.8.2)
R 0 0 0 0

And of course, we can do the other way around. That is why I also wrote the second formula.
Noting that the process has been simplified by considering a rectangle for R. In a general case,
the integration limits a and b are functions of y and x. The next example is going to show how
to handle this situation.
Ž
And that is how mathematicians use the notation with two integral signs.

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 659

Figure 7.13: Double integral as an integral (with x) of an integral with y.

Example 7.1
Compute the volume under f .x; y/ D x 2y and the base triangle (see Fig. 7.13b). Using
Eq. (7.8.2), we can write:
“ Z 1 Z 1 x  Z 1
 1 x
.x 2y/dxdy D .x 2y/dy dx D xy y 2 0 dx
R 0 0 0

And finally, “ Z 1
.x 2y/dxdy D . 2x 2 C 3x 1/dx D 1=6
R 0

7.8.2 Double integrals in polar coordinates


To demonstrate the fact that sometimes it is difficult to compute certain double integrals using x
and y, let’s us compute the mass of a semi-circular plate of unit radius and unit density (refer to
Section 7.8.7 if you’re not familiar with how a mass of a continuous object is computed):

“ Z "Z p # Z Z
1 1 x2 1 p =2

mD dxdy D 2 dy dx D 2 1 x 2 dx D2 cos2 udu D
R 0 0 0 0 2

Not hard but still a bit of work. Using polar coordinates (which suitable for circles) is so much
more easier. Using polar coordinates, double integrals are given by

“ “
f .x; y/dxdy D f .r cos ; r sin /rdrd (7.8.3)
R S

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 660

The key point is going from dxdy to rdrd not drd . A


not-rigorous proof is given here. We chop the integration domain
R into infinitely many tiny polar rectangles, the integral is then
written as the Riemann sum of the area of these polar rectangles
multiplied with the integrand evaluated at the centers of the rectan-
gles. So, we just need to compute the area of one polar rectangle,
which is rdrd (see figure).
Getting back to the problem of determining the mass of the
semi-circular plate, it is much easier with polar coordinates:
“ Z 1 Z 

mD rdrd D rdr d D
0 0 2
As another example of the usefulness of polar coordinates, let’s consider the following integral:
Z 1
2 p
AD e x dx D 
1

which was computed using polar coordinates, see Section 5.11.3 for details.

7.8.3 Triple integrals


R ’
At this point we have seen f .x/dx with
” differential line dx, f .x; y/dxdy with differential
area dxdy. We can naturally to write f .x; y; z/dxdydz, with differential volume” dxdydz.
We cannot associate this triple integral as a volume, but if f .x; y; z/ D 1, we get dxdydz
which is a volume.

7.8.4 Triple integrals in cylindrical and spherical coordinates


Certain integrals are easier to deal with using not Cartesian coordinates but cylinder and spherical
coordinates. This section presents these two coordinates. In a cylindrical coordinate system, we
specify a point by .r; ; z/, see Fig. 7.14. The differential volume dxdydz becomes rdrddz.
Without the z-component the cylindrical coordinates are polar coordinates. So, I do not treat
double integrals using polar coordinates explicitly.
In a spherical coordinate system, we specify a point by .; ; /, see Fig. 7.15;  is the
distance from the origin (similar to r in polar coordinates, and  is used instead of r to avoid
confusion with cylindrical coordinates),  is the same as the angle in polar coordinates and 
is the angle between the z-axis and the line from the origin to the point. If you still remember
Section 3.16.2, then the angle  is the longitude, which increases as we travel east around the
Equator. The angle  equals 0 at the North Pole and  at the South Pole.
Referring to the middle figure in Fig. 7.15, we’re now going to derive the relation between
Cartesian coordinates and spherical coordinates. In the right triangle OPQ we have PQ D
 cos , and PQ is the z coordinate of P . We also have OQ D  sin . Then, in the Oxy plane
that we’re familiar with we can compute x and y (I also include z):
x D OQ cos  D  sin  cos ; y D  sin  sin ; z D  cos 

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 661

z z

D dr
r
S A
P P
dz
C
R B
rdθ
z z Q

O y y

r
θ θ

x x

(a) (b)

Figure 7.14: Cylindrical coordinates .r; ; z/: simply a 3D version of the polar coordinates. The differen-
tial volume is rdrddz, which is of dimension of length cubed (r, dr, dz).

That’s how to convert between Cartesian and spherical coordinates. How we know the above is
correct? Check x 2 C y 2 C z 2 D 2 is the way.
The differential volume dxdydz becomes 2 sin ddd, see Fig. 7.16. This differential
volume is of dimension of length cubed (2 , d).

Figure 7.15: Spherical coordinates .; ;  /.

Volume of a sphere of radius R. It is now quite straightforward to compute the volume of a

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 662

ρ
ρdϕ
ϕ dϕ

θ dθ y

ρ sin ϕ
x

Figure 7.16: Spherical coordinates. The differential volume dxdydz becomes 2 sin ddd.

sphere using spherical coordinates and a triple integral for the function f .x; y; z/ D 1:
• Z R Z  Z 2
2 2 R3 4R3
V D  sin ddd D  d sin d d D .2/.2/ D
0 0 0 3 3
What’s next is one of the most important triple integrals in the history of physics: Newton’s
gravitational attraction.

7.8.5 Newton’s shell theorem


Newton’s gravitation theory states that every object is pulling other object and for two objects of
masses M and m they are pulled by a force that is proportional to the product of their masses and
inversely proportional to the square of the distance r between them. In mathematical symbols,
his law is expressed as (Section 7.10.8):
GM m
F D
r2
So, when applied to massive objects (such as the sun and planets), it is
implied by this theory that regardless of the massive size of the sun (and
any planets) it acts as if the mass of the Sun was all concentrated at its
center. And that is the Newton shell theorem. Newton proved that the
gravitational pull of a sphere (has a total mass M ) on a point mass m
locating at a distance D from its center is given by GM m=D 2 as if the
mass of the sphere was all concentrated at its center. Herein, we use triple
integral with spherical coordinates to prove this theorem. For simplicity,
we prove the equivalent theorem regarding the potential energy U . That
is, we’re going to prove that U D GM m=D, see Eq. (7.11.16).
Proof. Denoting the density of the sphere by N (i.e., N D M=V with V D 4=3R3 ) and consider
an infinitesimal volume dV locating at a distance q from the point mass m, which is located at

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 663

.0; 0; D/, the gravitational potential energy is given by (refer to Section 7.11.4 for a discussion
on the concept of gravitational potential energy)
• • 2
GmdVN  sin ddd
Usphere D D GmN (7.8.4)
q q
where in the second equality, we used spherical coordinates. Using the law of cosines or the
generalized Pythagorean theorem, we can compute q in terms of D,  and :

q 2 D D 2 C 2 2D cos  D u (7.8.5)


p
Assume for now that  is fixed, from Eq. (7.8.5) we have du D 2D sin d and q D u.
With this, Eq. (7.8.4) becomes:

GmN dudd
Usphere D p
2D u
Z R Z Z 
GmN du 2
D p d d (7.8.6)
2D 0 u 0
  Z R
GmN
D .2/ Œ.D C / .D / d
D 0
R 2 R p
where 2 comes from 0 d, the for the integral du= u, the limits are .D /2 with  D 0
and .D C /2 with  D ; see Eq. (7.8.5). Finally, we do integration along the  direction:
  Z R  
GmN GmN R3 GM m
Usphere D .4/ 2
 d D .4/ D (7.8.7)
D 0 D 3 D


7.8.6 Change of variables and the Jacobian



Assume that we need to evaluate this double integral R .3x C 6y/2 dA where R is the shaded
region bounded by the four straight lines shown in Fig. 7.17 (left). Even though it is possible to
directly calculate this integral, it is tedious. We can use a change of variables as shown in the
figure to simplify the integral. Indeed, the integration limits are now constants.
Another example of change of variables is given in Fig. 7.18. This is to demonstrate that
straight edges in the uv plane can be transformed to curves in the xy plane. What are the
transformations? The four curves in the xy plane are given by
x x
xy D 1; xy D 4; D 1; D4
y y
Thus, the region R0 is R0 W f.x; y/ W 1  x=y  4; 1  xy  4g. Therefore, if we consider the
following transformation:
x
u D xy; v D
y

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 664

y
−2
u = x + 2y
v = x − 2y
x+ R′
y =2 2y
=2
+2
−x
R
x −2 2 u
−x
−2 2
y= 2 y=
2 x− u+v
x=
area=4 2 area=16
u−v
y=
4
v 2

Figure 7.17: Change of variables x D f .u; v/, y D g.u; v/.


v y
v=4 y = 4/x
y=x

y = 1/x
R
u=1 u=4
y = x/4
R′
v=1

u x

Figure 7.18: Straight edges in the uv plane can be transformed to curved edges in the xy plane.

we can transform R0 to R in the uv plane shown in Fig. 7.18. You can consult this geogebra link
to play with 2D transformation.
Actually we have seen change of variables before: integration by substituion in Section 4.7.7
and double integrals using polar coordinates in Section 7.8.2. We again believe in patterns
and search for a formula for double and triple integrals based on single integrals and the polar
coordinates Eq. (7.8.3). So, we put them together in the below equation:
Z xDg.b/ Z b
F .x/dx D F .g.u//g 0 .u/du; x D g.u/
“ “
xDg.a/ a

F .x; y/dxdy D F .f .u; v/; g.u; v//dudv


R0
• R

F .x; y; z/dxdydz D F .f .u; v; w/; g.u; v; w/; ; h.u; v; w//dudvdw
R R0

And our task is to find the unknown red box which plays the role of g 0 .u/ when we replace
dx by du. For double integrals, this quantity is denoted by Juv and called the Jacobian of
the transformation from the uv plane to the xy plane. What should Juv be? From the first
Rb
equation in the above for a f .x/dx, we guess Juv should be a function of fu ; fv ; gu ; gv i.e.,
all the first derivatives of f and g. If you know linear algebra, precisely linear transformations
(Section 11.6), you’ll see that Juv is the determinant of a matrix containing all these 1st

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 665

derivatives. In what follows we explain where this matrix comes from. We note in passing that
for completeness we have included triple integrals, but we do not have to consider double and
triple integrals separately. What works for double integrals will work for triple integrals.

Local linearity of transformations and the Jacobian matrix. Let’s come back to the transfor-
mation in Fig. 7.17. That is a linear transformation from a square in the uv plane to a rhombus
in the xy plane (check Section 11.6 if that term i.e., linear transformation is new to you), and
the equation of the transformation is
" # " #" #
1 1
x C u
D 21 1
2 (7.8.8)
y 4 4
v

Thus, from linear algebra, the area of the rhombus is the area of the square (which is 16) scaled
with the absolute of the determinant of the red transformation matrix (which is j 1=4j): that
area is then 16  1=4 D 4, which is correct (computed by the standard way of plane geometry,
see Fig. 7.17-left).
But most of usual transformations are nonlinear (Fig. 7.18 is one of them: lines are trans-
formed to curves). In that case, how can we use linear transformations to find the area? The
answer is: linear approximations turn a curve to a line (tangent), a square to a parallelogram,
then the theory of linear transformations can be used.
Let’s consider the following transformation:
" # " # " #
2 2
x f .u; v/ u v
D D
y g.u; v/ 2uv

Now we consider small (i.e., infinitesimal) changes in u and v namely u and v, and see how
x and y change:
" # " # " # " #" #
2 2 2 2
.u C u/ .v C v/ u v 2uu 2vv 2u 2v u
 D
2.u C u/.v C v/ 2uv 2uv C 2vu 2v 2u v

As can be seen, since for infinitesimal changes .u/2 and .v/2 are negligible, we have obtained
an approximation to a change in f and g in terms of a matrix containing the four partial
derivatives: fu D 2u; fv D 2v; gu D 2v; gv D 2u. This matrix is special and it has a name:
the Jacobian matrix, named after the German mathematician Carl Gustav Jacob Jacobi (1804 –
1851). Generally, we then have:
" # " #" #
dx f f du
D u v (7.8.9)
dy gu gv dv

where the matrix is the Jacobian matrix. Globally the transformation is nonlinear but locally
(when we zoom in) the transformation is linear.
To find Juv , considering a point .u0 ; v0 / and a rectangle of sides du and dv with one vertex
at .u0 ; v0 /, see Fig. 7.19. The vector .du; 0/ becomes .fu du; gu du/ according to Eq. (7.8.9)

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 666

whereas the vector .0; dv/ becomes .fv dv; gv dv/. The rectangle in the uv-plane has an area
of dudv whereas the transformed rectangle, which is a parallelogram, has an area of .fu gv
fv gu /dudv. Thus,
ˇ " #ˇ ˇ ˇ
ˇ ˇ
ˇ fu fv ˇ ˇˇ @f @g @g @f ˇˇ
D ˇdet ˇD (7.8.10)
gu gv ˇ ˇ @u @v @u @v ˇ
Juv
ˇ

As the determinant can be positive, zero and negative, we needed to use its absolute value.

x = f (u, v)
y = g(u, v) fv dv
(u0, v0 + dv)

gv dv
gu du

(u0, v0) (u0 + du, v0) (x0, y0)


fudu

Figure 7.19: Finding the Jacobian of the transformation Juv .

Ok. How are we sure that our Juv is correct? The answer is easy: just apply it to a case that
we’re familiar with: polar coordinates. In polar coordinates we use r;  which are u; v:
) ˇ " #ˇ
ˇ ˇ
x D r cos  ˇ cos  r sin  ˇ
H) Juv D ˇdet ˇDr
y D r sin  ˇ sin  r cos  ˇ
Thus dxdy D rdrd.
We come back to the problem in Fig. 7.17. The determinant of the transformation is given
by " #
1=2 1=2 1
det D
1=4 1=4 4
Therefore, Juv D 1=4 and,
“ “ Z 2 Z 2
2 2 9
.3x C 6y/ dxdy D 9u jJuv jdudv D u2 dudv D 48
R R0 4 2 2

For 2D integrals Juv is related to the determinant of a 2  2 matrix, and thus for 3D integrals,
it is related to the determinant of a 3  3 matrix containing all the nine first partial derivatives:
ˇ 2 3ˇ
ˇ fu fv fw ˇˇ
ˇ
ˇ 6 7ˇ
Juv D ˇdet 4gu gv gw 5ˇ (7.8.11)
ˇ ˇ
ˇ hu hv hw ˇ

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 667

which should not be a surprise. And of course we check the correctness of this 3D Juv by
applying it to triple integrals using spherical coordinates. We don’t provide details, one just needs
to know how to compute the determinant of a 3  3 matrix. That determinant in Eq. (7.8.11) for
spherical coordinates i.e.,

x D  sin  cos ; y D  sin  sin ; z D  cos 

should be 2 sin ddd.

7.8.7 Masses, center of mass, and moments


To introduce the concept of center of mass, let’s us first consider a system of two masses. We
can write Newton’s 2nd law (check Section 7.10 for detail) for these two masses as

Newton’s 2nd law for mass 1: F ext


1 C F 12 D pP1
ext
(7.8.12)
Newton’s 2nd law for mass 2: F 2 C F 22 D pP 2

where F ext ext


1 is the external force applied on mass 1; F 2 is the external force applied on mass
2; F 12 is the force applied on mass 1 due to mass 2; p 1 is the linear momentum of mass 1. By
summing these two equations, we obtain the total momentum of the system (and its time rate of
change):
p D p 1 C p2 H) pP D pP 1 C pP 2 (7.8.13)
Using Eq. (7.8.12) and Newton’s third law which states that F 12 D F 21 , these two forces
cancel out leaving us only the external forces in p:
P

pP D F ext ; .F ext WD F ext ext


1 CF2 / (7.8.14)

We can generalize this to a system of any number of masses to have pP D F ext .


If we throw a basket ball in the air, we will see that it falls in a parabola trajectory. If we
throw a bunch of balls connected by strings, we also observe a parabola. But what is moving
along this parabola? The concept of center of mass answers this question.
Assuming that the mass of n balls are m1 ; m2 ; : : : ; mn and the position vector of them are
r 1 ; r 2 ; : : : ; r n . If we define R CM –the position vector of the center of mass of all basket balls–as
Pn
m1 r 1 C m2 r 2 C    C mn r n mi r i
R CM D D Pi D1
n (7.8.15)
m1 C m2 C    C mn i D1 mi

we then can write the system momenta as if all the mass is concentrated on this center of mass:

p D m1 rP 1 C m2 rP 2 C    C mn rP n
X
D MR P CM ; M D mi (7.8.16)
i

By differentiating p D M R P CM with respect to time we get F ext D M R R CM . This equation is


significant as it implies that the center of mass moves exactly as if it were a single particle of

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 668

mass M (the mass of the whole system), subject to the net external force on the system. This is
why we can treat extended objects such as planets as if they were point particles.
Even though the math is simple, how we know beforehand the introduction of the center of
mass will be useful? We might not know. The idea is to reduce a complicated problem (involving
many particles for example) to the simple problem of a single particle that we’re familiar with.
In a Cartesian coordinate system, the (position) of the center of mass is given by
8̂ mi xi
ˆ
ˆ xCM D
P ˆ
< M
i mi r i mi yi
R CM D H) yCM D (7.8.17)
M ˆ
ˆ M
ˆ
:̂ zCM D mi zi
M

We can appreciate the usefulness of vector notation; an equation using this notation is really
three equations, one for each of the three directions. WeP note by passing that we have used
the Einstein summation in writing mi yi=M without the symbol, see Section 11.2 for detail.
Mathematicians call R CM a convex combination (less jargon is a weighted average of r i ).
Let’s play with Eq. (7.8.17) and surely something fun will come to us. We now shall consider
only the x direction, because if we can understand that one, we can understand the other twos.
Now, assume that the object is divided into little pieces (N such pieces), all of which has the
same mass m. Then, P P P
i mi xi m i xi xi
xCM D D D i
M mN N
In words, xCM is the average of all the x’s, if the masses are equal. Now, suppose we have only
two masses, and one mass is 2m and the other is m. Then we have xCM D .2x1 C1x2 /=3. In other
words, every mass being counted a number of times proportional to the mass. From that it can
be seen that xCM is somewhere larger than the smallest x and smaller than the largest x. That
holds for yCM and zCM . Thus, the CM lies within the envelope of the masses (Fig. 7.20).

y
m4

P
m1 mi xi
x1 < xCM = Pi < x4
CM i mi
P
m3 mi yi
y1 < yCM = Pi < y4
i mi
yCM
m2

x
x1 x2 xCM x3 x4

Figure 7.20: The center of mass of n masses lie within the envelope of the masses.

Center of mass of solids. What is the center of mass of a continuous object; e.g. a steel disk?

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 669

Of course, integral calculus is the answer. The sums in Eq. (7.8.17) become integrals
• •
1
x CM D x .dxdydz/; M D .dxdydz/ (7.8.18)
M „ ƒ‚ …
dm

where  is the density. Thus for objects with density that does not vary from point to point, the
geometric centroid and the center of mass coincide.

Recall that for a particle of mass m, its moment of inertia with respect to an axis
Pis I D 2mr ,
2

see Section 11.1.5. Extending this to a system of N particles, we will have I D ˛ m˛ r˛ and
to a continuum we have dI D r 2 d m, and thus:
Z •
2 2
Iz D .x C y /dV D .x 2 C y 2 /dxdydz (7.8.19)
B

And this is the moment of inertia of a solid B when it is rotating wrt the z-axis. Similarly, wrt
these other two axes, we have:
Z
Ix D .y 2 C z 2 /dV
ZB (7.8.20)
2 2
Iy D .x C z /dV
B

Now, if we consider plane figures i.e., objects of which the thickness is negligible compared
with other dimensions, we can see that z D 0 in Eq. (7.8.20), and thus
Z Z Z
Iz D .x C y /dA D x dA C y 2 dA D Iy C Ix
2 2 2
(7.8.21)
B B B

All are two dimensional integrals as signified by dA. When  D 1, we have


Z Z
Ix D y dA; Iy D x 2 dA
2
(7.8.22)
B B

which are known as the second moments of inertia. The second moment of area is a measure of
the ’efficiency’ of a shape to resist bending caused by loading perpendicular to the beam axis
(Fig. 7.21). It appeared the first time in Euler–Bernoulli theory of slender beams.

Example 7.2
Determine the center of gravity and moment of inertia of a semi-circular disk of radius a made
of a material with a constant density .
First we compute the mass. It is given by (Eq. (7.8.18) and use polar coordinates)
“ Z a Z 
a2
M D rddr D  rdr d D 
0 0 2

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 670

F
y

r
Ix D y 2 dxdy nge
stro
Z Z !
b=2 h=2
h x D y 2 dy dx
b=2 h=2
F
bh3
D
12
b

Figure 7.21: The second moment of area is a measure of the ’efficiency’ of a shape to resist bending
caused by loading perperdicular the beam axis. For example, a beam of rectangular cross section has
Ix D bh3 =12, thus the height of the beam h decides the bending resistance. That’s why the top beam is
stronger than the bottom one.

Then we determine the center of gravity (due to symmetry, only the y-component is non-zero)
“ “ Z a Z 
1 1 2  2 4a
yCM D yrddr D r sin ddr D r dr sin d D
M M M 0 0 3

And the moment of inertia is given by:


“ “
2
Iy D  x rdrd D  r 3 cos2 drd
Z a Z 
3 1 C cos 2 a4
D r dr d D 
0 0 2 8

Fig. 7.22 presents a summary of how to determine the center of mass for discontinuous
(a) and continuous objects (b). Particularly interesting is the way how the center of mass of a
compound object is determined. In Fig. 7.22(c), we have an object consisting of two rectangles.
As we can treat each rectangle as a point mass with its center of mass already known, Fig. 7.22(c),
the CM of the compound object can be computed using Eq. (7.8.17). When the thickness of
the compound object (t ) is constant and it has a uniform density (i.e., the density is the same
everywhere), we can convert from mass to area (A), and obtain the following equation
P
x i Ai
x CM D Pi (7.8.23)
i Ai

for the CM of any 2D compound solid. The shape in Fig. 7.22(c) is the cross section of a T-beam
(or tee beam), used in civil engineering. Thus, civil engineers use Eq. (7.8.23) frequently.
In many cases, we remove material from a shape to make a new one, see Fig. 7.23. In that
case, the CM of the object is given by
x 1 A1 x 2 A2
x CM D (7.8.24)
A1 A2

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 671


a y ○
b
P
mi xi
xCM = Pi R
mi mi ρxdV
P i xCM = RV V, ρ
mi y i ρdV
yCM = Pi V
i mi xi


c

1
m1 y 1 + m2 y 2
yCM =
m1 + m2
y1 A1 y1 + A2 y2
=
A1 + A2
yCM
y2

2 (mi = ρAi t)

Figure 7.22: Center of mass: from particles (a) to continuous objects (b) and compound objects (c).

Figure 7.23: Center of mass of objects obtained by material removal.

Proof of Eq. (7.8.23). Assume that we have a compound object V which is consisted of two sub-
domains: V1 with m1 and V2 with m2 . Now, we can write the center of mass of the compound
object as
R R R
xd V V xdV C V2 xdV
x CM D V D 1
m1 C m2 Z m1 C m2 Z
1 1
D xdV C xdV
m1 C m2 V1 m1 C m2 V2
R R
m1 V1 xdV m2 V2 xdV
D C
m1 C m2 m1 m1 C m2 m2
The red terms are CMs of sub-domains 1 and 2. Thus, we can treat the sub-domains as points
and compute the CM using the familiar formula. 
Example 2. Determine the moment of inertia of a rod of length L with  D 1 with respect to
various point: the left extreme A and the center O (Fig. 7.24). Could you guess which case has

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 672

a lower moment of inertia?


ω ω ω
dx dx
A A
O x t
x
L
L
a) b)

Figure 7.24: Moment of inertia of a thin rod of length L.

As the rod is very thin, we only have 1D integrals. So, the moments of inertia w.r.t A and O
are, see Fig. 7.24a:
Z L Z L=2
2 L3 L3
IA D x dx D ; IO D x 2 dx D (7.8.25)
0 3 L=2 12

And the fact that IA > I0 indicates it is easier to turn the rod around O –its center of gravity.
This is consistent with our daily experiences.
Now, if we ask the following question various interesting things would show up. About
which point along the rod, the moment of inertia is minimum? Let’s denote I.t/ the moment of
inertia w.r.t a point located at a distance t from A. We can compute I.t/ as, Fig. 7.24b:
Z L Z L Z L Z L
2 2 2 L3
I.t / D .x t/ dx D x dx C t dx 2 xtdx D C t 2L tL2
0 0 0 0 3
And differential calculus helps us to find t such that I.t/ is minimum:
d I.t/ L
D 2tL L2 D 0 H) t D (7.8.26)
dt 2
The first thing to notice is that instead of integrating and then differentiating, we can do the
reverse. That is we differentiate the function in the integral and then do the integration:
Z L
d I.t/ d .x t/2
D dx
dt dt
Z L
0

D 2 .x t/dx D L2 C 2tL
0

And we have got the same result. So, there must be a theorem about this. It is called Leibnitz
rule for differentiating under the integral sign:
Z b Z b
d I.t/ d f .x; t/
I.t/ D f .x; t/dx H) D dx (7.8.27)
a dt a dt
Parallel axis theorem. In the problem of the calculation of the moment of inertia of a rod
of length L, we have IA D L3=3 and IO D L3=12. If we ask this question: what is the relation

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 673

between these two quantities, we will get something interesting. Let’s first compute the difference
between them:
L3 L3 L3
IA IO D D
3 12 4
And this difference must depend on the distance between A and O which is L=2, thus we write
 2
L3 L
IA IO D D L
4 2

Now, we anticipate the following result: if O 0 is at a distance d from the CM O, the moment of
inertia wrt to O 0 is given by:
IO 0 D IO C d 2  L
Next, we extend this result to 3D objects and obtain the so-called parallel axis theorem, which
facilitates the calculation of the moment of inertia of a solid about an arbitrary axis: we just need
to compute the moment of inertia wrt the CM and use this theorem if we need the moment of
inertia wrt any axis.

Figure 7.25: Parallel axis theorem: two parallel axes, one passing through the CM and the other is a
2 2
distance d away: d 2 D xCM C yCM .

We consider an object B with density  (Fig. 7.25). A set of coordinate axes is used where
O is at the origin. In this coordinate system, the center of mass of the object is located at
.xCM ; yCM ; zCM /. Let ICM be the moment of inertia of B with respect to an axis passing through
CM. Now we’re determining the moment of inertia Iz w.r.t. an axis passing through O by
considering an infitesimal d m D dV locating at .x; y/:
Z
Iz D .x 2 C y 2 /dV
ZB
 
D  .xCM C x 0 /2 C .yCM C y 0 /2 dV
ZB Z Z Z
D .xCM C yCM /dV C .x C y /dV C 2xCM x dV C 2yCM y 0 dV
2 2 02 02 0
B B B B
2
D Md C ICM C 0 C 0
(7.8.28)

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 674

And that gives us the parallel axis theorem that states:

Iz D ICM C Md 2 (7.8.29)

You can find ICM for many common solids in textbooks, and from that, the parallel axis theorem
allows us to compute the moment of inertia about an arbitrary axis.
But wait why the blue integrals in Eq. (7.8.28) are zero? This is due to one property of the
CM: Z
xdV Z Z
xCM D Z B
H) .x xCM /dV D 0 H) x 0 dV D 0
B B
dV
B
Actually we know this result without realizing it, see Table 7.3.
P
Table 7.3: i .xi N D 0 where xN is the arithmetic average of xi s.
x/

xi xN xi xN

1.0 3.0 -2.0

2.0 3.0 -1.0

3.0 3.0 0.0

4.0 3.0 1.0

5.0 3.0 2.0

7.8.8 Barycentric coordinates


In this section the barycentric coordinates, discovered by the German mathematician and theo-
retical astronomer August Ferdinand Möbius (1790 – 1868), are presented. These coordinates
are based on the concept of the center of mass in physics. Let’s consider three point masses mA ,
mB and mC placed at the three vertices of the triangle ABC with coordinates xA ; xB ; x C . We
know that its center of mass is point P :
mA mB mC
xP D xA C xB C xC
M M M
with M D mA C mB C mC .
Conversely, given a triangle ABC , what masses/weights must be put at the vertices to balance
at some point Q? The solution to this problem defines a new coordinate system relative to the
given positions A; B and C : it is possible to locate a point P on a triangle with three numbers
.1 ; 2 ; 3 /. These three numbers are called the barycentric coordinates of P . The barycentric

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 675

coordinates of a point relative to a triangle are the masses that we would have to place at the
vertices of the triangle for its center of mass to be at that point.
We, then have:
1 D 1 C 2 C 3
x D 1 xA C 2 xB C 3 xC (7.8.30)
y D 1 yA C 2 yB C 3 yC
The second and third equations convert the barycentric coordinates to Cartesian coordinates.
They are just Eq. (7.8.17).
Now, we need to determine the barycentric coordinates of the three vertices. It is straightfor-
ward to see that the barycentric coords of A is .1; 0; 0/: using Eq. (7.8.30) with .1; 0; 0/ results
in .xA ; yA /. Another way to see this is: the only way so that the center of mass is at A is when
mA is very large compared with mB and mC ; thus 1 D mA=M D mA=mA D 1. Similarly, the
coords of B are .0; 1; 0/ and of C are .0; 0; 1/.
From that we can see that every point on the edge BC has 1 D 0 (this makese sense as the
only case where the center of mass is on BC is that the mass at A is zero). The point is within
the triangle if 0  1 ; 2 ; 3  1. If any one of the coordinates is less than zero or greater than
one, the point is outside the triangle. If any of them is zero, P is on one of the lines joining the
vertices of the triangle. See Fig. 7.26.
C(0, 0, 1) C

ξ2 = 0 ξ1 = 0

P (ξ1 , ξ2, ξ3 ) P (ξ1 , ξ2, ξ3 )


B(0, 1, 0) B
ξ3 = 0
A(1, 0, 0) C A C
P (ξ1 , ξ2 , ξ3)
ξ1 = 0 area P BC
ξ1 =
ξ1

area ABC
=
1/
3

P
B B

A A

Figure 7.26: Barycentric coordinates .1 ; 2 ; 3 / of points in a triangle.

Next, we’re showing that the line 1 D a (e.g. 1 D 1=3) is parallel to the edge BC or the
line  D 0. Using Eq. (7.8.30) with 1 D 1=3 (hence 2 C 3 D 2=3), we can obtain .x; y/ as
1 1 2
x D xA C 2 xB C 3 xC D xA C xC C 2 .xB xC /
3 3 3 (7.8.31)
1 1 2
y D yA C 2 yB C 3 yC D yA C yC C 2 .yB yC /
3 3 3
Phu Nguyen, Monash University © Draft version
Chapter 7. Multivariable calculus 676

We have learnt in Section 11.1.3 that the above line has the direction vector xB x C , which is
edge BC . Therefore, the line 1 D 1=3 is parallel to BC .
Now, we carry out some algebraic manipulations to xP to show that there is nothing entirely
new about barycentric coordinates. To this end, we replace 1 by 1 2 3 , and we compute
xP xA which is the relative position of P wrt A:
xP xA D Œ.1 2 3 /xA C 2 xB C 3 x C  xA D 2 .xB xA / C 3 .x C xA /
Or,
! ! !
AP D 2 AB C 3 AC (7.8.32)
So, if we use the vertex A as the origin and two edges AB and AC as the two basic vectors,
we have an oblique coordinate system, and in this system, any point P is specified with two
coordinates .2 ; 3 / is simply a linear combination of these two basic vectors with the coefficients
being 1 and 2 .
One question arises: why don’t we just use Eq. (7.8.32)? If we look at this equation carefully,
one thing comes to us: it is not symmetric! Why A is the origin? How about B and C ? On the
other hand, with the barycentric coordinates .1 ; 2 ; 3 /, everything is symmetric. There is no
origin!

Geometrical meaning. The point P divides the triangle ABC into three sub-triangles PBC ,
PAB and PAC . It can be shown that the barycentric coordinates .1 ; 2 ; 3 / are actually the
ratio of the areas of these sub-traingles with that of the big triangle:
area of PBC area of PAC area of PAB
1 D ; 2 D ; 3 D
area of ABC area of ABC area of ABC
One way to prove this is to use Eq. (11.1.21) to compute the areas of PBC and ABC noting
that the Cartesian coords of P is 1 xA C 2 xB C 3 xC . Because of this property that .1 ; 2 ; 3 /
are also called the areal coordinates.

7.9 Parametric surfaces


Functions of the form r.t/ D .f .t/; g.t/; h.t// having one variable for input and a vector for
output, are called single-variable vector-valued functions. Such single-variable vector-valued
functions can be denoted as r W R ! R3 . And the graph of such functions is a 3D curve. One
can see that this function actually transforms a line segment lying on the number line to a curve
in a 3D space (Fig. 7.27).
And in the same manner, a two-variable vector-valued functions defined as r W R2 ! R3
describes a surface. This function transform a domain in the u v plane to a surface living in a
3D space (Fig. 7.27). We actually do this kind of transformation in our daily lives when we roll
a paper to make a cylinder, see for example (Fig. 7.28). If we use a ruled paper i.e., one with two
sets of perpendicular lines, in the transformed paper, these lines become curves which are called
grid curves. The surface functions take two variables, u, and v, because a parametric surface can
be a seen as a "warped" version of a rectangular grid. The vector function "warps" this grid into
a three-dimensional surface.

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 677

Figure 7.27: Parametric curves and parametric surfaces.

(a) (b)

Figure 7.28: Paper rolling is a transformation r W R ! R3 .

7.9.1 Parametric representation of common solids


Let’s consider a cylinder with a radius a and height h in which the base is in the xy plane
(Fig. 7.28b). The parametric representation of such cylinder is

x.u; v/ D a cos u
y.u; v/ D a sin u (7.9.1)
z.u; v/ D v

which is an extension of the parametric representation of circles. The domain of this function
is Œ0; 2  Œ0; h for a full cylinder. One advantage of the parametric representation is that it is
easy to generate the surface: simply limit u to 0;  then we will have a half cylinder.
From spherical coordinates, we immediately have the parametric representation for a sphere

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 678

of radius :

x.; / D  sin  cos 


y.; / D  sin  sin ;  2 Œ0; ;  2 Œ0; 2 (7.9.2)
z.; / D  cos 

Some plots of sphere are given in Fig. 7.29. Sometimes you see x.; / D  sin u cos 2v
with .u; v/ 2 Œ0; 1  Œ0; 1. This form is obtained by normalizing the two angle parameters, so
that they vary in the unit interval.

(a) 0    =2, 0     (b) 0    =2, 0    =2

Figure 7.29: Plots of sphere using a parametric representation.

Our next destination is the paraboloid z D x 2 C y 2 . The first parametric representation is


obtained when we use x and y as the parameters:

x.u; v/ D u
y.u; v/ D v
z.u; v/ D u2 C v 2 ; u; v 2 Œ a; a

And a plot using this equation is given in Fig. 7.30a. It is not nice at all! Furthermore, it is
impossible to plot part of the paraboloid. We need a better parametric representation. Noting that
this paraboloid is a surface of revolution obtained by revolving the curve z D y 2 (or z D x 2 )
around the z axis 2. The level curves are circles of the form x 2 C y 2 D r, thus the parametric
representation is:

x.u; v/ D v cos u
y.u; v/ D v sin u
z.u; v/ D v 2 ; u 2 Œ0; 2; v 2 Œ0; a

In Figs. 7.30b and 7.30c I plot two paraboloids using this parametric representation.
We now consider an ellipsoid obtained by revolving the ellipse x 2 =a2 C y 2 =b 2 D 1 around
the x axis. First, we parametrize the ellipse as x D a cos u and y D b sin u. As the revolution

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 679

(a) parametric 1 (b) parametric revol (c) parametric revol

Figure 7.30: Plots of the paraboloid z D x 2 Cy 2 using an explicit formula and a parametric representation.
Obviously, all surfaces can be parameterized in more than one way.

is around the x axis, we consider a point P on the ellipse x 2 =a2 C y 2 =b 2 D 1; this point is
P .a cos u; b sin u/ for, let say, u D =3 as shown in Fig. 7.31. Now, we rotate this point P
around the x axis a full round. What we obtain is a circle of radius b sin u. The parametric
equation of this circle is thus .b sin u/ cos v; .b sin u/ sin v. Therefore, the parametric equation
of this ellipse is:
x.u; v/ D a cos u
y.u; v/ D b sin u sin v
z.u; v/ D b sin u cos v; u 2 Œ0; ; v 2 Œ0; 2
See Fig. 7.31 for an illustration of the meaning of u; v. It is obvious that this parametric repre-
sentation works for any parametric curve written as .f .u/; g.u//. If we rotate this curve around
the x axis then we have
x.u; v/ D f .u/; y.u; v/ D g.u/ sin v; z.u; v/ D g.u/ cos v
See Fig. 7.31b for a surface of revolution obtained with y D 2 C cos x.
We move now to doughnuts. In geometry, a torus (plural tori) is a surface of revolution
generated by revolving a circle in three-dimensional space about an axis that is coplanar with the
circle. It is thus a surface of revolution, so we know how to derive a parametric representation
of it. Let’s consider a circle in the zx plane with a center at .a; 0/ and radius b. The parametric
equation of this circle is:
x D b cos u C a; z D b sin u; 0  u  2
Now we rotate this circle around the z axis and we get a torus:
x D .b cos u C a/ cos v; y D .b cos u C a/ sin v; z D b sin u; u; v 2 Œ0; 2 (7.9.3)

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 680

(a) (b)

Figure 7.31: Parametric representation of surface of revolution: (a) the ellipsoid obtained by revolving
the ellipse x 2 =a2 C y 2 =b 2 D 1 around the x axis. The ellipse x 2 =a2 C y 2 =b 2 D 1 is the blue curve;
(b) y D 2 C cos x.

(a) 0  v   (b) 0  v  2

Figure 7.32: Torus. The mesh lines are shown in (a) only.

See Fig. 7.32 for an illustration with a D 3 and b D 1.


After all these interesting surfaces we now discuss how to parametrize a plane. Let’s consider
a plane going through a point P 0 D .x0 ; y0 ; z0 / and contains two vectors a D .ax ; ay ; az /
and b D .bx ; by ; bz / (these two vectors must be not parallel or must be independent). From
Section 11.1.3 we know that the plane can be written as P D P 0 C ua C vb. Written explicitly,
the parametric representation of the plane is:

x.u; v/ D x0 Cuax Cvbx ; y.u; v/ D y0 Cuay Cvby ; z.u; v/ D z0 Cuaz Cvbz ; (7.9.4)

7.9.2 Practical implementation detail


This section discusses how computers generate the plot of parametric surfaces. The idea is
simple: a number of points on the surface are generated, these points are then connected to make
the so-called grid curves. If needed, the surface is shaded. The algorithm is given in Listing 7.1.
Note that the parameters are normalized to fall within the unit interval.

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 681

Listing 7.1: Algorithm to generate points on a parametric surface.


1 m = number of steps in v direction;
2 n = number of steps in u direction;
3 delta_v = 1.0/m; delta_u=1.0/n;
4 for(int i = 0; i < m; ++i){
5 v = i * delta_v;
6 for(int j = 0; j < n; ++j){
7 u = j * delta_u;
8 P = (x(u,v), y(u,v), z(u,v));
9 }
10 }

7.9.3 Tangent plane and normal vector


To find the tangent plane at a point P .u0 ; v0 / on a parametric surface S, we need to find the two
tangent vectors of S at P and using the cross product to get the normal. First, we fix u, and thus
get a curve Cv lying on S , the tangent to this curve at P is: (Fig. 7.33)

@x @y @z
rv D .u0 ; v0 /i C .u0 ; v0 /j C .u0 ; v0 /k (7.9.5)
@v @v @v
Second, we fix v, and get a curve Cu lying on S, the tangent to this curve at P is:

@x @y @z
ru D .u0 ; v0 /i C .u0 ; v0 /j C .u0 ; v0 /k (7.9.6)
@u @u @u
Finally, the normal N to the surface at P .u0 ; v0 / is obtained by taking the cross product of the
two tangent vectors:
N D ru  rv
I refer to Fig. 7.33 for a demonstration, in which the surface is actually a Bézier surface discussed
in Section 7.9.5.

(a) (b)

Figure 7.33: Tangent plane and normal at P on a parametric surface S . The plan’s equation is obtained
with Eq. (7.9.4) where a D r u and b D r v .

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 682

7.9.4 Surface area and surface integral


The problem we’re now interested in is: compute the area of a general parametric surface.
Assuming that the parameter domain is a rectangle which is divided into sub-rectangles of sides
u; v (Fig. 7.34a). We call these sub-rectangles the patches. On the surface S the marked
patch transfroms to the patch Sij , which is a curved surface (Fig. 7.34b). If we can compute the
area of this patch (approximately), then we sum up the areas of all the patches, and when the
number of patches tends to infinity, we shall obtain the sought for surface area. That’s the plan.
v

v
u
.ui ; vj /

u
(a) (b)

Figure 7.34: Area of a parametric surface S .

The area of Sij turns out to be easy for we know the tangents u and r v . We replace Sij by a
parallelogram of sides jjuujj and jjvvjj. This parallelogram lies in the tangent plane to the
surface at Pij (see Fig. 7.33b). Thus, the area of Sij is:

area of one patch D kr u  r v kuv

We now sum up all the areas:


X
area of surface D kr u  r v kuv

And we replace the above Riemann sum by a double integral:



area of surface D kr u  r v kdudv (7.9.7)

7.9.5 Bézier surfaces


In Section 4.14 we have met Bézier curves. It is time to extend them to Bézier surfaces, which
are very powerful parametric surfaces as they are extensively used in practice. Recall that a
Bézier curve, denoted by Bn .u/, with n C 1 control points P k , k D 0; 1; :::; n is described by
the following parametric equation:
!
X n
n Xn
n k k
Bn .u/ D .1 u/ u P k D Bk;n .u/P k (7.9.8)
k
kD0 kD0

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 683

where Bk;n .u/ is the Bernstein basis polynomial. (Note that I have intentionally changed from t
to u for the parameter). The idea now is to introduce another Bézier curve Bm .v/ with control
points Ql , l D 0; 1; :::; m. We then build a Bézier surface S.u; v/ by multiplying Bn .u/ and
Bm .v/: ! m !
Xn X
S.u; v/ D Bk;n .u/P k Bl;m .v/Ql
kD0 lD0
We put the control points P k and Ql into one single 2D array C k;l , which is a matrix n  m
with entries are 3D points:
X
n X
m
S.u; v/ D Bk;n .u/Bl;m .v/C k;l
kD0 lD0

As an example, we consider a bi-cubic Bézier surface with 4  4 control points (Fig. 7.35).

(a) (b)

Figure 7.35: A bi-cubic Bézier surface with 4  4 control points.

7.10 Newtonian mechanics


Back in the fifteenth century it was still debatable about whether the earth orbits the sun or vice
versa. The discussion was purely philosophical. It was Tycho Brahe who came up with the idea
of making accurate observations of the orbit of the planets. Based on these observations, it should
be easy to deduce what is orbiting about what. So, Tycho Brahe spent many years to measure the
positions of planets. And he hired Kepler to be his assistant. After Brahe’s death, Kepler, based
on his former boss’s data, has discovered the three Kepler’s law of planetary motions. Inspired
by these laws, Newton came up with his universal gravitation theory and derived Kepler’s laws
as consequences of his gravitation theory.
This section is a brief introduction to Newtonian mechanics. The aim is to introduce some
applications of differential and integral calculus in the description of motion. And the section
also presents how the French astronomer Urbain Le Verrier discovered Neptune with only paper,
pencil and of course mathematics. In Francois Arago’s apt phrase, Le Verrier had discovered a
planet "with the point of his pen".

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 684

7.10.1 Aristotle’s motion


Aristotle believed that there are two kinds of motion for inanimate matter, natural and unnatural.
Unnatural (or "violent") motion is when something is being pushed, and in this case the speed
of motion is proportional to the force of the push. This was probably deduced from watching
boats and oxcarts . Natural motion is when something is seeking its natural place in the universe,
such as a stone falling, or fire rising.
For the natural motion of heavy objects falling to earth, Aristotle asserted that the speed of
fall was proportional to the weight, and inversely proportional to the density of the medium the
body was falling through. However, these remarks are very brief and vague, and certainly not
quantitative.
Actually, these views of Aristotle did not go unchallenged even in ancient Athens. Thirty
years or so after Aristotle’s death, Strato pointed out that a stone dropped from a greater height
had a greater impact on the ground, suggesting that the stone picked up more speed as it fell
from the greater height.

7.10.2 Galileo’s motion


Galileo set out his ideas about falling bodies, and about projectiles in gen-
eral, in a book called "Two New Sciences". The two were the science of
motion, which became the foundation-stone of physics, and the science of
materials and construction, an important contribution to engineering.
A biography by Galileo’s pupil Vincenzo Viviani stated that Galileo had
dropped balls of the same material, but different masses, from the Leaning
Tower of Pisa to demonstrate that their time of descent was independent of
their mass. It is an amazing feeling to see this ourselves, and you can go to
this YouTube webpage, see also the next figure.

History note 7.1: Galileo Galilei (1564–1642)


Galileo di Vincenzo Bonaiuti de’ Galilei was an Italian astronomer,
physicist and engineer, sometimes described as a polymath, from Pisa.
Galileo has been called the "father of observational astronomy", the
"father of modern physics", the "father of the scientific method", and the
"father of modern science". Although Galileo considered the priesthood
as a young man, at his father’s urging he instead enrolled in 1580 at
the University of Pisa for a medical degree. In 1581, when he was
studying medicine, he noticed a swinging chandelier swinging in larger
and smaller arcs. By comparison with his heartbeat, he observed that the chandelier took
the same amount of time to swing back and forth, no matter how far it was swinging. At
home, he set up two pendulums of equal length and swung one with a large sweep and
the other with a small sweep and found that they kept time together. Up to this point,
Galileo had deliberately been kept away from mathematics, since a physician earned a

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 685

higher income than a mathematician. However, after accidentally attending a lecture on


geometry, he decided to study mathematics and natural philosophy instead of medicine.

7.10.3 Kepler’s laws


Based on the data that Brahe had collected and his own genius Kepler has discovered the fol-
lowing laws of planetary motion. Kepler’s three laws describe the orbits of planets around the
Sun. The laws modified the heliocentric theory of Nicolaus Copernicus, replacing its circular
orbits with elliptical trajectories, and explaining how planetary velocities vary. The three laws
state that (see Fig. 7.36):

 Law 1: Each planet orbits in an ellipse with one focus at the sun (Fig. 7.36a);

 Law 2: The vector from the sun to a planet sweeps out an area at a steady state: dA=dt D
constant; the shaded areas in Fig. 7.36b are equal.

 Law 3: The length of the planet’s year (or


p period) is T D ka where a is the maximum
3=2

distance from the center, and k D 2= GM is the same for all planets; G is a constant,
called the universal gravitational constant, M is the mass of the sun (see Section 7.10.8
for detail).

The elliptical orbits of planets were indicated by calculations of the orbit of Mars. From this,
Kepler inferred that other bodies in the Solar System, including those farther away from the Sun,
also have elliptical orbits. The second law helps to establish that when a planet is closer to the
Sun, it travels faster. Unlike the first and second laws that describe the motion characteristics of
a single planet, the third law makes a comparison between the motion characteristics of different
planets. The comparison being made is that the ratio of the squares of the periods (i.e., T 2 ) to the
cubes of their average distances (i.e., a3 ) from the sun is the same for every one of the planets.
Thus, the third law expresses that the farther a planet is from the Sun, the slower its orbital speed,
and vice versa.

planet planet
F0 F0
a a
Sun Sun

(a) Kepler’s 1st law (b) Kepler’s 2nd law

Figure 7.36: Keplers’ laws of planetary motion: illustration of the first and second law.

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 686

History note 7.2: Johannes Kepler (1571–1630)


Johannes Kepler was a German astronomer, mathematician, and as-
trologer. He is a key figure in the 17th-century scientific revolution,
best known for his laws of planetary motion, and his books Astrono-
mia nova, Harmonices Mundi, and Epitome Astronomiae Coperni-
canae. These works also provided one of the foundations for Newton’s
theory of universal gravitation. He was introduced to astronomy at an
early age and developed a strong passion for it that would span his
entire life. At age six, he observed the Great Comet of 1577, writing
that he "was taken by his mother to a high place to look at it. In 1580,
at age nine, he observed another astronomical event, a lunar eclipse, recording that he
remembered being "called outdoors" to see it and that the moon "appeared quite red".

7.10.4 Newton’s laws of motion


Kepler had three laws for planetary motions and Newton also developed his three own laws of
motion. We learnt them by heart in high school. And on the surface they look very simple and
indeed the equation is so simple (F D ma). The three laws of motion are:

 Law 1: states that if a body is at rest or moving at a constant speed in a straight line, it will
remain at rest or keep moving in a straight line at constant speed unless it is acted upon
by a force.

 Law 2: is a quantitative description of the changes that a force can produce on the motion
of a body. It states that the time rate of change of the momentum of a body is equal in both
magnitude and direction to the force imposed on it. The momentum of a body is equal to
the product of its mass and its velocity. In symbols, this law is written as F D ma.

 Law 3: states that when two bodies interact, they apply forces to one another that are equal
in magnitude and opposite in direction. The third law is also known as the law of action
and reaction.

The first law is known as the law of inertia and was first formulated by Galileo Galilei. This law
is very counter-intuitive: if we go shopping with a cart and we stop pushing it it goes for a short
distance and stop. The law of inertia is wrong! As explained in the wonderful book Evolution
of Physics by Einstein and Infeld, only with the imagination that Galilei resolved the problem:
there is actually friction acting on the cart. If we can remove it (by having a very smooth road
for example) the cart would go indeed further. And with a ideally perfectly smooth road, it goes
forever.

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 687

We focus now on the 2nd law, which is written fully as

d 2x dvx
Fx D max D m 2 D m
dt dt
d 2y dvy
Fy D may D m 2 D m (7.10.1)
dt dt
2
d z dvz
Fz D maz D m 2 D m
dt dt
How are we going to use it? First we need to know the force, we then resolve it into three
components Fx ; Fy and Fz , and finally we solve Eq. (7.10.1). How to do that is the subject of
the next section.
Eq. (7.10.1) are what mathematicians refer to as ordinary differential equations with the well
known abbreviation ODEs. Precisely they are second order ODEs as they contain the second
time derivative d 2 x=dt 2 . Scientists like to call them dynamical equations because they describe
the evolution in time (i.e., dynamics) of the system. Chapter 9 discusses differential equations
in detail.
Newton gave us the 2nd law which requires force so he had to give us some forces. And he
did. In Section 7.10.8 I present his force of gravitation. For other forces, he gave us the third law
which in many cases helps us to remove interaction forces (usually unknown) between bodies.

7.10.5 Dynamical equations: meaning and solutions


To illustrate what Eq. (7.10.1) can predict we consider an object of mass m locating at a height
h above the earth. The mass of the earth is M and its radius is R. According to Newton’s theory
of gravitation presented in Section 7.10.8, the earth is pulling the object with a force F pointing
to the center of the earth and has a magnitude of

GM m
F D
.R C h/2

Since h is tiny compared with R, we can approximate .R C h/2 as R2 C 2Rh C h2  R2 . Thus,

GM GM
F D m D mg; gD
R2 R2
where g is called the acceleration of gravity. The quantity mg is called the weight of the object,
which is how hard gravity is pulling on it. With

G D 6:673  10 11
Nm2 =kg2 ; M D 5:972  1024 kg; R D 6:37  106 m

one can determine that g D 9:81 m=s2 .


With the gravitational force known, let’s solve the first real problem using calculus. The
problem is: we are shooting a basket ball or firing a gun; describe its motion. These projectile
motions occur in a plane. Let’s use the xy plane with x being horizontal and y vertical. For

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 688

y
x(t)
v0 m

y(t) v0y mg
α

v0x = v0 cos α x

Figure 7.37: Projectile motion. [Generated using Asymptote.]

simplicity the initial position of the object (with mass m) is at the origin. The initial velocity of
the object is .v0 cos ˛; v0 sin ˛/ (Fig. 7.37). Our task now is to solve the following dynamical
equations:
d 2x d 2y
0Dm 2; mg D m 2
dt dt
Solving the first equation for x.t/, we get

vx .t/ D v0 cos ˛; x.t/ D .v0 cos ˛/t (7.10.2)

which agrees with the law of inertia: no force on the x direction, the velocity (in the horizontal
direction) is then constant. Now, solving the second equation for y.t/, we get

d 2y 1 2
D g H) vy .t/ D gt C v0 sin ˛ H) y.t/ D .v0 sin ˛/t gt (7.10.3)
dt 2 2

Putting together x.t/ and y.t/ we get the complete trajectory of the projectile:

1 2
x.t/ D .v0 cos ˛/t; y.t/ D .v0 sin ˛/t gt (7.10.4)
2

What this equation provides us is that: start with the initial position (which is .0; 0/ in this
particular example) and initial velocity, this equations predicts the position of the projectile at
any time instant t. One question here is: what is the shape of the trajectory? Eliminating t will
reveal that. From Eq. (7.10.2), we have t D x=v0 cos ˛, and substitute that into Eq. (7.10.3) we get

1 g
y D .tan ˛/x 2 2
x2 (7.10.5)
2 v0 cos ˛

A parabola! We can do a few more things with this: determining when the object hits the ground,
and how far. The power of Newton’s laws of motions is in the prediction of the motion of planets,
see Section 7.10.9 for detail.

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 689

7.10.6 Motion along a curve (Cartesian)


With calculus of functions of one variable we have studied motion along a straight line. An
extension along this line is to study motion along a curve path. For example, what is the trajectory
of a rocket when time is passing? Such trajectory is defined by a position vector R.t/ given by

R.t/ D x.t/i C y.t/j C z.t/k (7.10.6)

The position vector gives the position of the object in motion at any time instance (Fig. 7.38). If
the motion is in a plane, we just omit the third term in the above equation. Such a position vector
is mathematically called a vector-valued function as we assign to all number t a vector R.

z y

R

R.t / R.t /
R.t C t /

y x

Figure 7.38: Position vector R.t / and a change in position vector R.t /. Sometimes the notation r.t / is
also used.

Knowing the function, the first step is to do the differentiation; which gives us the velocity
vector v.t/. To this end, we consider two time instants: at t the position vector is R.t/ and at
t C t the position vector is R.t C t/. Then, the velocity is computed as (one note about the
notation is in order: vectors are typeset by italic boldface minuscule characters like a)Ž

R
v.t / D lim
t!0 t
Œx.t C t/ x.t/i C Œy.t C t/ y.t/j C Œz.t C t/ z.t/k
D lim (7.10.7)
t!0 t
dx dy dz
D iC jC k
dt dt dt
What does this equation tell us? It tells us that differentiating a vector valued function is amount
to differentiating the three component functions (they are ordinary functions of a single variable).
The formula is simple because the unit vectors (i.e., i ,j ,k) are fixed. As we shall see later, this
is not the case with polar coordinates, and the velocity vector has more terms.
The speed (of the object) is then given by kv.t/k, the length of the velocity vector. The
direction of motion is given by the tangent vector T .t/, which is given by v=kvk. The tangent is a
unit vector, as we’re only interested in the direction.
Ž
Implicitly we used the rule of limit: limit of sum is sum of limits.

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 690

The acceleration is just the derivative of the velocity:

dv d 2R d 2x d 2y d 2z
a.t/ D D D i C j C k (7.10.8)
dt dt 2 dt 2 dt 2 dt 2

Now, we generalize the rules of differentiation of ordinary functions to vector functions.


Let’s consider two vector valued functions u.t/ and v.t/ and a scalar function f .t/, we have the
following rules:
d
(a) Œu C v D u0 C v0
dt
d
(b) Œf .t/u D f 0 .t/u C f .t/u0
dt (7.10.9)
d
(c) Œu  v D u0  v C u  v0
dt
d
(d) Œu  v D u0  v C u  v0
dt

These rules can be verified quite straightforwardly. These rules are just some maths exercises,
but amazingly we shall use the rule (d) to prove that the orbit of the earth around the sun is a
plane curve.
And with all of this, we can study a variety of motions. In what follows, we present uniform
circular motion as an example of application of the maths.

Uniform motion along a circle. Uniform circular motion can be described as the motion of an
object in a circle at a constant speed. This might be a guest on a carousel at an amusement park,
a child on a merry-go-round at a playground, a car with a lost driver navigating a round-about
or "rotary", a yo-yo on the end of a string, a satellite in a circular orbit around the Earth, or the
Earth in a (nearly) circular orbit around our Sun.
At all instances, the object is moving tangentially to the circle. Since the direction of the
velocity vector is the same as the direction of the object’s motion, the velocity vector is directed
tangent to the circle as well. As an object moves in a circle, it is constantly changing its direction.
Therefore, it is accelerating (even though the speed is constant).
Let’s denote by ! the angular velocity of the object (the SI unit of angular velocity is radians
per second). Then, we can write its position vector, and differentiating this vector gives us the
velocity vector, which is then differentiated to give us the acceleration vector (assuming that the
radius of the circular path is r):

" # " # " #


r cos !t r! sin !t r! 2 cos !t
R.t / D H) v.t/ D H) a.t/ D (7.10.10)
r sin !t Cr! cos !t r! 2 sin !t

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 691
y v(t)
The speed–the length of v–is thus r!. The acceleration vec-
tor a.t / can also be written as ! 2 R.t/: that explains the term
centripetal acceleration. The word centripetal comes from the

r sin ωt

)
(t
R
Latin words centrum (meaning center) and petere (meaning to ωt
seek). Thus, centripetal takes the meaning ‘center seeking’. With- O x
out this acceleration, the object would move in a straight line, r cos ωt
a = −ω 2 R
according to Newton’s laws of motion. About the magnitude, we
have a D v2=r (for a D jjajj D r! 2 and v D r!). We plot the po-
sition vector, velocity vector and acceleration vector in Fig. 7.39.
Figure 7.39
7.10.7 Motion along a curve (Polar coordinates)
We have described motion along a curved in which space is mathematically represented by a
Cartesian coordinate system. Herein, we do the same thing but with polar coordinates. A point in
this system is written as .r; /, and similar to i and j –the unit vectors in a Cartesian system, we
also have r, O the unit vector in the angular direction
O the unit vector in the radial direction and ,
(Fig. 7.40).
y y
r
rO D O rO
jjrjj
.x; y/
r r
 =constant  r Dconstant
x x

Figure 7.40: Unit vectors in polar coordinate system. The most important observation is that while rO and
O are constant in length (because they are both unit vectors), they are not constant in direction.
p In other
words, they are vector-valued functions that change from point to point. Note that jjrjj D x C y 2 .
2

The unit vector in the radial direction rO is given by


r x y
rO WD Dp iCp j D cos i C sin j (7.10.11)
jjrjj x2 C y2 x2 C y2
Knowing rO allows us to determine the unit vector in the tangential direction O as the two vectors
are perpendicular to each other. Collectively, they are written as
rO D C cos i C sin j
(7.10.12)
O D sin i C cos j
As both of them are functions of  only, their derivatives with respect to r are zeros. We need
their derivatives w.r.t :
d rO
D sin i C cos j D O
d
(7.10.13)
d O
D cos i sin j D rO
d
Phu Nguyen, Monash University © Draft version
Chapter 7. Multivariable calculus 692

We’re now ready to compute the derivative of these unit vectors w.r.t time (following Newton,
use the notation fP to denote the time derivative of f .t/):
d rO d rO d
D D P O
dt d dt
(7.10.14)
d O d O d
D D P rO
dt d dt
Now, we proceed to determine the velocity and acceleration. First, the velocity is
dr d rO
r D r rO H) D rP rO C r D rP rO C r P O (7.10.15)
dt dt
And therefore, the acceleration is
d 2r d  P O

D rP rO C r  
dt 2 dt
d rO d O (7.10.16)
D rR rO C rP C rP P O C r R O C r P
dt dt
P 2
D .rR r  /rO C .2rP  C r / P R O

where use was made of Eq. (7.10.14).


So, Newton’s 2nd law in polar coordinates is written as

Fr D m.rR r P 2 /
(7.10.17)
F D m.2rP P C r / R

Another way to come up with the velocity and acceleration using r D re i .

Using complex exponential, we can write

rO D e i ; O D ie i (7.10.18)
O Now, we
As multiplying with i is a 90ı rotation, it is clear that rO is perpendicular to .
can differentiate r D re w.r.t time:
i

dr
r D re i H) P i C ire i P D rP rO C r P O
D re
dt
which is exactly what we obtained in Eq. (7.10.15). For the acceleration, doing something
similar as
dr i P d 2r
D re
P i
C ire  H) 2
D re P i P C i re
R i C rie P i P C ir ie
P i P C ire i R
dt dt
and we got Eq. (7.10.16).

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 693

7.10.8 Newton’s gravitation


In 1687 Newton published his work on gravity in his classic Mathematical Principles of Natural
Philosophy. He stated that every object is pulling other object and for two objects they are pulled
by a force that is proportional to the product of their masses and inversely proportional to the
square of the distance between them. In mathematical symbols, his law is expressed as:
GM m
F D (7.10.19)
r2
where G is a constant, called the universal gravitational constant, that has been experimentally
measured by CavendishŽ about 100 years after Newton’s death. The value of this constant is
G D 6:673  10 11 Nm2 =kg2 . In physics, Eq. (7.10.19) is known as an inverse square law. Why
such a name? Because electric forces also obey this kind of law. Again, you see that the same
mathematics apply for different physical phenomena.
How Newton came up with Eq. (7.10.19)? It can be based on Kepler’s third law as shown in
the box below.

Assume that a planet of mass m orbits the sun in a circle of radius r with uniform speed
v. This is not correct that a planet is orbiting with a uniform the speed. But note that we
are just trying to guess what the form of a law looks like. Note also that Newton never
knew G in his own equation Eq. (7.10.19)! The period T of the planet, which is the time
for it to complete one travel around the sun, is given by T D 2 r=v (nothing but time =
distance/speed), and we need T 2 :
2 r 4 2 r 2
T D H) T 2 D
v v2
We then determine v in terms of the force F using Newton’s 2nd law and a D v 2 =r
(check Eq. (7.10.10) and the discussion below it):
v2 Fr
F D ma; aD H) v 2 D ar D
r m
Thus, the squared period T 2 becomes
4 2 r 2 4 2 mr
T2 D D (7.10.20)
v2 F
And Kepler’s third law says T / r , so
2 3

4 2 mr m
/ r 3 H) F / 2 (7.10.21)
F r
But the planet also pulls the sun of mass M with the same force (Newton’s third law), thus
F should be proportional to M too. Thus, F / Mr 2m . Eventually, F D constant  Mr 2m ,
and that constant is G–for gravity. Mathematics cannot give you G; for that we need
physicists.

Ž
Henry Cavendish (1731–1810) was an English natural philosopher, scientist, and an important experimental
and theoretical chemist and physicist.

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 694

Why Eq. (7.10.19) is called the universal law of gravitation? That apple Moon
is because it is the force described by Eq. (7.10.19) that governs the
motion of planets around the sun, the orbit of the moon around the rE
rM
earth and the fall of an apple from a tree to the earth surface. How
Newton proved this? He used the following facts: (i) the earth can be mE

approximated as a mass concentrated at its center (obtained via his


Shell theorem proved in Section 7.8.5) and (ii) the acceleration due to Earth
gravity near the earth surface g is 9.81 m/s2 , probably determined by
Galileo.
So, using Newton’s 2nd law and Eq. (7.10.19), we can compute the acceleration of an apple
and the moon as

F GmE
Apple: aa D D 2
ma rE
GmE
Moon: am D 2
rM

where rE is the radius of the earth (a known quantity), rM is the distance from the earth center
to the moon (also known) and mE is the mass of the earth (we do not really need it as it will be
canceled in the algebraic manipulations); ma is the mass of the apple and note that as the apple
is near to the earth surface the distance from the center of the earth to the apple is simply rE . We
can thus compute am :
 2  2
rE 6:37  106 m
am D aa D  9:81 D 2:7  10 3
m/s2 (7.10.22)
rM 3:84  108 m

The acceleration of the moon can also be computed using another way (Eq. (7.10.20)):

v2 4 2 r 2 4 2 rM .4 2 /3:84  108 m


am D D 2 M D D D 2:72  10 3
m/s2 (7.10.23)
rM T rM T2 .2:36  106 s/2

where T  27 days is the period of the moon. The amazing agreement of the two values of the
moon acceleration proved the universality of Newton’s law of gravity.

All planets orbit the sun on a plane. Using (d) in Eq. (7.10.9), we compute

d
.r  v/ D v  v C r  a D 0 C r  a D r  a
dt

Now comes one observation: r  a D 0 because r and a are parallel. Why that? From Newton’s
2nd law (i.e., F D ma) we deduce that F and a are parallel. But Newton’s gravitation tells us
that F is parallel to r. To conclude, dt
d
.r  v/ D 0, or r  v is a constant vector H . And r stays
always in the plane perpendicular to H . See Fig. 4.2b for a beautiful picture of this plane.

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 695

7.10.9 From Newton’s universal gravitation to Kepler’s laws


Proof of 2nd law. I provide two proofs of Kepler’s 2nd law. Recall that Kepler’s 2nd law simply
means that dA=dt is constant, see Fig. 7.41. Let’s put the origin (of the coordinate system) at
the sun which leads to the net torque acting on a planet P is zero. This is because the sun’s
gravitational pull is a central force and the cross product of two parallel vectors is a zero vector.
If the net torque is zero, then the angular momentum is constant (Section 11.1.5). It can be seen
that dA=dt is proportional to the length of the angular momentum (which is constant), and thus
dA=dt is constant.

!
PQ D vdt
1
dA D jjr  vdt jj
F0 2
dA 1
S dA D jjr  vjj
r Q dt 2
1
P D jjr  pjj
2m

Figure 7.41: Proof of Kepler’s 2nd law. For illustratioj purposes, the area dA was magnified. The angular
momentum is l D r  p.

As the second proof, we compute the angular momentum. We start with the angular momen-
tum l D r  p, and p D mv, v D rP rO C r P O according to Eq. (7.10.15):
l D r  p D mr.rO  v/; .r D r r/ O
h i
D mr rO  .rP rO C r P /
O (Eq. (7.10.15)) (7.10.24)
P rO  /;
D mr 2 . O .rO  rO D 0/

From this, we can determine the length of the angular momentum as l D mr 2 !, where ! D ; P
O D 1 for two perpendicular unit vectors. From Fig. 7.41 and following the steps
because jjrO  jj
in Eq. (7.10.24) but without the mass m, we get
1 P rO  /
O D 0:5r 2 ! D l=2m
dA=dt D 0:5kr  vk D kr  pk D 0:5r 2 .
2m
Since the angular momentum l is conserved, we arrive at the conclusion that dA=dt is constant.
This proof shows us that as the planet is orbiting the sun, when it is close to the sun (r is small),
it speeds up (! is bigger as l D mr 2 ! is constant).

Proof of 2nd law. We use Newton’s 2nd law in polar coordinates i.e., Eq. (7.10.17) together with
Newton’s universal gravity to deduce Kepler’s 1st law. The only force is the Sun’s gravitational
pull written as
GM m
F D rO (7.10.25)
r2
Phu Nguyen, Monash University © Draft version
Chapter 7. Multivariable calculus 696

Introducing this force into Eq. (7.10.17), we get the following system of two equations:
GM
rR r P 2 D
r2 (7.10.26)
2rP P C r R D 0
Solution of this system of equations is the orbit of the planet and it should be an equation for an
ellipse (but we need to prove this). From the second equation in Eq. (7.10.26) , we have
P D 0 ” r 2 P D h D constant
d=dt.r 2 /

The trick is to use a new variable q D 1=r Ž . In terms of q, r 2 P D h becomes hq 2 D .


P Let’s
compute dr=dt first:
1 qP 1 dq d dq
rD H) rP D D D h
q q2 q 2 d dt d
Then, using the above expression of rP we compute r,
R which is what we want, see Eq. (7.10.26):
     
d 2r d dq d dq d dq d 2 2d q
2
D h D h D h D h q (7.10.27)
dt 2 dt d d dt d d dt d 2
P
where, in the last equality, we used the result that hq 2 D .
We’re now ready to re-write the first equation of Eq. (7.10.26) in terms of h; q;  :

d 2q 1 d 2q
h2 q 2 C .hq 2 2
/ D GM q 2
H) Cq DC ; .C D GM=h2 / (7.10.28)
d 2 q d 2
The boxed equation is a so-called differential equation (DE). We have more to say about dif-
ferential equations in Chapter 9, but briefly a DE is an equation that contains derivatives of
some function that we’re trying to find e.g. f .x/ C f 0 .x/ D 2. How are we going to solve the
above boxed equation? Solving DEs is not easy, but in this case it turns out that the solution is
something we know. What is the boxed equation saying to us? It tells us to find a function (i.e.,
q) such that its second derivative equals minus itself (the constant C is not important). We know
that cos  is such a function. So, the solution to this equation is q D C D cos . Now, forget
q–it’s just a means to an end–we need r which is
1
rD
C D cos 
But this is the equation of a conic section (Section 4.13.2). We need astronomical data to
determine C and D and from that to deduce that this is indeed the equation of an ellipse.
At this moment, you might be thinking ’but the orbit of planets around the Sun was known
to be an ellipse thanks to Kepler’. It is indeed easier to work on a problem of which solution
we known beforehand. But, Newton’s universal gravity theory is more powerful than that. It can
predict things that we never know of.
Ž
Don’t ask me why this new variable. I have no idea.

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 697

7.10.10 Discovery of Neptune


To understand why Newton’s universal gravity law is considered one of the best of human kind,
let’s see how it helped to predict the planet Neptune before Neptune was directly observed.
By 1847, the planet Uranus had completed nearly one full orbit since its discovery by William
Herschel in 1781, and astronomers had detected a series of irregularities in its path that could not
be entirely explained by Newton’s universal gravitation theory. So, either was Newton wrong
or there was a mysterious planet that people at that time did not know of. In 1845, the French
astronomer Urbain Le Verrier (1811–1877) and the British mathematician and astronomer John
Couch Adams (1819–1892)– an undergraduate at Cambridge University, who both believed in
Newton, separately began calculations to determine the nature and position of this unknown
planet. On September 23, 1846, Johann Galle head of the Berlin Observatory used Le Verrier’s
calculations to find Neptune only 1ı off Le Verrier’s predicted position. The planet was then
located 12ı off Adams’ prediction. In Francois Arago’s apt phrase, Le Verrier had discovered a
planet "with the point of his pen".
Although Newton’s theory of gravity was a great achievement Newton himself could
not explain the mechanism of his own theory. As the sun and the earth are separated
by a large distance and there is nothing in between, how the force of gravity commu-
Remark
nicates is the big issue. And this is only resolved hundreds years later when Albert
Einstein came up with his theory of general relativity. We shall touch upon on this in
Chapter 8 when we talk about tensors.

7.10.11 Newton and the Great Plague of 1665–1666


Between the summer of 1665 and the spring of 1667, Isaac Newton at the age of 22 made two
extended visits to his family farm in Woolsthorpe to escape the plague affecting Cambridge. The
bubonic ’Great Plague’ of 1665–1666 was the worst outbreak of plague in England since the
black death of 1348. It spread rapidly throughout the country. London lost roughly 15% of its
population, and the villagers of Eyam, Derbyshire, became famous for their heroic quarantine
to halt the spread of the disease.
Many town-dwellers, like Newton, retreated to the relative safety of the countryside. What is
different is how he set his mind to work in this period. There he remained secluded for eighteen
months, during which time he not only discovered the universal law of gravity but changed the
face of science.
About these years of wonder, in a letter to Pierre Des Maizeaux, written in 1718, Newton
wrote these words

In the beginning of the year 1665 I found the method of approximating series and
the rule for reducing any dignity [power] of any binomial into such a series. The
same year in May I found the method of tangents of Gregory and Slusius, and in
November had the direct method of fluxions and the next year [1666] in January had
the theory of colours and in May following I had entrance into the inverse method
of fluxions. And the same year I began to think of gravity extending to the orb of the

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 698

moon ... All this was in the two plague years of 1665 and 1666, for in those days I
was in the prime of my age for invention and minded Mathematics and Philosophy
more than at any time since.

7.11 Vector calculus


Vector calculus is the calculus of vector fields or vector-valued functions. It was developed for
electromagnetism and thus it is best to be studied with electromagnetism. This tradition started
with Richard Feynman in his lecture on physics and with the book Div, Grad, Curl, and All
That: An Informal Text on Vector Calculus by Schey [60]. We also follow this approach mainly
because we wanted to learn electromagnetism–a branch of physics that underlie everything in
our modern world.
All of electromagnetism is contained in the Maxwell equations:

r E D (7.11.1a)
0
@B
r E D (7.11.1b)
@t
@E j
c2r  B D C (7.11.1c)
@t 0
r B D0 (7.11.1d)
where r is the gradient vector operator, r  E is the divergence of the electric field E , r  E is
the curl of E ; B is the magnetic field.
When the electric and magnetic field do not depend on the time i.e., the charges are per-
manently fixed in space or if they do more, they move as a steady flow, all of the terms in
Eq. (7.11.1) which are time derivatives of the fields are zero. And we get two sets of equations.
One for electrostatics:

r E D (7.11.2a)
0
r E D0 (7.11.2b)
and one for magnetostatics:
j
r B D (7.11.3a)
c2 0
r B D0 (7.11.3b)
Looking at these two sets of equations, we can see that electrostatics is a neat example of a
vector field with zero curl and a given divergence. And magnetostatics is a neat example of a
vector field with zero divergence and a given curl.
To summarize, the central object of vector calculus is vector fields C . And to this object, we
will of course do differentiation and integration, which leads to differential calculus of vector
fields and integral calculus of vector fields, and connections between them:

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 699

 differential calculus: we have divergence r  C D @C @x


x
C @C
@y
y
C @C
@z
z
, we have curl r  C ;
R R
 integral calculus: we have line integral C C  d s, surface integral S C  ndA;
 the fundamental theorem of calculus that links line integrals to surface integrals and
volume integrals: we have Green’s theorem, Stokes’ theorem and Gauss’ theorem. They
Rb
are all generalizations of a df =dx dx D f .b/ f .a/.

7.11.1 Vector fields


In vector calculus and physics, a vector field is an assignment of a vector to each point in space.
A vector field in the plane (for instance), can be visualized as a collection of arrows with a given
magnitude and direction, each attached to a point in the plane. Vector fields are often used to
model, for example, the speed and direction of a moving fluid throughout space, or the strength
and direction of some force, such as the magnetic or gravitational force, as it changes from one
point to another point.
Generally a 3D vector field F can be described as:

F D M.x; y; z; t/i C N.x; y; z; t/j C P .x; y; z; t/k (7.11.4)

So, a 3D vector field is similar to three ordinary functions. If the field does not depend on time
t; we have a static field, then in the above equation t is omitted. And for a plane vector field we
have F D M.x; y; t/i C N.x; y; t/j . Fig. 7.42 gives some plane vector fields which you can
think of the velocity field of a fluid.

2 2 1

1.5 1.5

1 1 0.5

0.5 0.5

0 0 0

-0.5 -0.5

-1 -1 -0.5

-1.5 -1.5

-2 -2 -1
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 -1 -0.5 0 0.5 1

(a) xi C yj (b) xi yj (c) yi C xj

Figure 7.42: Some vector fields.

7.11.2 Central forces and fields


Probably the two most well known vector fields are gravitational force and electric force. They
are both central forces. The gravitational force was discovered by Newton and the electric force

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 700

by Charles CoulombŽŽ . In mathematical symbols, they are written as


Mm
Gravitational force: F D G rO
r2
1 qq0
Electric force: F D rO
40 r 2

Remarkably these two very different forces have the same mathematical format: they are in-
versely proportional to the distance r between two masses M , m or two charges q and q0 , and
they are proportional to the product of two masses or charges. They are known as inverse square
laws. As these forces are along the line connecting the two masses (or charges), they are called
central forces.

Figure 7.43: Gravitational force between two masses M and m and electric force between two charges
q0 and q.

1 q
ED uO (7.11.5)
40 r 2

Figure 7.44

Fig. 7.45
ŽŽ
Charles-Augustin de Coulomb (1736–1806) was a French officer, engineer, and physicist. He is best known
as the eponymous discoverer of what is now called Coulomb’s law, the description of the electrostatic force of
attraction and repulsion. He also did important work on friction.The SI unit of electric charge, the coulomb, was
named in his honor in 1880.

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 701

Figure 7.45: Michael Faraday’s lines of force.

7.11.3 Work done by a force and line integrals


To introduce the concept of line integral, let’s us go back to the well know conservation of energy
principle that states, for 1D motion along a vertical line (e.g. free falling motion), that the sum
of the kinetic energy and potential energy is constant
2
mgh D const
„ ƒ‚ … C „ƒ‚…
0:5mv (7.11.6)
K.E. P.E

And we want to verify whether this principle is correct. We use Newton’s second law
F D ma D mdv=dt, but focus on energy aspects. Let’s calculate the change of the kinetic
energy T :    
dT d 1 2 dv dv
D mv D mv D m v D Fv (7.11.7)
dt dt 2 dt dt
Since F D mg and v D dh=dt , we get (assuming the mass is constant)

dT dh d
D Fv D mg D .mgh/
dt dt dt
Hence, the change in the kinetic energy turns into potential energy, and thus Eq. (7.11.6) is
indeed correct.
So, from Newton’s law we have discovered an interesting fact about energy conservation.
But it was only for the simple problem of free fall. Will this energy principle work for other
cases? Let’s check! In 3D, the kinetic energy T for a particle of mass m traveling along a 3D
curve is given by
1 
T D mvx2 C mvy2 C mvz2
2
Thus, its rate of change is

dT dvx dvy dvz


D mvx C mvy C mvz DF v (7.11.8)
dt dt dt dt
The term F  v is called the power. Replacing v as d s=dt , where d s D .dx; dy; dz/> , we then
have
dT ds
DF v DF  (7.11.9)
dt dt
Phu Nguyen, Monash University © Draft version
Chapter 7. Multivariable calculus 702

Even though the trajectory is a 3D curve, the only non-zero force component is Fz D mg, thus
we have
dT dz d
D . mg/ D .mgz/
dt dt dt
And again, energy conservation works.
We have a tiny change of T w.r.t a tiny change in time, Eq. (7.11.9). Integral calculus gives
us the total change when the particle traverses the entire path, denoted by C . From Eq. (7.11.9)
we obtain d T D F  ds, and integrating this gives us the total change of the kinetic energy
Z
T D F  ds (7.11.10)
C

This integral (a significant integral) is named a line integral of a vector field. In mechanics, this
integral is called the work done by a force. And Eq. (7.11.10) is known as the work-kinetic
energy theorem: the change in a particle’s KE as it moves from point 1 to point 2 (the end points
of C ) is the work done by the force.
Let’s say a few words about the unit of work. As work is defined as force multiplied with
distance, its SI unit is Newton  meter, which is one JouleŽ .
Don’t let the name line integral fool you, the integral path C is actually a curve. As F  d s
Rb
is a number the line integral is simply an extension of a f .x/dx. Instead of moving on the
horizontal x line from .a; 0/ to .b; 0/, now we traverse a spatial curve C . Obviously when this
curve happens to be the horizontal line, the line integral is reduced to the ordinary integral. So,
actually nothing is too new here.
For the evaluation of a line integral it is convenient to use a parametric representation for the
curve C . That is, C W .x.t/; y.t// for a  t  b. Then, Eq. (7.11.10) becomes, for a 2D vector
field F D M.x; y/i C N.x; y/j :
Z Z b" # " #
M.x.t/; y.t// x 0 .t/
F  ds D  0 dt (7.11.11)
C a N.x.t/; y.t// y .t/
Rb
The final integral is simply an integral of the form a f .t/dt, which can be evaluated using
standard techniques of calculus. In what follows, we present a few examples.

Example 7.3
Let’s consider this vector field F D yi C xj (see Fig. 7.42c), and the path is the full unit
circle centered at .2; 0/, and it is traversed counter-clockwise. First, we parametrize C , then
just apply Eq. (7.11.11):
) (
x D 2 C cos t dx D sin tdt
H) ; F d s D . sin t/. sin t/dt C.2Ccos t/.cos t/dt
y D sin t dy D C cos tdt
Ž
One joule is equal to the amount of work done when a force of 1 newton displaces a mass through a distance
of 1 metre in the direction of the force applied. It is named after the English physicist James Prescott Joule (1818–
1889).

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 703

H
So, (the symbol to designate that the curve is closed)
I Z 2
F  ds D .1 C 2 cos t/dt D 2
0

The result is positive which is expected because the force and the path are both counter-
clockwise.

Example 7.4
Let’s consider this vector field F D 2xi C 2yj (see Fig. 7.42a), and the path is the full
unit circle centered at .2; 0/. Note that the vector field F is the gradient of this scalar field
D x2 C y 2.
We have
) (
x D 2 C cos t dx D sin tdt
H) ; F d s D .4C2 cos t/. sin t/dt C.2 sin t/.cos t/dt
y D sin t dy D C cos tdt

Thus, I Z 2
F  ds D 4 sin tdt D 4 cos t 2
0 D 0 (7.11.12)
0
So, the line integral of a gradient field along a closed curve is zero! Let’s see would we also
get zero if the path is not a closed curve. Assume the path is just the first quarter of the circle,
and the line integral is
Z Z =2
F  ds D 4 sin tdt D 4
0
which is not zero.

Now, we suspect that there is something special about the line integral of a gradient vector.
Rb
But a line integral is a generalization of a f .x/dx, which satisfies the fundamental theorem of
calculus:
Z b
dF
dx D F .b/ F .a/
a dx

So, the equivalent counterpart for line integrals should look like this:

Z2
r  ds D .2/ .1/
1
along C

And it turns out that our guess is correct. Suppose that we have a scalar field .x; y/ and two
points 1 and 2. We denote by .1/ the field at point 1 and by .2/ the field at point 2. A curve
C joints these two points (Fig. 7.46). We have the following theorem:

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 704

Theorem 7.11.1: Fundamental Theorem For Line Integrals


Z2
r  ds D .2/ .1/ (7.11.13)
1
along C

which states that the line integral along the curve C of the dot product of a gradient r –
a vector field–with d s–another vector which is the infinitesimal line segment– equals the
difference of evaluated at the two end points of the curve C .

It is because of this theorem that the integral in Eq. (7.11.12) is zero, as the two end points
are the same. Also because of this theorem that the line integral of a gradient vector is path-
independent. That is, no matter how we go from point 1 to point 2, the integral is the same.

y 2 2
e
r i
r c
C
ds b
a si
1 1
Z 2 X
n

O r  d s D lim .r  s/i
x 1 n!1
iD1

Figure 7.46: Line integral as a Riemann sum.

Proof. [Proof of theorem 7.11.1]. We use the definition of an integral as a Riemann sum to prove
the above theorem. To this end, we divide the curve C into many segments (Fig. 7.46). Then,
we can write the integral as
Z 2 X
n
r  d s D lim .r  s/i
1 n!1
i D1

Now, what is r  s? It is the change of along s. Remember the directional derivative?
Here the direction is along the curve. So, we can compute this term for all n segments:

.r  s/1 D .a/ .1/


.r  s/2 D .b/ .a/
.r  s/n D .2/ .e/

If we sum up all the finite differences we get .2/ .1/. 

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 705

7.11.4 Work of gravitational and electric forces


We have seen previously that the work done by gravity near the earth surface in moving a mass
from point 1 to point 2 depends only on the vertical distance difference of these two end points.
In other words, the work which is a line integral is independent of the path. No matter which
path the object is moving, the work is the same! What is an interesting thing. Now we are going
to examine to see if this holds for harder cases.

1
y
GM m 1
m 2 F D O
r; rO D .x; y; z/>
ds r 2 r
F C r 2 D x2 C y2 C z2
r
rO  
M GM m 1
x F  ds D .xdx C ydy C zdz/
Earth r2 r
z

Figure 7.47: Work of the gravitational force in moving a mass m at .x; y; z/ from point 1 to point 2 along
a curved path C . Note that d s D .dx; dy; dz/> . There is a minus sign in the force because F and rO are
in opposite direction.

We compute the work done by the gravity in moving a mass m from point 1 to point 2 along
a curved path C as shown in Fig. 7.47. The origin of the coordinate system is put at the earth of
mass M . Refering to Fig. 7.47 for the computation of F  d s, we have
 
GM m 1
F  ds D .xdx C ydy C zdz/
r2 r

But, as r 2 D x 2 C y 2 C z 2 , we have rdr D xdx C ydy C zdz, thus


 
GM m
F  ds D dr
r2

So the work is written as


Z Z  
dr 1 1
W D F  ds D GM m 2
D GM m (7.11.14)
C r r2 r1

And this work is also independent of the path! And if C is a close path, W would be zero.
We know that the work done is equal to the change in the kinetic energy (that is W D T ).
And Eq. (7.11.14) shows that work done is also a change of something: the RHS of that equation
is the difference of two terms which indicates a change of something that we label as U . Our
aim is now to find the expression for U . We have, W D T and W D U , so
)
W D CT
H) .T C U / D 0 (energy is conserved) (7.11.15)
W D U

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 706

From Eqs. (7.11.14) and (7.11.15) we can determine the expression for U :

 
1 1 GM m
GM m D U H) U.r/ D (7.11.16)
r2 r1 r

And U.r/ is called the potential of the gravitational force.

7.11.5 Fluxes and Divergence


To introduce the concept of flux, let us consider the problem of heat
conduction, first solved by Joseph Fourier in 1855. Assume we have a
thin slab made of a certain material. One face of the slab is heated to
a temperate T1 and the other face is heated to T2 < T1 . Experiments
demonstrate that there is heat flow through the slab. The amount of
thermal energy per unit time flows through the slab, denoted by Q, is
given by (in SI system the unit of Q is WŽ or J/s)

.T1 T2 /A
QDk .W=J/s/ (7.11.17)
d

where k is the thermal conductivity of the material (SI unit is W/(mK)). This equation was
obtained based on experimental observations that the rate of heat conduction through a slab is
proportional to the temperature difference across the slab (T1 T2 ) and the heat transfer area
(A), but it is inversely proportional to the thickness of the slab d .
Now if we shrink the slab thickness d to zero so that we have the derivative of the temperature,
and divide the above equation by A (and thus get rid of that), we get the following differential
form of the one dimensional Fourier law for heat conduction:

dT
qD k .W/m2 / (7.11.18)
dx

where q is the heat flux density. The word flux comes from Latin: fluxus means "flow", and
fluere is "to flow".
Now we move to heat conduction in a three dimensional body of complicated geometry. The
generalization of Eq. (7.11.18) is
qD krT (7.11.19)

where rT denotes the gradient of the temperature field.

Ž
Named after James Watt (1736–1819), a Scottish inventor, mechanical engineer, and chemist.

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 707

Figure 7.48: An open or close surface with an infinitesimal surface area dA with n being its unit normal
vector pointing outward. By unit normal I meant that knk D 1.

The question now is what is the the thermal energy crossing


a surface? The surface can be open or closed (Fig. 7.48). To
compute that amount of energy, we divide the surface into many
many small parts of area dA, compute the energy crossing each
part and sum all those energies. Now comes the key observation:
If the surface element is not perpendicular to q, the amount of
thermal energy crossing it is smaller as the tangential component of the flow does not contribute
to the flow across the surface. The amount of thermal energy passing though dA per unit time
is then given by q  ndA. The proof goes like this. Assume that the heat flux density q is
perpendicular to the surface of area A1 . Thus, the amount of thermal energy crosses this surface
per unit time is Q D qA1 . This same amount of energy is passing through the surface of area
A2 D A1=cos  . Thus, the heat flux through A2 is
Q qA1
D cos  D q cos  D q  n
A1 = cos  A1
In the last step, Eq. (11.1.6) relating the dot product and the cosine of the angle was used. Noting
that knk D 1.
And the total flux of heat through a surface S is the sum of all the fluxes through the small
surface elements dA: Z
heat flux D q  ndA
S
Now we generalize this concept of heat flux to any vector field C :

Z
flux D C  ndA (7.11.20)
S

So in vector calculus a flux is a surface integral of the normal component of a vector.


Imagine that we have a volume V with surface S (Fig. 7.49). Now we cut that volume into
two volumes V1 and V2 by a plane Sab . The first volume is enclosed by surface S1 which consists

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 708

of a part of the original surface Sa and Sab . The second volume is bounded by surface S2 which
consists of the other part of the original surface Sb and Sab . If we compute the flux of a vector
field C through the surface S1 and the flux through S2 , we get:
Z Z
flux through S1 : C  ndA C C  n1 dA
Z Sa
Z Sab

flux through S2 : C  ndA C C  n2 dA


Sb Sab

Noting that n2 D n1 , when we sum these two fluxes, the red terms cancel out, and we obtain
this interesting fact about flux: the flux through the complete outer surface S can be considered
as the sum of the fluxes out of the two pieces into which the volume was broken. And nothing
can stop us from dividing V1 into two little pieces and regardless of how we divide the original
volume we always get that the flux through the original outer surface S is equal to the sum of
the fluxes out of all the little interior pieces.

Sb
Sa Sa

V2 ; S2
Sab V; S Sab
n2 n1 n1

V1 ; S1 V1 ; S1

Figure 7.49: The flux through the complete outer surface S of a volume V can be considered as the sum
of the fluxes out of the two pieces into which the volume was broken.

We continue that division process until we get an infinitesimal little piece. And that is a
very small cube. Now, we’re going to compute the flux of a vector field C through the faces
of an infinitesimal cube. And of course we choose a special cube, one that is aligned with the
coordinate axes (Fig. 7.50).

Figure 7.50: Flux of a vector field C through the faces of an infinitesimal cube.
R
The flux through faces 1 and 2, defined by C  ndA, are given by (note that the normals of

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 709

these faces are parallel to the x direction so other components of C are irrelevant)

flux through face 1 D Cx .1/yz


flux through face 2 D CCx .2/yz

And as the cube is tiny, the field is constant over these faces. So, for face 1, the field is Cx .1/
where 1 is any point on this face.
Along the x direction, the field is changing, so we have

@Cx
Cx .2/ D Cx .1/ C x (7.11.21)
@x
which is correct as x is small. Thus, we can compute the flux through faces 1/2, and similarly
for faces 3/4 and 5/6. They are given by

@Cx
flux through faces 1/2 D xyz
@x
@Cy
flux through faces 3/4 D xyz
@y
@Cz
flux through faces 5/6 D xyz
@z
which gives us the total flux through all the six faces of the small cube with surface S :
Z  
@Cx @Cy @Cz
C  ndA D C C V (7.11.22)
S @x @y @z

where V D xyz is the volume of the cube. The red term is given a special name–the
divergence of C . Thus, the divergence of a 3D vector is defined as

@Cx @Cy @Cz


r  C D div C D C C (7.11.23)
@x @y @z

What does Eq. (7.11.22) mean? It tells us that, for an infinitesimal cube, the outward flux of the
cube is equal to the divergence of the vector multiplied with the volume of the cube. To better
understand the meaning of this new divergence concept, we consider three vector fields and
compute the corresponding divergences (Fig. 7.51). Think of these vector fields as the velocities
of some moving fluid. Now put a sphere at the origin and the fluid can go in and out of this
sphere. In Fig. 7.51a, r  C > 0 indicates that, due to Eq. (7.11.22), the fluid is moving out of the
sphere. On the contrary, in Fig. 7.51b, the fluid is entering the sphere, thus r  C < 0. Finally, the
fluid in Fig. 7.51c is just swirling around: there is no fluid moving out of the sphere–r  C D 0.
If the divergence cannot describe a rotating fluid, then we need another concept. And indeed, the
curl of the fluid velocity field does just that (Section 7.11.7).

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 710

2 2 1

1.5 1.5

1 1 0.5

0.5 0.5

0 0 0

-0.5 -0.5

-1 -1 -0.5

-1.5 -1.5

-2 -2 -1
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 -1 -0.5 0 0.5 1

(a) xi C yj (b) xi yj (c) yi C xj

Figure 7.51: Some 2D vector fields and their divergences: (a) r  C D 2 > 0, (b) r  C D 2 < 0 and
(c) r  C D 0. You’re recommended to watch this amazing animation for a better understanding of the
meaning of the divergence and curl.

7.11.6 Gauss’s theorem


Gauss’ theorem is a relation between a volume integral (a triple integral) and a surface integral.
Assume that we have a solid with volume V and it is enclosed by a surface S . We have a vector
field C inside the volume. If we divide this solid into many many infinitesimal cubes, then for
each cube, from Eq. (7.11.22), we have

Z
C  ndA D r  C V (7.11.24)
cube faces

And if we sum up all these tiny cubes, the right hand side of Eq. (7.11.24) is the volume integral
of the divergence of C . How about the left hand side? It is the flux of C through the solid surface
S ; see the discussion related to Fig. 7.49. And that is, Gauss’s theorem or Gauss’s divergence
theorem :

Z Z
Gauss’s divergence theorem: C  ndA D r  C dV (7.11.25)
S V

In Section 9.5.2 I provide one application of Gauss’ divergence theorem to derive the three
dimensional heat conduction equation.

This proof is however not mathematically rigorous. It is certainly true that any domain can be cut up into


cubes/boxes. But most domains have a curved boundary, so the domain is unlikely to be a union of boxes. It is not
uncommon to argue that by taking the boxes to be smaller and smaller we can approximate any reasonable domain
better and better, and hence taking some sort of limit, the divergence theorem follows for any such domain.

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 711

7.11.7 Circulation of a fluid and curl


R2
We have met the line integral of a force field of the form 1 F  d s. This has a physical meaning
of the work done by the force in moving the object from point 1 to point 2 on a curved path. Now,
we study again the line integral but with two differences: instead of a force field, we consider
the velocity field of a moving fluid, and the path is a close one.
To start with we take a very special path: the boundary of a rectangle living on the xy plane
(Fig. 7.52). The rectangle is infinitesimally small, the path of the integral is its boundary with a
counter-clockwise orientation as indicated.
∆x
y (3)
C

Cy (2)
I
(4) (2) ∆y C · ds =?
I
C · ds = (∆ × C)z ∆a
Cx (1)
(1)
x

Figure 7.52: Circulation of C around a rectangle of sides x  y. The area of this small rectangle is
a.

Now, the circulation of the fluid around the rectangle is the line integral along the rectangle
boundary of the tangential component of the vector field or C d s. The line integral is broken into
four integrals along the four sides. Take the side 1 for example, using the mean value theorem
for integral i.e., Eq. (4.11.3) we can write
Z
C  d s D Cx .1/x
side 1

where Cx .1/ is the value of Cx evaluated at some point on the side 1. It does not matter the
precise location of this point. Doing similarly for other sides, the integral is given by
I
C  d s D Cx .1/x C Cy .2/y Cx .3/x Cy .4/y (7.11.26)

Similarly to what we have done to get the divergence, we group the red terms and blue terms:
@Cx
circulation along sides 1/3 D ŒCx .1/ Cx .3/x D xy
@y
@Cy
circulation along sides 2/4 D ŒCy .2/ Cy .4/x D C xy
@x

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 712

Substitution of this into Eq. (7.11.26) we obtain:


I  
@Cy @Cx
C  ds D xy (7.11.27)
@x @y
Now is the time to check if what we have obtained is really capturing the tendency of rotation.
Just use the examples shown in Fig. 7.51. For the left and middle figures, the red term Cy;x
Cx;y D 0 and obviously these two fluids are not rotating. For the right figure, Cy;x Cx;y D 2
and the fluid in that figure is counter-clockwise curling.
If you’re still not yet convinced, consider now the uniform circular motion discussed in
Section 7.10.6. A disk is rotating around the z axis with an angular velocity !. A point P .x; y; z/
on the ring of the disk (with radius r) has a velocity vector
vD !yi C !xj ; or v.x; y; z/ D . !y; !x/
If we plot this velocity field it looks exactly similar to the one given in Fig. 7.51c. Then, the red
term in Eq. (7.11.27) but applied to v (instead of C , noting they’re both vectors) is given by
@vy @vx
D 2!
@x @y
Indeed that red term is an indication of a rotation.
Instead of considering a rectangle in the xy plane, we can consider rectangles in the yz and
zx plane. Altogether, the circulations are given by
I  
@Cy @Cx
rectangle in xy plane: D C  d s D xy
@x @y
I  
@Cz @Cy
rectangle in yz plane: D C  d s D yz
@y @z
I  
@Cx @Cz
rectangle in zx plane: D C  d s D zx
@z @x
The three terms in the brackets are the three Cartesian components of a vector called the curl of
C , written as r  C (read del cross C) where    is the cross product (see Section 11.1.5
for a discussion on the cross product between two vectors). One way to memorize the formula
for the curl of a vector field is to use the determinant of the following 3  3 matrix:
ˇ ˇ
ˇ ˇ
ˇi j kˇ      
ˇ@ @ ˇˇ
ˇ @ @Cz @Cy @Cx @Cz @Cy @Cx
ˇ ˇD iC jC k (7.11.28)
ˇ @x @y @z ˇ @y @z @z @x @x @y
ˇ ˇ
ˇ Cx Cy Cz ˇ

Now we return to Eq. (7.11.27) and observe that the term in the brackets is just the
z component of r  C . And xy is the area of our little square a. Thus,
I
C  d s D .r  C /  na (7.11.29)

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 713

7.11.8 Curl and Stokes’ theorem


And guess what we are going to do now? Having established the line integral around the close
boundary of a flat rectangle, we now move to the harder problem: the line integral around a
spatial curve which is the boundary of some surface S . The idea is similar to what we have
done to get the Gauss theorem. The surface S is divided into many many small parts, each part
can be considered as a flat rectangle (left picture in Fig. 7.53). The line integral around is
then the sum of the line integral around the small rectangles i (along common side 12 the line
integrals cancel each other similar to Fig. 7.49).
y (3)
C

Cy (2)

(2)

Cy (2)

(1)
x

Figure 7.53: The line integral around is then the sum of the line integral around the small rectangles i .
Along the common side the line integrals cancel each other: on side (2) of the left flat rectangle, the line
integral is Cy .2/y and on the side (3) of the right rectangle, it is Cy .2/y. When sum up they cancel
each other. Noting that the two rectangles are intentionally drawn apart for illustration purposes only.

Thus we have
I XI X Z
C  ds D C  ds D .r  C /  na D .r  C /  ndA
i i i S

which is the Stokes theorem or the Kelvin-Stokes theorem. It is named after Lord Kelvin and
George Stokes.

Z I
Stokes’ theorem: .r  C /  ndA D C  ds (7.11.30)
S

7.11.9 Green’s theorem


If we return to 2D planes, then Stokes theorem becomes Green’s theorem, which is named after
the British mathematical physicist George Green. As C is now a two dimensional vector field,
the integrand in the surface integral is simply the z-component of the curl of C . Thus, Green’s
theorem states that
Z   I
@Cy @Cx
Green’s theorem: dA D .Cx dx C Cy dy/ (7.11.31)
S @x @y

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 714

That’s how physicists present a theorem. Mathematicians are completely different. Here is how
a mathematician presents Green’s theorem.
Theorem 7.11.2: Green’s theorem
Let C be a positively oriented, piecewise smooth, simple closed curve in the plane and let
D be the region bounded by C . If P .x; y/ and Q.x; y/ are two continuously differentiable
functions on D, then
Z   Z
@Q @P
dA D .P dx C Qdy/
D @x @y C

The main content is of course the same but with rigor. To use the theorem properly we
need to pay attention to the conditions mentioned in the theorem, especially about the curve C
(Fig. 7.54). For example, if the curve is open, forget Green’s theorem.

Figure 7.54: Illustration of positively oriented, piecewise smooth, simple closed curves.

History note 7.3: George Green (14 July 1793 – 31 May 1841)
George Green (14 July 1793–31 May 1841) was a British mathemati-
cal physicist who wrote An Essay on the Application of Mathematical
Analysis to the Theories of Electricity and Magnetism in 1828. The
essay introduced several important concepts, among them a theorem
similar to the modern Green’s theorem, the idea of potential functions
as currently used in physics, and the concept of what are now called
Green’s functions. Green was the first person to create a mathematical
theory of electricity and magnetism and his theory formed the foun-
dation for the work of other scientists such as James Clerk Maxwell, William Thomson,
and others. His work on potential theory ran parallel to that of Carl Friedrich Gauss.
The son of a prosperous miller and a miller by trade himself, Green was almost completely
self-taught in mathematical physics; he published his most important work five years
before he went to the University of Cambridge at the age of 40. He graduated with a BA
in 1838 as a 4th Wrangler (the 4th highest scoring student in his graduating class, coming

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 715

after James Joseph Sylvester who scored 2nd).

7.11.10 Curl free and divergence free vector fields


7.11.11 Grad, div, curl and identities
While working on a scalar function of multivariable e.g. T .x; y; z/ we discovered the gradient
vector, denoted by rf or gradf . This gradient vector allows us to answer the question how
much the function will change along any direction. It has a meaning that it provides the direction
of maximum change.
On the other hand, while working with vector fields, we have discovered two new things: the
divergence of a vector field C , denoted by r  C or div C and the curl of a vector field r  C
or curl C .
For a function f .x; y; z/, its gradient is a vector defined as
 
@f @f @f
rf D ; ;
@x @y @z
Now we do something remarkable, we remove f from the above, and define a gradient operator
as:  
@ @ @
rD ; ;
@x @y @z
And this operator is a vector. But it is not a vector on its own. We have to attach it to something
else so that it has a meaning. What can we do with this vector? Recall that we can multiply
a vector with a scalar, we can do a dot product for two vectors and finally we can do a cross
product for two vectors. Now, we define all these operations for our new vector r with a scalar
f and a vector field C :
 
@f @f @f
scalar multiplication: rf D ; ;
 @x @y @z
@ @ @ @Cx @Cy @Cz
dot product: r C D ; ;  .Cx ; Cy ; Cz / D C C
 @x @y @z  @x @y @z
@ @ @
cross product: r C D ; ;  .Cx ; Cy ; Cz /
@x @y @z
(7.11.32)
What we have achieved? Except for rf (which is where we started), we have obtained the
divergence and curl of a vector field, which matches the definition discovered previously when
we were doing physics!
Having now the new stuff, we’re going to find the rules for them. And of course we base our
thinking on the rules that we know for the differentiation of functions of a single variable. For
two functions f .x/ and g.x/, we know the sum and product rule:
d df dg
sum rule: .f C g/ D C
dx dx dx
d df dg
product rule: .fg/ D gC f
dx dx dx
Phu Nguyen, Monash University © Draft version
Chapter 7. Multivariable calculus 716

From this sum rule, now considering f .x; y; z/, g.x; y; z/ and two vector fields a and b, we
have the sum rules
sum rule 1: r.f C g/ D rf C rg
sum rule 2: r  .a C b/ Dr aCr b (7.11.33)
sum rule 3: r  .a C b/ D r  a C r  b

We have not one sum rule but three because we have three combinations of r, f and a as shown
in Eq. (7.11.32). The proof is straightforward, so we just present the proof of the second sum
rule:
@.ax C bx / @.ay C by / @.az C bz /
r  .a C b/ D C C
@x @y @z
@ax @bx @ay @by @az @bz
D C C C C C
@x @x @y @y @z @z
D r  a C r  b (collecting red terms to get div of a)

In some books, you can see the following proof, which is similar, but adopts index notation; the
vector a D .a1 ; a2 ; a3 / and the coordinates are x1 ; x2 ; x3 :

X 3   X
@.ai C bi / X @ai @ai X @bi
3 3 3
@ai
r  .a C b/ D D C D C
i D1
@xi i D1
@xi @xi i D1
@xi i D1
@xi

The pros of this notation is space saving, and it works for vectors in Rn for any n not just three.
Now comes the product rules. First, from rf we have r.fg/ and r.a  b/. Second, from
r  a we have r  .f a/ and r  .a  b/. Third, from r  a we have r  .f a/ and r  .a  b/.
Totally, we have six product rules, they are given by

product rule 1: r.fg/ D grf C f rg


product rule 2: r.a  b/ D‹
product rule 3: r  .f a/ D f .r  a/ C rf  a
(7.11.34)
product rule 4: r  .a  b/ D .r  a/  b .r  b/  a
product rule 5: r  .f a/ D f .r  a/ C .rf /  a
product rule 6: r  .a  b/ D‹

Proof of rules 1 and 3 is simple (and rules 1/3 have the same form). The product rule 5 can be
guessed from rule 3 and can be proved straightforwardly. The proof follows the same idea as
that of the proof of the sum rules. The form of rule 4 can be guessed: r  .a  b/ is a scalar, and
if the pattern of the derivative of fg still applies r  .a  b/ should consist of two scalar terms:
one involves the dot product of the curl of a and the other vector and the other term containing

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 717

the dot product of the curl of b and the other vector. What is weird is the minus sign not plus.

Second derivatives The grad, div and curl operators involve only first derivative. How about
second derivatives?

 Start with a scalar f .x; y; z/; we have rf , which is a vector. And for a vector we can do
a div and a curl, so we will have r  .rf / and r  .rf /;

 Start with a vector field C ; we have r  C which is a scalar, and for a scalar we can do a
grad on it: r.r  C /;

 Start with a vector field C ; we have r  C which is a vector, and for a vector we can do
a div on it: r  .r  C /, or we can do a curl on it: r  .r  C /.

We now compute all these possibilities and see what we will get. Let’s start with r  .rf /:

@2 f @2 f @2 f
r  .rf / D C C D r 2 f D f
@x 2 @y 2 @z 2

So, r  .rf / is a scalar and called the Laplacian of f , denoted by r 2 f . This operator appears
again and again in physics (and engineering). We can define the Laplacian of a vector field C as
a vector field with the components being the Laplacian of the components of the vector:

r 2 C D .r 2 Cx ; r 2 Cy ; r 2 Cz /

Moving on to r  .rf /, which is the curl of the grad of f . It is a zero vector, due to this
@2 f @2 f
property of partial derivative @x@y D @y@x . It is interesting that r  .r  C /, which is the div of
a curl, is also zero.
We now summarize all the results:

@2 f @2 f @2 f
Laplacian: r  .rf / D r 2 f D f D @x 2
C @y 2
C @z 2

Curl of the grad: r  .rf / D 0


Div of a curl: r  .r  C / D 0 (7.11.35)
Grad of a div: r.r  C / W nothing special
Curl of a curl: r  .r  C / D r.r  C / r 2C

You can check the last formula by computing the components of r  C , and then computing
the curl of that vector, and you will see the RHS appear. The formula is not important, what is
important is that the curl of a curl does not give us anything new.

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 718

7.11.12 Integration by parts


Integration by parts is:
Z b   Z b  
dg df
f dx D g dx C Œfgba
a dx a dx
which comes from the product rule and the fundamental theorem of calculus (ordinary calculus).
And of course, we’re going to develop a 3D version of integration by parts. And the machinery
is similar: product rule and the fundamental theorem of vector calculus.
Starting with this product rule (check Eq. (7.11.34)),

r  .f a/ D f .r  a/ C rf  a

Integrating both sides of it over a volume B with boundary surface @B, we get
Z Z Z
r  .f a/dV D f .r  a/dV C rf  adV
B B B

And using Gauss’ divergence theorem for the LHS to convert it to a surface integral on the
boundary, we obtain
Z Z Z
.f a/  ndS D f .r  a/dV C rf  adV
@B B B

And a bit of rearrangement gives us:


Z Z Z
f .r  a/dV D rf  adV C .f a/  ndS (7.11.36)
B B @B

From this result, we can obtain the gradient theorem. Let’s consider a constant vector a and
a smooth function u in place of f . From Eq. (7.11.36) we get (r  a D 0)
Z Z
ru  adV D .ua/  ndS
B @B

And since this holds for any constant vector a, we get the gradient theorem:
Z Z
rudV D undA (7.11.37)
V S

7.11.13 Green’s identities


Green’s identities are a set of three identities in vector calculus relating the bulk with the bound-
ary of a region on which differential operators act. They are named after the mathematician
George Green, who discovered Green’s theorem.

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 719

First identity. Assume two scalar functions u.x; y/ and v.x; y/ (extension to u.x; y; z/ is
straightforward), we then have

.vux /x D vx ux C vuxx
.vuy /y D vy uy C vuyy

where the notation ux means the first derivative of u with respect to x. Adding up these identities
gives
r  .vru/ D rv  ru C vu (7.11.38)
Integrating both sides of it over a volume B with boundary surface @B, we get
Z Z Z
r  .vru/dV D rv  rudV C vudV
B B B

Now, using again the Gauss divergence theorem for the LHS, we have
Z Z Z
.vru/  ndS D rv  rudV C vudV
@B B B

which is known as the first Green’s identity.


Note that ru  n is the directional derivative of u along the direction of n. Usually mathe-
maticians define the directional derivative in the outward normal direction as:
@u
WD ru  n
@n
With this new term, the first Green’s identity can also be written as, for a pair of .u; v/
Z Z Z
@u
vudV D rv  rudV C v dS
B B @B @n

Second identity. Writing the first Green’s identity for two pairs, .u; v/ and .v; u/ we get
Z Z Z
@u
vudV D rv  rudV C dS v
@B @n
ZB ZB
Z
@v
uvdV D ru  rvdV C u dS
B B @B @n

What we do next? We subtract the second from the first, as the red terms cancel each other:
Z Z  
@v @u
.uv vu/ dV D u v dS
B @B @n @n

and this is the second Green’s identity.

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 720

7.11.14 Kronecker and Levi-Cavita symbols


In the proof of the sum rule 2 in Eq. (7.11.33) we have shown that using indicial notation shortens
the proof. The question now is how to prove the product rule 4 in Eq. (7.11.34) in the same way?
But before that, try to prove this rule the usual way and you would understand how tedious the
algebra is.
Consider the three dimensional Euclidean space R3 with three orthonormal vectors e i .i D
1; 2; 3/. Then any vector, say, a can be written as§

a D a1 e 1 C a2 e 2 C a3 e 2 D ai e i (7.11.39)

where we have used Einstein summation rule in the last equality. We can write the dot product
of two vectors a and b as

a  b D .ai e i /  .bj ej / D ai bj e i  ej

Now, we know that the three basis vectors are orthonormal, we can easily compute the dot
product of any two of them, it is given by
(
1 if i D j
e i  ej D (7.11.40)
0 otherwise

And we introduce the Kronecker delta symbolŽ , which is defined by


(
1 if i D j
ıij D (7.11.41)
0 otherwise

Hence
a  b D ai bj ıij D ai bi D aj bj D a1 b1 C a2 b2 C a3 b3
So, the dot product gave us a new symbol ıij . The cross product should lead to a new symbol.
Let’s discover that. The cross product of two vectors a and b is a vector denoted by a  b:

a  b D ai e i  bj ej D ai bj e i  ej (7.11.42)

And of course we’re going to compute e i  ej (we know how to compute the cross product of
two vectors). The results are

e1  e1 D 0 e1  e2 D e3 e1  e3 D e2
e2  e1 D e3 e2  e2 D 0 e2  e3 D e1 (7.11.43)
e3  e1 D e2 e3  e2 D e1 e3  e3 D 0
§
We move away from i , j and k and use e i as we are now using indicial notation. It is important to remember
that these objects are vectors even though they also have an index.
Ž
Obviously named after Leopold Kronecker a German mathematician.

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 721

This allows us to write


ej  e k D ij k e i (7.11.44)
where ij k is the permutation symbol or the Levi-Civita symbol, which is defined by

< C1 if .i; j; k/ is .1; 2; 3/, .2; 3; 1/, or .3; 1; 2/
ij k D 1 if .i; j; k/ is .3; 2; 1/, .1; 3; 2/, or .2; 1; 3/ (7.11.45)

0 i D j , j D k, or k D i
Fig. 7.55

Figure 7.55: For the indices .i; j; k/ in ij k , the values 1; 2; 3 occurring in the cyclic order .1; 2; 3/
correspond to  D C1, while occurring in the reverse cyclic order correspond to  D 1. Tullio Levi-
Civita (1873 –1941) was an Italian mathematician, most famous for his work on absolute differential
calculus (tensor calculus) and its applications to the theory of relativity, but who also made significant
contributions in other areas. He was a pupil of Gregorio Ricci-Curbastro, the inventor of tensor calculus.

The cross product is now written as


a  b D aj bk ej  e k D aj bk ij k e i (7.11.46)
Denote c as the cross product of a  b, then we have c D aj bk ij k e i , i.e., the components of c
are ci D aj bk ij k , written explicitly
c1 D aj bk 1j k D a2 b3 a3 b2
c2 D aj bk 2j k D a3 b1 a1 b3
c3 D aj bk 3j k D a1 b2 a2 b 1
We’re now ready to prove the product rule 4 in Eq. (7.11.34) in a much elegant manner. First
it is necessary to express the curl of a vector using the Levi-Civita symbol:
@
a  b D aj bk ij k e i H) r  a D bk ij k e i D bk;j ij k e i (7.11.47)
@xj
where the notation bk;j means partial derivative of bk with respect to xj .
@
r  .a  b/ D .aj bk ij k / D .aj bk ij k /;i
@xi
D ij k aj;i bk C ij k aj bk;i
D .kij aj;i /bk aj j i k bk;i
„ ƒ‚ … „ ƒ‚ …
.ra/b a.rb/

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 722

where the minus comes from the fact that ij k D j i k , a property can be directly seen from its
definition.
The Levi-Civita symbol comes back again and again whenever you have tensors, so it is a
very important thing to understand. Let me just emphasize that no matter how complicated the
Levi-Civita Symbol is, life would be close to unbearable if it wasn’t there! In fact, it wasn’t until
Levi-Civita published his work on tensor analysis that Albert Einstein was able to complete his
work on General Relativity. That permutation symbol is that useful!

7.11.15 Curvilinear coordinate systems


In this section we present div, grad, curl and Laplacian operators in curvilinear coordinate sys-
tems. We have seen such coordinate systems: polar coordinates, cylindrical coordinates and
spherical coordinates. To illustrate the problem let’s consider a scalar function f .x; y/ of which
its gradient vector is .fx ; fy /. The question is what is the gradient vector of f .r; / when a polar
coordinate system is used.

Figure 7.56: Polar coordinates.

The first solution comes to mind is starting from rf D fx i C fy j we convert x; y and i ; j


O the latter two being the unit basis vectors of a polar coordinate system. Referring
to r;  and rO ,
to Fig. 7.56, we have
i D cos  rO O j D sin  rO C cos  O
sin  ; (7.11.48)
And we also have p
rD x2 C y 2;  D arctan y=x (7.11.49)
Now, we can do the variable conversion, as it is purely algebraic:
@f @f
rf D iC j
@x @y
   
@f @r @f @ O @f @r @f @ O
D C .cos  rO sin  / C C .sin  rO C cos  /
@r @x @ @x @r @y @ @y
D : : : .using Eq. (7.11.49)/
@f 1 @f O
D rO C 
@r r @
(7.11.50)

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 723

Cylindrical

7.12 Complex analysis


Mathematicians were used to be skeptical about imaginary number i 2 D 1, but when a geo-
metrical meaning of that was established, they embraced complex numbers and started doing
wild things with them. They developed another branch of mathematics called complex analysis.
Complex analysis refers to the calculus of complex-valued functions f .z/ of single complex
variable z.
This section presents a brief introduction to this amazing branch of mathematics.

7.12.1 Functions of complex variables


In this section, we shall discuss functions of complex variables, and how to visualize them.
Similar to functions of real variables which map a real number to a real number i.e., f W R ! R,
a function of one complex variable z D x C iy maps z to a new complex number w D u C iv
according to a certain rule f W C ! C.
Let’s play with some complex functions. Consider this simple function w D f .z/ D z 2 .
With z D x C iy, we have w D .x C iy/2 D x 2 y 2 C i.2xy/. Thus, the real part of f .z/
is u D x 2 y 2 and the imaginary part is v D 2xy. Let’s move to another complex function.
What is f .z/ D sin z? Using the trigonometry identity sin.a C b/ D sin a cos b C sin b cos a,
we write

sin z D sin.x C iy/ D sin x cos.iy/ C sin.iy/ cos x D sin x cosh y Ci sinh y cos x
„ ƒ‚ … „ ƒ‚ …
u.x;y/ v.x;y/

(where the identities cos.iy/ D cosh y and sin.iy/ D i sinh y; check Eq. (3.15.6)).

Exponential function. The exponential of a complex number is defined as

z z z2 z3
e WD 1 C C C C  (7.12.1)
1Š 2Š 3Š

which is reasonable given the fact that this definition is consistent with the definition of y D e x .
Now, we want to check whether e z1 e z2 D e z1 Cz2 using Eq. (7.12.1). Why that? Because that
the rule that the ordinary exponential function obeys. The new exponential function should obey
that too! We have,

z1 z2 z3
e z1 D 1 C C 1 C 1 C 
1Š 2Š 3Š
z2 z22 z23
e z2 D1C C C C 
1Š 2Š 3Š
Phu Nguyen, Monash University © Draft version
Chapter 7. Multivariable calculus 724

And therefore, the product e z1 e z2 :


  
z1 z2 z1 z12 z2 z22
e e D 1C C C  1C C C 
1Š 2Š 1Š 2Š
What we’re currently dealing with is a product of two power series. It’s better
P1to develop a
formula
P1 for that and we get back to e e later. Considering two power series nD0 an x , and
z1 z2 n

mD0 m x , their product is given by


m
b
! 1 !
X1 X
an x n bm x m
nD0 mD0

To get the formula, let’s try the first few terms, and hope for a pattern:
.a0 C a1 x C a2 x 2 C    /.b0 C b1 x C b2 x 2 C    / D .a0 b0 /x 0 C .a0 b1 C a1 b0 /x 1 C
C .a0 b2 C a1 b1 C a2 b0 /x 2 C   

If we look at the term .a0 b1 Ca1 b0 /x 1 we can see that the sum of the indices equals the exponent
of x 1 (a0 b1 has the indices sum to 1 for example). With this, we have discovered the Cauchy
product formula for two power series
! 1 ! !
X
1 X X1 Xn
an x n bm x m D ak bn k x n (7.12.2)
nD0 mD0 nD0 kD0

With this tool, we go back to tackle the quantity e z1 e z2 , writing e z1 as a power series, and using
the Cauchy product formula, and the binomial theorem:
P  P 
1 z1n 1 z2m
e z1 e z2 D nD0 nŠ mD0 mŠ
P1 Pn
D nD0 kD0 kŠ.n1 k/Š z1k z2n k (Cauchy product)
P1 1 Pn nŠ
D nD0 nŠ kD0 kŠ.n k/Š z1 z2 k n k
(add nŠ)
P1 .z1 Cz2 /n
D nD0 nŠ (binomial theorem)
z1 Cz2
De (def. of exponential of z)
Logarithm. The task now is to define ln z. We define the logarithm of a complex variable as the
inverse of the exponential of a complex variable. Start with w D u C iv 2 C, compute z D e w
as defined in Eq. (7.12.1). Now, the logarithm of z is defined as
ln z D w
Writing z D re i , now we can express it another way because z D e w :
z D e w D e uCiv D e u e iv
Now, we have the same complex variable written in two forms: z D re i and z D e u e iv , we can
deduce that
r D e u .H) u D ln r/; v D  C 2n

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 725

Finally, the logarithm of a complex number is given by

ln z D ln r C i. C 2n/; r D jzj;  D arg z (7.12.3)

Powers. We know how to compute .3 C 2i/n , using de Moivre’s formula. But we do not know
what is .3 C 2i/2C3i . Given a complex variable z and a complex constant a, we define z a in the
same manner as for real numbers:
z a WD e a ln z
Note that the RHS of this equation is completely meaningful: we know ln z, thus a ln z and its
exponential. Now, using Eq. (7.12.3) for ln z, we obtain the expression for z a

z a D exp .aŒln r C i. C 2n// (7.12.4)

As the formula involves n, z a can be multi-valued or not, depending on a. Let’s compute the nth
roots of z, that is Eq. (7.12.4) with a D 1=n:
 
1=n
p 1 1 1
z D z D exp
n
ln r C i C i 2m
n n n
     
1 1 1
D exp ln r  exp i  exp i 2m
n n n
p i.=nC2m=n/
D re
n

With the special case of z D 1 (with r D 1;  D 0), the nth root of one is thus given by
pn
1 D e i.2=n/m

which are the vertices of a regular n polygon inscribed in the unit circle.

7.12.2 Visualization of complex functions


Domain coloring approach. It is easy to see that we need a fourth order dimensional space
to visualize a complex function (we need x; y; u; v). As we live in a 3D space, it is difficult to
visualize a 4D space. Therefore, we need a different way. One way is called domain coloring in
which we assign a color to each point of the complex plane. By assigning points on the complex
plane to different colors and brightness, domain coloring allows for a four dimensional complex
function to be easily represented and understood.
The procedure is as follows. To each point in the domain of the function i.e., to each .x; y/,
do

 construct z D f .x C iy/;

 compute its argument arg z and its magnitude jzj;

 assign arg z with a hue following the color wheel, and the magnitude jzj by other means,
such as brightness or saturation (there are many options for this).

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 726

 convert from HSV to RGB.


The final result is a matrix of pixels of different RGB values. Fig. 7.57 shows the domain coloring
plots of f .z/ D sin z 1 and f .z/ D tan z 1 . This way of visualizing complex functions
was proposed by Frank Farris–an American mathematician working at Santa Clara University–
possibly around 1998.

(a) f .z/ D sin z 1 (b) f .z/ D tan z 1

Figure 7.57: Domain coloring based visualization of complex functions using ComplexPortraits.jl.

Two plane approach. Instead of using only one plane, we can visualize complex functions using
two planes: the xy plane and the uv plane. To demonstrate the idea, consider f .z/ D z 2 C 1,
we have
u.x; y/ D x 2 y 2 C 1; v.x; y/ D 2xy
Now, considering the entire complex plane and two grid lines, first x D 1, it is mapped to

u.1; y/ D 2 y 2; v.1; y/ D 2y

which can be combined to get u D 2 v 2 =4, which is a parabola. Similarly, consider the grid
line y D 1, it is mapped to
u.x; 1/ D x 2 ; v.x; 1/ D 2x
which is also a parabola. It can be shown that these two parabolas are orthogonal. We can repeat
this process for other grid lines, and the result is shown in Fig. 7.58 where the grid lines x D a
are red colored and the lines y D b are blue. The plane in Fig. 7.58a is mapped or transformed
to the one in Fig. 7.58b.

7.12.3 Derivative of complex functions


Having now complex functions, no doubt that mathematicians are going to differentiate them.
Let’s do that. Recall first that for a real function y D f .x/, the derivative of f at x0 is
f .x0 C h/ f .x0 /
f 0 .x0 / D lim
h!0 h
Phu Nguyen, Monash University © Draft version
Chapter 7. Multivariable calculus 727

(a) (b)

Figure 7.58: Visualization of complex functions as a mapping from the xy plane to uv plane (using
desmos). Note that the mapping preserves the angle between the grid lines: the grid lines in the uv plane
are still perpendicular to each other. Such a mapping is called a conformal mapping.

when this limit exists. We mimick this for complex functions: the complex function f .z/ D
u.x; y/ C iv.x; y/ with z D x C iy has a derivative at z0 D x0 C iy0 defined as
f .z0 C z/ f .z0 /
f 0 .z0 / D lim
z!0 z
when this limit exists. The thing is that while for real functions there are only two ways for h
to approach zero: either from the left or from the right of x0 ; the only street is the number line.
Now as complex numbers live on the complex plane, z can approach 0 from infinite number
of ways. The above limit only exists (i.e., has a finite value) when this limit gets the same value
no matter what direction z might approach 0. There are, however, two special directions:
Case 1: z D x.

f .x0 C x C iy0 / f .x0 C iy0 /


f 0 .z0 / D lim
x!0 x
u.x0 C x; y0 / C iv.x0 C x; y0 / u.x0 ; y0 / iv.x0 ; y0 /
D lim (7.12.5)
x!0 x
@u @v
D .x0 ; y0 / C i .x0 ; y0 /
@x @x
Case 2: z D iy. Following the same calculation, we get

@v @u
f 0 .z0 / D .x0 ; y0 / i .x0 ; y0 / (7.12.6)
@y @y
In order to have f 0 .z0 /, at least the two values given in Eqs. (7.12.5) and (7.12.6) must
be equal because if they are not equal we definitely do not have f 0 .z0 /. And this leads to the
following equations
@u @v @v @u
D ; D (7.12.7)
@x @y @x @y

Phu Nguyen, Monash University © Draft version


Chapter 7. Multivariable calculus 728

which are now known as the Cauchy-Riemann equation.

Geometric meaning of the complex derivative.

History note 7.4: Georg Bernhard Riemann (17 September 1826 – 20 July 1866)
Georg Friedrich Bernhard Riemann was a German mathematician
who made significant contributions to analysis, number theory, and
differential geometry. In the field of real analysis, he is mostly known
for the first rigorous formulation of the integral, the Riemann integral,
and his work on Fourier series. His contributions to complex analysis
include most notably the introduction of Riemann surfaces, breaking
new ground in a natural, geometric treatment of complex analysis. His
1859 paper on the prime-counting function, containing the original statement of the Rie-
mann hypothesis, is regarded as a foundational paper of analytic number theory. Through
his pioneering contributions to differential geometry, Riemann laid the foundations of
the mathematics of general relativity. He is considered by many to be one of the greatest
mathematicians of all time.

7.12.4 Complex integrals

Phu Nguyen, Monash University © Draft version


Chapter 8
Tensor analysis

Contents
8.1 Index notation and Einstein summation convention . . . . . . . . . . . . 732
8.2 Why tensors are facts of the universe? . . . . . . . . . . . . . . . . . . . 733
8.3 What is a tensor: some examples . . . . . . . . . . . . . . . . . . . . . . . 733
8.4 What is a tensor: more examples . . . . . . . . . . . . . . . . . . . . . . . 734
8.5 What is a tensor: definitions . . . . . . . . . . . . . . . . . . . . . . . . . 734

Vector analysis is about the calculus of vector fields. Similarly, tensor analysis is about the
calculus of tensor fields. That’s it. The problem is, whereas we all can get the idea of what a
vector is, it is much harder to grasp what a tensor is. So, what are tensor fields and why we need
to study them?
We study tensor fields–tensors that vary in space–
because they are The Facts Of The Universe as Lillian
LieberŽŽ once said. Of course Lieber was referring to Ein-
stein’s theory of general relativity. Presented to the Prussian
Academy of Sciences in Berlin in a series of lectures in
November 1915, the general theory of relativity is at its heart
a theory of gravity. It states that gravity is a result of spacetime being curved by mass and en-
ergy. Gravity is no longer a force as Newton told us. So, the sun keeps the earth in orbit not by
exerting a physical force on it, but because its mass distorts the surrounding space and forces the
earth to move that way. In the words of American theoretical physicist John Archibald Wheeler
(1911-2008), “space tells matter how to move and matter tells space how to curve”. What a
theory!
Of Einstein’s mind blowing theory, Carlo RovelliŽ wrote the following lines:
ŽŽ
Lillian Rosanoff Lieber (1886-1986) was a Russian-American mathematician and popular author. Her highly
accessible writings were praised by no less than Albert Einstein, Cassius Jackson Keyser, and Eric Temple Bell.
Ž
Rovelli (born May 3, 1956) is an Italian theoretical physicist and writer. His popular science book, Seven Brief

729
Chapter 8. Tensor analysis 730

There are absolute masterpieces which move us intensely: Mozart’s Requiem,


Homer’s Odyssey, the Sistine Chapel. . . Einstein’s jewel, the general theory of rela-
tivity, is a masterpiece of this order.
Surprisingly, mathematically, that theory is written by an equation as simple as this (at least
at first glance):
8G
G C g  D 4 T (8.0.1)
c
where G is Newton’s gravitational constant (yes, Newton is there because Einstein’s theory is
an improved version of Newton’s gravitation), c is the speed of light,  is the cosmological
constant. And  is also playing a part! These are all scalars. The other terms (in red) are of
different nature: they are what we call tensors. T is the energy-momentum tensor, g is the
metric tensor, and G is Einstein’s tensor which is defined as R 1=2Rg with R being
Ricci’s curvature tensor and R is a scalar curvature.
So, the left hand side of Einstein’s field equations is about the curvature of the four dimen-
sional curved spacetime, and the right hand side is about the mass and energy. And since they
are equal, we’re now able to understand what Wheeler said: “space (LHS) tells matter how to
move and matter (RHS) tells space how to curve”.
Newton’s second law of motion, written in terms of vectors, is given by F D ma, which is
essentially three equations in disguised:
Fx D max ; Fy D may ; Fz D maz
because a vector in our 3D space has three components. And in the same manner, Einstein’s field
equations i.e., Eq. (8.0.1) encode ten equations. Yes, ten equations, I was not mistaken. That’s
why Eq. (8.0.1) is referred to as Einstein’s field equations (plural). Why ten? The subscripts 
and , in e.g. g , run over the four coordinates of spacetime, so each tensor is a 4  4 table of
16 numbers. But these tensors are symmetric (think of a symmetric matrix), meaning that they
do not change when  and  are swapped, reduces them to 10 numbersŽ .
How long did it take Einstein to develop his general relativity? The
answer is ten years starting from 1905 when he published his theory of
special relativity while working as a patent clerk. To express his ideas
mathematically Einstein first had to learn tensors and differential geometry.
And he had learnt them with great difficulty, from his close friend–the
Swiss geometer Marcel Grossmann. Einstein told Grossmann: “You must
help me, or else I’ll go crazy.” Levi-Civita, who in 1900, had published
with Curbastro the theory of tensors in Meéthodes de calcul diffeérentiel
absolu et leurs applications, then initiated a correspondence with Einstein
to correct mistakes Einstein had made in his use of tensor analysis. The correspondence lasted
1915–1917, and was characterized by mutual respect. Einstein wrote to Levi-CivitaŽŽ :
Lessons on Physics, was originally published in Italian in 2014. It has been translated into 41 languages and has
sold over a million copies worldwide. In 2019, he was included by Foreign Policy magazine in a list of 100 most
influential global thinkers.
Ž
Only numbers on and above the diagonal are independent, which are 10.
ŽŽ
Later on, when asked what he liked best about Italy, Einstein said "spaghetti and Levi-Civita".

Phu Nguyen, Monash University © Draft version


Chapter 8. Tensor analysis 731

I admire the elegance of your method of computation; it must be nice to ride through
these fields upon the horse of true mathematics while the like of us have to make
our way laboriously on foot.

It is no exaggeration to say that our understanding of the universe was changed forever when
Albert Einstein succeeded in expressing his theory of gravity in terms of tensors. His theory
gives us black holes, gravitational waves, the expansion of the universe. It is all what physics is
about: finding the secrets of nature. If you look for a daily application of Einstein’s theory, you
might be disappointed as I can list only GPS. As a by product, it was the success of Einstein’s
general theory of relativity that gave rise to the current widespread interest of mathematicians
and physicists in tensors and their applications. Apart from the vital role in Einstein’s theory,
some applications of tensors, the list is by no means exhaustive:

 Continuum mechanics: stress tensor, strain tensor, gradient deformation tensor etc. Civil
engineers, mechanical engineers, aerospace engineers are among those who use these
tensors frequently.

 Electromagnetic tensor (or Faraday tensor) in electromagnetism.

 Quantum mechanics and quantum computing utilize tensor products for combination of
quantum states.

With this introduction let’s study these facts of the universe. Even though the ultimate goal is
to somehow understand Einstein’s beautiful field equations, this chapter is not about the theory
of general relativity. Instead, it is my attempt to present tensors to students of engineering and
science. The plan is to start simple. So, we first start with the conventional 3D space with
the familiar Cartesian coordinate system in which the three axes are orthogonal. Within that
framework, I shall present some examples of the so-called rank 2 tensors in Section 8.3..
To read this chapter you need to know linear algebra. For those who need a refresh on this
topic, check out Chapter 11. The most important thing to note from linear algebra is that a
vector is a not a list of numbers. A vector is a geometrical object, and we associate with it a
list of numbers (called its coordinates/components) only for computational purposes when we
artificially choose a coordinate system. When we use another coordinate system, the coordinates
of the vector change, but the vector is still itselfŽ .
I have consulted the following sources for the material put in this chapter:
 A student’s guide to vectors and tensors by Daniel A Fleisch, [23]

 The Einstein theory of relativity by Lillian Lieber et al., [41]

 dd

 dd
Ž
One example makes things clear. My car velocity is a vector in the geometric sense: it has a magnitude and
a direction. But whether its magnitude is measured as 60 miles per hour, 96 kilometers per hour, 27 meters per
second depends on my choice of units.

Phu Nguyen, Monash University © Draft version


Chapter 8. Tensor analysis 732

8.1 Index notation and Einstein summation convention


Let’s begin with the matter of notation. In tensor analysis one makes extensive use of indices.
For example, instead of using x; y; z to denote the three coordinates of a point in 3D space, we
use x1 ; x2 ; x3 . Einstein has introduced what we now call Einstein’s summation convention to
facilitate working with tensors. For example, for x being a vector and vi ; i D 1; : : : ; n are n
vectors, we can write

X
n
x D ˛1 v1 C ˛2 v2 C    C ˛n vn D ˛i v i
iD1

But Einstein wanted to save him some time by simply writing the above as

x D ˛1 v1 C ˛2 v2 C    C ˛n vn D ˛i vi (8.1.1)

And that is the Einstein Summation ConventionŽ . This convention can be summarized in the
following rules:

 If an index is repeated (twice) on the same side of an equation this index is summed over
i.e., the index i in Eq. (8.1.1);

 Indices that are summed over (called dummy indices) can be changed to another index
symbol. For instance, in the expression ˛i vi , i can be changed to l (or whatever), giving
us ˛l vl , which is an entirely equivalent expression. This often needs to be done to prevent
using the same symbol for multiple repeated indices.

 The free indices (the ones that are not summed over) have to match in each term of a
tensor equation. So, to write Ax D b as

xi D Ai j bj ; or xi D Ai l bl

is fine. But the following


xi D Alj bj
is not a valid tensor equation. This is so because the free index i on the LHS does not
show up on the RHS.

 For general relativity, the convention is to use Latin indices i; j; k; l; : : : to denote purely
spatial indices. These take the values 1; 2; 3, denoting the three spatial dimensions. And
use Greek indices ; ; ; : : : to denote spacetime indices. These take the values 0; 1; 2; 3,
where 0 denotes the time-like dimension and 1; 2; 3 the spatial dimensions.

Ž
Of this, he said "I have made a great discovery in mathematics; I have suppressed the summation sign every
time that the summation must be made over an index which occurs twice".

Phu Nguyen, Monash University © Draft version


Chapter 8. Tensor analysis 733

Example 8.1
Some examples of usage of the summation convention in 3D:

(a) aij bj k stands for ai1 b1k C ai 2 b2k C ai 3 b3k (sum over j )
@fi @f1 @f2 @f3
(b) stands for C C (sum over i)
@xi @x1 @x2 @x3 (8.1.2)
2 2 2
@  @  @  @2 
(c) stands for C 2C 2
@xi @xi @x12 @x2 @x3
>
(d) Aij xi xj stands for x Ax (sum over i and j )

8.2 Why tensors are facts of the universe?


This section tries to understand Lieber’s saying that tensors are the facts of the universe. But we
do not know what tensors are yet! If so, then let’s turn our attention to vectors.
F D ma.

8.3 What is a tensor: some examples


The reader is no doubt familiar with the words “scalar” and “vector.” A scalar is a quantity which
has magnitude only, whereas a vector has both magnitude and direction. For example, consider
a cube of side 2 cm; its volume is 8 cm3 . Now if we rotate this cube, whatever the rotation angle
is, its volume is always 8 cm3 . We say that volume is a direction-independent quantity. Mass,
volume, density, temperature are such quantities. The formal term for them is scalar quantities.
On the other hand, it is not hard to see that velocity is a vector quantity. We need to specify the
magnitude (or speed) and a direction when speaking of a velocity. After all, your car is running
at 50 km/h north-west is completely different from 50 km/h south-east.
We shall now discuss some quantities which come up in our experience and which are
neither scalars nor vectors, but which are called TENSORS.

8.3.1 Tensor of inertia


We have derived, in Section 11.10.1, an expression for the angular momentum of a 3D rigid
body. Consider a rigid body rotating with fixed angular velocity ! D .!x ; !y ; !z / about an axis
which passes through the origin, the total angular momentum of the body (about the origin) L
is written as
Lx D Ixx !x C Ixy !y C Ixz !z
Ly D Iyx !x C Iyy !y C Iyz !z (8.3.1)
Lz D Izx !x C Izy !y C Izz !z

Phu Nguyen, Monash University © Draft version


Chapter 8. Tensor analysis 734

8.3.2 Stress tensor


Suppose that we have a piece of a solid material e.g. a block of
jello and we pull it. What we observe? We see that the solid is
deformed i.e., it changes the shape. What is happening inside the
solid? Photoelasticity is an experimental method to determine the
stress distribution in a material.

Figure 8.1

This is how the stress at a point inside a solid is defined. Suppose that at that point a force F
is applied. First, we consider a surface that is perpendicular to the x-axis and of area y  z
(Fig. 8.1). The force F is resolved into three components: Fx along the x-axis, Fy along the
y-axis, and Fz along the z-axis. Then, learning from the concept of pressure (which is force
divided by area), we define the following quantities:
Fx Fy Fz
Sxx D ; Syx D ; Szx D (8.3.2)
yz yz yz
The first index refers to the direction of the force component and the second index x is normal
to the area. So, Syx is the stress. Next, we consider a surface perpendicular to the y-axis. Doing
the above steps, we define the following quantities
Fx Fy Fz
Sxy D ; Syy D ; Szy D (8.3.3)
xz xz xz
Finally, we consider a surface which is perpendicular to the z-axis, and we also obtain three
quantities. So we have nine numbers
2 3
Sxx Sxy Sxz
6 7
 D 4Syx Syy Syz 5 (8.3.4)
Szx Szy Szz

8.4 What is a tensor: more examples


8.5 What is a tensor: definitions

Phu Nguyen, Monash University © Draft version


Chapter 9
Differential equations

Contents
9.1 Mathematical models and differential equations . . . . . . . . . . . . . . 736
9.2 Models of population growth . . . . . . . . . . . . . . . . . . . . . . . . . 738
9.3 Ordinary differential equations . . . . . . . . . . . . . . . . . . . . . . . 740
9.4 Partial differential equations: a classification . . . . . . . . . . . . . . . . 747
9.5 Derivation of common PDEs . . . . . . . . . . . . . . . . . . . . . . . . . 747
9.6 Linear partial differential equations . . . . . . . . . . . . . . . . . . . . . 755
9.7 Dimensionless problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 756
9.8 Harmonic oscillation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 765
9.9 Solving the diffusion equation . . . . . . . . . . . . . . . . . . . . . . . . 784
9.10 Solving the wave equation: d’Alembert’s solution . . . . . . . . . . . . . 786
9.11 Solving the wave equation . . . . . . . . . . . . . . . . . . . . . . . . . . 790
9.12 Fourier series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 793
9.13 Classification of second order linear PDEs . . . . . . . . . . . . . . . . . 795
9.14 Fluid mechanics: Navier Stokes equation . . . . . . . . . . . . . . . . . . 795

In this chapter we discuss what probably is the most important application of calculus: dif-
ferential equations. These equations are those that describe many laws of nature. In classical
physics, we have to mention Newton’s second law F D mxR that describes motions, Fourier’s
heat equation P D  2 @2 =@x 2 that describes how heat is transferred in a medium, Maxwell’s
equations describing electromagnetism and the Navier-Stokes equation that calculates how flu-
ids move. In quantum mechanics, we have the Schrödinger equation. In biology, we can cite
the Lotka–Volterra equations, also known as the predator–prey equations–a pair of first-order
nonlinear differential equations– used to describe the dynamics of biological systems in which
two species interact, one as a predator and the other as prey. In finance there is the Black–Scholes
equation.

735
Chapter 9. Differential equations 736

The German mathematician Bernhard Riemann once said:

...partial differential equations are the basis of all physical theorems. In the theory
of sound in gaes, liquid and solids, in the investigations of elasticity, in optics, ev-
erywhere partial differential equations formulate basic laws of nature which can be
checked against experiments

The chapter introduces the mathematics used to model the real world. The attention is on
how to derive these equations more than on how to solve them. Yet, some exact solutions
are presented. Numerical solutions to differential equations are treated in Chapter 12. Topics
which are too mathematical such as uniqueness are omitted. Also discussed is the problem of
mechanical vibrations: simple harmonics and waves.
The following excellent books were consulted for the materials presented in this chapter:

 Partial differential equations for scientists and engineers by Stanley FarlowŽŽ [21];

 Classical Mechanics by John Taylor [70];

 Modelling with Differential Equations by Burghes and Borrie [12]

The plan of this chapter is as follows. We start with a toy problem in Section 9.1 to get the
feeling of what mathematical modeling looks like. Then, we become a bit more serious with
a real differential equation describing the population growth (Section 9.2). In Section 9.3, we
discuss ordinary differential equations. Next, we move to partial differential equations (such
as the wave equation u t t D c 2 uxx ). We start with Section 9.4 in which we get familiar with
partial differential equations, discuss some terminologies. The derivation of common partial dif-
ferential equations (e.g. the heat equation, the wave equation and so on) is treated in Section 9.5.
Section 9.7
Harmonic oscillation is given in Section 9.8. How to solve the heat (diffusion) equation is
presented in Section 9.9. Solutions of the wave equation are given in Sections 9.10 and 9.11. It
is when solving these two equations the idea of Fourier series were born.

9.1 Mathematical models and differential equations


To introduce mathematical modeling and differential equations, let us consider a simple problem
as follows. Assume that we want to know how long for a snow ball to completely melt. How
can we start? We begin with experiments or observations. Assume that experiment data show
us that the rate of change of the mass of the ball is proportional to the surface area of the ball.
This data alone is not sufficient. We need to make some assumptions i.e., we are building a
ŽŽ
Stanley Jerome Farlow (born 1937) is an American mathematician specializing in differential equations. For
many years he has been a professor at the University of Maine. Farlow is the author of several books in mathematics.

John R. Taylor (born 2 February 1939 in London). He got a BA in Mathematics, Cambridge University, 1960
and a Ph D in Theoretical Physics, University of California, Berkeley, 1963. Taylor is an emeritus professor of
physics at the University of Colorado, Boulder.

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 737

simplified model of the reality. The first assumption is the ball is always a sphere. The second
assumption is the density of snow does not change in time. These assumptions might not be
enough to have a very good model, but we have to start with something anyway. In summary,
we have the following set of facts to build our model:

 The rate of change of the mass of the ball is proportional to the surface area of the ball;

 At any time the ball is a sphere;

 The density of the snow is constant.

All we have to do is to translate the above facts (written in English) to the language of mathemat-
ics. The assumption is that all variables are continuous. Thus, we can use differential calculus
to differentiate them as we want, even though for some problems such as population growth the
population is not continuous! Remember that we’re building a model. As the mass is density
times volume, we can determine the mass with r.t/ representing the radius of the snow ball at
time t ŽŽ . And we also compute its derivative w.r.t t (because the derivative captures changes):

4 dM dr
M D   r 3 H) D 4r 2 (9.1.1)
3 dt dt
Using the experiment data on the rate of change of M , we can write

dM dr dr k
D k.4 r 2 / H) 4r 2 D k.4 r 2 / H) D (9.1.2)
dt dt dt 

where k is constant that can only be experimentally determined. The minus sign reflects the fact
that the mass is decreasing. Quantities such as  and k whose values do not change in time are
called parameters.
The equation in the box is a differential equation–an equation that contains derivatives.
In fact, it is an ordinary differential equation as there exits partial differential equations that
involve partial derivatives. In this example, t is the only independent variable and r.t/ is the
dependent variable. An ordinary differential equation expresses a relation between a dependent
variable (a function), its derivatives (first, second derivatives etc.) and the independent variable:
F .r.t /; r 0 ; r 00 ; : : : ; r .n/ ; t/ D 0. If there are more than one independent variable, we have a
partial differential equation as the derivatives are partial derivatives.
Now we have an equation. Next step is to solve it to find the solutionŽ . For what purpose?
For the prediction of the radius of the snow ball at any time instance. It is the prediction of
future events that is the ultimate goal of mathematical modeling of either natural phenomena or
engineering systems.
ŽŽ
The notation r.t / is read “r at time t”, and the parentheses tell us that our variable is a function of time.
Ž
A solution to a differential equation is a function that when substituted (together with all involved derivatives)
into the equation results in an identity. For example, y D sin x is a solution to the differential equation: y 0 D cos x.

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 738

For this particular problem, it is easy to find the solution: by integrating both sides of the
boxed equation in Eq. (9.1.2):

dr k
D c; c WD H) r.t/ D ct C A (9.1.3)
dt 

where A is a real number. But, why we get not one but many solutions? That is because the radius
at time t depends of course on the initial radius of the ball. So, we must know this initial radius
(denoted by R), then by substituting t D 0 in Eq. (9.1.3), we get A D R. Thus, r.t/ D R ct.
Now, we can predict when the ball is completely melt, it is when r.tm / D 0: tm D R=c. And
we need to check this against observations. If the prediction and the observation are in good
agreement, we have discovered a law. If not, our assumptions are too strict and we need to refine
them and refine our model.

9.2 Models of population growth


Populations are groups of organisms of the same species living in the same area at the same time.
They are described by characteristics that include:

 population size: the number of individuals in the population

 population density: how many individuals are in a particular area

 population growth: how the size of the population is changing over time.

If population growth is just one of many population characteristics, what makes studying it so
important? First, studying how and why populations grow (or shrink!) helps scientists make
better predictions about future changes in population sizes and growth rates. This is essential for
answering questions in areas such as biodiversity conservation (e.g., the polar bear population
is declining, but how quickly, and when will it be so small that the population is at risk for
extinction?) and human population growth (e.g., how fast will the human population grow, and
what does that mean for climate change, resource use, and biodiversity?).
In what follows a simple population growth model is presented. It is based on the ideas put
forward by Thomas Robert MalthusŽŽ in his 1798 book An Essay on the Principle of Population.
The basic assumption of the model is that the birth rate and dead rate are proportional to the
population size. Now, again, we just have to translate that assumption into mathematics. Let
N.t / be the population size at time t . Then, within a short time interval t, the births and deaths
are
births D ˛N.t/t; deaths D ˇN.t/t
where ˛ and ˇ are real positive constants; they are similar to k in the toy model in Section 9.1.
ŽŽ
Thomas Robert Malthus (13/14 February 1766 – 23 December 1834) was an English cleric, scholar and
influential economist in the fields of political economy and demography.

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 739

With that we can determine the increase (or decrease) of the population within t, labeled
by N :
births deaths D N D N.t/t; WD ˛ ˇ
You guess what we shall we do next? Dividing the above equation by t (so that a rate of
population appears) and let t ! 0

N N dN
D ıN.t/ H) lim D N.t/ H) NP D D N (9.2.1)
t t !0 t dt

Here the overdot denotes differentiation with respect to time following Newton. Now, we have
to solve the boxed ordinary differential equation. Lucky for us, we can solve this equation. The
solution i.e., N.t/ should involve the exponential function e ct (why?). Here is how:
Z t Z t
dN dN dN
D N H) D dt H) D dt H) N.t/ D N0 e t (9.2.2)
dt N 0 N 0

where we’ve assumed that the starting time is t D 0. Looking at the solution we can understand
why this model is called an exponential growth model.
How good is this model? To answer that (pure mathematicians do not care), scientists use
real data. For example, Table 9.1 is the USA population statistics taken from [12]. Of course the
data is much more, but we need to use just a small portion of the data to calibrate the model.
Calibrating a model is to find values for the parameters (or constants) in the model. In the context
here, we need to find N0 and using the data in Table 9.1.

Table 9.1: USA population data.

Year USA Population (106 )

1790 3.9

1800 5.3

1810 7.2

We have data starting from the year of 1790, thus t D 0 is that year and then N0 D 3:9
millions. For , use the data for 1800, noting that t in the model is in terms of 10 years, thus
1800 corresponds with t D 1:
 
5:3
5:3 D N.1/ D N0 e H) D ln D 0:307
3:9
Now is time for prediction. The calibrated model is used to predict the population up to 1870.
The results given in Table 9.2 indicates that the model is in good agreement until 1870, at that
year the error is nearly 20%. It’s time for an improved model.

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 740

Table 9.2: USA population data vs prediction.

Year USA Population (106 ) Prediction (106 ) Error (%)

1810 7.2 7.2 0.0

1820 9.6 9.80 2.1


:: :: :: ::
: : : :

1870 38.6 45.47 17.8

For students who would like to become scientists trying to understand our world, no one says
it best when it comes to how we–human beings–unravel the mysteries of the world, as Richard
Feynman in his interesting book The Pleasure of Finding Things OutŽŽ :

. . . a fun analogy in trying to get some idea of what we’re doing in trying to under-
stand nature, is to imagine that the gods are playing some great game like chess. . .
and you don’t know the rules of the game, but you’re allowed to look at the board, at
least from time to time. . . and from these observations you try to figure out what the
rules of the game are, what the rules of the pieces moving are. You might discover
after a bit, for example, that when there’s only one bishop around on the board that
the bishop maintains its color. Later on you might discover the law for the bishop as
it moves on the diagonal, which would explain the law that you understood before
– that it maintained its color – and that would be analogous to discovering one
law and then later finding a deeper understanding of it. Then things can happen,
everything’s going good, and then all of a sudden some strange phenomenon oc-
curs in some corner, so you begin to investigate that – it’s castling, something you
didn’t expect. We’re always, by the way, in fundamental physics, always trying to
investigate those things in which we don’t understand the conclusions. After we’ve
checked them enough, we’re okay.

9.3 Ordinary differential equations


So far we have met two ordinary differential equations and they’re both of this form

xP D f .x; t/ (9.3.1)
In the problem of population growth, x.t/ is N.t/–the population size. As the highest derivative
in the equation is one, it is called a first order ODE. Now, we show that we can always convert
a high order ODE to a system of first order ODEs. For example, the equation for a damped
harmonic oscillator is (Section 9.8)
ŽŽ
You can watch the great man here.

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 741

b k
mxR C b xP C kx D 0 ” xR D xP x (9.3.2)
m m
Now, to remove the second derivative, we introduce a variable x2 D x;
P this leads to xR D xP 2 , and
voilà we have removed the second derivative. And of course instead of x we use x1 D x. Then,
xP 1 D xP D x2 , and we can write xP 2 D xR D .b=m/x2 .k=m/x1 from Eq. (9.3.2). Now, using
matrix notation, we write
) " # " #" #
x1 D x xP 1 0 1 x1
H) D (9.3.3)
x2 D xP xP 2 k=m b=m x2
This is a system of two first order linear ODEs with a constant coefficient (the matrix does
not vary with time) matrix. How about a problem with a time dependent term like the forced
oscillator of which the equation is mxR C b xP C kx D F sin t? The idea is the same, introduce
another variable to get rid of t:
9 8̂
xP 1 D x2
x1 D x >= ˆ
<
k b
x2 D xP H) xP 2 D x1 x2 C F=m sin.x3 / (9.3.4)
>
; ˆ m m
x3 D t :̂
xP 3 D 1
So, we can now just focus on the following system of equations, which provides a general
framework to study ODEs:
xP 1 D f1 .x1 ; : : : ; xn /
:: (9.3.5)
:
xP n D fn .x1 ; : : : ; xn /
This general equation covers both linear systems such as the one in Eq. (9.3.3) and nonlinear
ones e.g. Eq. (9.3.4). However it is hard to solve nonlinear systems, so in the next section we
just focus on systems of linear ODEs.

9.3.1 System of linear first order equations


If we have this equation xP D x, we now know that (from Section 9.2) the solution is x.t/ D
C0 e t where C0 D x.0/. The next problem we’re interested in is a system of similar equations.
For example,
xP 1 D 2x1 xP 1 D 1x1 C 2x2
;
xP 2 D 5x2 xP 2 D 3x1 C 2x2
We use linear algebra (matrices) to solve it, so we re-write the above as (xP D .xP 1 ; xP 2 /)
" #" # " #" #
2 0 x1 1 2 x1
xP D ; xP D
0 5 x2 3 2 x2
„ ƒ‚ … „ ƒ‚ …
A1 A2

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 742

Before solving them, let’s make one observation: it is easier to solve the first system than
the second one; because the two equations in the former are uncoupled. This is reflected in
the diagonal matrix A1 with two zeros (red terms). The solution to the first system is simply
x D .C1 e 2t ; C2 e 5t /. But we can also write this as
" # " #
1 0
x D C1 e 2t C C2 e 5t
0 1
Noting that 2; 5 are the eigenvalues of the matrix A1 , and the two unit vectors are the eigenvectors
of A1 . Thus, the solution to a system of linear first order differential equations can be expressed
in terms of the eigenvalues and eigenvectors of the coefficient matrix, at least when that matrix
is diagonal and two eigenvalues are different.
For the second system xP D A2 x, the matrix is not diagonal. But there is a way to diago-
nalize a matrix (check Section 11.11.4 for matrix diagonalization) using its eigenvalues  and
eigenvectors v. So, we put these info below

"# " # " # " #


1 2 2 1 2 1
A2 D W 1 D 4; 2 D 1; v1 D ; v2 D ; PD
3 2 3 C1 3 C1
Let x D Py and substitute that into the original system, we get (don’t forget that xP D A2 x)
x D Py H) xP D PyP H) PyP D A2 x D A2 Py H) yP D P 1 A2 Py
But P 1 A2 P is simply a diagonal matrix with the eigenvalues (of A2 ) on the diagonal, thus we
can easily solve for y, and from that we obtain x:
" # " # " # " #
4t
4 0 C1 e 2 1
yP D y H) y D t
H) x D C1 e 4t C C2 e t (9.3.6)
0 1 C2 e 3 C1

Again, we can write the solution in terms of the eigenvalues and eigenvectors of the coefficient
matrix. To determine C1;2 we need the initial condition x 0 D x.0/; substituting t D 0 into the
boxed equation in Eq. (9.3.6) we can determine C1;2 in terms of x 0 :
" # " # " #" # " # " # 1
2 1 2 1 C1 C1 2 1
x 0 D C1 C C2 D H) D x0
3 C1 3 C1 C2 C2 3 C1
With a given x 0 , this equation gives us C1;2 and put them in the boxed equation in Eq. (9.3.6),
and we’re finished. Usually as a scientist or engineer we stop here, but mathematicians go further.
They see that
" # " # " #" #" # 1
4t
2 1 2 1 e 0 2 1
x D C1 e 4t C C2 e t D x0 (9.3.7)
3 C1 3 C1 0 e t 3 C1

Only if we know linear algebra we can appreciate why this form is better. So refresh your linear algebra before
continuing.

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 743

Is something useful with this new way of looking at the solution? Yes, the red matrix! It is a
matrix of exponentials. What would you do next when you have seen this?
For ease of presentation, we discussed systems of only two equations, but as can be seen, the
method and thus the result extends to systems of n equations (n can be 1000):
2 3 2 32 3
xP 1 A11 A12    A1n x1
6 7 6 76 7
6xP 2 7 6A21 A22    A2n 7 6x2 7
7 6 X n
6 7D6 7 H) x D C i e i t x i
6 :: 7 6 :: :: ::: :: 7 6 :: 7
4:5 4 : : : 54 : 5 i D1
xP n An1 An2    Ann xn

where the eigenvalues of A are i and eigenvectors are x i . Note that this solution is only possible
when A is diagonalizable i.e., when the eigenvectors are linear independent.
It is remarkable to look back the long journey from the simple equation xP D x with the
solution x.t/ DPC0 e t to a system of as many equations as you want, and the solution is still of
the same form niD1 Ci e i t x i . It is simply remarkable!
But wait. How about non-diagonalizable matrices? The next section is answering that ques-
tion.

9.3.2 Exponential of a matrix


Now we do something extraordinary. Starting with xP D ax with the solution x.t/ D ce at . Then,
consider the linear system xP D Ax, can we write the solution as x D e At x 0 , with x 0 being
a vector? To answer that question, we need to know what exponential of a matrix means. And
mathematicians define e A by analogy to e x :

x2 x3 A2 A3
ex D 1 C x C C C    H) e A D I C A C C C  (9.3.8)
2Š 3Š 2Š 3Š

On the RHS (of the boxed eqn) we have a sum of a bunch of matrices, thus e A is a matrix. If
we can compute the powers of a matrix (e.g. A2 ; A3 ; : : :) we can compute the exponential of a
matrix! Let’s use the matrix A2 and compute e At . For simplicity, I drop the subscript 2. The key
step is to diagonalize AŽ :
" # " #
2 1 4 C0
A D PDP 1 ; P D ; DD
3 C1 0 1

Hey, isn’t this section only for non-diagonalizable matrices? We’re now testing the idea of e A for the case we
Ž

know the solution first. If it does not work for this case then forget the idea.

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 744

Then, using the definition for e A , we can compute e At as follow (with Ak D PDk P, k D
1; 2; : : :)

A2 2 A3 3
e At D I C At C t C t C 
2Š 3Š
1 1
D PIP 1 C PDP 1 t C PD2 P 1 t 2 C PD3 P 1 t 3 C   
 2Š 3Š
1 2 2 1 3 3
D P I C Dt C D t C D t C    P 1
2Š 3Š
1
Dt
D Pe P (the red term is e Dt due to Eq. (9.3.8))

Can we compute e Dt ? Because if we can then we’re done. Using Eq. (9.3.8), it can be shown
that " #
4t
e 0
e Dt D
0 e t
Did we see this matrix? Yes, it is exactly the red matrix in Eq. (9.3.7)! Now we have e At as
" #" #" # 1
At 2 1 e 4t 0 2 1
e D
3 C1 0 e t 3 C1

Multiplying with x 0 and we get e At x 0 D x–the solution we’re looking for (compare with
Eq. (9.3.7)). Now, we have reasons to believe that the exponential of a matrix, as we have
defined it, is working.
Is there an easier way to see that x D e At x 0 is the solution of xP D Ax? Yes, differentiate
x! But only if we’re willing to compute the derivative of e At . It turns out not hard at allŽ :
 
d e At d .At/2 .At/3 1
D I C At C C C    D 0 C A C A2 t C A3 t 2 C   
dt dt 2Š 3Š 2
 
1
D A I C At C A2 t 2 C    D Ae At
2

With that, we can verify whether x D e At x 0 is the solution to xP D Ax:

d .e At x 0 / d e At
xP D D x 0 D A.e At x 0 / D Ax
dt dt

We can also prove that this is the only solutionŽŽ .

Ž
So the differentiation rule: d=dt .e ˛t / D ˛e ˛t still holds if ˛ is a matrix.
ŽŽ
We assume that x.t / is one solution and there was another solution y.t /, then we build z D x y. Now,
letting v.t / D e At z.t /, it can be shown that vP D 0: so v.t / must be constant. But v.0/ D 0, thus v.t / D z.t / D 0.
Therefore, y D x: the solution is unique.

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 745

What if the matrix is non-diagonalizable? We have computed e A by diagonalizing the matrix


and take advantage of the fact that the exponential of a diagonal matrix is easy to get. But not all
matrices are diagonalizable! For example, solving this equation: y 00 2y 0 C y D 0. First, we
convert this into a system of two first order DEs: with x D .y; y 0 /
" # " # " #
d y y0 C0 1
D H) xP D Ax; A D
dt y 0 2y 0 y 1 2

The matrix A is non-diagonalizable because it has repeated eigenvalues and thus linear dependent
eigenvectors: " # " #
1 1
1 D 2 D 1; x 1 D ; x2 D ˛
1 1
We have to rely on the infinite series in Eq. (9.3.8) to compute e At . First, massaging A a bit|| :

ADICA I H) e At D e .ICA I/t


D e It e .A I/t

Using Eq. (9.3.8) we can compute e It and e .A I/t


, (for the latter, note that .A I/2 D 0)

e It D Ie t ; e .A I/t
D I C .A I/t

With these results, we can write e At

e At D Ie t ŒI C .A I/t D e t ŒI C .A I/t

Therefore, the solution in the form of e At x 0 is

x D e At x 0 D e t ŒI C .A I/tx 0 H) y.t/ D e t y.0/ te t y.0/ C te t y 0 .0/

Is this solution correct? We can check! It is easy to see that y D e t and y D te t are two solutions
to y 00 2y 0 C y D 0. Thus, the solution is a linear combination of them. Hence, the solution
obtained using the exponential of a matrix is correct.
This method was based on this trick A D I C A I and the fact that .A I/2 D 0. How
can we know all of this ? It’s better to have a method that less depends on tricks.

Schur factorization. Assume a 2  2 matrix A with one eigenvalue  and the associated eigen-
vector v i.e., Av D v. Now we select a vector w such that v; w are linear independent, thus we
can write Aw D cv C d w for c; d 2 R. Now, we have
( " # " #
Av D v h i h i  c h i  c h i 1
H) A v w D v w H) A D v w v w
Aw D cv C d w 0 d 0 d
||
We accepted that e At e Bt D e .ACB/t if AB D BA, of which proof is skipped.

Actually there is a theorem called the Caley-Hamilton theorem that reveals this. The characteristic equation
of A is . 1/2 D 0. That theorem–stating that the matrix also satisfies the characteristic equation–then gives us:
.A I/2 D 0.

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 746

So, we have proved that for any 2  2 matrix, it is always possible to diagonalize A into the form
PTP 1 where T is an upper triangle matrix. Now, we’re interested in the case A is defective i.e.,
it has a double eigenvalue 1 D 2 D , thus we haveŽŽ
" # " #
hi  c h i 1 h
i  c kh i 1
AD v w v w H) Ak D v w v w
0  0 

It turns out that it is easy to compute the blue term: a triangular matrix is also nice to work with.
Indeed, we can decompose the blue matrix, now denoted by , into the sum of a diagonal matrix
and a nilpotent matrix. A nilpotent matrix is a square matrix N such that Np D 0 for some
positive integer p; the smallest such p is called the index of N. Using the binomial theorem and
the nice property of nilpotent matrices (in below the red matrix is N with p D 2), we get
" # " #!k " #k " #k 1 " # " #
 0 0 c  0  0 0 c k kck 1
k D C D Ck D
0  0 0 0  0  0 0 0 k

Thus, using Eq. (9.3.8) we can determine e t


" #
t t
e cte
e t D
0 e t

And the solution to xP D Ax is x.t/ D e At x 0 , which can be written as


" #" #
h i e t cte t a
x.t / D e At x 0 D Pe t P 1 x 0 D v w t
D ae t v C be t .ctv C w/
0 e b

The final step is to find w and we’re done. Recall that Aw D cv C d w, but d D , thus (redefine
w as .1=c/w), we obtain

Aw D cv C w ” Aw D v C w ” .A I/w D v H) .A I/2 w D 0

We call v the eigenvector of A, how about w? Let put the equations of these two vectors together:

.A I/1 v D 0
(9.3.9)
.A I/2 w D 0

With this, it is no surprise that mathematicians call w the generalized eigenvectors (of order 2)
of A. generalized eigenvectors play a similar role for defective matrices that eigenvectors play
for diagonalizable matrices. The eigenvectors of a diagonalizable matrix span the whole vector
space. The eigenvectors of a defective matrix do not, but the generalized eigenvectors of that
matrix do.
ŽŽ
We must have d D  as A and the red matrix are similar, they have same eigenvalues.

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 747

9.4 Partial differential equations: a classification


It is more often that a quantity varies from points to points and from time to time; such quantity
is a function of two variables u.x; t/ in the simplest setting; or it can be u.x; y; z; t/. Thus, we
have changes of u in space when t is fixed, and we also have change of u when time is passing by
at a fixed point x. Let’s first introduce some short notations for all partial derivatives of u.x; t/:

@u @u @2 u @2 u
ux D ; ut D ; uxx D ; ut t D (9.4.1)
@x @t @x 2 @t 2
Then, a partial differential equation (PDE) in terms of u.x; t/ is the following equation:

F .u; ux ; u t ; uxx ; u t t / D 0 (9.4.2)

Note that partial derivatives of order higher than 2 are not discussed. This is because in physics
and engineering, we rarely see them present in differential equations.
To classify different PDEs, the concepts of order, dimension and linearity of a PDE are
introduced:

Order The order of a PDE is the highest partial derivative; u t D uxx is a second-order PDE;

Dimension The dimension of a PDE is the number of independent variables; u t t D uxx C uyy
is a 3D PDE as it involves x; y and t ;

Linearity A PDE is said to be linear if the function u and all its partial derivatives appear in a
linear fashion ;i.e., they are not multiplied together, they are not squared etc.

Table 9.3 presents some examples to demonstrate these concepts.

Table 9.3: Some PDEs with associated order, dimension and linearity.

equation order linear dim.

u t D uxx 2 X 2

u t t D uxx C uyy 2 X 3

xux C yuy D u2 1  2

9.5 Derivation of common PDEs


This section presents the derivation of common PDEs; those PDEs show up quite frequently in
many science and engineering fields. Knowing how to get the equations is important particularly
if you want to be a scientist. Mathematicians are more interested in solving the equations and

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 748

determining the behavior of the solutions; for example mathematicians are interested in questions
such as whether the solutions are unique or when the solutions exist.
We start with the wave equation in Section 9.5.1, derived centuries ago by d’Alembert in
1746. We live in a world of waves. Whenever we throw a pebble into the pond, we see the circular
ripples formed on its surface which disappear gradually. The water moves up and down, and the
effect, ripple, which is visible to us looks like an outwardly moving wave. When you pluck the
string of a guitar, the strings move up and down, exhibiting transverse wave; The particles in
the string move perpendicular to the direction of the wave propagation. The bump or rattle that
we feel during an earthquake is due to seismic-S wave. It moves rock particles up and down,
perpendicular to the direction of the wave propagation.
We continue in Section 9.5.2 with the heat equation (or diffusion equation) derived by Fourier
in 1807.

9.5.1 Wave equation


This section presents the derivation of the wave equation. It all started with the aim to understand
the vibration of a violin string. Why this object? This is because a string can be modeled as an
infinitely thin line, and its motion is constrained in a plane. So, it is simple from a mathematics
point of view.

Figure 9.1: Derivation of the wave equation: a vibrating string.

So, we consider a string fixed at two ends. At time t D 0, the string is horizontal and un-
stretched (Fig. 9.1). As the string undergoes only transverse motion i.e., motion perpendicular
to the original string, we use u.x; t/ to designate the transverse displacement of point x at time
t . Our task is to find the equation relating u.x; t/ to the physics of the string.
The key idea is to use Newton’s 2nd law (what else?) for a small segment of the string.
Such a segment of length x is shown in Fig. 9.1. What are the forces in the system? First,
we have f .x; t/ in the vertical direction which can be gravity or any external force. This is

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 749

a distributed force that is force per unit length (i.e., the total force acting on the segment is
f x). Second, we have the tension force T .x; t/ inside the string. We use Newton’s 2nd law
F D ma p in the vertical direction to write, with a D =@t , mass is density times length, that is
@2 u 2

m D  .x/ C .u/
2 2

p @2 u
 .x/2 C .u/2 2 D T .x C x; t/ sin .x C x; t/ T .x; t/ sin .x; t/ C f .x; t/x
@t
(9.5.1)
Dividing this equation by x and considering x ! 0, we get
s
 2 2
@u @ u d
 1C 2
D .T .x; t/ sin .x; t// C f .x; t/
@x @t dx (9.5.2)
@T @
D sin .x; t/ C T .x; t/ cos .x; t/ C f .x; t/
@x @x
We know that the derivative of u.x; t/ is tan .x; t/, so we can write
 
@u @u
tan .x; t/ D .x; t/; .x; t/ D arctan (9.5.3)
@x @x

where we also need an expression for .x; t/. From tan .x; t/, we can compute sin .x; t/,
cos .x; t / and from the expression for , we can compute the derivative of :
p s @ u 2
. @u /2 1 @
sin .x; t/ D @x
; cos .x; t/ D ; D @x 2 (9.5.4)
1 C . @u
@x
/2 1 C . @u
@x
/2 @x 1 C . @u
@x
/2

Now comes the art of approximation (otherwise the problem would be too complex). We consider
only small vibration, that is when j @u
@x
j  1ŽŽ , and with this simplified condition the above
equation becomes

@u @ @2 u
sin .x; t/ D ; cos .x; t/ D 1; D 2 (9.5.5)
@x @x @x
With all this, Eq. (9.5.2) is simplified to

@2 u @T @u @2 u
 D C T .x; t/ C f .x; t/ (9.5.6)
@t 2 @x @x @x 2
The equation looks much simpler. But it is still unsolvable. Why? Because we have one equation
but two unknowns u.x; t/ and T .x; t/. But wait, we have another Newton’s 2nd law in the
horizontal direction:

T .x C x; t/ cos .x C x; t/ T .x; t/ cos .x; t/ D 0 (9.5.7)


ŽŽ
The symbol  means much smaller than.

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 750

Dividing it by x and let x ! 0, we get


@
ŒT .x; t/ cos .x; t/ D 0 (9.5.8)
@x
But note that cos .x; t/  1, thus the above equation indicates that T .x; t/ is constant. Let’s
use T to designate the tension in the string, Eq. (9.5.6) becomes
@2 u @2 u
 D T C f .x; t/
@t 2 @x 2
Ignoring f .x; t/ and letting c 2 D T =, we get the wave equation:

@2 u 2
2@ u
D c (9.5.9)
@t 2 @x 2
What does this equation mean? On the LHS we have the acceleration term and on the RHS we
have the second spatial derivative of u.x; t/. The second spatial derivative of u measures the
concavity of the curve u.x; t/. Thus, when the curve is concave downward, this term is negative,
and thus the wave equation tells us that the acceleration is also negative and thus the string is
moving downwards (Fig. 9.2).

Figure 9.2

We do not discuss the solution to the wave equation here. But even without it, we still can
say something about its solutions. The first thing is that this equation is linear due to the linearity
of the differentiation operator. What does this entail? Let u.x; t/ and v.x; t/ be two solutionsŽŽ
to the wave equation, that is
@2 u 2
2@ u @2 v 2
2@ v
D c ; D c
@t 2 @x 2 @t 2 @x 2
then any linear combination of these two i.e., ˛u C ˇv, where ˛ and ˇ are two constants, is also
a solution:
@2 .˛u C ˇv/ 2
2 @ .˛u C ˇv/
D c
@t 2 @x 2
ŽŽ
Why the wave equation can have more than one solution? Actually any PDE has infinitely many solutions.
Think of it this way. The violin string can be bent into any shape you like before it is released and the wave equation
takes over. In other words, each initial condition leads to a distinct solution.

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 751

3D wave equation. Having derived the 1D wave equation, the question is what is the 3D version?
let’s try to guess what it would be. It should be of the same form as the 1D equation but has
components relating to the other dimensions (red terms below):
 2 
@2 u 2 @ u @2 u @2 u
Dc C 2C 2 (9.5.10)
@t 2 @x 2 @y @z
It is remarkable that a model born in attempts to understand how a string vibrates now has a
wide spectrum of applications. Here are some applications of the wave equation:

(a) Vibrations of a stretched membrane, such as a drumhead

(b) Sound waves in air or other media

(c) Waves in an incompressible fluid, such as water

(d) Electromagnetic waves, such as light waves and radio waves.

9.5.2 Diffusion equation


If we hold a metal rod on one end and the other end of the rod is heated, after a while your hand
feels the heat. This phenomenon is called heat conduction. This can be explained roughly as
follows. Initially, before the rod is heated, the atoms (or electrons) in the rods vibrate around
their equilibrium positions (you can imagine tiny particles jiggling). When one end of the rod is
heated, the atoms in this part vibrate quicker and they collide with nearby atoms (that have lower
temperature). Through these collisions heat is transferred from the hotter atoms to the colder
ones and eventually to the other end of the rod.
Herein, we want to make a mathematical model to describe heat conduction at a macroscopic
scale (i.e., without worrying about atoms, molecules etc.). The most distinct advantage of such
a continuum model is that it is possible to investigate the heat conduction in a large piece of
material e.g. automotive Diesel piston. Obviously it is impossible with current computers to do
so with an atomistic model.
When energy is added to a system and there is no change in the kinetic or potential energy, the
temperature of the system usually rises. The quantity of energy required to raise the temperature
of a given mass of a substance by some amount varies from one substance to another. To quantify
this, the specific heat c is used, which is the amount of energy required to raise the temperature
per unit mass by 1o C.
We are deriving the equation of heat conduction inside a long thin bar of length L (to
minimize the mathematical complexity). The outer surface of the bar is insulated and the left
end is heated up while the right end is cooled down. Therefore, there is heat moving along the
bar to the right (Fig. 9.3). Let’s denote by .x; t/ the temperature in the bar at a distance x from
the left end at time t . Furthermore, let A be the cross sectional area of the bar,  be the density
of the material making up the bar. These two quantities (A and ) can vary along the bar or they
can be constant. For simplicity, they are considered constant subsequently though. We should
always start with a simplest model that is possible.

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 752

Figure 9.3: Heat conduction in a long bar.

The idea is to consider a segment of the bar e.g. the part of the bar between x D a and x D b,
and applying the principle of conservation of energy to this segment. The conservation of energy
is simple: the rate of change of heat inside the bar is equal to the heat flux entering the left end
minus the heat flux going out the right end. The rate of change of heat is given by
Z
@ b
rate of change of heat D cA.x; t/dx (9.5.11)
@t a
while the heat fluxes are
heat fluxes D AJ.a; t/ AJ.b; t/ (9.5.12)
where J is the heat flux density. Now, we can write the equation of conservation of heat as
Z
@ b
cA.x; t/dx D AJ.a; t/ AJ.b; t/ (9.5.13)
@t a
Using Leibniz’s rule and the fundamental theorem of calculus, we can elaborate this equation as
Z b Z b
d.x; t/ dJ
cA dx D A dx (9.5.14)
dt a dx
Z b 
a
@.x; t/ @J
H) c C dx D 0 (9.5.15)
a @t @x
@.x; t/ @J
H)c C D0 (9.5.16)
@t @x
In the third equation, we moved from an integral equation to a partial differential equation. This
is because the segment Œa; b is arbitrary, so the integrand must be identically zero.
You might guess that we still miss a connection between J and .x; t/ (one equation and
two unknown variables is unsolvable). Indeed, and Fourier carried out experiments to give us
just that relation (known as a constitutive equation)
@
J D k (9.5.17)
@x
where k is known as the coefficient of thermal conductivity. The thermal conductivity provides
an indication of the rate at which heat energy is transferred through a medium by the diffusion
process.

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 753

With Eq. (9.5.17), our equation Eq. (9.5.16) becomes (note that k is constant):
 
@.x; t/ @ @
c C k D0
@t @x @x
(9.5.18)
@ @2  k
H) D 2 2 ; 2 D
@t @x c

which is a linear second order (in space) partial differential equation. As it involves the second
derivative of  we need two boundary conditions on : .0; t/ D 1 and .L; t/ D 2 where
1;2 are real numbers. Furthermore, we need one initial condition (as we have 1st derivative of
 w.r.t time): .x; 0/ D .x/ for some function .x/ which represents the initial temperature
in the bar at t D 0. Altogether, the PDE, the boundary conditions and the initial condition make
an initial-boundary value problem:

@ @2 
D 2 2 0<x<L (9.5.19)
@t @x
.x; 0/ D g.x/ 0xL (9.5.20)
.0; t/ D 1 ; .L; t/ D 2 t >0 (9.5.21)

Another derivation of Eq. (9.5.16).

We consider a segment of the bar Œx0 ; x0 C x, then we can write


Z x0 Cx
d
cA.x; t/dx D AJ.x0 ; t/ AJ.x0 C x; t/ (9.5.22)
dt x0

Intermediate value theorem of integral calculus (Eq. (4.11.3)) applied to the integral on
the LHS,
@
cA.x1 ; t/x D AJ.x0 ; t/ AJ.x0 C x; t/ (9.5.23)
@t
where x1 2 Œx0 ; x0 C x. Dividing both sides by x, we obtain
 
@ J.x0 C x; t/ J.x0 ; t/
cA.x1 ; t/ D A (9.5.24)
@t x
The final step is to let x to go to zero, and then x1 is x0 and on the RHS we have the
derivative of J evaluated at x0 .
@.x0 ; t/
c D Jx .x0 ; t/ (9.5.25)
@t
This equation holds for any x0 , we can replace x0 by x. And we get the 1D heat diffusion
equation.

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 754

3D diffusion equation. Having derived the 1D heat equation, it is not hard to derive the 3D
equation. Before doing so, let’s try to guess what it would be. It should be of the same form as
the 1D equation but has components relating to the other dimensions (red terms below):
 2 
@ 2 @  @2  @2 
D C 2C 2 (9.5.26)
@t @x 2 @y @z

We use the Gauss’s theorem, see Section 7.11.6, for the derivationŽŽ . We consider an arbitrary
domain V with the surface S. The temperature is now given by .x; t/ where x D .x1 ; x2 ; x3 /
is the position vector. The conservation of energy equation is
Z Z
@
cdV D J  ndA
@t V
Z ZS
@
cdV D r  J dV (Gauss’s theorem) (9.5.27)
@t V
Z   V
@
c C r  . kr/ dV D 0 .J D kr/
V @t

As the volume domain V is arbitrary, we get the well known 3D heat equation (assuming k is
constant):
@.x; t/ X 3
@2 
2
D  .x; t/;  WD r  .r/ D (9.5.28)
@t i D1
@xi 2

where  is the Laplacian operator, named after the French mathematician Pierre-Simon Laplace
(1749-1827). We see this term f again and again in physics. Some people say that it is the
most important operator in mathematical physics.
In the above derivation, we have used the 3D version of Eq. (9.5.17):
2 3 2 32 3
Jx k 0 0 ;x
6 7 6 76 7
J D kr or 4Jy 5 D 4 0 k 0 5 4;y 5 (9.5.29)
Jz 0 0 k ;z

The matrix form is convenient when k is not constant. In that case we say the heat conduction is
not isotropic but anisotropic, and we use three different values for the diagonal terms.
Eq. (9.5.26) can also be used to model other diffusion processes (that’s why it is referred to
as the diffusion equation rather than the restricted heat equation term). For example, if a drop of
red dye is placed in a body of water, the dye will gradually spread out and permeate the entire
body. If convection effects are negligible, Eq. (9.5.26) will describe the diffusion of the dye
through the water; .x; t/ is now the concentration of dye at x and time t !
ŽŽ
Of course it is possible to consider an infinitesimal cube and follow the same steps done for the long bar. But
the divergence theorem provides a shorter way.

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 755

9.5.3 Poisson’s equation

9.6 Linear partial differential equations


We have seen a few partial differential equations, it is time to sit back and study the common
features of them. That’s what mathematicians do. For example, they studied quadratic equations,
then cubic equations, and then n-order polynomial equations of the form an x n C an 1 x n 1 C
   C a2 x 2 C a1 x C a0 D 0. Doing the same thing here, we can see that the wave equation, the
diffusion equation etc. can be written in this form L.u/ p D F where L is a linear p differential
operator. Feed in a number x, the operator square root 2 gives another number x. Similarly,
feed in a function u.x1 ; x2 ; t/, the operator L gives another function L.u/. In case of the wave
equation, L is  2 
@2 2 2 @ 2 @2 2
LD 2 c C 2
@t @x12 @x2
Thus, fed in a function u we get L.u/:
 
@2 u 2 @2 u @2 u
L.u/ D 2 c C 2
@t @x12 @x2
Now it is time to have a general expression for L, which generalizes the concrete instances we
have met :
X n
@ X n
@2
L D a.x/ C bi .x/ C cij .x/ C  (9.6.1)
i D1
@xi i;j D1 @xi xj

where a; bi ; cij are some coefficients, which can be constants or functions of x.


A linear partial differential equation is simply an equation of the form L.u/ D F where
L is a linear partial differential operator and F is a function of x. Such an equation is called
homogeneous if F D 0 and inhomogeneous if F ¤ 0.
As a linear operator is consisted of partial derivatives and the derivative of af .x/ is a times
the derivative of f and the derivative of a sum is the sum of the derivatives, it is readily that a
linear operator L satisfies the following properties for functions u; v and scalar c:

constant taken outside the operator L.cu/ D cL.u/


(9.6.2)
opetator on sum is sum of operator L.u C v/ D L.u/ C L.v/

If we combine the above two properties, we then get

L.˛u C ˇv/ D L.˛u/ C L.ˇv/ D ˛L.u/ C ˇL.v/

But this is not enough for mathematicians, why just two functions u; v? Then, they go for n
functions u1 ; u2 ; : : : ; un , and write L.a1 u1 C    C an un / D a1 L.u1 / C    C an L.un /.

Principle of linear superposition.



For the wave equation, a D 0, bi D 0, c11 D c22 D c 2 and c33 D 1 noting that x3 D t.

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 756

9.7 Dimensionless problems


The transient heat conduction equation needs values for the heat capacity c, the density , and the
heat conduction coefficient k of the material. In addition, relevant values must be chosen for the
initial and boundary temperatures. With a dimensionless mathematical model, as explained in
this section, no physical quantities need to be assigned. Not only is this a simplification of great
convenience, as one simulation is valid for any type of material, but it also actually increases the
understanding of the physical problem.
This section is organized as follows. Section 9.7.1 is for a discussion on dimensions and
units. Then scaling of ordinary differential equations is treated in Section 9.7.4.

9.7.1 Dimensions and units


A physical dimension is a property we associate with physical quantities for purposes of clas-
sification or differentiation. Mass, length, time, and force are examples of physical dimensions.
There are fundamental and derived dimensions. Fundamental dimensions include mass, length,
time, and perhaps charge and temperatureŽŽ .
We need some suitable mathematical notation to calculate with dimensions. The dimension
of length is written as ŒL, the dimension of mass as ŒM , the dimension of time as ŒT . We then
express the dimensions of other quantities (e.g. speed) in terms of the fundamental dimensions.
For instance, the dimension of speed is ŒL=T  or LT 1 . The dimension of force, another derived
unit, is the same as the dimension of mass times acceleration, and hence the dimension of force
is ŒMLT 2 . The point is that every quantity which is not explicitly dimensionless, like a pure
number e.g. , e, has characteristic dimensions which are not affected by the way we measure
it. As we will see shortly, this provides a useful check on any calculations we do.
Units give the magnitude of some dimension relative to an arbitrary standard. For example,
when we say that a person is six feet tall, we mean that person is six times as long as an object
whose length is defined to be one foot. The standard size chosen is, of course, entirely arbitrary,
but becomes very useful for comparing measurements made in different places and times. Several
national laboratories are devoted to maintaining sets of standards, and using them to calibrate
instruments.
In contrast to dimensions, of which only a few are needed, there is a multitude of units for
measuring most quantities: lengths measured in inches, meters, centimeters and kilometers. It is,
therefore, always necessary to attach a unit to a number, as when giving a person’s height as 175
cm or as 5 feet 9 inches. Without units, a number is at best meaningless and at worst misleading
to the reader.
The International System of Units (SI, abbreviated from the French Système international
(d’unités)) is the modern form of the metric system. It comprises a coherent system of units
of measurement starting with seven base units, which are the second (the unit of time with the
symbol s), metre (length, m), kilogram (mass, kg), ampere (electric current, A), kelvin (thermo-

Noting that this choice is arbitrary, it is fine to use force as a fundamental dimension instead of mass for
ŽŽ

example.

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 757

dynamic temperature, K), mole (amount of substance, mol), and candela (luminous intensity,
cd).
From the seven base (or fundamental) units, we can derive many more derived units. For
example, what is the unit of force in SI? Using Newton’s 2nd law, we write
m
ŒF  D kg (9.7.1)
s2
And to honour Newton, we invented a new unit called N, and thus 1 N=1kgm/s2 . Similarly, we
have 1 Pa=1N/m2 as the SI unit of pressure and stress, in honor of Blaise Pascal.
Some common consistent SI units are given in Table 9.4.

Quantity Relation SI (m,s,N) Dimension

length - m ŒL
time - s ŒT 
mass - kg ŒM 
force mass  acceleration N=1kgm/s 2
ŒMLT 2

pressure/stress force = area Pa=1N/m 2
ŒML T 1 2


Table 9.4: Some physical quantities with corresponding dimensions and SI units.

It is not the end of the story about units. Why we have meters and still need kilometers? The
reason is simple: we’re unable to handle too large or too small numbers. If we only had meter as
the only unit for length, then for lengths smaller than 1 meter we have to use decimals e.g. 0.05 m.
To avoid that, sub-units are developed. Instead of 0.05 m we say 5 cm. Similarly for 20 000 m, we
write 20 km, which is much easier to comprehend. In conclusion, larger and smaller quantities
are expressed by using appropriate prefixes with the base unit. Table 9.5 presents all prefixes in
SI. One example is: the mass of the Earth is 5 972 Yg (yottagrams), which is 5:972  1024 kg.

9.7.2 Power laws


If we stop doing calculations and make one observation about dimensions, we would see that di-
mensional quantities always appear in power laws. For example, we have ŒLŒT  1 (or ŒL1 ŒT  1 )
for speed, ŒMLT 2  for force, or ŒL3 for volume and ŒM ŒL 3 for density. We can prove that
as follows.
Assume that x is a quantity of dimension of length, and y is a quantity depends on x via
y D f .x/. You can think of y as volume for example, then f .x/ D x 3 . Suppose now that x
takes two values x1 and x2 , then we get two y’s: y1 D f .x1 / and y2 D f .x2 /. The crucial point
is that the ratio of y1=y2 is a dimensionless number. Obviously, a dimensionless number exists
independently of any system of units we create. That is, if we now measure x1 and x2 using a
different system of units, its value are ˛x1 and ˛x2 ; e.g. ˛ D 1000 if x1 was measured in meters
and now measured in mm. Then, we have

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 758

Table 9.5: Prefixes in SI. Prefix names have been mostly chosen from Greek words (positive powers of
10) or Latin words (negative powers of 10), although recent extensions of the range of powers of 10 has
resulted in the use of words from other languages. ‘Kilo’ comes from the Greek word for 1000 (103 ), and
‘milli’ comes from the Latin word for one thousandth (10 3 ).

Large measurements Small measurements

Prefix Symbol Multiple Prefix Symbol Sub-multiple

yotta Y 1024 deci d 10 1

zetta Z 1021 centi c 10 2

exa E 1018 milli m 10 3

peta P 1015 micro  10 6

tera T 1012 nano n 10 9

giga G 109 pico p 10 12

mega M 106 femto f 10 15

kilo k 103 atto a 10 18

hecto h 102 zepto z 10 21

deka h 101 yocto y 10 24

y1 f .x1 / f .x1 / f .˛x1 /


D D a dimensionless number H) D
y2 f .x2 / f .x2 / f .˛x2 /
Now, our goal is to solve the boxed equation and hope that its solution is of the form
f .x/ D C x ˇ , a power functionŽŽ .
Now, we rearrange the boxed equation a bit and take the first derivative of two sides of
resulting equation with respect to ˛, we get:

f .x1 / f .x1 / df
f .˛x1 / D f .˛x2 / H) f1 .˛x1 /x1 D f1 .˛x2 /x2 ; f1 WD
f .x2 / f .x2 / dx
The above equation holds for any value of x1 ; x2 and ˛. Now, setting ˛ D 1,

f1 .x1 / f1 .x2 / f 0 .x/ f 0 .x/ k


x1 D x2 H) x Dk” D
f .x1 / f .x2 / f .x/ f .x/ x
ŽŽ
It is easy to check that if f .x/ is a power function then it satisfies the boxed equation. So, we;re in good
direction.

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 759

Now, integrating both sides of the above equation we obtain


Z Z
f 0 .x/ 1
dx D k dx H) ln f .x/ D k ln x C A
f .x/ x

And that leads to, indeed, a power function for f .x/:

f .x/ D C x k
That is good but why power functions but not other functions that we have spent a lot of time to
study in calculus? The reason is simple. We can never have more complicated functions. One
simple way to see this is use Taylor series. For example, the exponential function has the Taylor
series

x2
ex D 1 C x C C 
2
If x was a cetain length, then e x would require the addition of length to area to volume, which
is nonsense. So, if we see, in an equation, e x or sin x or whatever function (except x k ), then x
must be a dimensionless number otherwise the equation is physically wrong.
The next step is to consider physical quantities that depend on more than one quantities. For
simplicity, I just consider a quantity z that depends on two other quantities x; y: z D f .x; y/.
Now, doing the samething, we will have

f .x1 ; y1 / f .x1 ; y1 / f .˛x1 ; ˇy1 /


D dimensionless number H) D
f .x2 ; y2 / f .x2 ; y2 / f .˛x2 ; ˇy2 /
And we get, )
f .x; y/ D C1 x a
H) f .x; y/ D C x a y b
f .x; y/ D C2 y b

9.7.3 Dimensional analysis


As all students of science and engineering know, equations must be dimensionally homogeneous;
that is, all terms in an equation must have the same dimensions–one cannot add apples and
oranges. This simple observation forms the basis of what is called dimensional analysis.

Example 9.1
The spring-mass system has only two quantities: the spring stiffness k with dimension ŒFL 1 
and the mass m with dimension ŒM . We know that the dimension of force is ŒF  D ŒMLT 2 .
Thus, k has a dimension of ŒM T 2 . We also know that the dimension of !0 is ŒT 1 . As this
quantity is a function of m and k, we have (from the power law above)

!0 D C ma k b
where a; b are so determined that the dimension of both sides be the same:

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 760

Œ!0  D C ŒM a ŒM b T 2b
 H) ŒT 1
 D ŒM aCb T 2b

And this gives us the following system of two linear equations to solve for a and b
)
aCb D0
H) a D 1=2; b D 1=2
2b D 1

Thus, we obtain the formula for the angular frequency without actually solving the equation,
p
!0 D C k=m

But dimensional analysis cannot give us the value of C . For that we can either solve the
problem (which is usually hard) or do an experiment. It is interesting to rewrite the above
equation as
p
C D !0 m=k
p
The number !0 m=k is called a dimensionless group. Furthermore, as it is a dimensionless
number, its value is invariant under change of units. Thus, it is called a universal constant.
In summary, this example has three independent dimensional quantities and they need
two fundamental dimensions (ŒM  and ŒT ). The solution shows that there exists one dimen-
sionless group.

Example 9.2
For example, suppose we want to work out how the flow Q of an ideal fluid through a hole
of diameter D depends on the pressure difference p. It seems plausible that Q might also
depend on the density of the fluid , so we look for a relationship of the form:

Q D kD a .p/b c
Now, we write the dimensions of all quantities involved

Œ D ŒML 3 ; ŒD D ŒL; Œp D ŒML 1 T 2


; ŒQ D ŒL3 T 1

Hence, the equations are
8̂ 2 3 2 3
<a b 3c D 3 a 2
6 7 6 7
ŒL3 T 1
 D ŒLa M b L b T 2b
M cL 3c
 H) b C c D 0 H) 4b 5 D 4 0:5 5

2b D 1 c 0:5

Thus, we obtain Q without having to actually solve the problem,


s r
2 p Q 
Q D kD H) 2 Dk
 D p

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 761

In summary, this example has four independent dimensional quantities and they need three
fundamental dimensions (ŒM ; ŒL and ŒT ). The solution shows that there exists one dimen-
sionless group.

Example 9.3
In the previous example, we considered only an ideal fluid i.e., a fluid with zero viscosity.
Now, suppose that we’re dealing with a viscous fluid if the viscosity  of dimension ŒL2 T 1 .
Now, Q is given by:

Q D kD a .p/b c d
Hence, the equations are

ŒL3 T 1
 D ŒLa M b L b T 2b
M cL 3c
L2d T d


which results in the following system of linear equations (three equations for four unknowns)
2 3
8̂ 2 3 a 2 3
<a b 3c C 2d D 3 1 1 3 2 6 7 3
6 7 6b 7 6 7
b C c D 0 ” 40 1 1 05 6 7 D 4 0 5
:̂ 4c 5
2b d D 1 0 2 0 1 1
d

Using linear algebra from Chapter 11, the rank of the matrix associated to the above system is
three, and the system has one free variable. Let’s choose b as the free variable, we can solve
for a; c; d in terms of b:

cD b; d D 1 2b; a D 1 C 2b
Thus, Q is written as
Q D kD 1C2b .p/b  b
1 2b
(9.7.2)
If the pattern we observe from the previous two examples still works, we should have two
dimensionless groups. This is so because there are five independent dimensional quantities
and they need three fundamental dimensions (ŒM ; ŒL and ŒT ). Indeed, we have two dimen-
sionless groups (highlighted red):
 2 b
Q D p
Dk (9.7.3)
D 2

From the presented three examples there exists a relationship between the number of quan-
tities, the number of fundamental dimensions and the number of dimensionless groups. Now,
we need to prove it. Instead of a general proof, we consider Example 9.3 and prove that there
must be one dimensionless number in this example. First, we write the dimensions of all quanti-
ties involved, but we have to explicitly write the powers of ŒM ; ŒL and ŒT . For example, for

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 762

Œ D ŒM 1 L 3  (with no time), we write Œ D ŒM 1 L 3 T 0 :

Œ D ŒM 1 L 3 T 0 ; ŒD D ŒM 0 L1 T 0 ; Œp D ŒM 1 L 1 T 2



(9.7.4)
ŒQ D ŒM 0 L3 T 1
; Œ D ŒM 0 L2 T 1


Now, suppose we can build a dimensionless number C of the form (power law)

C D x1 D x2 p x3 Qx4 x5

Noting that as C is dimensionless, its dimension is ŒC  D ŒM 0 L0 T 0  (or ŒC  D 1). And this


leads to the following homogeneous system of equations:
2 3
2 3 6x1 7 2 3
1 0 1 0 0 6 x2 7
7 607
6 76
4 3 1 1 3 25 6x3 7
6
7 D 405 (9.7.5)
6 7
0 0 2 1 1 4x4 5 0
„ ƒ‚ …
A x5
where the first row of A is the powers of ŒM  in Eq. (9.7.4). The second row is the powers
of ŒL and so on. Thus, this matrix is called the dimension matrix. Are we going to solve the
system in Eq. (9.7.5)? No no no. That is the power of mathematics. Using the rank theorem,
Theorem 11.5.4, from linear algebra which says rank.A/ C nullity.A/ D 5 and the fact that
rank.A/ D 3, we deduce that nullity.A/ D 2, hence it has two solutions x ¤ 0. Therefore, we
have two dimensionless numbers.
Hey, but why the rank of the dimension matrix A is three not less? If we use the Gauss-Jordan
elimination to get the row reduced echelon form of A, we get as the first three columns the three
unit vectors of R3 : .1; 0; 0/; .0; 1; 0/; .0; 0; 1/. This makes us to think of a three dimensional
space. Indeed, the three independent dimensions ŒM , ŒL and ŒT  makes a three dimensional
vector space. In this vector space, a dimensional quantity has the coordinate vector .x1 ; x2 ; x3 /
because we always can write

Œx D ŒM x1 Lx2 T x3  (9.7.6)


Now, we make another observation. In Example 9.3 the relation between the different quanti-
ties can be written in the form of Eq. (9.7.2) in terms of dimensional variables or in the equivalent
form of Eq. (9.7.3) that involves only dimensionless variables. We now try to prove this. First,
we need to solve Eq. (9.7.5):
2 3 2 3
1=2 1=2
6 7 6 7
6 27 6 1 7
6 7 6 7
x D u6 6 1=2 7 C v 6 1=27
7 6 7
6 7 6 7
4 15 4 05
0 1

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 763

which allows us to find the two dimensionless variables, denoted by 1 and 2 ŽŽ :


( (
1 D 1=2 D 2 p 1=2 Q Q D  1=2 D 2 p 1=2 1
H)
2 D 1=2 D 1 p 1=2   D  1=2 D 1 p 1=2 2
Now, suppose that the physical law we’re seeking for is given by
f .; D; p; Q; / D 0 (9.7.7)
which can be rewritten as Q and  can be replaced by 1 and 2 :
1=2
f .; D; p; Q; / D f .; D; p;  D 2 p 1=2 1 ;  1=2
D 1 p 1=2 2 / D 0
Now we introduce a new function G depending on ; D; p; 1 ; 2 :
1=2
G.; D; p; 1 ; 2 / WD f .; D; p;  D 2 p 1=2 1 ;  1=2
D 1 p 1=2 2 / D 0 (9.7.8)
Now, we can choose a particular set of units such that ; D; p have unit values, then we have
G.1; 1; 1; 1 ; 2 / D 0, which can be rewritten as F .1 ; 2 / D 0. Thus, we have rediscovered
the following theorem.
Theorem 9.7.1: The Buckingham’s Pi theorem
Let
f .q1 ; q2 ; : : : ; qm / D 0
be a unit free physical law that relates the dimensional quantities q1 ; : : : ; qm . Let L1 ; : : : ; Ln ,
n < m, be fundamental dimensions with

Œqi  D ŒLa11i La22i : : : Lanni ; i D 1; 2; : : : ; m


and let r D rank.A/, where A is the dimension matrix given by
2 3
a11 a12    a1m
6 7
6a21 a22    a2m 7
AD6 : 6 7
: :: : : :: 7
4 : : : : 5
an1 an2    anm

Then there exists m r independent dimensionless quantities 1 ; 2 ; : : : ; m r that can be


formed from q1 ; : : : ; qm , and the physical law f .qi / D 0 is equivalent to an equation

F .1 ; 2 ; :::; m r / D 0
expressed only in terms of the dimensionless quantities.

Note that if the chosen fundamental dimensions are independent, then r is simply the number
of these fundamental dimensions.
ŽŽ
The dimensionless combinations that we can make in a given problem are not unique: if 1 and 2 are both
dimensionless, then so are 1 2 and 1 C 2 and, indeed, any function that we want to make out of these two
variables.

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 764

9.7.4 Scaling of ODEs


First order ODEs. We consider the following ODE:

axP C bx D Af .t/ (9.7.9)


In the first step, we replace all dependent and independent variables by non-dimensionless
counterparts. In this problem, we have two variables namely x.t/ and t , they are replaced by
x t
xQ D; tQ D (9.7.10)
xc tc
Of course, xc has the same dimension as x and tc is of the same dimension as t . That’s why xQ
and tQ are non-dimensionless variables.
Now, we write the old variables in terms of the new ones,

x D xc x;
Q t D tc tQ (9.7.11)

Also the first derivative,


dx dx d tQ d xQ 1
xP D D D xc (9.7.12)
dt d tQ dt d tQ tc
Thus, the original ODE Eq. (9.7.9) becomes
axc d xQ
C bxc xQ D Af .tc tQ/ (9.7.13)
tc d tQ
Next, we divide the equation by the coefficient of the highest derivative term (red term):
d xQ btc Atc
C xQ D f .tc tQ/
d tQ a axc
It is time to select xc and tc , so that there are less parameters less possible. So, we select them
so that the red/blue terms in the above equation are unity:

ˆ btc a
< D 1 H) tc D
a b (9.7.14)
ˆ Atc A
:̂ D 1 H) xc D
axc b
Finally, the original ODE Eq. (9.7.9) containing 3 parameters, now becomes
d xQ
C xQ D F .tQ/ (9.7.15)
d tQ
which is a dimensionless ODE with no parameter!

Differential operator. As a preparation for a discussion of 2nd order ODE in which we need
to compute x,
R we introduce the differential operator dt
d
, which we need to supply a function to
compute its time derivative:

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 765

d d d tQ 1 d
D D (9.7.16)
dt d tQ dt tc d tQ
The usefulness of this operator comes in when we compute the second derivative operator:

   
d2 d d d 1 d
D D .use Eq. (9.7.16)/
dt 2 dt dt dt tc d tQ
  (9.7.17)
1 d 1 d 1 d2
D D 2 2
tc d tQ tc d tQ tc d tQ
and we applied Eq. (9.7.16) to the function 1 d
tc d tQ
in the third equality.

Second order ODEs. Consider this 2nd order ODE

axR C b xP C cx D Af .t/; x.0/ D x0 ; x.0/


P D v0 (9.7.18)
Using the scaled quantities defined in Eq. (9.7.11) and the differential operators introduced
previously, this equation becomes
axc d 2 xQ bxc d xQ
C C cxc xQ D Af .tc tQ/
tc d tQ
2 2 tc d tQ
Dividing it by the coefficient of the 2nd derivative, we get this equation:
d 2 xQ btc d xQ ctc2 Atc2
C C xQ D f .tc tQ/
d tQ2 a d tQ a axc
We have two choices here: selecting tc so that the coefficient of either the second term or the
third term unity. We chose the latterŽŽ :
r
ctc2 a
D 1 H) tc D (9.7.19)
a c
And making Atc2=axc D 1 gives us xc D A=c.

d 2 xQ b d xQ
C p C xQ D F .tQ/
d tQ2 ac d tQ

9.8 Harmonic oscillation


Many kinds of motion repeat themselves over and over: the swinging pendulum of a grandfather
clock and the back-and-forth motion of the pistons in a car engine. This kind of motion, called
periodic motion or oscillation, is the subject of this section. Understanding periodic motion will
be essential for the study of waves, sound and light.
p
ŽŽ
Only with hindsight we can do this. For a spring-mass system, this leads to tc D m=k which is proportial
to the period of the oscillation of the mass.

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 766

Observing a ball rolling back and forth in a round bowl or


a pendulum that swings back and forth past its straight-down
position (Fig. 9.4), we can see that a body that undergoes peri-  L
odic motion always has a stable equilibrium position. When it
is moved away from this position and released, a force or torque y
comes into play to pull it back toward equilibrium (such a force T
is called a restoring force). But by the time it gets there, it has m
x
picked up some kinetic energy, so it overshoots, stopping some- ma
where on the other side, and is again pulled back (by the restoring 
force) toward equilibrium. mg
When the restoring force is directly proportional to the dis-
placement from equilibrium the oscillation is called simple har- Figure 9.4: A pendulum.
monic motion, abbreviated SHM or simple harmonic oscillation (SHO). This section is confined
to such oscillations.
We start with the simple harmonic oscillation in Section 9.8.1 where we discuss the equation
of motion of a spring-mass system, its solutions and its natural frequency and period. Damped
oscillations i.e., oscillations that die out due to resistive forces are discussed in Section 9.8.2.
Then, we present forced oscillations (those oscillations that require driving forces to maintain
their motions) in Section 9.8.3. The discussion is confined to sinusoidal driving forces only.
The phenomenon of resonance appears naturally in this context (Section 9.8.4). Force oscilla-
tions with any periodic driving forces are given in Section 9.8.5 where Fourier series are used.
Section 9.8.6 discusses the oscillation of pendulum.

9.8.1 Simple harmonic oscillation


Consider a mass m attached to a weightless spring of stiffness k 0
on a frictionless horizontal plane. Let’s denote by O the equilib- x
x
rium position of the mass; this is the position in which the spring l0 F
is neither stretched nor compressed. Now, if we displace the mass m
to the right of O a distance x, the spring will try to pull it back by
applying a force kx to the massŽŽ . The minus sign is here to express the effect of pulling back:
the force is always opposite the displacement vector. Thus, when the mass is at the left side of
O the force is pointing to the right and thus the spring pushes the mass back to O. In this way
we get harmonic oscillation.
Using Newton’s 2nd law we can write
k
mxR D kx H) xR C !02 x D 0; !02 D (9.8.1)
m
where xR D d 2 x=dt 2 . The notation !02 was introduced instead of !0 so that the maths (to be
discussed) will be in a simple form. At this stage we do not know its meaning, its role is for
ŽŽ
This is Hooke’s law which is named after British physicist Robert Hooke (1635 – 1703). He first stated the
law in 1676 as a Latin anagram. He published the solution of his anagram in 1678 as: ut tensio, sic vis ("as the
extension, so the force" or "the extension is proportional to the force").

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 767

notational convenience.
Assume that x.t/ is a solution of Eq. (9.8.1), then it is easy to see that Ax.t/ is also a
solution with any A > 0 that is a constant. Now assume that we have two solutions to this
equation, namely x1 .t/ and x2 .t/, which are independent of each otherŽŽ , then Ax1 .t/ C Bx2 .t/
is also a solution . Actually as it contains two constants A; B it is the general solution to
Eq. (9.8.1). Now we need to find two particular solutions and we are done. They are cos.!0 t /
and sin.!0 t / which are the only functions that have the second derivatives equal minus the
functions. Therefore, the general solution is||

x D A1 cos.!0 t/ C A2 sin.!0 t / (9.8.2)

with two constants A1 and A2 being real numbers. They are determined using the so-called
initial conditions. The initial conditions specify the conditions of the system when we start the
system. They include the initial position of the mass x0 (which is x.t/ evaluated at t D 0 i.e.,
x.0/) and the initial velocity v.0/:

x.0/ D A1 ; v.0/ D x.0/


P D !0 A2 (9.8.3)

While the solution in Eq. (9.8.2) is perfectly fine, it does not immediately reveal the amplitude
of the oscillation. Using the trigonometry identity cos.a b/ D cos a cos b C sin a sin b, we
can re-write that equation in the following form
q !
A A
x D A21 C A22 p cos.!0 t / C p
1 2
sin.!0 t/
A21 C A22 A21 C A22 (9.8.4)
D A cos.!0 t /

where A is the amplitude of the oscillation, i.e., the maximum displacement of the mass from
equilibrium, either
p in the positive or negative direction. If needed, we can relate A and  to A1
and A2 : A D A21 C A22 and cos  D A1 =A.  is called phase-shifted angle, see Fig. 9.5.

Figure 9.5: Phase shifted angle .

Simple harmonic motion is repetitive. The period T is the time it takes the mass to complete
one oscillation and return to the starting position. Everyone should be familiar with the period
ŽŽ
For example x1 .t / D sin t and x2 .t / D 5 sin t are not independent. Refer to Chapter 11 for detail.

You should verify this claim.
||
We should ask why there can’t be other solutions? To answer this question we need to use

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 768

of orbit for the Earth around the Sun, which is approximately 365 days; it takes 365 days for the
Earth to complete a cycle. We can find the formula for T based on this definition: the position
of the mass at time t is exactly the position at time t C T ; that is A cos.!0 .t C T / / D
A cos.!0 t /. So,
r
2 m
!0 .t C T / D !0 t C 2 H) T D D 2 (9.8.5)
!0 k

The unit of T is second in the SI system.


Next, we mention a related quantity named frequency, usually denoted by f . Frequency
helps to answer how often something happens (e.g. how many visits per day). In the case of
SHO, it measures how many cycles per unit time is. There is a relation between the period T
and the frequency f . To derive this relation, one example suffices. If it takes 0:1 s for one cycle
(i.e., T D 0:1 s), there will be then 10 cycles per second. Thus,

f D 1=T D !0=2 (9.8.6)

In the SI system, the unit of f is cycles per second or Hertz in honor of the first experimenterŽŽ
with radio waves (which are electric vibrations). While f is referred to as frequency, !0 is
called angular frequency. It is such called because !0 D 2f with the unit of radians per
second. There is no circle but why angular frequency? There is a circle hidden here. Whenever
we deal with sine and cosine we are dealing with the complex exponential, which in turn
involves the unit circle. See Fig. 9.6 for detail. Later on, we will call !0 the natural frequency of
the system when the mass is driven by a cyclic force with yet another frequency !.

Solution using a complex exponential. As it is more convenient to work with the exponential
function than with the sine/cosine functions, we use a complex exponential to solve the SHO
problem. But as x.t/ is real not imaginary, we use complex numbers to simplify the mathematics,
and we will take x.t/ as the real part of the complex solution. Using complex exponential, we
write x.t / asŽ
x.t/ D C1 e i !0 t C C2 e i !0 t ; C1 ; C2 2 C (9.8.7)
Using e i D cos  C i sin  in Eq. (9.8.7) and compare with Eq. (9.8.2), we can relate C1 ; C2
with A1;2 :
C1 C C2 D A1
(9.8.8)
i.C1 C2 / D A2
Solving Eq. (9.8.8) for C1 and C2 , we get

C1 D 1=2.A1 iA2 /; C2 D 1=2.A1 C iA2 / (9.8.9)


ŽŽ
Heinrich Rudolf Hertz (22 February 1857 – 1 January 1894) was a German physicist who first conclusively
proved the existence of the electromagnetic waves predicted by James Clerk Maxwell’s equations of electromag-
netism.
Ž
This is so because e i !0 t and e i !0 t are two solutions of Eq. (9.8.1), thus any linear combinations of them is
also a solution.

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 769

which indicates that C2 is simply the complex conjugate of C1 : C2 D CN 1 . Now, we can proceed
with Eq. (9.8.7) where C2 is replaced by CN 1 Ž :
x.t/ D C1 e i !0 t C CN 1 e i !0 t

D 2 ReŒC1 e i !0 t  .CN 1 e i !0 t
is the conjugate of C1 e i !0 t /
(9.8.10)
D ReŒ2C1 e i !0 t  (with 2C1 D A1 iA2 D Ae i
, Fig. 9.6)
i  i !0 t
D ReŒAe e  D A cos.!0 t /
And we got the same result, of course.

Figure 9.6: Solving SHO using a complex exponential: the complex number Ae i.!0 t / moves counter-
clockwise with angular velocity !0 around a circle of radius A. Its real part, x.t /, is the projection of the
complex number onto the real axis. While the complex number goes around the circle, this projection
oscillates back and forth on the x axis.

Geometric meaning of Euler’s identity. Recall that we have derived Euler’s identity
e i  C 1 D 0 in Eq. (2.25.19). Now, we can give a geometric meaning to it. Referring to Fig. 9.6
but with A D 1 (unit circle) and  D 0. The complex number e i !0 t is circulating the unit circle.
When !0 t D , it has traveled half of the circle and arrive at the point . 1; 0/ or 1. And thus
e i  D 1.

Plots of displacement, velocity and acceleration. To verify whether our solutions agree with
our intuitive understanding of a SHO, we analyze the displacement x.t/, the velocity xP and the
acceleration xR for A1 D 1:0 and A2 D 0:0. That is we displace the mass (from the equilibrium)
to the right a distance of A1 and release it. The plots of x; xP and xR are shown in Fig. 9.7.
The mass goes to the left with an increasing velocity (and acceleration). When it reaches the
equilibrium point, the velocity is maximum (and so is the kinetic energy). It continues moving
to the left until it reaches A at t D 0:5, at that point the velocity is zero (and the potential
energy is maximum).

Ž
If not clear, check Section 2.25 on complex conjugate rules, particularly uN wN D uw.

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 770

1.0
0.5

x(t)
0.0
0.5
1.0
0 1 2 3
t
5.0
2.5

x(t)
0.0
2.5
5.0
0 1 2 3
t
40
20
x(t)

0
20
40
0 1 2 3
t
Figure 9.7: SHO with x D A cos !t : plots of displacement, velocity and acceleration. The frequency is
!0 D 2 so that T D 1. The amplitude is A D 1.

Energy conservation. Let’s now compute the kinetic and potential energy of the SHO and see
about energy conservation. From Eq. (9.8.4), we have x and thus xP as

x D A cos.!0 t / H) xP D A!0 sin.!0 t /

Using them, we can determine the kinetic energy T and potential energy U as
1 1
T D mxP 2 D kA2 sin2 .!0 t /
2 2 (9.8.11)
1 2 1 2
U D kx D kA cos2 .!0 t /
2 2
From that energy conservation is easily seen: T C U D 1=2kA2 . It’s useful to plot the evolution
of the energies in time (Fig. 9.8a) to see the exchange between kinetic and potential energies. In
that plot, I used A D 0:5,  D 0, m D k D 1 (thus !0 D 1 and T D 2).
This energy conservation also gives us one more thingŽ :

1 2 1 2 1 xP 2 x2
mxP C kx D kA2 H) C D1
2 2 2 .!0 A/2 A2

What is the boxed equation? It is an ellipse! So, on the x xP plane–which is called the phase
plane–the trajectory of the mass is an ellipse (Fig. 9.8b). Think about it: we are dealing with a
mass moving on a line, but we have a circle if we use complex numbers to study this problem,
and we also met an ellipse if we use the phase plane. That’s remarkable.
Ž
Don’t forget that !02 D k=m.

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 771

0.5
T
U 0.50
0.4
0.25

Energies
0.3
0.00

x
0.2
0.25
0.1
0.50
0.0
0 1 2 3 4 5 6 1.0 0.5 0.0 0.5 1.0
t x
(a) energies (b) phase portrait

Figure 9.8: Simple harmonic oscillator.

9.8.2 Damped oscillator


We know that in reality, a spring won’t oscillate forever. Frictional forces will diminish the
amplitude of oscillation until eventually the system is at rest. We will now add frictional forces
to the mass and spring. Imagine that the mass is put in a liquid like molasses. Only friction that
is proportional to the velocity is considered. This is a pretty good approximation for a body
moving at a low velocity in air, or in a liquid. So we say the frictional force is b x.
P The constant
b > 0 depends on the kind of liquid the mass is in. The negative sign, just says that the force is
in the opposite direction to the body’s motion. Now Newton’s 2nd law gives us the equation of
motion with friction:
k b
mxR D b xP kx H) xR C 2ˇ xP C !02 x D 0; !02 D ; 2ˇ D (9.8.12)
m m
We are going to use complex numbers to solve Eq. (9.8.12). Let’s consider z.t/ D e i !t ,
which is the solution to the following equation

zR C 2ˇ zP C !02 z D 0; z D e i !t (9.8.13)

Now comes the reason why we used complex numbers: the derivatives of an exponential function
is the product of the function and a constant! Indeed,

z D e i !t
zP D i!e i !t D i!z (9.8.14)
2 i !t 2
zR D ! e D ! z
Substituting z; zP and zR into Eq. (9.8.13), we get the following equation

z. ! 2 C 2ˇi! C !02 / D 0 (9.8.15)

which is valid for all t . Thus,


! 2 C 2ˇi! C !02 D 0 (9.8.16)

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 772

which is a quadratic equation for !. Solving this equation, we get:


q
! D iˇ ˙ !02 ˇ 2 (9.8.17)
Now, we get different solutions depending on the sign of the term under the square root. In what
follows, we discuss these solutions.
p
Weakly damped is the case when !0 > ˇ. By setting !0d D !02 ˇ 2 , we have ! D iˇ ˙ !0d .
So, z D e i !t is written as
z D e i !t D e i.iˇ ˙!0 /t D e . ˇ ˙i !0d /t ˇ t ˙i !0d t
d
De e (9.8.18)
ˇ t i !0d t i !0d t
These are two particular solutions of Eq. (9.8.13): z1 D e e and z2 D e ˇt
e . Thus,
the general complex solution is
ˇ t i !0d t ˇt i !0d t ˇt d
i !0d t
z D C1 e e C C2 e e De .C e i !0 t C C2 e / (9.8.19)
„ 1 ƒ‚ …
z0

where C1 and C2 are two complex numbers. Now, we have to express z in the form x C iy, so
that we can get the real part of it, which is the solution we are seeking of. We write z0 as
h    i
z0 D ŒRe.C1 / C i Im.C1  cos !0d t C i sin !0d t
h    i
d d
C ŒRe.C2 / C i Im.C2 / cos !0 t i sin !0 t
    (9.8.20)
D .Re.C1 / C Re.C2 // cos !0d t C .Im.C2 / Im.C1 // sin !0d t
„ ƒ‚ … „ ƒ‚ …
A B
C i.: : :/
The solution x.t/ is the real part of z, thus it is given by
h    i
x.t/ D Re z.t/ D e ˇ t A cos !0d t C B sin !0d t
  (9.8.21)
D e ˇ t C cos !0d t C 

where A; B; C 2 R. Of course, we can also write the solution as C e ˇ t cos !0d t  . Is
this solution correct or at least plausibly correct? To answer that question is simple: put
ˇ D 0–which is equivalent to b D 0–into x.t/ and if that x.t/ has the same form of the
undamped solution, then x.t/ is ok. This can be checked to be the case, furthermore the term
e ˇ t is indeed a decay term: the oscillation has to come to a stop due to friction.

Example. Let’s consider one example with !0 D 1, ˇ D 0:05, x0 D 1:0, v0 D 3:0. We need to
compute C and  using the initial conditions. Using Eq. (9.8.21), we have

) x0
ˆ
< C D
x0 D x.t D 0/ D C cos  cos   
H) v0 C ˇx0
v0 D x.t
P D 0/ D Cˇ cos./ C !0 sin./ d ˆ  D arctan

!0d x0

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 773

Now, we can plot x.t/ using Eq. (9.8.21) (Fig. 9.9). The code is given in Listing A.11.

3
2
1

x(t)
0
1
2
3
0 10 20 30 40 50
t
Figure 9.9: Weakly damped oscillation can be seen as simple harmonic oscillations with an exponentially
decaying amplitude C e ˇ t . The dashed curves are the maximum amplitudes envelop ˙C e ˇ t .

p
Overpdamped is the case when !0 < ˇ. In this case, ! D iˇ ˙ i ˇ2 !02 D i.ˇ ˙ !/,
N
!N D ˇ 2 !02 .
)
N
z1 D e i !1 t D e . ˇ !/t
N ˇ C!/t
N
i !2 t . ˇ C!/t
N
H) z.t/ D C1 e . ˇ !/t
C C2 e . (9.8.22)
z2 D e De

9.8.3 Driven damped oscillation


Everyone knows that a swing will stop after awhile unless its motion is maintained by a par-
ent keeps pushing it. This section studies such forced or driven oscillations. If we consider a
sinusoidal driving force f .t/ D F0 cos.!t /, the equation is given by

mxR C b xP C kx D F0 cos.!t/ (9.8.23)

There are two main reasons for the importance of sinusoidal driving forces. First, there are many
important systems in which the driving force is sinusoidal. The second reason is subtler. It turns
out that any periodic force can be built up from sinusoidal forces using Fourier series.
Eq. (9.8.23) can be rewritten as follows

k b F0
xR C 2ˇ xP C !02 x D f0 cos.!t/; !02 D ; 2ˇ D ; f0 D (9.8.24)
m m m
We are going to solve this equation using a complex function z.t/ D x.t/ C iy.t/ satisfying
Eq. (9.8.24):
zR C 2ˇ zP C !02 z D f0 e i !t (9.8.25)

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 774

It can be seen that the real part of z.t/ i.e., x.t/ is actually the solution of Eq. (9.8.24). With
z D C e i !t , we compute zP , zR
z D C e i !t
zP D i!C e i !t (9.8.26)
zR D ! 2 C e i !t
And substituting them into Eq. (9.8.25) to get
! 2 C C 2ˇi!C C !02 C D f0 (9.8.27)
which give us C as follows
f0 f0 .!02 ! 2 2i!ˇ/
C D D
!02 ! 2 C 2i!ˇ .!02 ! 2 /2 C 4! 2 ˇ 2
(9.8.28)
f0
D f0 .!02 ! 2 2i!ˇ/; f0 D 2
.!0 ! 2 /2 C 4! 2 ˇ 2
Now, we write z D C e i !t explicitly into the form x.t/ C iy.t/ to find its real part:
z D C e i !t D C.cos !t C i sin !t/
D f0 .!02 ! 2 2i!ˇ/.cos !t C i sin !t/ (9.8.29)
   
D f0 .!02 ! 2 / cos !t C 2!ˇ sin !t C if0 .!02 2
! / sin !t 2!ˇ cos !t
Thus, the solution to Eq. (9.8.24), which is the real part of z.t/ is given by
f0 .!02 ! 2 / 2f0 !ˇ
x.t / D Re.z/ D cos !t C 2 sin !t (9.8.30)
.!02 ! 2 /2 C 4! 2 ˇ 2 .!0 ! 2 /2 C 4! 2 ˇ 2
Now, we use the trigonometry identity cos.a b/ D cos a cos b C sin a sin b to rewrite x.t/.
First, we re-arrange x.t/ in the form of cos cos C sin sin, then we will have a compact form for
x.t /:
" #
f0 .!02 ! 2 / cos !t 2!ˇ sin !t
x.t / D p p Cp
.!02 ! 2 /2 C 4! 2 ˇ 2 .!02 ! 2 /2 C 4! 2 ˇ 2 .!02 ! 2 /2 C 4! 2 ˇ 2
f0 2!ˇ
D A cos.!t ı/; A D p ; tan ı D 2
.!02 ! 2 /2 C 4! 2 ˇ 2 !0 ! 2
(9.8.31)
We have just computed the response of the system to the driving force: a sinusoidal driving force
results in a sinusoidal oscillation with an amplitude proportional to the amplitude of the force.
All looks reasonable. Do not forget the natural oscillation response. We’re interested in the case
of weakly damped only. The total solution is thus given by
 
ˇt d
x.t/ D A cos.!t ı/ C Be cos !0 t C  (9.8.32)

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 775

Example. A mass is released from rest at t D 0 and x D 0. The driven force is f D f0 cos !t
with f0 D 1000 and ! D 2. Assume that the natural frequency is !0 D 5! D 10, and the
damping is ˇ D !0=20 D =2 i.e., a weakly damped oscillation.
We determine B and  from the given initial conditions. Noting that A and ı are known:
A D 1:06 and ı D 0:0208.
) (
x0 D A cos ı C B cos  B cos  D x0 A cos ı
H)
v0 D !A sin ı C B. ˇ cos  !0d sin / ˇB cos  C B!0d sin  D !A sin ı v0

which yields B D 1:056 and  D 0:054. Using all these numbers in Eq. (9.8.32) we can
plot the solution as shown in Fig. 9.10. We provide the plot of the driving force, the transient
solution and the total solution x.t/. Codes to produce these plots are given in Appendix A.4.
1000
500
0
f(t)

500
1000
0 1 2 3 4 5
t
0.5
xh(t)

0.0
0.5
1.0
0 1 2 3 4 5
t
1.5
1.0
0.5
x(t)

0.0
0.5
1.0
1.5
0 1 2 3 4 5
t
Figure 9.10: Driven oscillation of a weakly damped spring-mass: the frequency of the force is 2, and
the natural frequency is 10. After about 3 cycles, the motion is indistinguishable from a pure cosine,
oscillating at exactly the drive frequency. The free oscillation has died out and only the long term motion
remains. In the beginning t  3, the effects of the transients are clearly visible: as they oscillate at a faster
frequency they show up as a rapid succession of bumps and dips. In fact, you can see that there are five
such bumps within the first cycle, indicating that !0 D 5!.

9.8.4 Resonance
By looking at the formula of the oscillation amplitude A, we can explain the phenomenon of
resonance. Recall that A is given by
f0
AD p (9.8.33)
.!02 ! 2 /2 C 4! 2 ˇ 2

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 776

which will have a maximum value when the denominator is minimum. Note that we are not
interested with using a big force to have a large amplitude. With only a relatively small force but
at a correct frequency we can get a large oscillation anyway. Moreover, we are only interested
in the case ˇ is small i.e., weakly damped. It can be seen that A is maximum when !  !0 , see
Fig. 9.11a and the maximum value is
f0
Amax  (9.8.34)
2!0 ˇ

0.25 0.25
= 0.1 0
0.20 = 0.2 0
0.20 = 0.3 0

0.15 0.15
A2

A2
0.10 0.10

0.05 0.05

0.00 0.00
0 5 10 15 20 0 5 10 15 20
0
(a) (b)

Figure 9.11: A is maximum when !  !0

Phase at resonance. We’re now interested in the phase ı when


resonance occurs. Recall that the phase difference ı by which the
oscillator’s motion lacks behind the driving force is
 
2!ˇ
ı D arctan (9.8.35)
!02 ! 2
If !  !0 , ı  0, and thus the oscillations are almost perfectly in
phase with the driving force (we can see this clearly in Fig. 9.10).
At resonance ! D !0 , ı D =2: the oscillations are 90ı behind the driving force.

9.8.5 Driven damped oscillators with any periodic forces


After damped oscillation with a sinusoidal driving force, it is just one more small step to tackle
damped oscillation with any periodic forces f .t/, thanks to the genius of Fourier. The equation
is now given by
mxR C b xP C kx D f .t/ (9.8.36)
And we replace f .t/ by its Fourier series (Section 4.19)
X
1
f .t/ D Œan cos.n!t / C bn sin.n!t/ (9.8.37)
nD0

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 777

with the Fourier coefficients given by (and b0 D 0)


8̂ Z =2
ˆ 2
Z ˆ
< an D  f .t/ cos.n!t/dt
1 =2 =2
a0 D f .t/dt;
ˆ Z (9.8.38)
ˆ
 =2 =2
2
:̂ bn D f .t/ sin.n!t /dt
 =2

With this replacement of f .t/, the equation of motion becomes


X
1
mxR C b xP C kx D Œan cos.n!t / C bn sin.n!t/ (9.8.39)
nD0

What is this new form different from the original problem, Eq. (9.8.36)? Now, we have a damped
SHO with infinitely many driving forces f0 .t/; f1 .t/; : : : But for each of this force, we are able
to solve for the solution xn .t/, with n D 0; 1; : : :, (we have assumed that the Fourier series
contain only the cosine terms for simplicity):
an 2n!ˇ
xn .t / D An cos.n!t ın /; An D p ; tan ın D
.!02 n2 ! 2 /2 C 4n2 ! 2 ˇ 2 !02 n2 ! 2
(9.8.40)

And what is the final solution? It is simply the sum of all xn .t/. Why that? Because our equation
is linear! To see this, let’s assume there are only two forces: with f1 .t/ we have the solution
x1 .t / and similarly for f2 .t/, so we can write:

mxR 1 C b xP 1 C kx1 D f1 .t/


(9.8.41)
mxR 2 C b xP 2 C kx2 D f2 .t/

Adding these two equations, we get

m.xR 1 C xR 2 / C b.xP 1 C xP 2 / C k.x1 C x2 / D f1 .t/ C f2 .t/ (9.8.42)

which indicates that x.t/ D x1 .t/ C x2 .t/ is indeed the solution. This is known as the principle
of superposition, which we discussed in Section 9.6. There, the discussion was abstract.
In summary, we had a hard problem (due to f .t/), and we replaced this f .t/ by many
many easier sinusoidal forces. For each force, we solved an easier problem and we added these
solutions altogether to get the final solution. It is indeed the spirit of calculus!

9.8.6 The pendulum


Consider a pendulum as shown in Fig. 9.4b. Newton’s 2nd law in polar coordinates in the 
direction gives us :
F D m.2rP P C r /
R (9.8.43)

Check Eq. (7.10.17) if this was not clear.

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 778

And note that F D mg sin , and r D l is constant, thus Eq. (9.8.43) is simplified to
g d 2 g
R C sin  D 0; or 2
C sin  D 0 (9.8.44)
l dt l
For small vibrations, we have sin    (remember the Taylor series for sine?). Thus, our
equation is further simplified to
r s
g l
R C ! 2  D 0; ! D H) T D 2 (9.8.45)
l g
And voilà, we see again the simple harmonic oscillation equation! And the natural frequency
(and the period) of a pendulum does not depend on the mass of the blob. And of course it does
not depend on how far it swings i.e., the initial conditions have no say on this. This fact was first
observed by Galileo Galilei when he was a student of medicine watching a swinging chandelier.
A historical note: it was the Dutch mathematician Christian Huygens (1629-1695) who first
derived the formula for the period
p of a pendulum. Note that we can also use a dimensional
analysis to come up with !  g= l.

Pendulum and elliptic integral of first kind. Herein I demonstrate how an elliptic integral of
the first kind shows up in the formula of the period of a pendulum when its amplitude is large.
The idea is to start with Eq. (9.8.44) and massage it so that we can have dt as a function of .
Then, integrating dt to get the period T .
We re-write Eq. (9.8.44) using !:
d 2
C ! 2 sin  D 0 (9.8.46)
dt 2
P we get
Multiplying both sides of this equation with ,
d 2  d d
2
C ! 2 sin  D0 (9.8.47)
dt dt dt
Now, integrating this equation w.r.t t, we obtain
 
1 d 2
! 2 cos  D k (9.8.48)
2 dt
where k is an integration constant. To find k, let   be the maximum angular displacement. At
 D   (a maxima), ddt
vanishes: k D ! 2 cos   . Now, we can solve Eq. (9.8.48) for P noting
that  decreases as t increasesŽ :
d p d
D ! 2.cos  cos   /; or dt D p (9.8.49)
dt ! 2.cos  cos   /
Ž
Picture moving the bob to a certain height and release it. Note that it will be easier, lot more, to derive this
equation using the principle of conservation of energy: 0:5mv 2 D mg.l cos  cos   /, then v D ds=dt with
s D l.

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 779

It’s now possible to determine the period T :

Z 0 Z 
d
T D4 dt D 4 p
 0 ! 2.cos  cos   /

Now, a bit of massage using the trigonometric identity cos ˛ D 1 2 sin2 ˛=2 leads to
s Z 
l  d
T D2 p ; k D sin   =2 (9.8.50)
g 0 k 2 sin2 =2

We still need to massage this equation a bit more. Using the following change of variableŽŽ

 1  2k cos d 2k cos d


sin D k sin  H) cos d D k cos d H) d D D p
2 2 2 cos 2 1 k 2 sin2 

we have 
s Z 
l  d
T D2 p
g 0 k 2 .1 sin2 /
s Z 
l  2k cos d
D2 p p (9.8.51)
g 0 k 2 .1 sin2 / 1 k 2 sin2 
s Z
l =2 d
D4 p
g 0 1 k 2 sin2 

And the red integral is exactly the integral we have met in Section 4.9.1 when computing the
length of an ellipse!

9.8.7 RLC circuits


An RLC circuit is an electrical circuit consisting of a resistor (R), an inductor (L), and a capacitor
(C ), connected in series or in parallel. The name of the circuit is derived from the letters that are
used to denote the constituent components of this circuit, where the sequence of the components
may vary from RLC.

ŽŽ
Why this change of variable? Look at the red term in Eq. (9.8.50). This change of variable can remove the
square root.

Note also that k D sin   =2, thus when  D   ,  D =2.

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 780

Assume that the positive direction for the current to be clock-


wise, and the charge q.t/ to be the charge on the bottom plate of
the capacitor. If we follow around the circuit in the positive direc-
tion, the electric potential drops by LIP D LqR across the indcutor,
by RI D RqP across the resistor, and by q=C across the capacitor.
Applying Kirchoff’s second rule for circuits, we conclude that

1
LqR C RqP C qD0 (9.8.52)
C
This has exactly the form of Eq. (9.8.12) for the damped oscillator.
And anything that we know about the damped oscillator will be immediately applicable to the
RLC circuit. In other words, the RLC circuit is an electrical analog of a spring-mass system with
damping.
Mathematicians do not care about physics or applications, what matters to them is the fol-
lowing nice equation with a; b; c 2 R

ayR C b yP C cy D 0 (9.8.53)

which they call a second order ordinary differential equation. But now you understand why
univesity students have to study them and similar equations. Because they describe our world
quite nicely.

9.8.8 Coupled oscillators


In the previous section we discussed the oscillation of a single body, such as a mass connected to
a fixed spring. We now move to the study of the oscillation of several bodies that are coupled to
each other such as the atoms making up a molecule. For this problem in which there are multiple
degrees of freedom, matrices appear naturally. If you need a refresh on matrices please refer to
Chapter 11.
As a simple example of coupled oscillators, consider the two carts shown in Fig. 9.12. And
in this context we will meet matrices and determinants. We use Newton’s 2nd law to find the
equations of motion.

Figure 9.12: A simple two coupled oscillators. In the absence of the spring 2, the two carts would oscillate
independently of each other. It is the spring 2 that couples the two carts.

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 781

Using Newton’s 2nd law we write

m1 xR 1 D k1 x1 C k2 .x2 x1 / D .k1 C k2 /x1 C k2 x2


(9.8.54)
m2 xR 2 D k2 .x2 x1 / k3 x2 D k2 x1 .k2 C k3 /x2

And we can write these two equations in a compact matrix formŽŽ as


" #" # " #" #
m1 0 xR 1 k1 C k2 k2 x1
D ; or MxR D Kx (9.8.55)
0 m2 xR 2 k2 k2 C k3 x2
„ ƒ‚ … „ ƒ‚ … „ƒ‚…
M K x

where M is the mass matrix and K is the spring-constant matrix or stiffness matrix. Note that
these two matrices are symmetric. Also note that using matrix notation the equation of motion
of coupled oscillators, MxR D Kx, is a very natural generalization of that of a single oscillator:
with just one degree of freedom, all three matrices x, K and M are just ordinary numbers and
we had mxR D kx.
We use complex exponentials to solve Eq. (9.8.55):
" # " #
z A1 i !t
zD 1 D e D ae i !t ; H) zR D ! 2 ae i !t (9.8.56)
z2 A2

Introducing z and zR into Eq. (9.8.55) we obtain

.! 2 M K/a D 0 (9.8.57)

which leads to the determinant of the matrix being null:



det K ! 2 M D 0 (9.8.58)

This is a quadratic equation for ! 2 and has two solutions for ! 2 (in general). This implies that
there are two frequencies !1;2 at which the carts oscillate in pure sinusoidal motion. These
frequencies are called normal frequencies. The two sinusoidal motions associated with these
normal frequencies are known as normal modes. The normal modes are determined by solving
Eq. (9.8.57). If you know linear algebra, what we are doing here is essentially a generalized
eigenvalue problem in which ! 2 play the role of eigenvalues and a play the role of eigenvectors;
refer to Section 11.10 for more detail on eigenvalue problems.

Example 1. Let’s consider the case of equal stiffness springs and equal masses: k1 D k2 D
k3 D k and m1 D m2 D m. Using Eq. (9.8.58) we can determine the normal frequencies:
r r
k 3k
!1 D ; !2 D (9.8.59)
m m
ŽŽ
Check Chapter 11 for a discussion on matrices.

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 782

Did u notice anything special about !1 ? And we use Eq. (9.8.57) to compute a:
" # " #! " # " #
2k k ! 2m 0 A1 0
2
D (9.8.60)
k 2k 0 ! m A2 0

With !1 , we solve Eq. (9.8.60) to have A1 D A2 D Ae i 1 . So, we have z1 .t/ and z2 .t/ and
from them the real parts of the actual solutions for mode 1:
) (
z1 D Ae i 1 e i !1 t x1 .t/ D A cos.!1 t 1 /
H) (9.8.61)
z2 D Ae i 1 i ! 1 t
e x2 .t/ D A cos.!1 t 1 /

As x1 .t / D x2 .t/ the two carts oscillate in a way that spring 2 is always in its unstretched
configuration. In other words, spring 2 is irrelevantpand thus the system oscillates with a natural
frequency similar to a single oscillator (i.e., ! D k=m).
With !2 , we solve Eq. (9.8.60) to have A1 D A2 D Be i 2 . The mode 2 solutions are
x1 .t/ D CB cos.!2 t 2 /
(9.8.62)
x2 .t/ D B cos.!2 t 2 / D B cos.!2 t 2 /
These solutions tell us that when cart 1 moves to the left a distance, cart 2 moves to the right the
same distance. We say that the two carts oscillate with the same amplitude but are out of phase.
Together, the general solutions are:
" # " #
1 1
x.t/ D A cos.!1 t 1 / C B cos.!1 t 2 / (9.8.63)
1 1
with the four constants of integration A; B; 1 ; 2 to be determined from four initial conditions.

Example 2. This case involves equal masses, but the second spring is much less stiff: k1 D
k3 D k, k2  k, m1 D m2 D m. The two normal frequencies are
r r
k k C 2k2
!1 D ; !2 D (9.8.64)
m m
As we have discussed, spring 2 is irrelevant in mode 1, so we got the same mode 1 frequency as
in Example 1. As k2  k, !1  !2 , we can write them in terms of their average !0 and half
difference  (you will see why we did this via Eq. (9.8.67); the basic idea is that we can write
the solutions in two separate terms, one involves !0 and one involves ):
!1 D !0 ; !1 C !2 !2 !1
; !0 D ; D (9.8.65)
!2 D !0 C ; 2 2
Therefore, the normal modes are
( (
z1 D C1 e i.!0 /t D C1 e i !0 t e i t z1 D CC2 e i.!0 C/t D CC2 e i !0 t e i t
(mode 1); (mode 2)
z2 D C1 e i.!0 /t
D C1 e i !0 t e i t
z2 D C2 e i.!0 C/t D C2 e i !0 t e i t

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 783

where C1 ; C2 2 C. To simplify the presentation, we simply let C1 D C2 D A=2 where A is real.


In this case, the general solutions are
" # " #
A i !0 t e i t C e i t cos.t /
z.t/ D e i t i t
D Ae i !0 t (9.8.66)
2 e e i sin.t/

where in the last step, we have used the formula that relating sine/cosine to complex exponentials,
see Section 2.25.5 if you do not recall this. And the real solutions are thus given by
" #
A cos.t/ cos.!0 t /
x.t/ D (9.8.67)
A sin.t/ sin.!0 t/

This is the solutions for the case when cart 1 is 1.0

pulled a distance A to the right and released at t D 0


while cart 2 is stationary at its equilibrium positionŽŽ . 0.5

To illustrate this solution, we use !0 D 10, A D 1:0,

x(t)
0.0
and  D 1 and consider a time duration of 2. First, we
try to understand what A sin.t/ sin.!0 t/ means. As −0.5

  !0 the t oscillation is much slower than the


−1.0
!0 t oscillation. The former then simply acts as an enve- +A 0sin(t) 1 2 −A sin(t)
3 4 5A sin(t)
6 sin(ω t)
0

lope for the latter. In Fig. 9.13 both x1 .t/ and x2 .t/ are t

shown. There we can see that the motion sloshes back and forth between the two masses. At the
start only the first mass is moving. But after a time of t D =2 (or t D =2), the 1st mass is
not moving and the second mass has all the motion. Then after another time of =2 it switches
back, and this continues forever.

1
x1 (t)

−1
0 1 2 3 4 5 6
1 t
x2 (t)

−1
0 1 2 3 4 5 6
t

Figure 9.13: Plot of Eq. (9.8.67) with !0 D 10, A D 1:0, and  D 1.

Later in Section 9.10 we shall know that this is nothing but the beat phenomenon when two
sound waves of similar frequencies meet.
ŽŽ
This is so because at t D 0, x1 D A while xP 1 D x2 D xP 2 D 0.

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 784

9.9 Solving the diffusion equation


We are going to discuss Fourier’s solution to the following 1D diffusion equation (derived in
Section 9.5.2). In the process, we will understand the idea of approximating any periodic function
by a trigonometric series now famously known as Fourier series. The equations including BCs
and ICs are:
@ @2 
D  2 2 .0 < x < 1; t > 0/ (9.9.1)
@t @x
.x; 0/ D .x/ .0  x  1/ (9.9.2)
.0; t/ D 0; .1; t/ D 0 .t > 0/ (9.9.3)
Using the method of separation of variables , the temperature field .x; t/ is written as a product
of two functions: one spatial function h.x/ and one temporal function g.t/:
.x; t/ D h.x/g.t/ (9.9.4)
As we shall see there are infinite number of solutions n .x; t/ of this type that satisfy the PDE
and the BCs. They are called the fundamental solutions. The final solution is found by adding
up all these fundamental solutions (as the PDE is linear a combination of solutions is also a
solution) so that it satisfies the initial condition.
Substitution of Eq. (9.9.4) into Eq. (9.9.1) leads us to another equation:
h.x/g 0 .t/ D  2 h00 .x/g.t/
Now comes the trick: we separate temporal functions on one side and spatial functions on the
other side:
g 0 .t/ h00 .x/
D (9.9.5)
 2 g.t/ h.x/
As this equation holds for any x and t , both sides must be a constant. Setting this constant by k
we thus obtain the following equations (two not one)
g 0 .t/ h00 .x/
D k; Dk (9.9.6)
 2 g.t/ h.x/
Even though there are three possibilities k D 0, k > 0 and k < 0, the two former cases do not
lead to meaningful solutionsŽŽ , so k < 0, which can thus be expressed as a negative of a square:
k D 2 . With this, our two equations become
g 0 .t/ D 2  2 g.t/
(9.9.7)
h00 .x/ C 2 h.x/ D 0

This was a clever idea of the Swiss mathematician and physicist Daniel Bernoulli (1700 – 1782). The first
question comes to mind should be how we know that this separation of variables would work. We do not know!
Daniel probably learned this technique from his father John Bernoulli who used y.x/ D u.x/v.x/ to solve the
differential equation y 0 D y.x/f .x/ C y 00 g.x/ for y.x/ [51]. Another source of motivation for the method of
separation of variables is waves. If we study waves, we will observe one phenomenon called standing waves–check
this youtube video out. A standing wave can be mathematically described by g.x/h.t / (see also Fig. 9.17).
ŽŽ
Why? If k is positive, then g 0 .t / D  2 kg.t / > 0, thus g.t / is increasing forever. This is physically wrong as
we know from daily experience that the temperature inside the bar goes to zero as time goes by.

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 785

So, with the technique of separation of variables, we have converted a single second order PDE
into a system of two first order ODEs. That’s the key lesson! What is interesting is that it is
straightforward to solve these two ODEs:
2  2 t
g.t/ D A1 e ; h.x/ D A2 cos x C A3 sin x (9.9.8)
with A1 ; A2 ; A3 are arbitrary constants. With these functions substituted into Eq. (9.9.4), the
temperature field is given by
2  2 t
.x; t/ D e .A cos x C B sin x/ (9.9.9)
with A D A1 A2 and B D A1 A3 . We have to find A; B and  so that .x; t/ satisfies the BCs
and IC. For the BCs, we have
2  2 t
.0; t/ D 0 W e A D 0 H) A D 0
(9.9.10)
2  2 t
.1; t/ D 0 W e B sin  D 0 H) sin  D 0 H)  D n; n D 1; 2; : : :
So, we have an infinite number of solutions written as§
.n/2 t
n .x; t/ D Bn e sin.nx/; n D 1; 2; 3; : : : (9.9.11)
All satisfy the boundary conditions (and of course the PDE). It is now to work with the initial
condition. First, since the PDE is a linear equation, the sum of all the fundamental solutions is
also a solution; this is known as the principle of superposition. So, we have
X
1 X
1
.n/2 t
.x; t/ D n .x; t/ D Bn e sin.nx/ (9.9.12)
nD1 nD1

Evaluating this solution at t D 0 gives us (noting that the initial condition Eq. (9.9.2) says that
at t D 0 the temperature is .x/):
X
1
.x; 0/ D Bn sin.nx/ D .x/ (9.9.13)
nD1

Now the problem becomes


P1 this: if we can approximate the initial temperature .x/ as an infinite
trigonometric series nD1 Bn sin.nx/, then we have solved the heat equation! Now Fourier
had to move away from physics to turn to mathematics: he had to find the coefficients Bn in
Eq. (9.9.13). We refer to Section 4.19 for a discussion on how Fourier computed Bn . Then the
solution to Eq. (9.9.1) is the following infinite series

X
1 Z 1
.n/2 t
.x; t/ D Bn e sin.nx/; Bn D 2 .x/ sin.nx/dx (9.9.14)
nD1 0

Should we be worry about the infinity involved in this solution? No, we do not have to thanks to
2
the term e .n/ t which is actually a decaying term i.e., for large n and/or for large t , this term
is small. See Fig. 9.14 for an illustration.
§
B D 0 also satisfies the BCs, but it would result in a boring solution .x; t / D 0.

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 786

1.0 t=0
t = 0.005
t = 0.05
0.5

(x, t)
0.0

0.5

1.0
0.00 0.25 0.50 0.75 1.00
x
Figure 9.14: Solution of the heat equation: high order terms vanish first and thus the wiggles are gone
first.

History note 9.1: Joseph Fourier (21 March 1768 – 16 May 1830)
Jean-Baptiste Joseph Fourier was a French mathematician and physi-
cist who is best known for initiating the investigation of Fourier se-
ries, which eventually developed into Fourier analysis and harmonic
analysis, and their applications to problems of heat transfer and vi-
brations. The Fourier transform and Fourier’s law of conduction are
also named in his honor. Fourier is also generally credited with the
discovery of the greenhouse effect.
In 1822, Fourier published his work on heat flow in The Analytical
Theory of Heat. There were three important contributions in this work, one purely math-
ematical, two essentially physical. In mathematics, Fourier claimed that any function of
a variable, whether continuous or discontinuous, can be expanded in a series of sines of
multiples of the variable. Though this result is not correct without additional conditions,
Fourier’s observation that some discontinuous functions are the sum of infinite series
was a breakthrough. One important physical contribution in the book was the concept of
dimensional homogeneity in equations; i.e. an equation can be formally correct only if
the dimensions match on either side of the equality; Fourier made important contributions
to dimensional analysis. The other physical contribution was Fourier’s proposal of his
partial differential equation for conductive diffusion of heat. This equation is now taught
to every student of mathematical physics.

9.10 Solving the wave equation: d’Alembert’s solution


Herein we discuss d’Alembert’s solution to the wave equation. His solutions are written in terms
of traveling waves. It is easy to see what is a traveling wave; it is there, in nature, waiting to be
discovered by a curious mind. For example, consider a rope, which is fixed at the right end, if

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 787

we hold its left end and move our hand up and down, a wave is created and travels to the right.
And that’s a traveling wave. Now, we need to describe it mathematically. And it turns out not so
difficult.
Assume that at time t D 0, we have a wave of which the shape can be described by a
function y D f .x/. Furthermore, assume that the wave travels with a constant velocity c to the
right and its shape does not change in time. Then, at time t  , the wave is given by f .x ct  /.
To introduce some terminologies, let’s consider the simplest traveling wave; a sine wave.

Sinusoidal waves. Now, consider a sine wave (people prefer to call it a sinusoidal wave) traveling
to the right (along the x direction) with a velocity c. As a sine wave is characterized by its height
y which depends on two independent variables x (the position of a point on the wave) and time
t, its equation is determined by a function y.x; t/, which is:
 
2
y.x; t/ D A sin .x ct/ (9.10.1)

The amplitude of the wave is A, the wavelength is . That is, the function y.x; t/ repeats itself
each time x increases by the distance . Thus, the wavelength is the spatial period of a periodic
waveŽŽ . It is the distance between consecutive corresponding points of the same phase on the
wave, such as two adjacent crests, troughs, or zero crossings (Fig. 9.15).

Figure 9.15: Plots of a sine wave at two different times.

So far we have focused on the shape of the entire wave at one particular time instant. Now
we focus on one particular location on the wave, say x  and let time vary. As time goes on, the
wave passes by the point and makes it moves up and down. (Think of a leaf on a pond that bobs
up and down with the motion of the water ripples) The motion of the point is simple harmonic.
Indeed, we can show this mathematically as follows. Replacing x by x  in Eq. (9.10.1), we have
   
 2  2c 2x 
y.x ; t/ D A sin .x ct/ D A sin t (9.10.2)
  
ŽŽ
Note that in Section 9.8 we met another period, which is a temporal period. Waves are more complicated than
harmonic oscillations because we have two independent variables x and t.

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 788

This is indeed the equation of a SHO (Section 9.8.1) with angular frequency ! and phase 

2c 2x 
!D D 2f; D (9.10.3)
 
where f is the frequency. Now, we can understand why the wavelength  is defined as the
distance between consecutive corresponding points of the same phase on the wave. The phase
 is identical for points x  and x  C . As each point in the string (e.g. x  ) oscillates back and
forth in the transverse direction (not along the direction of the string), this is called a transverse
wave.
Now, I present another form of the sinusoidal wave which
introduces the concept of wavenumber, designated by k. Ob-
viously we can write Eq. (9.10.1) in the following form
y.x; t / D A sin.kx !t /, with k WD 2=. Referring to
the figure next to the text, it is obvious that the wavenumber
k tells us how many waves are there in a spatial domain of
length L. More precise k=2 is the number of waves fit inside
L. We can now study what will happen if two waves of the
same frequencies meet. For example if we are listening two
sounds of similar frequencies, what would we hear? Writing the two sounds as

x1 D cos.!1 t/; x2 D cos.!2 t/

And what we hear is the superposition of these two soundsŽŽ :


  !
!1 C !2 1 !2 
x1 C x2 D cos.!1 t / C cos.!2 t / D 2 cos t cos t
2 2

If we plot the waves as in Fig. 9.16 (!1 =!2 D 8 W 10), we see that where the crests coincide we
get a strong wave and where a trough and crest coincide we get practically zero, and then when
the crests coincide again we get a strong wave again.

d’Alembert’s solution. Now, we turn to d’Alembert’s solutions to the wave equation. We have
shown that a traveling wave (to the right) can be written as f .x ct/. Thus, f .x ct/, as a
wave, must satisfy the wave equation. That is obvious (chain rule is what we need to verify this):

@ @
.f .x ct// D c 2 .f .x ct//
@t 2 @x 2
And there is nothing special about a wave traveling to the right, we have another wave traveling
to the left. It is given by g.x C ct/, and it is also a solution to the wave equation. As the wave
equation is linear, f .x ct/ C g.x C ct/ is also a solution to the wave equation. But, we need
a proof.
ŽŽ
Note the similarity with Eq. (9.8.67).

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 789

Figure 9.16

The equation that we want to solve is for an infinitely long string (so that we do not have to
worry about what happens at the boundary):

u t t D c 2 uxx 1 < x < 1; t > 0 (9.10.4)


u.x; 0/ D f .x/; u.x;
P 0/ D g.x/ 1<x<1 (9.10.5)

where f .x/ is the initial shape of the string, and g.x/ is the initial velocity.
We introduce two new variables  and  as

 D x C ct;  D x ct

which transform the PDE from u t t D c 2 uxx to u D 0, which can be solved easily:

u t t D c 2 uxx H) u D 0 H) u.; / D ./ C ./ H) u.x; t/ D .x ct/ C .x C ct/

Now we have to deal with the initial conditions i.e., Eq. (9.10.5).

.x/ C .x/ D f .x/; c 0 .x/ C c 0


.x/ D g.x/
Z xCct
1 1
u.x; t/ D .f .x ct/ C g.x C ct// C g./d  (9.10.6)
2 2c x ct

History note 9.2: Jean-Baptiste le Rond d’Alembert (1717 – 1783)


Jean-Baptiste le Rond d’Alembert (16 November 1717 – 29 October
1783) was a French mathematician who was a pioneer in the study
of differential equations and their use of in physics. He studied the
equilibrium and motion of fluids. Jean d’Alembert’s father was an
artillery officer, Louis-Camus Destouches and his mother was Mme
de Tencin. D’Alembert was the illegitimate son from one of Mme
de Tencin ’amorous liaisons’. His father, Louis-Camus Destouches,
was out of the country at the time of d’Alembert’s birth and his

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 790

mother left the newly born child on the steps of the church of St
Jean Le Rond. The child was quickly found and taken to a home for homeless children.
He was baptised Jean Le Rond, named after the church on whose steps he had been
found. When his father returned to Paris he made contact with his young son and arranged
for him to be cared for by the wife of a glazier, Mme Rousseau. She would always be
d’Alembert’s mother in his own eyes, particularly since his real mother never recognized
him as her son, and he lived in Mme Rousseau’s house until he was middle-aged. Jean Le
Rond d’Alembert was one of the eighteenth century’s preeminent mathematicians. He was
elected to the French Academy of Sciences at the age of only twenty-three. His important
contributions include the d’Alembert formula, describing how strings vibrate, and the
d’Alembert principle, a generalization of one of Newton’s classical laws of motion.

9.11 Solving the wave equation


Herein we solve the problem of a finite vibrating string (of length L) which is fixed at two ends.
The equations governing the motion of the string are (Section 9.5.1)
@2 u 2
2@ u
D c 0 < x < L; t > 0 (9.11.1)
@t 2 @x 2
u.x; 0/ D f .x/; u.x;
P 0/ D g.x/ 0  x  L (9.11.2)
u.0; t/ D 0; u.L; t/ D 0 t > 0 (9.11.3)
where f .x/ is the original position of the string and g.x/ is its velocity at time t D 0.
Using the separation of variables method, we write the solution as
u.x; t/ D X.x/T .t/ (9.11.4)
And proceed in the same manner as for the heat equation, we now need to solve 2 equations:
X 00 X D 0
(9.11.5)
T 00 c 2 T D 0
where 1 <  < 1. It can be shown that only the case  < 0 gives meaningful solutionŽŽ .
For n D 0; 1; 2; : : :, we have
 nx 
Xn D sin
L
nc   nc  (9.11.6)
Tn D An cos t C Bn sin t
L L
And thus the general solution is
X
1 h  nc   nc i  nx 
u.x; t / D un .x; t/; un .x; t/ D An cos t C Bn sin t sin
nD1
L L L
(9.11.7)
ŽŽ
Note that our solution must be non-zero and bounded.

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 791

where the n term un .x; t/ is called the n-th mode of vibration or the n-th harmonic. This solution
satisfies the PDE and the BCs. If we plot these modes of vibration (Fig. 9.17), what we observe
is that the wave doesn’t propagate. It just sits there vibrating up and down in place. Such a wave
is called a standing wave. Points that do not move at any time (zero amplitude of oscillation)
are called nodes. Points where the amplitude is maximum are called antinodes. The simplest
mode of vibration with n D 1 is called the fundamental, and the frequency at which it vibrates
is called the fundamental frequency.
But waves should be traveling, why we have standing waves here? To see why, we need
to use trigonometry, particularly the product identities in Eq. (3.8.6) (e.g. sin ˛ cos ˇ D
sin.˛Cˇ /Csin.˛ ˇ /=2). Using these identities, we can rewrite u .x; t/ as
n

An  n n 
un .x; t/ D sin .x C ct/ C sin .x ct/
2 L L (9.11.8)
Bn  n n 
C cos .x ct/ cos .x C ct/
2 L L
Let’s now focus on the terms with An , we can write
An n An n
un .x; t/ D sin .x ct/ C sin .x C ct/
2 L 2 L   (9.11.9)
An 2x 2ct An 2x 2ct 2L
D sin C sin C ; n D
2 n n 2 n n n
which is obviously the superposition of two traveling waves: the first term is a wave traveling to
the right and the second travels to the left. Both waves have the same amplitude.
All points on the string oscillate at the same frequency but with different amplitudes.
Now we need to consider the initial conditions. By evaluating u.x; t/ and its first time
derivative at t D 0, and using the ICs, we obtain

X
1  nx 
An sin D f .x/
nD1
L
 nx  (9.11.10)
X
1
nc
Bn sin D g.x/
nD1
L L

Again, we meet Fourier series.

Example 9.4
Now, assume that the initial velocity of the string is zero, thus Bn D 0, then the solution is

X
1  nc   nx  Z  nx 
2 L
u.x; t / D An cos t sin ; An D f .x/ sin dx (9.11.11)
nD1
L L L 0 L

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 792

0.75
0.50
0.25
0.00
0.25
0.50
0.75
0.00 0.25 0.50 0.75 1.00
u1(x, t)
0.75
0.50
0.25
0.00
0.25
0.50
0.75
0.00 0.25 0.50 0.75 1.00
u2(x, t)
0.75
0.50
0.25
0.00
0.25
0.50
0.75
0.00 0.25 0.50 0.75 1.00
u3(x, t)
Figure 9.17: Standing waves un .x; t / for n D 1; 2; 3. Different colors are used to denote un .x; t / for
different times.

What does this mean? If we break the initial shape of the string into many small components:

X
1  nx 
f .x/ D An sin
nD1
L
 
And let each component dance with cos nc L
t An sin nx
L
, then adding up all these small
vibrations we will get the solution to the string vibration problem. Now, we consider a plucked
guitar string, it is pulled upward at the position x D d so that it reaches height h. Thus, the
initial position of the string is
8
< hx ; if 0  x  d
f .x/ D d
: h.L x/
; if d  x  L
L d

Then we suddenly release the string and study its motion. As the initial velocity is zero, we
just have An , which are computed as (Eq. (9.11.11))
 
2h L2 d n
An D 2 2 sin
n  d.L d / L

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 793

9.12 Fourier series


In this section we study Fourier series deeper than what we had done in Section 4.19.

9.12.1 Bessel’s inequality and Parseval’s theorem


To motivate the mathematics, let’s assume that we have a SHO and we want to compute its
average displacement x.t/. N One way to go is to use the root mean square (RMS). Recall that for
n numbers a1 ; a2 ; : : : ; an , the RMS is defined as
s
a12 C a22 C    C an2
RMS D (9.12.1)
n

Now, we extend this definition to a continuous function f .x/. Following the same procedure in
Section 4.11.3 when we computed the average of a function, we get
Z !1=2
L
1
fN.x/ D Œf .x/2 dx (9.12.2)
L L

Recall also that the Fourier series of a periodic function f .x/ in Œ L; L is given by
1 
X nx nx 
f .x/ D a0 C an cos C bn sin (9.12.3)
nD1
L L

where the coefficients are


Z L Z L Z L
1 1 nx 1 nx
a0 D f .x/dx; an D f .x/ cos dx; bn D f .x/ sin dx
2L L L L L L L L
(9.12.4)

Now, we introduce fN .x/ which is a finite Fourier series of f .x/. That is fN .x/ consists of
a finite number N 2 N of the cosine and sine terms:
N 
X nx nx 
fN .x/ D a0 C an cos C bn sin (9.12.5)
nD1
L L

With that we compute the RMS of the difference between f .x/ and fN .x/ŽŽ :
Z
1 L
ED .f .x/ fN .x//2 dx
L L (9.12.6)
1
D ..f; f / 2.f; fN / C .fN ; fN //
L
RL
ŽŽ
Although not necessary, I used the short notation .f; g/ to denote the inner product L f .x/g.x/dx.

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 794

The plan is like this: if we can compute .f; fN / and .fN ; fN /, then with the fact that E  0, we
shall get an inequality, and that inequality is the Bessel inequality. Let’s start with .fN ; fN / :

Z " N 
#2
L X nx nx 
.fN ; fN / D a0 C an cos C bn sin dx
L nD1
L L
Z L XN Z L XN Z L
2 2 2 nx 2 nx
D a0 dx C an cos dx C bn sin2 dx
L nD1 L L nD1 L L
!
XN X
N
D L 2a02 C an2 C bn2
nD1 nD1

The term .f; SN / is much easier:

Z " N 
#
L X nx nx 
.f; SN / D f .x/ a0 C an cos C bn sin dx
L nD1
L L
Z L XN Z L XN Z L
nx nx
D a0 f .x/dx C an f .x/ cos dx C bn f .x/ sin dx
L nD1 L L nD1 L L
!
XN X
N
D L 2a02 C an2 C bn2
nD1 nD1

To arrive at the final result, we just use Eq. (9.12.4) to replace the red integrals by the Fourier
coefficients. Substituting all these into the second of Eq. (9.12.6), we obtain
!
1 X
N X
N
E D .f; f / 2a02 C an2 C bn2 (9.12.7)
L nD1 nD1

As E  0, we get the following inequality:

X
N X
N Z L
1
2a02 C an2 C bn2  f 2 .x/dx
nD1 nD1
L L

X
1 Z
 1 L
Bessel’s inequality W 2a02 C an2 C bn2  f 2 .x/dx (9.12.8)
nD1
L L


d

Phu Nguyen, Monash University © Draft version


Chapter 9. Differential equations 795

9.12.2 Fourier transforms (Fourier integrals)

9.13 Classification of second order linear PDEs


9.14 Fluid mechanics: Navier Stokes equation
Fluid mechanics is a branch of physics concerned with the mechanics of fluids (liquids, gases,
and plasmas). It has applications in a wide range of disciplines, including mechanical, chemical
and biomedical engineering, geophysics, oceanography, meteorology, astrophysics, and biology.
Fluid mechanics is a sub-branch of continuum mechanics (which deals with solids and
fluids), a subject which models matter without using the information that it is made out of
atoms; that is, it models matter from a macroscopic viewpoint rather than from a microscopic
viewpoint. Fluid mechanics can be divided into fluid statics, the study of fluids at rest; and fluid
dynamics, the study of the effect of forces on fluid motion. Fluid dynamics is an active field
of research, typically mathematically complex. Many problems are partly or wholly unsolved
and are best addressed by numerical methods, typically using computers. A modern discipline,
called computational fluid dynamics (CFD), is devoted to this approach.
This section presents a derivation of governing equations of fluid dynamics. The presentation
follows the excellent textbook of Anderson ??.

Phu Nguyen, Monash University © Draft version


Chapter 10
Calculus of variations

Contents
10.1 Introduction and some history comments . . . . . . . . . . . . . . . . . . 798
10.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 798
10.3 Variational problems and Euler-Lagrange equation . . . . . . . . . . . . 802
10.4 Solution of some elementary variational problems . . . . . . . . . . . . . 805
10.5 The variational ı operator . . . . . . . . . . . . . . . . . . . . . . . . . . 809
10.6 Multi-dimensional variational problems . . . . . . . . . . . . . . . . . . 811
10.7 Boundary conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 812
10.8 Lagrangian mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 815
10.9 Ritz’ direct method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 820
10.10 What if there is no functional to start with? . . . . . . . . . . . . . . . . 824
10.11 Galerkin methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 827
10.12 The finite element method . . . . . . . . . . . . . . . . . . . . . . . . . . 830

This chapter is devoted to the calculus of variations which is a branch of mathematics that
allows us to find a function y D f .x/ that minimizes a functional–a function of function, for
Rb
example I D a G.y; y 0 ; y 00 ; x/dx is a functional. The calculus of variations provides answers
to questions like ‘what is the plane curve with maximum area with a given perimeter’. You might
have correctly guessed the answer: in the absence of any restriction on the shape, the curve is a
circle. But calculus of variation provides a proof and more.
This chapter serves only as a brief introduction to this interesting theory of mathematics.
It also provides a historical account of the development of the finite element method, often
regarded as one of the greatest achievements in the twentieth century.
I have use primarily the following books for the material presented herein:

796
Chapter 10. Calculus of variations 797

 When Least Is Best: How Mathematicians Discovered Many Clever Ways to Make Things
as Small (or as Large) as Possible by Paul Nahin [49];

 A History of the Calculus of Variations from the 17th through the 19th Century by Herman
Goldstine‘ [5];

 The Variational Principles of Mechanics by Cornelius LanczosŽŽ [38];

 The lazy universe. An introduction to the principle of least action by Jennifer Coopersmith||
[14]

We start with an introduction in Section 10.1. Next, some elementary variational problems
are given in Section 10.2 to illustrate the problems that this branch of mathematics has to deal
with. Section 10.3 presents Lagrange’s derivation of the Euler-Lagrange equation. Using this
equation, Section 10.4 provides solutions to some elementary variational problems given in Sec-
tion 10.2. The variational operator ıy is introduced in Section 10.5; it is for change in a function
y.x/ similar to dx, which is change in a number x. Two dimensional variational problems are
treated in Section 10.6. Boundary conditions are presented in Section 10.7. Section 10.8 is a
brief introduction to Lagrangian mechanics–a new formulation of Newtonian mechanics using
calculus of variations.
Section 10.9 discusses the Ritz’s direct method to solve variational problems numerically.
The Ritz method begins with a functional, however, in many cases we only have a partial differen-
tial equation, not the corresponding functional. To handle those situations, Section 10.10 presents
the Dirichlet principle, which states that for a certain PDE we can find the associated variational
principle. Then Section 10.11 treats what is now called the Galerkin method–a method that can
solve numerically any PDE without knowing the variational principle.
The Ritz-Galerkin method is, however, limited to problems of simple geometries. Sec-
tion 10.12 is devoted to a discussion on the finite element method, which can be considered
as a generalization of the Ritz-Galerkin method. The finite element method can solve PDE
defined on any geometry.

Paul Joel Nahin (born November 26, 1940) is an American electrical engineer and author who has written 20
books on topics in physics and mathematics, including biographies of Oliver Heaviside, George Boole, and Claude
Shannon, books on mathematical concepts such as Euler’s formula and the imaginary unit, and a number of books
on the physics and philosophical puzzles of time travel. Nahin received, in 1979, the first Harry Rowe Mimno
writing award from the IEEE Aerospace and Electronic Systems Society, and the 2017 Chandler Davis Prize for
Excellence in Expository Writing in Mathematics.

Herman Heine Goldstine (1913 – 2004) was a mathematician and computer scientist, who worked as the
director of the IAS machine at Princeton University’s Institute for Advanced Study, and helped to develop ENIAC,
the first of the modern electronic digital computers. He subsequently worked for many years at IBM as an IBM
Fellow, the company’s most prestigious technical position.
ŽŽ
Cornelius Lanczos (1893–1974) was a Hungarian-American and later Hungarian-Irish mathematician and
physicist. In 1924 he discovered an exact solution of the Einstein field equation representing a cylindrically sym-
metric rigidly rotating configuration of dust particles. Lanczos served as assistant to Albert Einstein during the
period of 1928–29.
||
Jennifer Coopersmith (born in 1955 in Cape Town, South Africa). She obtained a BSc and a PhD in physics
from King’s College, University of London.

Phu Nguyen, Monash University © Draft version


Chapter 10. Calculus of variations 798

10.1 Introduction and some history comments


Calculus of variations or variational calculus is a branch of mathematics to solve the so-called
variational problems. A variational problem is to find a function, e.g. u.x/ that minimizes a
functional–a function of functions. A functional is mostly written as a definite integral that
involves u, u0 (i.e., du=dx) and x; for example, the following integral
Z b
I D F .u.x/; u0 .x/I x/dx
a

is a functional. Briefly, if we input a function and its derivatives into a functional we get a number
(i.e., I ).
Going back in history, variational calculus started in 1696 with the famous brachystochrone
problem stated by Johann Bernoulli . In 1744, Euler gave a general solution to variational
problems in the form of a differential equation–the well known Euler-Lagrange equation. That
is, the solution to a variational problem is the solution to a partial differential equation associated
with the variational problem. Of course we still need to solve for this partial differential equation
to get the solution to the original problem; but it was a big result. Eleven years later, 19 year old
Lagrange provided an elegant derivation for this equation.
There is a deep reason why variational calculus has become an important branch of mathe-
matics. It is the fact that nature follows laws which can be expressed as variational principles.
For example, Newtonian mechanics is equivalent to the least action variational principle that
states that among various paths that a particle can follow, the actual path minimizes a functional.
This functional is the integral of the difference between the kinetic energy and potential energy:
Z 2
LŒx.t/ D P
ŒKE.x.t// PE.x.t///dt
1

Yes, among infinitely many paths that a particle can choose, it chooses the one that minimizes
the functional LŒx.t/. It is simply super remarkable.
Even though Euler has developed many techniques to solve the Euler-Lagrange partial differ-
ential equations, it was the physicist Walter Ritz who, in 1902, proposed a direct method to solve
approximately variational problems in a systematic manner. The modifier ’direct’ means that
one can work directly with the functional instead of first finding the associated Euler-Lagrange
equation and then solving this equation; the way Euler and many other mathematicians did.

10.2 Examples
We have seen ordinary functions such as f .x/ D x 2 or f .x; y/ D x 2 C y 2 , but we have not
seen a functional before. This section presents some examples so that we get familiar with

Johann Bernoulli (1667 – 1748) was a Swiss mathematician and was one of the many prominent mathemati-
cians in the Bernoulli family. He is known for his contributions to infinitesimal calculus and educating Leonhard
Euler in the pupil’s youth.

Phu Nguyen, Monash University © Draft version


Chapter 10. Calculus of variations 799

functionals and variational problems. Note that we do not try to solve those problems in this
section.

Eucledian geodesic problem is to find the shortest path joining two points .x1 ; y1 / and .x2 ; y2 /.
To this end, we are finding a curve mathematically expressed by the function f .x/ such that the
following integral (or functional)
Z .x2 ;y2 / Z x2 p
lŒf .x/ D ds D 1 C .f 0 .x//2 dx (10.2.1)
.x1 ;y1 / x1

is minimum. We use the notation lŒf .x/ to denote a functional l that depends on f .x/ (and
possibly its derivatives f 0 .x/; f 00 .x/; : : :). In this particular example, our functional depends
only on the first derivative of the sought for function.

Brachistochrone problem–John and James Bernoulli 1697. Suppose a particle is allowed to


slide freely and frictionlessly along a wire under gravity from a point A to point B (Fig. 10.1a).
Furthermore, assume that the beads starts with a zero velocity. Find the curve y D y.x/ that
minimizes the travel time. Such a curve is called a brachistochrone curve (from Ancient Greek
brákhistos khrónos ’shortest time’). Surprisingly that curve is not a line.

A A x


m m

g B B.a; b/
a/ b/ y f .x/

Figure 10.1: A brachistochrone curve is a curve of shortest time or curve of fastest descent; g is the
acceleration of gravity.

To solve this problem we first need to compute the traveling time, then find y.x/ that mini-
mizes that time. We use differential calculus to compute dt-the infinitesimal time required for
the particle to travel a distance ds. We need to know the velocity of the particle for this purpose.
For simplicity, we select a coordinate system as shown in Fig. 10.1b where the starting point A
is at the origin and the vertical axis is pointing downward. Using the principle of conservation of
energy (at time t D 0 and any time instance t) leads to (at t D 0 the total energy of the particle
is zero)
1 2
mv mgy D 0
2
p
Thus, the particle velocity, v D ds=dt , is given by v D 2gy. It is now possible to compute dt,
and hence the total time
p Z ap
ds 1 C Œy 0 .x/2 dx 1 C Œy 0 .x/2
dt D D p H) t D p dx (10.2.2)
v 2gy 0 2gy.x/

Phu Nguyen, Monash University © Draft version


Chapter 10. Calculus of variations 800

The shortest curve is then the one with y.x/ that minimizes the above integral.

Minimal surface of revolution. Suppose the curve y D f .x/ p 0 is rotated about the
Rb
x-axis. The area of the surface of the resulting solid is 2 a f .x/ 1 C Œf 0 .x/2 dx, check
Section 4.9.3 for detail. Find the curve which makes this area minimal.

Galileo’s hanging chain. Galileo Galilei in his Discorsi1 (1638) described a method of drawing
a parabola as “Drive two nails into a wall at a convenient height and at the same level; ...
Over these two nails hang a light chain ... This chain will assume the form of a parabola, ...”.
Unfortunately, the hanging chain does not assume the form of a parabola and Galileo’s assertion
became a discussion point for followers of his work. Prominent mathematicians of the time,
Leibniz, Huygens and Johann Bernoulli, studied the hanging chain problem, which can be stated
as: Find the curve assumed by a loose flexible string hung freely from two fixed points. Every
person viewing power lines hanging between supporting poles is seeing Galileo’s hanging chain,
which is called a catenary, a name that is derived from the Latin word catena, meaning chain.
How is this problem related to the above variational problems? y B(x2 , y2 )
In other words, what quantity is to be minimized? The answer is the M, L, g, ρ
potential energy of the chain! Let’s consider a flexible chain hung A(x1 , y1)
by two points A and B. The chain has a total mass M , a total length ds

L, and thus a uniform mass per length density  D M=L. The chain
is described mathematically as y.x/. Let’s consider a (very) small y(x)

segment ds of the chain locating at a distance y.x/ above y D 0, x


and the potential energy of this segment of mass m D ds is
p
mgy D gyds D gy dx 2 C dy 2
Thus, the total potential energy is
Z x2 p Z x2 p
P.E D gy dx C dy D
2 2 gy 1 C .y 0 /2 dx
x1 x1

The problem is then: find the curve y.x/ passing through A.x1 ; y1 / and B.x2 ; y2 / such that
P.E is minimum. Not really. We forgot that not every curve is admissible; only curves of the
same length L are. So, the problem must be stated like this: find the curve y.x/ passing through
A.x1 ; y1 / and B.x2 ; y2 / such that
Z x2 p
0
I Œy; y I x D gy 1 C .y 0 /2 dx
x1

is minimum while satisfying this constraint (check arclength in Section 4.9.1 if this is not clear):
Z x2 p
1 C .y 0 /2 dx D L
x1

This is certainly a variational problem, but with constraints. As we have learned from calculus,
we need Lagrange to handle the constraints.

Phu Nguyen, Monash University © Draft version


Chapter 10. Calculus of variations 801

Calculus based solution of the hanging chain problem. Herein we present the calculus based
solution of the hanging chain problem. It was done by Leibniz and Johann Bernoulli before vari-
ational calculus was developed. I provide this solution to illustrate two points: (i) how calculus
can be used to solve problems and (ii) how the same problem (in this context a mechanics one)
can be solved by more than one way.

T .x C x/
y B.x2 ; y2 / ˛.x C x/

A.x1 ; y1 / s

M; L; g;  ˛.x/
sg
T .x/
x x x C x x
Figure 10.2: Hanging chain problem: forces acting on a segment x of the chain.

Considering a segment of the chain locating between x and x C x as shown in Fig. 10.2;
the length of this segment is s, there are three forces acting on this segment: the tension at the
left end T .x/, the tension at the right end T .x C x/ and the gravity gs. As this segment is
stationary i.e., not moving, the sum of total forces acting on it must be zero:
P
Fx D 0 W T .x/ cos ˛.x/ D T .x C x/ cos ˛.x C x/
P
Fy D 0 W T .x C x/ sin ˛.x C x/ T .x/ sin ˛.x/ gs D 0

From the first equation, we deduce that the horizontal component of the tension in the chain is
constant:
T0
T .x/ cos ˛.x/ D T0 D constant H) T .x/ D
cos ˛.x/
And from the second equation, we get:

 .T .x/ sin ˛.x// s


 .T .x/ sin ˛.x// D gs H) D g
x x
Replacing T .x/ by T0=cos ˛.x/ and considering the limit when x ! 0, we then have

d p
.T0 tan ˛.x// D g 1 C .y 0 /2
dx
And finally, we obtain the differential equation for the hanging chain (noting that T0 is constant
and tan ˛.x/ D y 0 .x/):
p
T0 y 00 D g 1 C .y 0 /2

Phu Nguyen, Monash University © Draft version


Chapter 10. Calculus of variations 802

To solve this differential equation, we followed Vincenzo RiccatiŽ with a new variable z such
that y 0 D z:
p dz T0
y 0 D z H) T0 z 0 D g 1 C z 2 ” k p D dx; k WD
1 C z2 g

Now, integrating both sides we get (see Section 4.4.15)


Z Z
dz 1
k p D dx H) C1 C k sinh zDx
1 C z2
where C1 is a constant of integration, and from this we get z, and finally from z D dy=dx, we
obtain y.x/:    
x C1 x C1
z D sinh H) y D k cosh C C2
k k
where C2 is yet another constant of integration. If the lowest point of the catenary is at .0; k/, it
can be seen that C1 D C2 D 0, and the catenry has this form
x 
y D k cosh
k
I hope that with this hanging chain problem, the introduction of hyperbolic functions into math-
ematics is easier to accept. Again, it is remarkable that mathematics, as a human invention,
captures quite well natural phenomena.

10.3 Variational problems and Euler-Lagrange equation


The classical variational problem is finding a function y.x/ such that the following functional
Z b
I Œy.x/ WD F .y; y 0 I x/ dx; y.a/ D A; y.b/ D B (10.3.1)
a

is minimized. In the above A and B are two real constants and y 0 D dy=dx is the first derivative
of y. As can be seen, all the examples given in Section 10.2 belong to this general problem.
This functional has one independent variable x and one single dependent variable y.x/. Thus,
it is the easiest variational problem. While other mathematicians solved specific problems like
those presented in Section 10.2, once got interested, the great Euler solved Eq. (10.3.1) once
and for all. And by doing just that he pioneered a new branch of mathematics. His solution was,
however, geometrical and not elegant as Lagrange’s one. I refer to [38] for Euler’s derivation.
In what follows, we present the modern solution, which is essentially due to Lagrange when he
was only 19 years old.
Before studying Lagrange’s solution, let’s recap how we find the minimum of f .x/. We
denote the minimum point by x0 and vary it a bit and saying that the corresponding change
Ž
Vincenzo Riccati (1707 – 1775) was a Venetian mathematician and physicist.

Phu Nguyen, Monash University © Draft version


Chapter 10. Calculus of variations 803

in f must be zero. Mathematically, we consider a small change in x, denoted by dx, that is


x0 C dx. We compute the corresponding change in f : df D f .x0 C dx/ f .x0 /. Next, we
use Taylor’s series for f .x0 C dx/ D f .x0 / C f 0 .x0 /dx (higher order terms are negligible).
So, df D f 0 .x0 /dx. And the condition to have a minimum at x0 becomes f 0 .x0 / D 0.
Can we do the same thing for Eq. (10.3.1)? Lagrange did exactly just that and he invented
variational calculus! Let us assume that y.x/ is the solution i.e., the function that minimizes the
functional. To denote a variation of y.x/, designated by y.x/, we consider

y.x/ WD y.x/ C .x/ (10.3.2)

where .x/ is a fixed function satisfying the conditions .a/ D .b/ D 0 so that y.a/
N D A and
N
y.b/ D B; and  is a small number. See Fig. 10.3 for an illustration of y.x/, .x/ and y.x/. For
each value of , we have a specific variation, and thus a concrete value of the functional, and
among all these values the one obtained from  D 0 is the minimum, because we have assumed
that y.x/ is the solution.
y
N
y.x/ D y.x/ C 1 .x/
A

B
y.x/ W solution

.x/

a x
b

Figure 10.3: Solution function y.x/, .x/ with .a/ D .b/ D 0 and one variation y.x/ C 1 .x/.

With the variation of the solution we proceed to the calculation of the corresponding change
in the functional, denoted by dI :
Z b Z b
0 0
dI D F .y.x/ C .x/; y .x/ C  .x/I x/ dx F .y.x/; y 0 .x/I x/ dx
Z b
a a
 
D F .y C ; y 0 C 0 I x/ F .y; y 0 I x/ dx (10.3.3)
Z b 
a
@F @F 0
D  C 0   dx
a @y @y
where in the last equality we have used the Taylor’s series expansion for F .y C ; y 0 C 0 I x/
around  D 0.
Now, as u.x/ is the minimal solution, one has to have dI= D 0 (this is similar to df =dx D 0
in ordinary differential calculus). Thus, we obtain
Z b 
@F @F 0
 C 0  dx D 0 (10.3.4)
a @y @y

Phu Nguyen, Monash University © Draft version


Chapter 10. Calculus of variations 804

In the next step we want to get rid of 0 (so that we can use a useful lemma called the fundamental
lemma of variational calculus which exploits the arbitrariness of  to obtain a nice result in terms
of y, no more  and ), and of course the trick is integration by parts:

Z b     b
@F d @F @F
 dx C  D0 (10.3.5)
a @y dx @y 0 @y 0 a

As .a/ D .b/ D 0, the boundary term (the last term in the above equation) vanishes and we
get the following
Z b  
@F d @F
 dx D 0 (10.3.6)
a @y dx @y 0

Rb
Using the fundamental lemma of variational calculus (which states that if a f .x/g.x/dx D 0
for all g.x/ then h.x/ D 0 for x 2 Œa; b), one obtains the so-called Euler-Lagrange equation

 
@F d @F
Euler-Lagrange equation W D0 (10.3.7)
@y dx @y 0

Euler derived this equation before Lagrange but his derivation was not as elegant as the one
presented herein which is due to Lagrange. To use Eq. (10.3.7), it should be noted that we treat
y; y 0 ; x as independent variables when calculating @F
@y
and @y
@F
0.

We note that the Euler-Lagrangeequation


 in Eq. (10.3.7) is a second order partial differential
equation; this is due to the term dx @y 0 as we have the derivative of y 0 , and thus y 00 .
d @F

Now to solve Eq. (10.3.1), Euler solved Eq. (10.3.7). This is known referred to as the
indirect way to solving variational problems. There is a direct method to attack the varia-
tional problem Eq. (10.3.1) directly; check Section 10.9. However, for now we are going to
use the indirect method to solve some elementary variational problems discussed in Section 10.2.

Stationary curves. Starting with the functional 10.3.1, we have assumed that y.x/ is a function
that minimizes this functional, and found that it satisfies the Euler-Lagrange equation 10.3.7.
Is the reverse true? That is if y.x/ satisfies the Euler-Lagrange equation will it minimize the
functional? The answer is, by learning from ordinary calculus, not necessarilyŽ . Therefore,
functions that satisfy the Euler-Lagrange equation are called stationary functions or stationary
curves.

Ž
For function y D f .x/, stationary points are those x  such that f 0 .x  / D 0. These points can be a maximum
or a minimum or an inflection point.

Phu Nguyen, Monash University © Draft version


Chapter 10. Calculus of variations 805

History note 10.1: Joseph-Louis Lagrange (25 January 1736 – 10 April 1813)
Joseph-Louis Lagrange was an Italian mathematician and astronomer,
later naturalized French. He made significant contributions to the fields
of analysis, number theory, and both classical and celestial mechanics.
As his father was a doctor in Law at the University of Torino, a career as
a lawyer was planned out for him by his father, and certainly Lagrange
seems to have accepted this willingly. He studied at the University of
Turin and his favorite subject was classical Latin. At first he had no
great enthusiasm for mathematics, finding Greek geometry rather dull.
Lagrange’s interest in mathematics began when he read a copy of Halley’s 1693 work
on the use of algebra in optics. In contrast to geometry, something about Halley’s al-
gebra captivated him. He devoted himself to mathematics, but largely was self taught.
Byy age 19 he was appointed to a professorship at the Royal Artillery School in Turin.
The following year, Lagrange sent Euler a better solution he had discovered for deriving
the Euler-Lagrange equation in the calculus of variations. Lagrange gave us the familiar
notation f 0 .x/ to represent a function’s derivative, f 00 .x/ a second derivative, etc., and
indeed it was he who gave us the word derivative. Mécanique analytique (1788–89) is a
two volume French treatise on analytical mechanics, written by Lagrange, and published
101 years following Newton’s Philosophiæ Naturalis Principia Mathematica. It consol-
idated into one unified and harmonious system, the scattered developments of various
contributors in the historical transition from geometrical methods, as presented in New-
ton’s Principia, to the methods of mathematical analysis. The treatise expounds a great
labor-saving and thought-saving general analytical method by which every mechanical
question may be stated in a single differential equation.

10.4 Solution of some elementary variational problems


This section presents solutions to variational problems introduced in Section 10.2. The idea is to
use Eq. (10.3.7) to find the partial differential equation associated with a functional, and solve it.
For some problems, non variational calculus solution is also provided.

10.4.1 Eucledian geodesic problem


Finding the shortest path joining two points .x1 ; y1 / and .x2 ; y2 /. For this problem F is given
by (note that it does not depend on y)
p
F .y; y 0 ; x/ D 1 C .y 0 /2
And thus
p .y 0 /2 y 00
  " # y 00 1 C .y 0 /2 p
d @F d y0 1 C .y 0 /2
D p D
dx @y 0 dx 1 C .y 0 /2 1 C .y 0 /2

Phu Nguyen, Monash University © Draft version


Chapter 10. Calculus of variations 806

Upon substitution into the Euler-Lagrange equation in Eq. (10.3.7) one gets
 
d @F
0
D 0 H) y 00 D 0 H) y D ax C b
dx @y

The solution is a straight line as expected. The two coefficients a and b are determined using the
boundary conditions:
y1 D ax1 C b; y2 D ax2 C b (10.4.1)

10.4.2 The Brachistochrone problem


Recall that for the Brachistochrone problem, we’re looking for a function y.x/ such that the
following is minimum:
p Z ap
1 C Œy 0 .x/2
2gtŒy.x/ D p dx (10.4.2)
0 y.x/
p p
So it is a classical variational calculus problem with F D 1CŒy 0 .x/2= y.x/. We can use the
Euler-Lagrange equation with this F . But there is a better way exploiting the fact that F does
not explicitly depend on x. Multiplying the Euler-Lagrange with y 0 , we obtain
  
@F d @F
0
y0 D 0
@y dx @y
Then, a few massages to it give us:
 
@F 0 0d @F
y y D0
@y dx @y 0
   
dF @F 00 0 d @F dF @F 0 @F 00
y y D 0; D y C y
dx @y 0 dx @y 0 dx @y @y
 
d 0 @F
F y D0
dx @y 0

which leads to
@F
F y0 D C; C is a constant (10.4.3)
@y 0
This result is known as Beltrami’s identity which is the simpler version of the Euler-Lagrange
equation when F does not explicitly depend on x. The identity is named after Eugenio Beltrami
(1835 – 1900) who was an Italian mathematician notable for his work concerning differential
geometry and mathematical physics.
p Now wep come back to the Brachistochrone problem. Using Eq. (10.4.3) for F D
1CŒy 0 .x/2= y.x/, we obtain

p
1 C y 02 .y 0 /2
p p DC
y.x/ y.1 C y 02 /

Phu Nguyen, Monash University © Draft version


Chapter 10. Calculus of variations 807

And from that, we get a simpler equation (squaring both sides and some terms cancel out),
1
y.1 C y 02 / D A (10.4.4)
C2
Rb
With y 0 D dy=dx , one can solve for dx in terms of dy and y, and from that we obtain x D 0 dx:
Z b r
y
xD dy
0 A y
Now, we’re back to the old business of integral calculus: using this substitution

y D A sin2 =2 D A=2.1 cos /

we can evaluate the above integral to get


A
xD . sin /
2
One determines A by the boundary condition that the curve passes through B.a; b/. The Brachis-
tochrone curve is the one defined parametrically as
A A
xD . sin /; yD .1 cos / (10.4.5)
2 2
And this curve is the cycloid in geometry. Refer to Section 4.12.3 for detail on this interesting
curve.

10.4.3 The brachistochrone: history and Bernoulli’s genius solution


Johann Bernoulli posed the problem of the brachistochrone to the readers of Acta Eruditorum in
June, 1696. He said:

I, Johann Bernoulli, address the most brilliant mathematicians in the world. Noth-
ing is more attractive to intelligent people than an honest, challenging problem,
whose possible solution will bestow fame and remain as a lasting monument. Fol-
lowing the example set by Pascal, Fermat, etc., I hope to gain the gratitude of the
whole scientific community by placing before the finest mathematicians of our time a
problem which will test their methods and the strength of their intellect. If someone
communicates to me the solution of the proposed problem, I shall publicly declare
him worthy of praise.

Bernoulli allowed six months for the solutions but none were received during this period.
At the request of Leibniz, the time was publicly extended for a year and a half. At 4 p.m. on
29 January 1697 when he arrived home from the Royal Mint, Newton found the challenge in
a letter from Johann Bernoulli. Newton stayed up all night to solve it and mailed the solution
anonymously by the next post. Bernoulli, writing to Henri Basnage in March 1697, indicated

Phu Nguyen, Monash University © Draft version


Chapter 10. Calculus of variations 808

that even though its author, "by an excess of modesty", had not revealed his name, yet even
from the scant details supplied it could be recognized as Newton’s work, "as the lion by its
claw" (in Latin, tanquam ex ungue leonem). This story gives some idea of Newton’s power,
since Johann Bernoulli needed two weeks to solve it. Newton also wrote, "I do not love to
be dunned [pestered] and teased by foreigners about mathematical things...", and Newton had
already solved Newton’s minimal resistance problem, which is considered the first of the kind
in calculus of variations.
In the end, five mathematicians had provided solutions: Newton, Jakob Bernoulli, Gottfried
Leibniz, Ehrenfried Walther von Tschirnhaus and Guillaume de l’Hôpital.

A ˛1 medium 1 v1
normal

˛2 ˛2 medium 2 v2
v1
˛1
boundary ˛3
˛3 medium 3 v3
O
˛2 v2 ˛4
˛4 medium 4 v4

(a) sin ˛1=v1 D sin ˛2=v2 (b) a medium consisting of four layers.

Figure 10.4: Light travels in a medium of variable density.

Now I present the genius solution of Johann Bernoulli. He used the Snell law (see Sec-
tion 4.5.1), re-given in Fig. 10.4a. He applied that to a medium consisting of multiple layers,
Fig. 10.4b. For this medium, he got

sin ˛1 sin ˛2 sin ˛3 sin ˛4


D D D
v1 v2 v3 v4

Now, you can guess what he would do next. He imagined that if the medium has an infinite
number of layers, the light would then travel in a curved path, and at any point on this path, we
have:
sin ˛
D constant (10.4.6)
v
Finally, he applied this result to the Brachistochrone problem. ˛ x
Referring to Fig. 10.5, and consider a point P .x; y/, draw a tan- P .x; y/
gent line to the curve y.x/ at P . He computed sin ˛ in terms of ˇ tan ˇ D y 0
y 0 as follows
tangent line
1 1
sin ˛ D cos ˇ D p Dp
1 C tan2 ˇ 1 C .y 0 /2 y y.x/

Phu Nguyen, Monash University © Figure Draft


10.5 version
Chapter 10. Calculus of variations 809

p
And the velocity v D 2gy, and thus Eq. (10.4.6) gave him:

1 p
p D c 2gy
1 C .y 0 /2

which is equivalent to Eq. (10.4.4)–the solution obtained using variational calculus.

10.5 The variational ı operator


We are now ready to define the variation of y.x/ (playing the same role as dx in standard
minimum problems):
ıy.x/ WD y.x/ y.x/ D .x/ (10.5.1)
We might ask why ıy not dy? Note that dy is a change in the function y.x/ due to a change
in x. For variational problems, we’re not interested in change in x i.e., ıx D 0. Instead we
need change in the function, and it is denoted by ıy. Let’s find what properties this ı operator
possesses.
First, a variation ıy is a function of x, so we can take its derivative:
 
d d 0 dy
ıy D Œ.x/ D  .x/ D ı
dx dx dx

This shows that the derivative of the variation is equal to the variation of the derivative.
We also define ıI the variation of the functional and it can be shown that the variation of a
functional is the integral of the variation of its integrand (as the integration limits are fixed):
Z b Z b Z b
0 0 0
ı F .y; y I x/dx WD F .y C ıy; y C ıy I x/dx F .y; y 0 I x/dx
Z Z b
a a a
b
D ŒF .y C ıy; y 0 C ıy 0 I x/ 0
F .y; y I x/dx D ıF dx
a a

From Eq. (10.3.3) we can compute ıF as easily as (recall that F D F .y; y 0 I x/)
 
@F @F 0 @F @F
ıF D  C 0  D ıy C 0 ıy 0 (10.5.2)
@y @y @y @y

Observing the similarity to the total differential df of a function of two variables f .x; y/:
df D fx dx C fy dy when its variables change by dx and dy. We put these two side-by-side:

@f @f
df D dx C dy
@x @y
(10.5.3)
@F @F
ıF D ıy C 0 ıy 0
@y @y

Phu Nguyen, Monash University © Draft version


Chapter 10. Calculus of variations 810

To summarize, here are some important properties of the ı operator:


 
d dy
variation/differentiation are permutable: ıy D ı
dx dx
Z b Z b
0
variation/integration are permutable: ı F .y; y I x/dx D ıF dx
a a

Finally, we can see that ıy is similar to the differential operator df in differential calcu-
lus; Eq. (10.5.3) is one example. That is why Lagrange selected the symbol ı. We know that
d.f C g/ D df C dg and d.x 2 / D 2xdx. We have counterparts for ı: for u; v are some func-
tions
ı.˛u C ˇv/ D ˛ıu C ˇıv
(10.5.4)
ı.u2 / D 2uıu

Now we can use ı in the same manner we do with d . The proof is easy. For example, consider
F .u/ D u2 , when we vary the function u by ıu, we get a new functional FN D .u C ıu/2 . Thus,
the variation in the functional is ıF D .u C ıu/2 u2 D 2uıu.

One dimensional variational problem with second derivatives. Find the function y.x/ that
makes the following functional
Z b
J Œy WD F .y; y 0 ; y 00 ; x/ dx (10.5.5)
a

stationary and subjects to boundary conditions that y.a/; y.b/; y 0 .a/; y 0 .b/ fixed.
We compute the first variation ıJ due to the variation in y.x/, ıy (recall that ıy 0 D
d=dx.ıy/ and ıy 00 D d 2 =dx 2 .ıy/):
Z b Z b 
@F @F 0 @F 00
ıJ D ıF dx D ıy C 0 ıy C 00 ıy dx
a a @y @y @y
Now comes the usual integration by parts. For the term with ıy 0 :
    Z b Z b  
d @F d @F @F 0 @F 0 d @F
0
ıy D 0
ıy C 0 ıy ) 0
ıy dx D ıydx
dx @y dx @y @y a @y a dx @y 0
Now for the term with ıy 00 :
    Z b Z b  
d @F 0 d @F 0 @F 00 @F 00 d @F
00
ıy D 00
ıy C 00 ıy ) 00
ıy dx D ıy 0 dx
dx @y dx @y @y a @y a dx @y 00
And still having ıy 0 , we have to do integration by parts again:
Z b    Z b 2  
d @F 0 d @F
ıy dx D ıydx
a dx @y 00 a dx
2 @y 00

Phu Nguyen, Monash University © Draft version


Chapter 10. Calculus of variations 811

Finally, the first variation ıJ is given by


Z b    
@F d @F d2 @F
ıJ D C ıydx
a @y dx @y 0 dx 2 @y 00
which yields the following Euler-Lagrange equation:
   
@F d @F d2 @F
C D0
@y dx @y 0 dx 2 @y 00

10.6 Multi-dimensional variational problems


We now extend what we have done to higher dimensions. We consider a two dimensional space
with two Cartesian coordinates x; y serving as the independent variables. We have two dependent
variables u.x; y/ and v.x; y/. Now let’s consider the following functional
Z
J Œu.x; y/; v.x; y/ WD F .u; v; ux ; vx ; uy ; vy I x; y/ dxdy (10.6.1)
B

And we want to find functions u.x; y/ and v.x; y/ defined on a domain B such that J is mini-
mum. On the boundary @B the functions are prescribed i.e., u D g and v D h, where g; h are
known functions of .x; y/.
The first variation of J , ıJ , is given by:
Z  
@F @F @F @F @F @F
ıJ D ıu C ıux C ıuy C ıv C ıvx C ıvy dxdy
B @u @ux @uy @v @vx @vy
The next step is certainly to integrate by parts the second, third, fifth and sixth terms. We demon-
strate the steps only for the second term, starting with:
   
@ @F @ @F @F
ıu D ıu C ıux
@x @ux @x @ux @ux
And thus, Z Z   Z  
@F @ @F @ @F
ıux dV D ıu dV ıudV
B @ux B @x @ux B @x @ux
Using the gradient theorem, Eq. (7.11.37), for the second term–the red term in the above equation,
we obtain Z Z Z  
@F @F @ @F
ıux dV D nx ıuds ıudV
B @ux @B @ux B @x @ux
Repeating the same calculations for the third, fifth and sixth terms, the variation of J is eventually
written as
Z           
@F @ @F @ @F @F @ @F @ @F
ıJ D ıu C ıv dxdy
@u @x @ux @y @uy @v @x @vx @y @vy
Z    Z   
B
@F @F @F @F
C nx C ny ıu ds C nx C ny ıv ds
@B @ux @uy @B @vx @vy

Phu Nguyen, Monash University © Draft version


Chapter 10. Calculus of variations 812

As u; v are specified on the boundary @B, ıu D ıv D 0 there. Using the fundamental lemma of
variational calculus, we obtain the following Euler-Lagrange equations:
   
@F @ @F @ @F
D0
@u @x @ux @y @uy
    (10.6.2)
@F @ @F @ @F
D0
@v @x @vx @y @vy

Example 10.1
For example, if J is:
Z Z Z
J Œu.x; y/ WD .u2x C uy2 / dV D 2
jruj dV D ru  ru dV (10.6.3)
B B B

then Eq. (10.6.2) yields (we need to use the first equation only as there is no v function in our
functional)
uxx C uyy D 0 or u D 0 in B (10.6.4)

Example 10.2
In the field of fracture mechanics, we have the following functional concerning a scalar field
.x; y/, where Gc ; b; c0 are real numbers and ˛ is a function depending on :
Z  
Gc 1
J Œ.x; y/ D ˛./ C br  r dV (10.6.5)
B c0 b
then Eq. (10.6.2) yields (we need to use the first equation only as there is no v function in our
functional, and .x; y/ plays the role of u.x; y/)

Gc 1 0 2Gc b
˛ ./  D 0 in B (10.6.6)
c0 b c0

10.7 Boundary conditions


We have seen that the first variation of any functional has two terms: one term defined inside
the problem domain and one term defined on the problem boundary. For example, for I Œy D
Rb 0
a F .y; y I x/dx, the first variation reads
Z b   
@F d @F @F @F
ıI D ıy dx C .b/ıy.b/ .a/ıy.a/ (10.7.1)
a @y dx @y 0 @y 0 @y 0
where the red terms are the boundary terms. The Euler-Lagrange equation associated with
this functional is a second order partial differential equation. Thus it requires two boundary

Phu Nguyen, Monash University © Draft version


Chapter 10. Calculus of variations 813

conditions (BCs) to have a unique solution. In many cases, it is easy to determine these boundary
conditions. but there are also cases where it is very difficult to know the boundary conditions.
This is particularly true for fourth order PDEs.
It is a particularly beautiful feature of variational problems that they always furnish automat-
ically the right number of boundary conditions and the form. All comes from the first variation
of the functional. Getting back to the already mentioned ıI , we have the following cases:

 Case 1: we impose the boundary conditions of this form: y.a/ D A and y.b/ D B. In
other words, we fix the two ends of the curve y.x/, and the corresponding variational
problems are called fixed ends variational problems. As fixed quantities do not vary, we
have ıy.a/ D ıy.b/ D 0, and the boundary terms–red terms in Eq. (10.7.1)–vanish. This
type of boundary condition is called imposed boundary conditions, or essential boundary
conditions.

 Case 2: we fix one end (for example, y.a/ D A, and thus ıy.a/ D 0), and allows the
other end to be free. As y.b/ can be anything, we have ıy.b/ ¤ 0, so to have ıI D
0, we need @y@F
0 .b/ D 0. And this is the second BC that the Euler-Lagrange equation

has to satisfy. Since this BC is provided by the variational problem, it is called natural
boundary condition. In case of the brachistochrone, this BC is translated to y 0 .b/ D 0
which indicates that the tangent to the curve at x D b is horizontal.

Example 10.3
Consider an elastic bar of length L, modulus of elasticity E and cross sectional area A. We
denote by x the independent variable which runs from 0 to L, characterizing the position
of a point of the bar. Assume that the bar is fixed at the left end (x D 0) and subjected to
a distributed axial load f .x/ (per unit length) and a point load P at its right end (x D L).
The axial displacement of the bar u.x/ is the function that minimizes the following potential
energy
Z L"   #
EA du 2
˘ Œu.x/ D f u dx P u.L/ (10.7.2)
0 2 dx
where the first term is the strain energy stored in the bar and the second and third terms denote
the work done on the bar by the force f and P , respectively.
To find the Euler-Lagrange equation for this problem, we compute the first variation of
the energy functional and set it to zero. The variation is given by
Z L 
du d.ıu/
ı˘ D EA f ıu dx P ıu.L/ (10.7.3)
0 dx dx

We need to remove ıu0 D d=dx.ıu/; for this we use integration by parts. Noting that
 
d du d 2u du d.ıu/
ıu D 2
ıu C
dx dx dx dx dx

Phu Nguyen, Monash University © Draft version


Chapter 10. Calculus of variations 814

Thus, we have  L
Z L Z L
du d.ıu/ du d 2u
dx D ıu ıudx
0 dx dx dx 0 0 dx 2
Eq. (10.7.3) becomes
Z L    L
d 2u du
ı˘ D EA 2 C f ıudx C EA ıu P ıu.L/
0 dx dx
Z L     (10.7.4)
0
d 2u du
D EA 2 C f ıudx C EA P ıu.L/
0 dx dx xDL

which gives the Euler-Lagrange equation

d 2u
EA C f D 0; 0<x<L
dx 2
which requires 2 BCs: one is u.0/ D 0–the BC that we impose upon the bar, and the other is
 
du
EA P D0
dx xDL

provided by the variational formulation.

Example 10.4
Consider an elastic beam of length L, modulus of elasticity E, and second moment of area
I . The vertical displacement of the beam y.x/ is the function that minimizes the following
potential energy
Z L 
k 00 2
˘ Œu.x/ D .y / y dx; k WD EI (10.7.5)
0 2
where the first term is the strain energy stored in the bar and the second term denote the work
done on the beam by the force per unit length .x/.
We use the results developed for the functional given in Eq. (10.5.5),
Z        
b
@F d @F d2 @F @F d @F @F 0 L
ı˘ D C ıydx C ıy C 00 ıy
a @y dx @y 0 dx 2 @y 00 @y 0 dx @y 00 @y 0
(10.7.6)

With F D k=2.y 00 /2 y, we get the Euler-Lagrange equation from the first term in ı˘ D 0,

ky 0000 D .x/; 0<x<L (10.7.7)

which is a fourth order different equation; it requires four boundary conditions. We are demon-
strating that the variational character yields all these required BCs. We note that solving this
equation yields the so-called elastic curve, which is the deflected shape of a bending beam.

Phu Nguyen, Monash University © Draft version


Chapter 10. Calculus of variations 815

With Eq. (10.7.6) and F D k=2.y 00 /2 y, the boundary term of the first variation of the
functional are given by:
 
ı˘ D k y 000 .L/ıy.L/ C y 000 .0/ıy.0/ C y 00 .L/ıy 0 .L/ y 00 .0/ıy 0 .0/ (10.7.8)

And ı˘ D 0 provides all BCs that the Euler-Lagrange equation of the beam requires.
There are the following cases:

 Clamped ends: The BCs are:

y.0/ D 0; y.L/ D 0
(10.7.9)
y 0 .0/ D 0; y 0 .L/ D 0

That is we fix the displacement and the rotation at both ends of the beam. As the
variations of fixed quantities are zero, all the terms in Eq. (10.7.8) vanish. No natural
BCs have to be added.

 Supported ends: in this case, the BCs are simple as

y.0/ D 0; y.L/ D 0 (10.7.10)

That is we fix only the displacement of the two ends. Eq. (10.7.8) provides two more
natural BCs:
y 00 .0/ D 0; y 00 .L/ D 0
which indicate that the bending moments are zero at both ends.

 One end clamped, one end free:

y.0/ D 0; y 0 .0/ D 0 (10.7.11)

That is we fix both the displacement/rotation of the left end, but leave the right end free.
Eq. (10.7.8) yields the remaining two BCs:

y 00 .L/ D 0; y 000 .L/ D 0 (10.7.12)

which means that the bending moment at the right end is zero and so is the shear force there.

10.8 Lagrangian mechanics


With the new mathematics (i.e., variational calculus) developed by himself, Lagrange created
a new formulation of Newtonian mechanics. That formulation is now called the Lagrangian
mechanics. To study motions, we can either use Newtonian mechanics or Lagrangian mechanics,

Phu Nguyen, Monash University © Draft version


Chapter 10. Calculus of variations 816

they’re giving the same result. To motivate the need of Lagrangian mechanicsŽŽ , we can cite
one weakness of Newtonian mechanics: because Newton’s 2nd law is vectorial in nature, the
equations change when the coordinate system change (Sections 7.10.6 and 7.10.7). Lagrange
gave us another way to view motion.

10.8.1 The Lagrangian, the action and the EL equations


Let’s consider a particle moving from point 1 to point 2; its trajectory is given by
.x.t /; y.t /; z.t//. Now I present a result from variational calculus: define the following func-
tional Z 2
J Œx.t/; y.t/; z.t/ WD P y;
F .x; y; z; x; P zP I t/ dt (10.8.1)
1
Then, the associated Euler-Lagrange equations are (there are three equations, one for each com-
ponent of the particle position vector)
     
@F d @F @F d @F @F d @F
D ; D ; D (10.8.2)
@x dt @xP @y dt @yP @z dt @Pz
We need this result in the following discussion on Lagrangian mechanics.
The central object in Lagrangian mechanics is the Lagrangian L defined as the difference
between the kinetic energy T and the potential energy U . And from that he computed a term
called action, labelled by S , defined as
Z t2
SD Ldt; L D T U (10.8.3)
t1

which is an integral of the Lagrangian, from t1 to t2 .


Now comes the magic: the particle actual trajectory is .x.t/; y.t/; z.t// that renders the
action stationary. To show that, we just need to work out the mathematics:
Z t2
1
S Œx.t / D P
L.x; x/dt; T D m.xP 2 C yP 2 C zP 2 /; U D U.x; y; z/ (10.8.4)
t1 2
This action is of the form of Eq. (10.8.1), thus its Euler-Lagrange equations can be obtained
from Eq. (10.8.2): (replace F by L)
@L d @L @L d @L @L d @L
D ; D ; D (10.8.5)
@x dt @xP @y dt @yP @z dt @Pz
Lagrange thus obtained three equations and recall that Newton also had three equations. If
Lagrange could show that his three equations are exactly Newton’s equations, then he has created
a new formulation of (classical) mechanics. That part is easy, we first need to compute @L @x
and
@L
@xP
:
@L @U @L @T
D D Fx I D D mxP (10.8.6)
@x @x @xP @xP
Substituting these into the first of Eq. (10.8.5), we get Fx D mx,R which is nothing but Newton’s
2nd law.
ŽŽ
d

Phu Nguyen, Monash University © Draft version


Chapter 10. Calculus of variations 817

10.8.2 Generalized coordinates


An important characteristic of any mechanical system is the num-
ber of degrees of freedom. The number of degrees of freedom
is the number of coordinates needed to specify the location of θ1 l1

the objects. If there are N free objects, there are 3N degrees of


freedom as each object requires three coordinates. But if there m1(x1, y1)
are constraints on the objects, then each constraint removes one
degree of freedom. The total number of degrees of freedom for a x21 + y12 = l12
x22 + y22 = l22 θ2 l2
system of N objects and n constraints is 3N n. See the figure
for an illustration of constraints.
To describe a system, we can use any set of parameters that m2(x2, y2)
unambiguously represents it. These parameters do not need to
have dimensions of length (e.g. the usual Cartesian coordinates x, y, and z). They are referred
to as generalized coordinates. In the next figure, to describe the system the two angles 1 and 2
are sufficient.
Now, we are going to prove one important result in Lagrangian mechanics. The result is that:
if the EL equations hold for one set of generalized coordinates, they hold for other generalized
coordinates. Assuming that we have x1 ; x2 ; : : : ; xN as the first set of generalized coordinates.
And the EL equations hold, that is

@L d @L
D ; i D 1; 2; : : : ; N (10.8.7)
@xi dt @xP i
Now, we have another set of generalized coordinates q1 ; q2 ; : : : ; qN . We assume that it’s always
possible to go back and forth between the two coordinate systems. That is,

xi D xi .q1 ; q2 ; : : : ; qN ; t/
(10.8.8)
qi D qi .x1 ; x2 ; : : : ; xN ; t/

What we need to prove is: the EL equations hold for qi :

@L d @L
D ; i D 1; 2; : : : ; N (10.8.9)
@qi dt @qP i

Proof. We start from the RHS of Eq. (10.8.9) with

@L X @L @xP i
N
D ; m D 1; 2; : : : ; N (10.8.10)
@qP m i D1
@xP P
i @q m

From the first in Eq. (10.8.8) we have

X
N
@xi @qk @xi @xP i @xi
xP i D C H) D (10.8.11)
@qk @t @t @qP m @qm
kD1

Phu Nguyen, Monash University © Draft version


Chapter 10. Calculus of variations 818

Thus, we can write


@L X @L @xi
N
D (10.8.12)
@qP m i D1
@xP i @qm
From that, its time derivative is computed:

d X @L @xi
N
d @L
D
dt @qP m dt i D1 @xP i @qm
XN
d @L @xi XN
@L d @xi
D C
dt @xP i @qm i D1 @xP i dt @qm
iD1 (10.8.13)
X
N
@L @xi X
N
@L @xP i
D C
i D1
@xi @qm i D1 @xP i @qm
@L
D
@qm
where in the third equality, Eq. (10.8.7) was used for the red term and for the blue term, the order
of d=dt and d=dx was switchedŽŽ . 

10.8.3 Examples
A bead is free to slide along a friction-less hoop of radius R. The hoop rotates with constant
angular speed ! around a vertical diameter (Fig. 10.6a). Find the equation of motion for the
angle  shown.
From Fig. 10.6 we can determine the speed in the hoop direction and the direction perpen-
dicular to the hoop. From that, the kinetic and potential energies are written as
1  2 P2 2 2 2

T D m R  C R sin ! ; U D mgR.1 cos / (10.8.14)
2
Now, we compute the terms in the EL equation:
@L
D mR2 ! 2 sin  cos  mgR sin 
@ (10.8.15)
@L d @L
D mR2 P H) D mR2 R
@P dt @P
ŽŽ
If it was not clear, here are the details:

X
N     XN    
d @xi @ @xi @ @xi @ @xi @ @xi
D qP k C D qP k C
dt @qm @qk @qm @t @qm @qm @qk @qm @t
kD1 kD1

Thus, " #
N 
X 
d @xi @ @xi @xi @xP i
D qP k C D
dt @qm @qm @qk @t @qm
kD1

Phu Nguyen, Monash University © Draft version


Chapter 10. Calculus of variations 819

ω ω

x = R(1 − cos θ) ∆s = R∆θ


vp = ρω
ρ = R sin θ R t + ∆t vp
O O O O ρ
θ R
R cos θ θ
R θ ∆θ ∆s

x ρ t
a) b) c)

Figure 10.6: A bead is free to slide along a friction-less hoop of radius R. The hoop rotates with constant
angular speed !. The bead position is specified by  2 Œ0; . The bead has two velocities: one is when
the hoop is not spinning, this velocity is of magnitude of s=t which is RP (b), and the other is due to
the rotation of the hoop (c): ! D R sin !.

And thus the EL equation yields the following equation of motion:


d @L @L  g
D H) R D ! 2 cos  sin  (10.8.16)
dt @P @ R
It is hard to solve this equation exactly. Still, we can get something out of Eq. (10.8.16). One thing
it can tells us is equilibrium points. If we place the bead at rest (i.e., P D 0) at an equilibrium
point 0 , it remains there. Since the bead remains at 0 , its velocity must be constant, and thus
its acceleration must be zero. So, to find equilibrium points, solve R D 0, which is:
 g
2
! cos  sin  D 0
R
A trigonometric equation! But this one is easy, it has totally four solutions:
 g 
01 D 0; 02 D ; 03;4 D ˙ arccos 2
.if ! 2  g=R/
R!
So, there are four equilibrium points if the hoop spins fast i.e., ! 2  g=R. Otherwise, there are
two equilibrium points 01;2 ; they are the bottom and top of the hoop as you can predict. But
equilibrium points can be stable or unstable. An equilibrium point is said to be stable if when the
bead is at that position 0 and it is given a small disturb, it moves back to 0 . So, our question
now is among these four equilibrium points, which ones are stable.
 First case: ! 2 < g=R. There are only two equilibrium points: 01 D 0 and 02 D .
Consider first 01 D 0 (that is the bottom of the hoop). Close to 0, we have sin    and
cos   1, thus Eq. (10.8.16) becomes
 g g
R D ! 2  D k; k WD !2
R R
Now if the hoop spins at a small speed that ! 2 < g=R, then k > 0. The above equation
is identical to the one describing simple harmonic oscillations (discussed in Section 9.8).

Phu Nguyen, Monash University © Draft version


Chapter 10. Calculus of variations 820

From the study of these oscillations, we know that the bead will oscillate around the
bottom of the hoop. Therefore, the bottom of the hoop is a stable equilibrium point when
! 2 < g=R. However, if ! 2  g=R, then that position is unstable.
Now we consider 02 D  (that is the top of the hoop). Intuitively, this must be an unstable
equilibrium. We adopt a change of variable  D  C  where  is tiny. Eq. (10.8.16) then
becomes  g
R D ! 2 C 
R
So, this point is an unstable equilibrium point.

 Second case: ! 2  g=R. In this case, both 01 D 0 and 02 D  are unstable (see the above
analysis). The only two stable equilibrium points are 03;4 D ˙ arccos .g=R! 2 /. Why they
are stable? We just need to consider 03 due to symmetry. Noting that 03 2 Œ0; =2 and
in this interval the sine function is positive and the cosine is decreasing. Starting with 03
and move the bead a little bit up, we have R < 0 because:
 g
R 2
 D ! cos  sin 
„ƒ‚…
„ ƒ‚ R … >0
<0

Thus, the bead is accelerating back to 03 . Doing the same analysis by moving the bead a
little bit down from 03 and we have R > 0: the bead accelerates to 03 .
So, we have an interesting phenomenon. When the hoop is rotating slowly (i.e., ! 2 < g=R),
there is just one stable equilibrium at  D 0. If we speed up the rotation, as ! passes the
critical value of ! 2 D g=R, this original equilibrium becomes unstable. However, two new stable
equilibrium points appear. This phenomenon-the disappearance of one stable equilibrium and
appearance of other stable equilibrium points is called a bifurcation.

10.9 Ritz’ direct method


To introduce the Ritz’s method, we use the following example: finding y.x/ that minimizes the
following functional
Z 1
I Œy.x/ D Œy 2 C .y 0 /2 dxI y.0/ D y.1/ D 1 (10.9.1)
0

Ritz did not follow Euler, he thus did not derive the Euler-Lagrange equation associated with
Eq. (10.9.1). Instead he attacks the functional directly, but he looks only for an approximate
solution of the following form:

N
y.x/ D ˛ C ˇx C x 2 (10.9.2)

We should be aware that even if we can derive the Euler-Lagrange equation, it is quite often that
we cannot solve it. Or it does not have solutions expressible in terms of elementary functions.

Phu Nguyen, Monash University © Draft version


Chapter 10. Calculus of variations 821

Still physicists (or engineers) need a solution even not in a nice analytical expression, but in the
form of a list of numbers.
If you ask why the form in Eq. (10.9.2)? Note that it is easy to work with polynomials (easy
to differentiate, to integrate for example). And the first curve we normally think of is a parabola.
So, it is natural to start with this polynomial form.
Because of the boundary conditions y.0/ D y.1/ D 1, y.x/ N has to be of the following form:

N
y.x/ D 1 C ˇx ˇx 2 (10.9.3)

(Use Eq. (10.9.2) for x D 0 and x D 1 with the given boundary conditions led to two equations
for ˛, ˇ and ). We can proceed with this form of y.x/.
N But we pause here a bit to study the
form of Eq. (10.9.3) carefully:

N
y.x/ D 1 C ˇx ˇx 2 D 1 C ˇx.1 x/ (10.9.4)

It can be seen that the red function x.1 x/ is vanished at both x D 0 and x D 1; the boundary
points! And the constant 1 is exactly the value of y.x/ at the boundary. Based on this analysis,
we can, in general, seek for y.x/
N in the following general form
X
n
N
y.x/ D ˛0 .x/ C ci ˛i .x/ (10.9.5)
iD1

where ˛i .x/ must be zero at the boundary points, and ˛0 .x/ chosen to satisfy the non-zero
boundary conditions. Note that the ˛i ’s were called Ritz parameters.
And from y.x/
N in Eq. (10.9.4), we can determine its first derivative:

yN 0 .x/ D ˇ 2ˇx (10.9.6)

Introducing y.x/
N and yN 0 .x/ into Eq. (10.9.7), we get (obtained using a CAS as I was lazy, in the
next example I will show the code):
11 2 1
I.ˇ/ D ˇ C ˇC1 (10.9.7)
30 3
which is simply an ordinary function of ˇ, and we want to minimize I , right? That’s easy now:
dI 11 1 5
D0W ˇ C D 0 H) ˇ D (10.9.8)
dˇ 15 3 11
Now that ˇ has been determined, we have found the approximate solution:
5 5
N
y.x/ D1 x C x2
11 11
How accurate is this solution? We can compare it with the exact solution, which is given by
sinh.x/ C sinh.1 x/
y e .x/ D
sinh.1/

Phu Nguyen, Monash University © Draft version


Chapter 10. Calculus of variations 822

One way to check the accuracy of an approximate solution is to plot both solutions together
as in Fig. 10.7a. The Ritz solution is quite good; however to have a better appreciation of the
accuracy, we can plot the error function defined as the relative difference of the Ritz solution
with respect to the exact one:
y e .x/ y.x/
N
error.x/ WD e
y .x/
Fig. 10.7b shows the plot of this error.

1.00 Ritz sol.


exact sol. error
0.98 0.0004

0.96
0.0002

0.94
0.0000
0.92
−0.0002
0.90

−0.0004
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

(a) (b)
R1
Figure 10.7: Ritz solution vs exact solution to the variational problem I Œy.x/ D 0 Œy 2 C
.y 0 /2 dxI y.0/ D y.1/ D 1.

Let’s solve another problem with the Ritz method. Consider a simply supported beam of
length L. Find the deflection of the beam under uniformly distributed transverse load q0 . Recall
from Eq. (10.7.5) that the deflection y.x/ minimizes the following energy functional
Z L 
k 00 2
˘ Œu.x/ D .y / q0 y dx; k WD EI (10.9.9)
0 2
What are the boundary conditions? Because the beam is simply supported, its two ends cannot
move down, thus y.0/ D y.L/ D 0.
Before using the Ritz method, note that the exact solution is a fourth order polynomial:
 
e q0 L4 x x3 x4
y .x/ D 2 3C 4 (10.9.10)
24EI L L L
Thus, as a first approximate solution, we seek for the following solution (what if we do not have
the exact solution at hand? Then, we have to rely on the functional (10.9.9))
N
y.x/ D c1 x.x L/ C c2 x 2 .x L/ (10.9.11)
This form is chosen due to the fact that ˛1 .x/ D x.x L/ and ˛2 .x/ D x 2 .x L/ vanish
at x D 0 and x D L. With this y.x/,N I used SymPy to do everything for me, as shown in
Listing 10.1.

Phu Nguyen, Monash University © Draft version


Chapter 10. Calculus of variations 823

Listing 10.1: Ritz’s solution for the simply supported beam with Eq. (10.9.11).
1 using SymPy
2 @vars x k L q0 c1 c2
3 y = c1*x*(x-L) + c2*x*x*(x-L) # approximate solution yh
4 ypp = diff(y,x,2) # its 2nd derivative
5 F = 0.5*k*ypp^2-q0*y # the integrand in the functional
6 J = integrate(F, (x, 0, L)) # the functional J
7 J1 = diff(J,c1) # derivative of J wrt c1
8 J2 = diff(J,c2) # derivative of J wrt c2
9 solve([J1, J2], [c1,c2]) # solve for c1 and c2

It is found that c1 and c2 are

q0 L2
c1 D ; c2 D 0
24EI
Thus, the two-parameter Ritz solution is given by
 
q0 L2 q0 L4 x x2
N
y.x/ D x.x L/ D
24EI 24EI L L2

We now can check the accuracy. It can show that the Ritz maximum deflection, at the middle of
the beam x D L=2, is off 20% of the exact deflection.
Even though programming gave us quickly the solution (Listing 10.1), it did not tell us
everything. So, it is always a good idea to develop everything manually. Upon introduction of
Eq. (10.9.11) into Eq. (10.9.9), we obtained a functional ˘ which is a function of c1 and c2 .
To minimize it, we set d˘=dc1 D 0 and d˘=dc2 D 0. Here is what we get from these two
equations: " #" # " #
A11 A12 c1 b
D 1 (10.9.12)
A21 A22 c2 b2
with Z Z
L L
Aij D k˛i00 .x/˛j00 .x/dx; bj D q0 ˛j .x/dx (10.9.13)
0 0

Thus, Ritz converted a problem of solving a PDE (or minimizing a functional) to a linear algebra
problem of finding the solutions to Ac D b. And the matrix is of size n  n, where n is the
number of terms in the Ritz approximation; furthermore the matrix is symmetric. What is nice
about Eq. (10.9.12) is that it has a pattern: the row i th can be written in this form

Aij cj D bi

which works for any value of n. Thus, we have a recipe to build up our system e.g. A and b to
solve for ci ’s.

Phu Nguyen, Monash University © Draft version


Chapter 10. Calculus of variations 824

To improve the Ritz solution, what should we do? We use a better approximation! A better
approximation can be obtained if we add more terms to u.x/;
N we add a new term c3 x 3 .x L/
to the two-parameter approximate y.x/:
N

N
y.x/ D c1 x.x L/ C c2 x 2 .x L/ C c3 x 3 .x L/

Repeat the same procedure by modifying the code in Listing 10.1, we getŽŽ

q0 L2 q0 L q0
c1 D ; c2 D ; c3 D
24EI 24EI 24EI
Thus, the three-parameter Ritz solution is given by
 
q0 L4 x x3 x4
N
y.x/ D 2 3C 4
24EI L L L
which is exactly the exact solution!

10.10 What if there is no functional to start with?


At this stage we should pause and think about what we have covered. The story is like this.
Some separate minimization problems were proposed by mathematicians like the Bernoullis;
they all have the same form of seeking a function to minimize an integral that depends on the
function and its derivatives. While eminent mathematicians in the 18th century had solved these
so-called variational problems, they did not develop a systematic approach. Then came Euler and
Lagrange. Euler and Lagrange proceeded on what Gander and Wanner in their interesting article
From Euler, Ritz, and Galerkin to modern computing [25] call the Euler-Lagrange highway, see
the left branch of Fig. 10.8. They took the variation of the functional, then integrated by parts,
and used the fundamental lemma of variational calculus to obtain a partial differential equation–
now bears their name. Thus, the solution to the original variational problem is now the solution
to the Euler-Lagrange equation. Then, Euler had spent time to develop techniques to solve his
PDE including the first version of the finite difference methodŽ .
On the other hand, in physics quite often we need to solve a partial differential equation. On
such example is the Laplace’s equation. In mathematics and physics, Laplace’s equation is a
second-order partial differential equation named after Pierre-Simon Laplace, who first studied
its properties. One example is: we have a thin plate and its edge is heated up to a certain degree,
then we ask this question: what is the temperature inside the plate? That temperature is the
solution to the Laplace’s equation, if u.x; y/ denotes the temperature in the plate:

@2 u @2 u
u D 0 in B; u D C 2 (10.10.1)
@x 2 @y
ŽŽ
If you want to do it manually, then use Eq. (10.9.13) to compute the members Aij of the 3  3 matrix A, and
3  1 vector b. Solving the equation Ac D b gives you exactly the same ci ’s.
Ž
The finite difference method is discussed in Section 12.7.

Phu Nguyen, Monash University © Draft version


Chapter 10. Calculus of variations 825

EL highway: forward EL highway: backward

Z b @F d @F
I Œy.x/ WD F .y; y 0 I x/ dx ! min D0
a @y dx @y 0

Z b Z b  
0 @F d @F
ıI D ıF .y; y I x/ dx D 0 ıy dx D 0
a a @y dx @y 0

Z   Integration by parts
b
@F @F Z  
ıy C 0 ıy 0
b
dx D 0 @F @F 0
a @y @y ıy C 0 ıy dx D 0
a @y @y

Integration by parts Z
Z b   b
@F d @F ıI D ıF dx D 0;
ıy dx D 0
a @y dx @y 0 a

EL equation Z b
@F d @F
D0 I Œy.x/ WD F .y; y 0 I x/ dx ! min
@y dx @y 0 a

Figure 10.8: The Euler-Lagrange highway of variational calculus: forward direction from a functional to
the Euler-Lagrange PDE and the backward direction from a PDE to a functional.

Phu Nguyen, Monash University © Draft version


Chapter 10. Calculus of variations 826

where  is the Laplacian operator, see Eq. (7.11.35). Eq. (10.10.1) means that u.x; y/ is a
function such that u D 0 for all points in the plate or .x; y/ 2 B.
Now, we start with a partial differential equation, and some mathematicians asked the ques-
tion: whether there exists a functional associated with this equation? And the answer to this
question in the case of Laplace’s equation is yes; a result which is now known as Dirichlet’s
principle. Dirichlet’s principle states thatŽ , if the function u is the solution to the Laplace’s
equation, Eq. (10.10.1), with boundary condition u D g on the boundary @B, then u can be
obtained as the minimizer of the Dirichlet energy functional
Z
1
EŒv D krvk2 dV (10.10.2)
B 2
The name "Dirichlet’s principle" is due to Riemann, who applied it in the study of complex
analytic functions.
What is the significance of Dirichlet’s principle? It tells us that we can go the Euler-Lagrange
highway the inverse way, see the right branch of Fig. 10.8. Facing the task of solving a PDE,
we do not solve it directly, but we multiply it with ıy, integrate the result and do integration by
parts, eventually arrive at a functional. Now, we find the minimizer of this functional.
And this was exactly what Walther Heinrich Wilhelm Ritz (1878 – 1909)–a Swiss theoretical
physicist–did when he solved the problem of an elastic plate. Thus, in 1915 Ritz developed the
method which was coined the Ritz method, presented in Section 10.9. This name was due to
Galerkin. The main motivation for Ritz was the announcement of the Prix Vaillant for 1907
by the Academy of Science in Paris. This announcement was sent to him by his friend Paul
Ehrenfest on a postcard. The deformation of an elastic plate under an external force f .x; y/
was a very difficult problem at that time; it was first considered by Sophie Germaine in several
articles. The breakthrough was achieved by Kirchhoff in the form of the differential equation

@4 w @4 w @4 w
C 2 C D f .x; y/ (10.10.3)
@x 4 @x 2 y 2 @y 4
where w.x; y/ is the deflection of the plate. Of course we skip the required boundary conditions.
A compact way to write the bending plate equation is to use the Laplacian operator :

w D f .x; y/ (10.10.4)

Ritz went the Euler-Lagrange highway backwards, and came up with the following functional:
Z  
1 2
J Œw.x; y/ D .w/ f w dV ! min (10.10.5)
B 2

Then, he introduced his approximation for the solution function w.x; y/, assuming that the
boundary condition is zero deflection on the plate edges:

N
w.x; y/ D c1 1 .x; y/ C c2 2 .x; y/ C    C cn n .x; y/ (10.10.6)
Ž
A proof will be presented shortly.

Phu Nguyen, Monash University © Draft version


Chapter 10. Calculus of variations 827

Substitution of this into Eq. (10.10.5), we have J.c1 ; c2 ; : : :/, and minimizing it gives us a system
of linear equations to solve for the Ritz parameters ci ’s. The effort was high as Ritz did not have
computer to help him, but of course he managed to get good results.
Because we need the functions i .x; y/ to be zero on the plate boundary, Ritz selected the
easiest plate problem: a square plate of size 2  2. Thus, 1 .x; y/ D .1 x 2 /2 .1 y 2 /2 , with
the origin of the coordinate system at the plate center, and so on.ŽŽ
Proof of Dirichlet’s principle. Assume that u is the solution to the Laplace’s equation, thus
u D 0 in B. Furthermore, we have u D g on @B. We have to show that

EŒu  EŒw for all w such that w D g on B

Now consider this function v D u w, we have v D 0 on @B. We now write w D u v, and


compute EŒw using Eq. (10.10.2); if the final step was not clear, note that ru is a vector and
check Box 11.2 for rules of the dot product:
Z Z
1 2 1
EŒw D kr.u v/k dV D r.u v/  r.u v/dV
B 2 2 B
Z
1  
D kruk2 C krvk2 C 2ru  rv dV
2 B
R
Note that B 2ru  rvdV D 0, thanks to the first Green’s identity, see Section 7.11.13
Z Z Z
ru  rvdV D .vru/  ndS vudV D 0
B @B B

Note that u D 0 in B and v D 0 on @B. Thus,

EŒw D EŒu C EŒv  EŒu; (because EŒv  0)

10.11 Galerkin methods


The Ritz method was picked up by Russians mathematicians and engineers. For example, Ivan
Grigoryevich Bubnov (1872 – 1919) a Russian marine engineer and designer of submarines for
the Imperial Russian Navy and Boris Galerkin (1871 – 1945) a Soviet mathematician and an
engineer used the method for practical problems and also developed new developments.
To solve the beam problem in Eq. (10.9.9), Bubnov used trigonometric functions instead of
polynomials. For example, his two-parameter approximation reads
x 3x
N
y.x/ D c1 sin C c2 sin (10.11.1)
L L
ŽŽ
What if we have to deal with a L-shape plate? Or even worse an arbitrary three dimensional shape? To that we
need an extended version of the Ritz method known as the finite element method.

Phu Nguyen, Monash University © Draft version


Chapter 10. Calculus of variations 828

And the corresponding solution is

4q0 L4 x 4q0 L4 3x


N
y.x/ D sin C sin (10.11.2)
EI  5 L 243EI  5 L
You can plot the exact solution in Eq. (10.9.10), the two-parameter Ritz solution using polyno-
mials in Eq. (10.9.2), and Bubnov’s solution in Eq. (10.11.2), and you will see that using two
trigonometric functions yield a better solution than using two polynomials. But, what is more
is that, during the computation, you see that A12 D A21 D 0. Actually it was not a surprise
to Bubnov. He knew the orthogonality of trigonometric functions (Fourier’s work) and took
advantage of it to simplify the computations. Note that because the diagonal terms in matrix A
are zero, solving Ac D b is super easy.
Another contribution that Bubnov and Galerkin made is their observation that it is not nec-
essary to know the functional associated with the given PDE to find the Ritz solution. So, they
also followed the Euler-Lagrange highway backwards, but stopped before the final destination,
Fig. 10.8:
Z b   Z b 
@F d @F @F d @F @F @F 0
D 0 H) ıy dx D 0 H) ıy C 0 ıy dx D 0
@y dx @y 0 a @y dx @y 0 a @y @y

With the boxed equation, they introduced the usual Ritz approximations for y and ıy to obtain a
system of linear equations. To demonstrate their method, we solve the bending beam problem
again, starting with the PDE:

ky 0000 D q0 0 < x < L; y.0/ D y.L/ D 0I y 00 .0/ D y 00 .L/ D 0 (10.11.3)

We first put the PDE in the form ky 0000 q0 D 0, multiply that with ıy and integrate over the
problem domain:
Z L
.ky 0000 q0 /ıydx D 0 (10.11.4)
0
Then, integrating by parts twice to get
Z L
.ky 00 ıy 00 q0 ıy/dx D 0 (10.11.5)
0

Of course, this equation is nothing but the variation of a functional being set to zero. But we do
not need to know the form of that functional, if our aim is primarily to find the solution y.x/.

Why integration by parts? In theory, we can stop at Eq. (10.11.4), and introduce the Ritz
approximation into it to get a system of equations to solve for the Ritz parameters. However, it
involves y 0000 , thus the Ritz approximation for y must use at least a third order polynomial. Fur-
thermore, we have asymmetry in the formulation: there is y 0000 and only ıy. A simple integration
by parts solves these two issues! Just one simple integration by parts, and we get Eq. (10.11.5) in
which the derivative of y.x/ has been lowered from four to two, and that is passed to ıy 00 . Thus,

Phu Nguyen, Monash University © Draft version


Chapter 10. Calculus of variations 829

we have a symmetric formulation. Thanks to this, the resulting matrix A will be symmetric i.e.,
Aij D Aj i .
Now, Galerkin used the Ritz approximation for y.x/. For illustration, only two terms are
used,
y D c1 1 .x/ C c2 2 .x/ H) y 00 D c1 100 .x/ C c2 200 .x/ (10.11.6)
What about the variation ıy? What should be its approximation? As a variation is a small
perturbation to the actual solution y.x/, if y.x/ is of the form ci i .x/, then its variation is of
the same formŽŽ :

ıy D d1 1 .x/ C d2 2 .x/ H) ıy 00 D d1 100 .x/ C d2 200 .x/ (10.11.7)

What are the di ’s? They are real numbers which can be of any value, because a variation is
anything that is zero at the boundary.
With these approximations of y and ıy introduced into Eq. (10.11.5), we get
Z L
Œk.c1 100 C c2 200 /.d1 100 C d2 200 / q0 .d1 1 C d2 2 dx D 0
0

which is re-arranged in the form of ./d1 C ./d2 D 0:


" Z ! Z ! Z #
L L L
k100 100 dx c1 C k100 200 dx c2 q0 1 dx d1 C
0 0 0
" Z ! Z ! Z # (10.11.8)
L L L
k100 200 dx c1 C k200 200 dx c2 q0 2 dx d2 D 0
0 0 0

Now, because d1 and d2 are arbitrary, we conclude that the two bracket terms must be zeroes:
Z ! Z ! Z
L L L
k100 100 dx c1 C k100 200 dx c2 D q0 1 dx
0 0 0
Z ! Z ! Z
L L L
k100 200 dx c1 C k200 200 dx c2 D q0 2 dx
0 0 0

Look at what we have obtained? A system of equations to determine the Ritz coefficients, and
the system is identical to the one got from the Ritz method, see Eqs. (10.9.12) and (10.9.13).
That’s probably why Galerkin called his method the Ritz method, and nowadays we call what
Galerkin did the Galerkin method!
Let’s summarize the steps of the method, which I refer to as the Bubnov-Galerkin method– a
common term nowadays–in Box 10.1, even though a better term should have been Ritz-Bubnov-
Galerkin method. What more this method gives us compared with its predecessor that Ritz
ŽŽ
In theory, the only requirement is that ıy.0/ D ıy.L/ D 0. Thus, it is possible to use another approximation
for it, for example ıy D di i .x/. But that would be some years later after Galerkin’s work. Advancements are
made in small steps.

Phu Nguyen, Monash University © Draft version


Chapter 10. Calculus of variations 830

developed? It has a wider range of applications as there are many partial differential equations
that are not Euler-Lagrange equations of any variational problem.

Box 10.1: Bubnov-Galerkin method to solve numerically any PDE.


 Starting point: the PDE
ky 0000 D q0 0<x<L

 Derive the weak form (multiply the PDE with ıy, integrate over the domain, inte-
grating by parts)
Z L
.ky 00 ıy 00 q0 ıy/dx D 0
0

 Approximating the function y.x/ using i .x/

X
n
y D 0 .x/ C ci i .x/
i D1

 Approximating the variation ıy.x/ also using i .x/

X
n
ıy D di i .x/
i D1

 Obtain a system of linear equations to solve for ci ’s

Aij cj D bj

10.12 The finite element method


10.12.1 Basic idea
The Ritz-Galerkin method has one serious limitation: it is difficult to apply the method to PDEs
with complex geometry. For one dimensional problems, this limitation does not present to us.
Only in two dimensions it shows up. In 1942 Richard Courant, in his classic paper entitled
‘Variational methods for the solution of problems of equilibrium and vibrations’, presented the
first appearance of what we now call the finite element method. Unfortunately, the relevance of
this article was not recognized at the time and the idea was forgotten. In the early 1950’s the
method was rediscovered by aerospace engineers at Boeing (MJ Turner) and structural engineers
(J. H. Argyris). The term ’finite elements’ was coined by Ray CloughŽ in his classic paper “The
Ž
Ray William Clough, (1920–2016), was Byron L. and Elvira E. Nishkian Professor of structural engineering
in the department of civil engineering at the University of California, Berkeley and one of the founders of the finite
element method.

Phu Nguyen, Monash University © Draft version


Chapter 10. Calculus of variations 831

Finite Element Method in Plane Stress Analysis” in 1960. The mathematical analysis of finite
element approximations began much later, in the 1960’s, the first important results being due to
Milos Zlamal2 in 1968. Since then finite element methods have been developed into one of the
most general and powerful class of techniques for the numerical solution of partial differential
equations and are widely used in engineering design and analysis.
The finite element method is a Ritz-Galerkin method but with one vital difference (Fig. 10.9)
regarding the construction of the approximate solution:

 The domain is divided (or partitioned) into a number of sub-domains called elements.
These elements are of simple shapes: in 2D the elements are triangles and quadrilaterals,
in 3D they are tetrahedrals and hexahedrals. The vertices of the elements are called the
nodes. The elements, the nodes and the relation between elements/nodes altogether make
a mesh. Let n be the total number of nodes in the mesh.

 The theory of interpolation is used to build the approximate solution. Assuming that u.x/
is the function we’re trying to find, and let’s denote by uI the value of u.x/ at node I ,
then the approximate solution is written as

X
n
u.x/ D NI .x/uI (10.12.1)
I

where NI .x/ are called the shape functions. The shape functions are constructed such that
they satisfy the Kronecker delta property:
8
<1 if I D J
NI .xJ / D ıIJ ; ıIJ D (10.12.2)
:0 otherwise

P
Therefore, u.xJ / D I NI .xJ /uI D uJ . The Ritz parameters uI now have a meaning: it
is the value of the function evaluated at the nodes. Furthermore, the shape functions have
local support i.e., NI .x/ is non-zero only over few elements connecting node I ; see the
right figure (bottom) of Fig. 10.9.

The finite element method is extremely flexible about geometry. It can solve PDEs on arbi-
trary 3D domains (Fig. 10.10). Furthermore, because the approximation is local, the construction
of NI .x/ are easier (than to build shape functions over the entire domain).

10.12.2 FEM for 1D wave equation


For a simplest introduction to FEM, let’s consider the one dimensional momentum equation,
that governs the deformation of a bar due to applied external forces:

@2 u @2 u
 D E C b (10.12.3)
@t 2 @x 2
Phu Nguyen, Monash University © Draft version
Chapter 10. Calculus of variations 832

Figure 10.9: Basic ideas of the finite element method: (1) domain division into triangular elements
connected via the notes and (2) FE approximation using local shape functions.

(a) 2D (b) 3D

Figure 10.10: The finite element method enjoys a geometry flexibility: it can handle any geometry.

where u.x; t / is the displacement field, E is the Young modulus of the material,  is the density
and b is the body force. The spatial domain is 0  x  L, L is the length of the bar and the time
domain is 0  t  T .
For the case of zero body force (i.e. b D 0) the above equation becomes the well known one
dimensional wave equation written as:
s
@2 u 2
2@ u E
D c ; cD (10.12.4)
@t 2 @x 2 

In order for a PDE to have unique solutions, initial and boundary conditions have to be
provided. For example, the so-called Dirichlet boundary conditions read

u.0; t/ D a; u.L; t/ D b; t >0 (10.12.5)

where a; b are some constants. As Eq. (10.12.4) involves second derivative with respect to t, two
initial conditions are required which are given by

u.x; 0/ D f .x/; P
u.x; 0/ D g.x/ (10.12.6)

where uP WD du=dt and f; g are some known functions (i.e., data of the problem).

Phu Nguyen, Monash University © Draft version


Chapter 10. Calculus of variations 833

Putting all the above together we come up with the following initial-boundary value problem
@2 u 2
2@ u
D c (wave equation)
@t 2 @x 2
u.0; t/ D a; u.L; t/ D b; t >0 (boundary conditions) (10.12.7)

u.x; 0/ D f .x/; P
u.x; 0/ D g.x/ (initial conditions)
Eq. (10.12.7) is called a strong form of the wave equation. The finite element methods (or
generally Galerkin based methods) adopt a weak formulation where the partial differential
equations are restated in an integral form called the weak form. A weak form of the differential
equations is equivalent to the strong form. In many disciplines, the weak form has a physical
meaning; for example, the weak form of the momentum equation is called the principle of virtual
work in solid/structural mechanics.
To obtain the weak form, one multiplies the PDE i.e., the wave equation in this particular
context, with an arbitrary function w.x/, called the weight function, and integrate the resulting
equation over the entire domain. That is
Z L 2 2

@ u 2@ u
c w.x/dx D 0; 8w.x/ with w.0/ D w.L/ D 0 (10.12.8)
0 @t 2 @x 2
The arbitrariness of the weight function is crucial, as otherwise a weak form is not equivalent to
the strong form. In this way, the weight function can be thought of as an enforcer: whatever it
multiplies is enforced to be zero by its arbitrariness.
Using the integration by parts for the second term, the above equation becomes
Z L 2 Z L
@ u 2 @u @w
2
w.x/dx C c dx D 0 (10.12.9)
0 @t 0 @x @x

where the spatial derivative of the unknown field, u.x; t/, was lowered from two to one.
The weak form of the wave equation is thus given by: find the smooth function u.x; t/ such
that
Z L 2 Z L
@ u 2 @u @w
2
w.x/dx C c dx D 0
0 @t 0 @x @x
(10.12.10)
u.0; t/ D a; u.L; t/ D b
u.x; 0/ D f .x/; u.x; 0/ D g.x/
for all w.x/ with w.0/ D w.L/ D 0.
Our weak form has both spatial and temporal variables. One simple method to deal with them
is the method of lines. The method of lines proceeds by first discretizing the spatial derivatives
only and leaving the time variable continuous. Therefore, the approximation of the unknown
field u.x; t/ is written as
X
n
h
u.x; t/  u .x; t/ D NI .x/uI .t/ (10.12.11)
I

Phu Nguyen, Monash University © Draft version


Chapter 10. Calculus of variations 834

where NI .x/ are the shape functions and uI .t/ denotes the value of u at point I at time instant t
and constitutes the unknowns to be solved. The weak form (10.12.10) requires the acceleration
and the first spatial derivative of u.x; t/, they are given by

@2 u X @u X dNI .x/
n n
D NI .x/uR I .t/; D uI .t/
@t 2 I
@x I
dx

Even though there are many choices for the weight functions w, in the Bubnov-Galerkin
method, which is the most commonly used method at least for solid mechanics applications, the
weight function is approximated using the same shape functions as u. That is
X
n
w.x; t/ D NI .x/wI (10.12.12)
I

where wI are the nodal values of the weight function; they are not functions of time. It is
straightforward to compute w 0 required in Eq. (10.12.10).
With these approximations, the weak form of the wave equation, i.e., Eq. (10.12.10), becomes:
find uJ such that
Z L Z L  
2 d NI d NJ
.NI .x/uR I / .NJ .x/wJ / dx C c uI wJ dx D 0 (10.12.13)
0 0 dx dx
for all wJ . Note that the Einstein summation rule was adopted: indices which are repeated twice
in a term are summed.
The arbitrariness of wJ results in the following system of ordinary differential equationsŽ
2Z L Z L Z L 3
6 N1 N1 dx N1 N2 dx : : : N1 Nn dx 7 2 3
6Z0 Z0 L Z0 L 7 uR 1
6 L 76 7
6 7 6uR 7
6 N N dx N N dx : : : N N dx 7 6 27
6 0 76 7
2 1 2 2 2 n
6 :
0
: :
0
: 7 6 :: 7
6 :: :: :: :: 76 : 7
6 74 5
6Z L Z L Z L 7
4 5 uR n
Nn N1 dx Nn N2 dx : : : Nn Nn dx
0
2Z L0 Z L
0
Z L 3
6 dN1 dN1 dx dN1 dN2 dx : : : dN1 dNn dx 7 2 3 2 3
6Z0 Z0 L Z0 L 7 u1 0
6 L 76 7 6 7
6 dN2 dNn dx 7 6 7 6 7
6
26
dN2 dN1 dx dN2 dN2 dx : : : 7 6u2 7 607
Cc 6 0 7 6 7 6 7
:
0
: :
0
: 7 6 :: 7 D 6 :: 7
6 : : : : 7 6 : 7 6:7
6 : : : : 74 5 4 5
6Z L Z L Z L 7
4 5 un 0
dNn dN1 dx dNn dN2 dx : : : dNn dNn dx
0 0 0
(10.12.14)
Ž
This is exactly identical to what we have done in the Ritz-Galerkin method, see for example Eq. (10.11.8).

Phu Nguyen, Monash University © Draft version


Chapter 10. Calculus of variations 835

where the short notation dNI is the first spatial derivative of the shape function NI , dNI D
dNI =dx. The integrals in the above equation are called weak form integrals. For this simple 1D
problem, they can be exactly computed, but generally, numerical integration is used to evaluate
these integrals.
And Eq. (10.12.14) can be cast in the following compact equation using a matrix notation
Z L Z L
2
MuR C Ku D 0; MIJ D NI NJ dx; KIJ D c dNI dNJ dx (10.12.15)
0 0

where u and uR are the vectors of displacements and accelerations of the whole problem, re-
spectively; they are one dimensional arrays of length n. M and K are the mass and stiffness
matrix–matrices of dimension n  n. Furthermore these matrices are symmetric.
Equation (10.12.15) is referred to as the semi-discrete equation as the time has not been yet
discretized. Any time integration methods for ODEs can be used to discretize Eq. (10.12.15)
in time. Refer to Section 12.6 for detail. After having obtained uI .t/, Eq. (10.12.11) is used to
compute the function at any other points.
Up to this point, how the shape functions NI are constructed is not yet discussed. In the next
section, we discuss this construction of shape functions.

10.12.3 Shape functions


Irons, B. M. and O. C. Zienkiewicz. 1968. “The Isoparametric Finite Element System – A New
Concept in Finite Element Analysis,

10.12.4 Role of FEM in computational sciences and engineering


The finite element method is an important tool in computational sciences and engineering (CSE).
CSE is a relatively new discipline that deals with the development and application of compu-
tational models, often coupled with high-performance computing, to solve complex problems
arising in engineering analysis and design (computational engineering) as well as in natural
phenomena (computational science). CSE has been described as the "third mode of discovery"
next to theory and experimentation.
Within the realm of CSE these are steps to solve a problem:

 First, a mathematical model that best describes the problem is selected or developed. This
step of model development is done manually by people with sufficient mathematical skills.
A majority of mathematical model is developed using calculus and thus they are continuous
models not suitable for digital computers.
 Second, a computational model of this mathematical model is derived. A computational
model is an approximation to the mathematical model and is in a discrete form which can
be solved using computers.
 Third, this discrete model is implemented in a programming language (Fortran in the past
and C++ and Python nowadays) to have a computational code or platform.

Phu Nguyen, Monash University © Draft version


Chapter 10. Calculus of variations 836

Computer simulations are not only useful to solve problems too complex to be resolved
analytically, but are also increasingly replacing costly and time consuming experiments. Further-
more, they can provide tremendous information at scales of space and time where experimental
visualization is difficult or impossible. And finally, simulations also have a value in their ability
to predict the behavior of materials and structures that are yet to be created; experiments are
limited to materials and structures that have already been created.

Phu Nguyen, Monash University © Draft version


Chapter 11
Linear algebra

Contents
11.1 Vector in R3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 839
11.2 Vectors in Rn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 857
11.3 System of linear equations . . . . . . . . . . . . . . . . . . . . . . . . . . 860
11.4 Matrix algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 870
11.5 Subspaces, basis, dimension and rank . . . . . . . . . . . . . . . . . . . . 881
11.6 Introduction to linear transformation . . . . . . . . . . . . . . . . . . . . 887
11.7 Linear algebra with Julia . . . . . . . . . . . . . . . . . . . . . . . . . . 894
11.8 Orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 894
11.9 Determinant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 904
11.10 Eigenvectors and eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . 911
11.11 Vector spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 925
11.12 Singular value decomposition . . . . . . . . . . . . . . . . . . . . . . . . 947

This chapter is about linear algebra. Linear algebra is central to almost all areas of mathemat-
ics. Linear algebra is also used in most sciences and fields of engineering. Thus, it occupies a
vital part in the university curriculum. Linear algebra is all about matrices, vector spaces, systems
of linear equations, eigenvectors, you name it. It is common that a student of linear algebra can
do the computations (e.g. compute the determinant of a matrix, or the eigenvector), but he/she
usually does not know the why and the what–the theoretical essence of the subject. This chapter
hopefully provides some answers to these questions.
There is one more strong motivation to learn linear algebra: it plays a vital part in machine
learning, which is basically ubiquitous in our modern lives.
The following books were consulted for the materials presented in this chapter:

837
Chapter 11. Linear algebra 838

 Introduction to Applied Linear Algebra: Vectors, Matrices, and Least Squares by Stephen
Boyd and Lieven Vandenberghe‘ ;

 Introduction to Linear Algebra by the famous maths teacher Gilbert StrangŽŽ [24]

 Linear Algebra: A Modern Introduction by David Poole [56];

 Linear Algebra And Learning from Data by Strang [68].

I follow David Poole’s organization for the subject to a great extent. Sometimes I felt lost
reading Strang’s [24]. With Poole, I could read from the beginning to the end of his book. Even
though I understand that my linear algebra is still shaky (it is a big field and I rarely did exercises),
thus reading Strang’s [68] was useful. That book gave a concise review of linear algebra required
to be used in applications. If I could understand Strang this time, I can say that I understand
linear algebra.
The chapter starts with the familiar physical vectors in the 2D plane and in the 3D space
we are living in (Section 11.1). Nothing is abstract and it is straightforward to introduce vector-
vector addition and scalar-vector multiplication–the two most important vector operations in
linear algebra. For use in vector calculus (and applications in physics), the cross product of two
3D vectors is also presented. But keep in mind that this product (with a weird definition and we
can define a cross product of two 3D vectors only) is not used in linear algebra. The description
using vectors of lines and planes is discussed, which plays an important role later.
Section 11.2 then presents a generalization of 2D and 3D vectors to vectors in Rn –the n
dimensional space, whatever it is geometrically. The section introduces the important concept
of linear combinations of a set of vectors, which plays a vital role in the treatment of systems of
linear equations.
Systems of linear equations, those of the form Ax D b, are the subject of Section 11.3. More
than 2000 years ago Chinese mathematicians already knew how to solve these systems. Due to
its linearity solving a system of linear equations is not hard. But we introduce the new concept of
matrix to the subject, and of course the Gaussian elimination method to take a matrix associated
to Ax D b to a row (reduced) echelon form.
And with that we study the algebraic rules of matrices; how we can add two matrices, multi-
plying a matrix with a vector and so on. The subject is known as matrix algebra (Section 11.4).
Also discussed are transpose of a matrix, the inverse of a matrix, the LU factorization of a matrix.
Subspaces, basis and dimension are discussed in Section 11.5. A brief introduction to linear

Stephen Boyd is the Samsung Professor of Engineering, and Professor of Electrical Engineering in the Informa-
tion Systems Laboratory at Stanford University. His current research focus is on convex optimization applications
in control, signal processing, machine learning, and finance.

Lieven Vandenberghe is a Professor of Electrical Engineering at the University of California, Los Angeles.
ŽŽ
His lectures are available at https://www.youtube.com/watch?v=ZK3O402wf1c&list=
PL49CF3715CB9EF31D&index=1.

David Poole is a professor of mathematics at Trent University. He has been recognized with a number of
awards for his inspirational teaching. His research interests are algebra, discrete mathematics, ring theory and
mathematics education.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 839

transformation is given to illustrate the geometric meaning of matrix-vector multiplication as


well as the geometric meaning of the determinant of a matrix (Section 11.6).
In Section 11.7 the use of Julia to do some linear algebra calculations is presented. The
purpose is to use computers to do boring/tedious computations so that we can focus on the
meaning underlying those computations. There is no gain if one can compute the determinant
of a 10  10 matrix manually but does not understand what a determinant is!
Section 11.8 is all about orthogonality: one vector orthogonal to another vector, vector or-
thogonal to a set of vectors, vectors orthogonal to a subspace and so on. Orthogonal projection
is presented together with the Gram-Schmidt orthogonalization process to build an orthogonal
basis of a subspace. And that leads to another matrix factorization–the QR factorization.
Section 11.9 is about determinants of matrices. With the introduction of linear transformation,
it is clear to get the formula for the determinant of 2  2 matrices and of 3  3 matrices. The
derivation is purely geometric. Based on this geometric meaning of the determinant of 3  3
matrices, some properties of the determinant are observed. And we then derive a general formula
for the determinant of an n  n matrix based on these properties.
Section 11.10 is about eigenvalue problem.
The cultivation of linear algebra is vector spaces (Section 11.11). Vector spaces not only
include the space of n dimensional vectors but also functions, matrices.

11.1 Vector in R3
To begin our journey about vector algebra let’s do some observation about various concepts we
use daily. For example, consider a cube of side 2 cm; its volume is 8 cm3 . Now if we rotate
this cube, whatever the rotation angle is, its volume is always 8 cm3 . We say that volume is
a direction-independent quantity. Mass, volume, density, temperature are such quantities. The
formal term for them is scalar quantities. To specify a scalar quantity, we need only to provide
its magnitude (8 cm3 , for example). And we know how to do mathematics with these scalars: we
can add, subtract, multiply, take roots etc. Furthermore, we know the rules of these operations,
see e.g. Eq. (2.1.2).
On the other hand, there are quantities that are direction-dependent. It is not hard to see
that velocity is such a quantity. We need to specify the magnitude (or speed) and a direction
when speaking of a velocity. After all, your car is running at 50 km/h north-west is completely
different from 50 km/h south-east. Quantities such as velocity, force, acceleration, (linear and
angular) momentum are called vectorial quantities; they need a magnitude and a direction.
Geometrically, we use arrows to represent vectors (Fig. 11.1). Symbolically, we can write
!
AB or a bold-face a–a notation introduced by Josiah Willard Gibbs (1839 – 1903), an American
scientist. We employ Gibbs’ notation in this book. So, in what follows a (and similar symbols
!
such as b) are vectors. However, in some figures, the old AB still exist as it’s easier to draw an
arrow.
Now, we need to define some operations for vectors similar to what we have done for numbers.
It turns out there are only a few: addition of vectors (two or more), multiplication of a vector

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 840

Figure 11.1: Vectors are geometrically represented by arrows. A is the head of the vector and B is its tail.

with a scalar, dot product of two vectors (yielding a scalar) and cross product of two vectors
giving a vector (remember the torque in physics?).

11.1.1 Addition and scalar multiplication


Addition of two vectors is simple: if we walk from A to B, then from B to C it is equivalent to
walking from A to C directly, see Fig. 11.2a. So, to compute the sum of a and b, we move the
head of b to the tail of a. Doing so does not change b as two vectors are the same if they have
identical lengths and directions. And that is the well known parallelogram rule.

c
d
C

C C
b

−−→ b
BC
c

a
a+b+

b
a+

B
a+

−→ b+aa+b
AC B B
b+
c+

b
d

−→ a a
AB

A A A
(a) Addition of 2 vectors (b) Addition of more than 2 vecs

Figure 11.2: Addition of vectors: (a) addition of two vectors: the parallelogram rule and (b) addition of
more than two vectors.

Having defined the addition operation, we need to find the properties that vector addition
obeys. From Fig. 11.2a, we can see immediately that a C b D b C a. Furthermore, it can be
seen that .a C b/ C c D a C .b C c/. That is, addition of vectors follow the commutative and

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 841

associative rules similar to numbers. Why .a C b/ C c D a C .b C c/ useful? Because it allows


us not to worry about the order and thus we can remove the brackets unambiguously.
Repeated addition leads to multiplication. If we add a vector a to itself we get 2a, a vector
that has the same direction with a but twice the length. We can generalize this by defining a
scalar multiplication for vectors. Given ˛ 2 R, ˛a is the scaled vector that has a new length
being the length of the original vector multiplied by ˛, but maintains the direction of a. From first
principles of Euclidean geometry (e.g. similar triangles), we can see that ˛.a C b/ D ˛a C ˛b.
Up to this point we have considered vectors as purely geometrical objects. To simplify the
computations, we adopt the approach of analytic geometry–use algebra to describe geometrical
objects. To this end, we use a Cartesian coordinate system, where each point is described by an
ordered pairŽŽ of numbers .x; y/ in 2D or an ordered triplet of numbers .x; y; z/ or .x1 ; x2 ; x3 /
in 3D. A vector is then a directed line segment from the origin to any point in space (Fig. 11.3).
The list .x1 ; x2 ; x3 / are called the coordinates or the components of the vector.

y y

a    
  1 0
3 a=3 +6
a= 0 1
6
= 3i + 6j
j
3 x x
i

Figure 11.3: With the introduction of a coordinate system, any vector is represented by an ordered pair
of numbers .x; y/ in 2D, written as a column vector, or an"ordered
# triplet of numbers .x; y; z/ in 3D. To
a
save space, in text we write a D .a1 ; a2 /> instead of a D 1 . There is more to say about the transpose
a2
operator > . Noting that a vector is a geometrical object, not a list of numbers. That list .x; y; z/ is just
a representation of a vector in a chosen coordinate system.

On this plane we see a remarkable thing: any vector, say a D .a1 ; a2 /> , is obtained by going
to the right (from the origin) a distance a1 and then going vertically a distance a2 , see the right
figure in Fig. 11.3. We can write this down as

" # " #
1 0
a D a1 C a2 (11.1.1)
0 1

ŽŽ
The word ordered is used because .x; y/ is totally different from .y; x/.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 842

or, with the introduction of two new vectors i and j called the unit coordinate vectorsŽ :
" # " #
1 0
a D a1 i C a2 j ; i WD ; j WD (11.1.2)
0 1

Of course for 3D, we have three such vectors i D .1; 0; 0/> , j D .0; 1; 0/> and k D .0; 0; 1/> .
Why writing such trivial equation such as Eq. (11.1.2)? Because it says that any vector can be
written as a linear combination of the unit coordinate vectors. In other words, we say that the
two unit coordinate vectors span the 2D space. This is how mathematicians express the idea that
‘the two directions–east and north–are sufficient to get us anywhere on a plane’. Note that this
geometric view does not, however, exist if we talk about high-dimensional spaces.
Vector addition is simple with components: to add vectors, add the components. The proof
is straightforward as follows, where a D .a1 ; a2 ; a3 />

a C b D .a1 i C a2 j C a3 k/ C .b1 i C b2 j C b3 k/
D .a1 C b1 /i C .a2 C b2 /j C .a3 C b3 /k

Similarly, to scale a vector, scale its components: for a vector in 2D, ˛a D .˛a1 ; ˛a2 /. Do we
have to define vector subtraction? No! This is because a b D a C . 1/b. Scaling a vector
with a negative number changes its length and flips its direction.
Being able to be added, and scaled by a number, it is natural to compute a vector given by
˛1 a1 C ˛2 a2 C    C ˛n an –a linear combination of n vectors ai . We have seen such combination
in Eq. (11.1.2).
With components, it is easy to prove ˛.a C b/ D ˛a C ˛b. Indeed, ˛.ai C bi / D ˛ai C ˛bi .
Similar trivial proofs show up frequently in linear algebra.
Box 11.1 summarizes the laws of vector addition and scalar multiplication. Note that 0 is the
zero vector i.e., 0 D .0; 0; 0/> for 3D vectors.

Box 11.1: The laws of vector addition and scalar multiplication.

(a): commutative law a C b DbCa


(b): associative law a C .b C c/ D .a C b/ C c
(c): zero vector aC0 Da
(d): distributive law ˛.a C b/ D ˛a C ˛b
(e): distributive law .˛ C ˇ/a D ˛a C ˇa
(f): 1a Da
(g): ˛.ˇa/ D .˛ˇ/a

Ž
Now imagine that we scale these unit vectors to 2i and 2j and use the scaled vectors as the new basis vectors.
What will happen to the components of our vector a? Apparently, its components will be 0:5a1 ; 0:5a2 . As the basis
vectors get bigger the components of the vector get smaller. That’s why the better name for them is: contravariant
components. This is important when we study tensors.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 843

11.1.2 Dot product


While vector addition and scalar-vector multiplication are quite natural, it is hard to immediately
grasp the dot product of two vectors. We give the definition first and from there we deduce the
meaning of the dot product. At the end of the section, we provide a discussion that leads to the
definition of the dot product. The discussion is based on the observation that the length of a
vector does not change whatever transformation we apply to the vector.

Definition 11.1.1
The dot product of two 3D vectors a D .a1 ; a2 ; a3 /> and b D .b1 ; b2 ; b3 /> is a number
defined as
a  b D a1 b1 C a2 b2 C a3 b3 (11.1.3)

Why this definition? One way to understand is to consider the special case that the two
vectors are the same. When b D a, we have a  a D a12 C a22 C a32 , which is the square of
the length
p of a, see Fig. 11.4. So, the dot product gives us the length of a vector, defined by
kak WD a  a. We recall that the notation jxj gives the distance from x to 0. Note the similarity
in the notations.
z

p p
y kak = a21 + a22 kak = a21 + a22 + a23

a
a2
a3
a2
x O y
a1
a1 p
a21 + a22

x a2
q
Figure 11.4: Length of a 2D and 3D vector: kak D a12 C a22 C a32 from the Pythagorean theorem.

The dot product has many applications. For example, the kinetic energy of a 1D point mass
R 2with speed v is 0:5mv and its extension to 3D is 0:5mv  v. The work done by a force F is
2
m
1 F  ds. And the list goes on.
There is a geometric meaning of this dot product: a  b D kakkbk cos.a; b/. The notation
.a; b/ means the angle between the two vectors a and b. The proof is based on the generalized
Pythagorean theorem c 2 D a2 C b 2 2ab cos C (Section 3.13). We need a triangle here: two
edges are vectors a and b, and the remaining edge is c D b a. To this triangle, we can write
(using the generalized Pythagorean theorem)
kb ak2 D kak2 C kbk2 2kakkbk cos  (11.1.4)

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 844

On the other hand, the squared length of vector b a can also be written as using the dot product
and its property a  .b ˙ c/ D a  b ˙ a  c (known as the distributive law, see Box 11.2):
kb ak2 D .b a/  .b a/
(11.1.5)
DbbCaa 2a  b D kak2 C kbk2 2a  b
From Eqs. (11.1.4) and (11.1.5) we get:

a  b D kakkbk cos  (11.1.6)

And this formula reveals one nice geometric property. As cos  D 0 when  D =2, two vectors
are perpendicular/orthogonal to each other if their dot product is zero. We can now see that the
unit vectors i ; j ; k are mutually perpendicular: i  j D 0, i  k D 0, j  k D 0. Why call them
unit vectors? Because their lengths are 1. We can always make a non-unit vector a unit vector
simply by dividing it by its length, a process known as normalizing a vector:
v
normalizing a vector: vO D (11.1.7)
kvk
When we need just the direction of a vector, kvk
v
is the answer.
Again, we observe some properties or laws governing the behavior of the dot product. We
summarize them in Box 11.2. The proofs are quite straightforward and thus skipped. From (a)
and (b) we are going to derive another rule with a D e C f

a  .b C c/ D a  b C a  c ” .e C f /  .b C c/ D .e C f /  b C .e C f /  c

And using (a,b) again, we have

.e C f /  .b C c/ D e  b C e  c C f  b C Cf  c

And what is this? This is the FOIL (First-Outer-Inner-Last) rule of algebra discussed in Sec-
tion 2.1!

Box 11.2: The laws of the dot product.

(a): commutative law a  b Dba


(b): distributive law a  .b C c/ DabCac
(c): .˛a/  b D ˛.a  b/ D a  .˛b/
(d): aa  0 .equality holds iff a D 0/

One application of (b): if a is perpendicular to both b and c (i.e., a  b D a  c D 0), then it


is perpendicular to b C c and in fact it is perpendicular to all linear combinations of b and c (or
it is perpendicular to the plane spanned by b and c):

a  .˛b C ˇc/ D ˛.a  b/ C ˇ.a  c/ D ˛0 C ˇ0 D 0

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 845

The triangle inequality. Consider two vectors a and b, they make two edges of a triangle, the
remaining edge is its sum a C b. From the property of triangle, we then have the following
inequality:
jja C bjj  jjajj C jjbjj (11.1.8)
Proof. We need to use the Cauchy-Schwarz inequality proved in Section 2.20.3. Note that
Eq. (11.1.6) also provides a geometric proof for the Cauchy-Schwarz inequality at least for
2D/3D cases . Now, we can writeŽŽ :
.a C b/  .a C b/ D a  a C 2a  b C b  b
 jjajj2 C 2jjajjjjbjj C jjbjj2 .Cauchy-Schwarz inequality/
2
 .jjajj C jjbjj/

And if we have something for two vectors, we should extend that to n vectors. First, it’s easy
to see that, for 3 vectors we have
jja C b C cjj  jjajj C jjbjj C jjcjj
Proof goes as: using Eq. (11.1.8) with two vectors a and d D b C c, then Eq. (11.1.8) gain
for b and c. You see the pattern to go to n vectors. And to practice proof by induction you can
prove the general case.

Solving plane geometry problems using vectors. Vectors can be used to solve easily many
plane geometry problem. (algebraic manipulations of some vectors only) See Fig. 11.5 for some
examples. First, we consider a segment AB with M being its midpoint. Let’s denote by a and b
the vectors from the origin to A and B, and m for point M . And we would like to express m in
terms of a; b.
It is not hard to derive the result shown in the left of Fig. 11.5:
8 !
<AM D m a 1 1
! ! H) m a D .b a/ H) m D .a C b/
:AM D AB D .b a/
1 1 2 2
2 2
And in the same manner, we get the result in the middle picture of the mentioned figure. Now,
we’re ready to prove the theorem about the centroid of a triangle.
We consider the median CM3 , and point G such that GM3 D 1=3CM3 . Using the results of
Fig. 11.5, we have

! 1
ˆ
<OM3 D .a C b/
2  
ˆ ! 1 2 1 1
:̂ OG D c C .a C b/ D .a C b C c/
3 3 2 3

As a  b D jjajjjjbjj cos  , we have a  b  jjajjjjbjj.
ŽŽ
We can use this to prove the Pythagoras’s theorem: if a is orthogonal to b then a  b D 0, thus we have
.a C b/  .a C b/ D a  a C b  b. which is nothing than jja C bjj2 D jjajj2 C jjbjj2 . And this vector-based proof of
the Pythagoras theorem works for 2D and 3D and actually nD.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 846

C
A M B A M B

M1
M2
a m b a m b
G
B
1 1 1 2 M3
m= a+ b m= a+ b
O 2 2 O 3 3 A

Figure 11.5: Solving plane geometry using vectors. A median of a triangle is the line segment from a
vertex to the midpoint of the opposite side. So, AM1 is a median of the triangle ABC . We want to prove
this fact: in any triangle, the three medians intersect at a common point (G) which is 2=3 of the way along
each median. In other words, for each median, the distance from a vertex to the G is twice that of the
distance from G to the midpoint of the side opposite that vertex. And that common point is called the
centroid of the triangle.

Of course, we next consider the median AM1 , and a point G 0 such that G 0 M1 D 1=3AM1 . It can
!
be shown that OG 0 D 1=3.a C b C c/. Thus, G 0 is nothing but G. And finally, considering the
median BM2 and we’re done.

Another way to come up with the dot product.

It is obvious that the length of a vector, which is a scalar quantity, is invariant under
translation and rotation. That is, if we rotate a vector, its length does not change. So, we
can define a ‘dot product’ that applies to a single vector only i.e., a  a D a12 C a22 C a32 .
We can thus write
a  a D kak2 D constant
b  b D kbk2 D constant
.a C b/  .a C b/ D ka C bk2 D constant

The length of vector a C b can be evaluated using our dot product definition:

.a C b/  .a C b/ D .a1 C b1 /2 C .a2 C b2 /2 C .a3 C b3 /2


D .a12 C a22 C a32 / C .b12 C b22 C b32 / C2.a1 b1 C a2 b2 C a3 b3 /
„ ƒ‚ … „ ƒ‚ …
constant constant

So, we come up with the fact that a1 b1 C a2 b2 C a3 b3 is also constant. That is why people
came up with this dot product between two vectors. It preserves lengths and angle.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 847

11.1.3 Lines and planes


Using vectors we can write equation of lines in 2D and 3D uniformly. There are two ways. In
the first way, one uses a normal vector to the line (let’s denote that normal by n D .a; b/) and a
point P0 .x0 ; y0 / on the line, see Fig. 11.6. Then consider an arbitrary point P .x; y/ on the line.
The fact that PP0 is perpendicular to the normal gives us the equation of the line. Of course,
perpendicularity is expressed using the dot product. So:

.a; b/  .x x0 ; y y0 / D 0 H) a.x x0 / C b.y y0 / D 0; or ax C by D c

Noting that scaling n does not change the equation, so we usually just need to use a unit vector
n. Geometry becomes easy with numbers!
In the second way one uses a vector tangent to the line called a direction vector d. Now,
any point P on the line is just a step from P0 .x0 ; y0 / along d. Thus, the equation of the line
is: r 0 C t d where t 2 R denotes the step size; the resulting equation has a vectorial form, see
Fig. 11.6. Later on for linear algebra, the vector form is helpful, as it shows that a line passing
through the origin (with r 0 D .0; 0/) can be expressed as a scalar of a direction vector.
y y

P (x, y) P
n

P0 (x0, y0 ) d(a, b)
P0 (x0, y0)

(x − x0)a + (y − y0 )b = 0 r0 r0 + td

x x

Figure 11.6: Equations of a line using vectors.

With the dot product we can now write the equation for a
plane in 3D. In 2D, a line needs a point .x0 ; y0 / and a slope. For
a plane, we need also a point P0 D .x0 ; y0 ; z0 / and a normal
n D .a; b; c/ (not a slope as there are infinitely many tangents to
a plane). For a point P D .x; y; z/ on the plane, the vector from
P0 to P is perpendicular to the normal. And of course perpendicularity is expressed by the dot
product of these two vectors:

.x x0 /a C .y y0 /b C .z z0 /c D 0; or ax C by C cz D d (11.1.9)

with d D ax0 C by0 C cz0 . And that is the equation of a plane: ax C by C cz D d . Noting the
similarity with the equation ax C by D c of a line.
Using two direction vectors u D .u1 ; u2 ; u3 /; v D .v1 ; v2 ; v3 / living on the plane, which
are not parallelŽ , we can write the equation for a plane in 3D passing through the point P 0 . To
Ž
We need two directions, if u is parallel to v, we would have only one direction.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 848

get this equation, consider a point P D .x; y; z/> on the plane, then the vector P0 P is a vector
on the plane. And that vector can be written as a linear combination of u and v:
2 3 2 3 2 3 2 3
x x0 u1 v1
6 7 6 7 6 7 6 7
x D P 0 C us C tv; or 4y 5 D 4y0 5 C s 4u2 5 C t 4v2 5 (11.1.10)
z z0 u3 v3
Again, a plane passing through the origin can be expressed as a linear combination of two
(direction) vectors:
Plane through .0; 0; 0/ with two direction vectors u; v W x D us C t v (11.1.11)
When u and t take all the values in R, x D us C t v generates all the vectors (infinitely many of
them) lying on this plane. We can see that this plane in R3 is similar to the plane R2 : if we take
a linear combination of u; v we can never escape the plane. It is a space of itself and later on it
leads to the important concept of subspace.
Table 11.1: Lines and planes in R2 and R3 : a summary.

Objects Dim. of objects General form Vector form

Lines in 2D 1 ax C by D c x D p C su
8
<a x C b y C c z D d
1 1 1 1
Lines in 3D 1 x D p C su
:a x C b y C c z D d
2 2 2 2

Planes in 3D 2 ax C by C cz D d x D p C su C tv

dim(object)=number of general equations +dim(space) (11.1.12)

11.1.4 Projections
Considering two vectors u and v that make an angle . In case A
that u is short we can always scale it ˛u and get a line of which
direction is determined by u. Let’s denote by p the projection of v
v on u (or on the line ˛u). We obtain this projection by dropping
!
a line from A perpendicular to u. Then, p D OH . The idea of 
u
a vector projection, in its simplest form is just the question of O H
how much one vector goes in the direction of another. We have
p D OH u=jjujj. And consider the right triangle OHA, we also have OH D jjvjj cos , now
relating cos  to the dot product of u; v, we can write p as (another common notation for vector
projection projv .u/ is also introduced):
u uv u u  v
p D projv .u/ D jjvjj cos  D jjvjj D u (11.1.13)
jjujj jjujjjjvjj jjujj uu

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 849

Finding a projection of a vector onto another one has many applications. For example, calculation
of the distance from a point to a line in space is one of them, but not an important one. As can
be seen, while finding the projection of v on u, we also get the vector perpendicular to u (vector
!
AB). This is very useful later on (Section 11.8.6). But I want to show you what will come next.
The vector p is, among all vectors along the line defined by u, the closest vector to v. This will be
generalized to the best approximation theorem when we extend our 3D space to n dimensional
space (Section 11.11.10).
The length of the projected vector can be computed as:
ˇˇ u  v  ˇˇ ˇ u  v ˇ
ˇˇ ˇˇ ˇ ˇ ju  vj
jjpjj D ˇˇ uˇˇ D ˇ ˇ jjujj D
uu uu jjujj

y
Q(x0, y0)
d d = kproj of P Q on nk
n
|n · P Q|
=
knk
|(a, b) · (x0 − x̄, y0 − ȳ)|
= √
P (x̄, ȳ) a2 + b2
|ax0 + bx0 − c|
line : ax + by = c = √
a2 + b2
x

Figure 11.7: Distance from a point Q.x0 ; y0 / to a 2D line ax C by D c. Note that axN C b yN D c as .x;
N y/
N
is a point on the line.

One application of this formula is to compute the distance from a point B.x0 ; y0 ; z0 / to a
plane P W ax C by C cz D d . To derive the formula for this distance, first we consider a simpler
problem: distance from a point to a 2D line (Fig. 11.7). Then, it is a simple generalization to 3D:
jax0 C by0 C cz0 d j
d.B; P / D p
a2 C b 2 C c 2
Projection of a point on a line. Consider a point P0 and a line passing through two points A
and B. Now, we want to find the projection of P0 on that line. We first project vector AP0 onto
vector B A using Eq. (11.1.13), then we add A to that result:
.P0 A/  .B A/
P DAC .B A/
jjB Ajj2

11.1.5 Cross product


There are different ways to introduce the cross product. I prefer the way in which the cross
product would appear naturally when we talk about rotational motions. I refer to Table 11.2 for
analogies between linear (or translational) and rotational motions. For rotational motions, we

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 850

Table 11.2: Analogy between linear motion and angular motion.

Linear motion Angular motion

linear displacement x Angular displacement 

linear velocity x=t Angular velocity =t

linear acceleration 2 x=t 2 Angular acceleration !=t

Work done W D F  x Work done W D‹  

use a rotation angle to measure the movement. This way of presenting the cross product is due
to Feynman in his celebrated lectures on physics, volume I.
First, we consider a two dimensional rotation i.e., an object is circulating around in the xy
plane (Fig. 11.8). Our analysis is guided by the last row in Table 11.2. That is we are going to
write the work F  x in terms of . If force times distance is work, then we define torque
such that work is torque times angle. That;s our plan.
Assume that at a given time instant, the object is located at point P , which is specified
by .x; y/ using the Cartesian coordinates or .r; / using polar coordinates. A moment later,
under the influence of a force F it moves to point Q by rotating a tiny angle of . We
compute the change in positions x and y in terms of  . Then, we compute the work
W D Fx x C Fy y D .xFy yFx / . So this term .xFy yFx /–a strange-looking
combination of the force and the distance–should be defined as torque which is a kind of force
that makes objects turn.
y ∆x
Q H
θ P Q = r∆θ
∆y

P ∆x = −r∆θ sin θ = −y∆θ


∆θ ∆y = r∆θ cos θ = x∆θ
r
y

θ
O x x

∆W = Fx ∆x + Fy ∆y
= (xFy − yFx )∆θ

Figure 11.8: Work in terms of . Noting that  is tiny, that’s why we have PQ D r and PQ is
perpendicular to OP . There is a minus in x because x is decreasing.

Yes, we have obtained one formula for the torque. But we can also obtain another formula
for it to reveal the geometry behind the algebraic expression. To this end, we recall that work is
tangential force multiplied with displacement. As seen from Fig. 11.9, torque can also be defined

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 851

as the magnitude of the force times the length of the level arm. And this formula agrees with our
experiences with torques: if the force is radial i.e., ˛ D 0 the torque is zero, or for zero length
of level arm the torque is also zero.

Ft
˛

Fr
 m
˛
W D Ft  r
r D .F sin ˛/  r
D .F r sin ˛/  
O

r sin 

Figure 11.9: Torque is defined as the magnitude of the force times the length of the level arm: D
F r sin  . A force F is acting at an object m locating at r. When the object has rotated a small angle ,
the object has moved tangentially a distance of r . Thus the work is W D .F sin ˛/  .r /.

With forces, we have linear momentum p D mv and Newton’s 2nd law saying that the
external force is equal to the time derivative of the linear momentum: F ext D p.
P A question
arises, with torques, do we have another kind of momentum in the sense that ext D .P Let’s
do the analysis. We start with the formula for the torque, D xFy yFx , then we replace Fx
and Fy using Newton’s 2nd law so that derivative with time appears:

dvy dvx d d
D xFy yFx D xm ym D .xmvy y mvx / D .xpy ypx / (11.1.14)
dt dt dt dt

Indeed, the torque is the time rate of change of something. And that something xpy ypx is
what we now call the angular momentum, denoted by L. And by doing the same analysis as
done in Fig. 11.9 for the torque, we can see that the angular momentum is the magnitude of the
linear momentum times the length of the level arm.
We have conservation of linear momentum when the total external forces in a system is zero.
Do we have the same principle for angular momentum? As can be seen from Fig. 11.10 for a
system of 2 particles, the torque due to F 12 cancels the torque due to F 21 . Thus, the the rate of
change of the total momenta depends only on the external torques:
9
C 12 >
dL1
D 1
ext
= dL
dt H) D ext
C ext
(11.1.15)
dL2 ext >
; dt 1 2
D 2 C 21
dt
Phu Nguyen, Monash University © Draft version
Chapter 11. Linear algebra 852

F1
m1 m2
F 21

F 12
F2

O
Figure 11.10: The torque due to F 12 cancels the torque due to F 21 due to Newton’s third law of action
and reaction F 12 D F 21 and the level arms are the same.

Thus, if the net torque is zero, the angular momentum is conserved. Indeed, we also have an
analog for the principle of conservation of linear momentum. This encourages us to keep moving
on. We have kinetic energy for translational motions, what it will look like for rotational motions?
Kinetic energy is T D 0:5mv 2 : mass time velocity squared. So we anticipate that for
rotations, it should be T D 0:5f .m/! 2 . Let’s do the maths (note that v D r! see Fig. 7.39):

1 1
T D mv 2 D mr 2 ! 2 H) I D mr 2 (11.1.16)
2 2

The quantity I D mr 2 is called moment of inertia by Leonhard Euler. It is a function of mass


(of course) but it depends also on r i.e., how far the mass is away from the rotation axis, see for
an application in Fig. 11.11.

Figure 11.11: Moment of inertia in rotations: it is a function of mass (of course) but it depends also on
r i.e., how far the mass is away from the rotation axis. A spinning figure skater pull in her outstretched
arms to spin faster. This is because the angular momentum l D I! is conserved, when I is decreased, !
is increased i.e., spinning faster.

Now, if we repeat the analysis that we have just done in the xy-plane but now for the yz-plane
and zx plane, we obtain three terms:

xy plane W xFy yFx


yz plane W yFz zFy (11.1.17)
zx plane W yFz zFy

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 853

And that is the torque which is defined from two vectors r D .x; y; z/ and F ; xFy yFx is just
the z component of this torque. Now we generalize that to any two 3D vectors a and b:
2 3
a2 b3 a3 b2
6 7
c WD a  b H) c D 4a3 b1 a1 b3 5 (11.1.18)
a1 b2 a2 b1

From this definition, it can be seen that b  a D a  b:


2 3
b2 a3 b3 a2
6 7
b  a D 4b3 a1 b1 a3 5 D ab (11.1.19)
b1 a2 b2 a1

The vector product is not commutative! One consequence is that a 


a D 0. Now, we need to know the direction of ab. Just apply Eq. (11.1.18) ab
to two special vectors .1; 0; 0/ and .0; 1; 0/, and we get the cross product a
of them is .0; 0; 1/, which is perpendicular to .1; 0; 0/ and .0; 1; 0/. The
b 
rule is: c is perpendicular to both a; b. This can be proved simply by just
calculating the dot product of a  b with a, and you will see it is zero. But
c points up or down? The right hand rule tells us which exact direction it
follows (see figure next to the text).
We now know the direction of the cross product, how about its length?
Let’s compute it and see what we shall get:

ka  bk2 D .b2 a3 b3 a2 /2 C .b3 a1 b1 a3 /2 C .b1 a2 b2 a1 /2


D .a12 C a22 C a32 /.b12 C b22 C b32 / .a1 b1 C a2 b2 C ab b3 /2
D kak2 kbk2 kak2 kbk2 cos2 ; .used dot product formula/
D kak2 kbk2 sin2 

We get a nice formula for the length of the cross product of two 3D vectors a and b in terms of
the length of the vectors and the angle between them:

ka  bk D kakkbk sin  (11.1.20)

Note the striking similarity with Eq. (11.1.6) about the dot product! With the dot product we
have cos , and now with the cross product we have sin . The dot product tells us when two
vectors are perpendicular and the cross product tells us when they are parallel. Perfect duo. A
geometric interpretation of this formula is that the length of the cross product of a and b is the
area of the parallelogram formed by a and b. We also get a result: the area of a triangle formed
by a and b is 0:5ka  bk. See Fig. 11.12a.
As the area of a triangle formed by a and b is 0:5ka  bk, if the three verices are .x1 ; y1 /,
.x2 ; y2 / and .x3 ; y3 /, the area of the triangle explicitly expressed in terms of the coordinates of

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 854

ab

jjcjj cos 

c
b jjbjj sin 
b jja  bjj

a a
(a) jja  bjj (b) c  .a  b/ D ka  bkkck cos 

Figure 11.12: A geometric interpretation of the cross product of two vectors: the length of the cross
product of a and b is the area of the parallelogram formed by a and b (a); c  .a  b/ is the volume of a
parallelepiped with three sides being our three vectors a; b; c (b).

its vertices is given by:


2 3
1 1 1
1 6 7
A D det 4x1 x2 x3 5 (11.1.21)
2
y1 y2 y3
Here are some rules regarding the cross product:

abD ba
aaD0
.˛a/  b D ˛.a  b/ D a  .˛b/
a  .b C c/ D a  b C a  c
(11.1.22)
.a C b/  c D a  c C b  c
a  .b  c/ D b.a  c/ c.a  b/
.a  b/2 D a2 b2 .a  b/2
c  .a  b/ D .c  a/  b

The first three rules are straightforward. How others have been discovered? Herein, we prove
the last rule, known as the scalar triple product of three vectors. As two vectors give us an area
so three vectors could give us a volume. So, let’s build a box with three sides being our three
vectors a; b; c (see Fig. 11.12b); this box is called a parallelepipedŽ . It is seen that the volume
of this box is c  .a  b/: consider the base with two sides a; b, its area is ka  bk; the volume
is: base area times the height; that is ka  bkkck cos . As the volume does not change if we
consider a different base, the rule of the scalar triple product of three vectors is proved. Of course,
Ž
Parallelepiped is a 3-D shape whose faces are all parallelograms. It is obtained from a Greek word which
means ’an object having parallel plane’.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 855

a proof using pure algebra exists:

c  .a  b/ D c1 .a2 b3 a3 b2 / C c2 .a3 b1 a1 b3 / C c3 .a1 b2 a2 b 1 /


D b1 .c2 a3 c3 a2 / C b2 .c3 a1 c1 a3 / C b3 .c1 a2 c2 a1 / D b  .c  a/

The rule a  .b  c/ D b.a  c/ c.a  b/ is known as the triple product. You’re encouraged
to prove it using of course the definition of cross product. You would realize that the process is
tedious and boring (lengthy algebraic expressions). Refer to Section 7.11.14 for a more elegant
proof when we’re equipped with more mathematics tools.

11.1.6 Hamilton and quartenions


The essence of mathematics lies in its freedom. (George Cantor)

This section is about the story of how Hamilton discovered quartenions in 1843. The story
started with complex numbers (Section 2.25). Let’s consider two complex numbers z1 D a C bi
and z2 D c C d i, where a; b; c; d 2 R and i 2 D 1. Addition/subtraction of complex numbers
are straightforward, but multiplication is much harder. So, we focus on the product of z1 and z2 :

z1 z2 D .ac bd / C .ad C bc/i

Note that to get this result we only neededp to use high school algebra and i D 1. Thus, the
2

modulus (or length) of z1 z2 is jz1 z2 j D .ac bd / C .ad C bc/ . Next, we’re trying to find
2 2

the relation between jz1 z2 j and jz1 j and jz2 j. To this end, we square jz1 z2 j and obtain:

jz1 z2 j2 D .ac bd /2 C .ad C bc/2 D .a2 C b 2 /.c 2 C d 2 / D jz1 j2 jz2 j2 (11.1.23)

or,
jz1 z2 j D jz1 jjz2 j (11.1.24)
And this result is called the law of the moduli by Hamilton: it states that the modulus of the
product of two complex numbers is equal to the product of the modulus of the two numbers.
Hamilton wanted to extend complex numbers–which he called couples as each complex
number contains two real numbers–to triplets. Thus, he considered a triplet of the following
form
z D a C bi C cj; with i 2 D j 2 D 1 and ij D j i
Hamilnto considered ij D j i because at that time Hamilton still insisted on the commutativity
of multiplication. Although it is straightforward to add two triplets, multiplication was, however,
not easy to even a mathematician of high caliber such as Hamilton. He wrote to his son Archibald
shortly before his death:

“Every morning in the early part of the above-cited month, on my coming down to
breakfast, your brother William Edwin and yourself used to ask me, ‘Well, Papa, can
you multiply triplets?’ Whereto I was obliged to reply, with a sad shake of the head,
‘No, I can only add and subtract them.’ ”

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 856

He started with zz or z 2 , and he obtained:


z 2 D .a C bi C cj /.a C bi C cj / D .a2 b2 c 2 / C 2abi C 2acj C 2bcij (11.1.25)
The red term troubled him. To get a triplet from z 2 , he needed to have ij D a1 C a2 i C a3 j
with ai 2 R. But this is impossible:
ij D a1 C a2 i C a3 j
i 2j D a1 i C a2 i 2 C a3 ij (multiplying the above by i )
j D a1 i a2 C a3 ij (i 2 D 1)
j D a1 i a2 C a3 .a1 C a2 i C a3 j / (replacing ij using 1st eq.)
j D a1 a3 a2 C .a1 C a2 a3 /i C a32 j
The last equation holds only when a32 D 1, which is impossible as a3 is a real number. So, ij
cannot be a triplet.
But if this troubling term 2bcij is zero, then it is simple to see that jz 2 j D .a2 C b 2 C c 2 /,
which is jzjjzj. The law of the moduli, Eq. (11.1.24), works! But when 2bcij is zero? It is
absurd to think that ij D 0. So, Hamilton thought that if ij ¤ j i , then it is possible for the red
term to vanish. So, with ij ¤ j i , he computed z 2 :
.a C bi C cj /.a C bi C cj / D .a2 b2 c 2 / C 2abi C 2acj C bc.ij C j i/ (11.1.26)
If ij D j i , then the red term in the above expression is zero, and the law of the moduli
holds. At this time, due to the red term, Hamilton decided that he had to consider not triplets but
quadruplets of the form z D a C bi C cj C d k. This k is for ij p D j i D k! He called such
number z a quartenion. He defined the modulus of a quartenion is a2 C b 2 C c 2 C d 2 , which
is reasonable.
What should be the rules of i; j; k? We have i 2 D 1, thus we should have j 2 D k 2 D 1.
After all, there is no reason that i is more special than j and k. And we need ij D j i , and
Hamilton considered ij D j i D k. Thus, his i; j; k must satisfy the followingŽ :
i2 D j 2 D k2 D 1
ij D ji D k
(11.1.27)
jk D kj D i
ki D i Dj
Hamilton now needed to verify that his quartenions satisfy the rule of modulus
(Eq. (11.1.24)). He computed z 2 and with Eq. (11.1.27), he got:
.a C bi C cj C d k/.a C bi C cj C d k/ D a2 C abi C acj C ad kC
C abi b 2 C bcij C bd ik
C acj C bcj i c 2 C cdj k
C ad k C bd ki C dckj d2
D .a2 b2 c2 d 2 / C 2abi C 2acj C 2ad k
Ž
which can also be compactly written as i 2 D j 2 D k 2 D ij k D 1.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 857

Thus, the modulus of zz is


p
jzzj D .a2 b 2 c 2 d 2 /2 C .2ab/2 C .2ac/2 C .2ad /2 D a2 C b 2 C c 2 C d 2 D jzjjzj
Therefore, we have again the old rule about modulus that jzzj D jzjjzj.
Hamilton’s discovery of the quartenions was one of those very
rare incidents in science where a breakthrough was captured in
real time. Hamilton had been working on this problem for over
10 years, and finally had a breakthrough on October 16th, 1843
while on a walk along the Royal Canal in Dublin towards the
Royal Irish Academy with his wife, Lady Hamilton. And when
this exciting idea took hold, he couldn’t resist the urge to etch his
new equation into the stone of Broom Bridge and give life to a
new system of four-dimensional numbers.
Hamilton described the ‘eureka’ moment in a letter to his son some years later:
Although your mother talked with me now and then, yet an undercurrent of thought
was going on in my mind, which gave at last a result, whereof it is not too much
to say that I felt at once an importance. An electric current seemed to close; and
a spark flashed forth, the herald (as I foresaw, immediately) of many long years
to come of definitely directed thought and work . . . Nor could I resist the im-
pulse—unphilosophical as it may have been—to cut with a knife on a stone of
Brougham Bridge as we passed it, the fundamental formula ...
Hamilton had created a completely new structure in mathematics. What is interesting is that
the quartenions did not satisfy the commutative rule ab D ba (note that complex numbers still
follow this rule). This did not bother Hamilton because this is what usually happens in nature.
For example, consider an empty swimming pool and the two operations of diving into the pool
head first and turning the water on. The order in which the operations take place is important!
The set of all quartenions is now denoted by H to honour Hamilton.
It was Hamilton who gave us the terms scalar and vector for he considered the quartenion
a C bi C cj C d k as consisted of a scalar part (a) and a vector part bi C cj C d k. Considering
two quartenions with zero scalar parts ˛ D xi Cyj Czk and ˛ 0 D x 0 i Cy 0 j Cz 0 k, he computed
their product using Eq. (11.1.27):
˛˛ 0 D .xi C yj C zk/.x 0 i C y 0 j C z 0 k/
D .xx 0 C yy 0 C zz 0 / C .yz 0 zy 0 /i C .zx 0 xz 0 /j C .xy 0 x 0 y/k
What is the red term? It is nothing but the dot product of two 3D vectors. And the blue term is
nothing but the cross product. Gibbs gave us the dot product and the cross product. But it was
Hamilton who was the first to write down these products.

11.2 Vectors in Rn
So we have seen 2D and 3D vectors. They are easy to grasp as we have counterparts in real
life. But mathematicians do not stop there. Or actually they encounter problems in which they

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 858

have to stretch their imaginations. One such problem is solving a system of large simultaneous
equations, like the following one
2x1 C 3x2 C 4x3 C x4 D5
x1 C 2x2 C 3x3 x4 D1
2x1 C 3x2 C x3 x4 D7
3x1 C 2x2 C 2x3 3x4 D2
which they simply write Ax D b where the vector x D .x1 ; x2 ; x3 ; x4 / is a vector in a four-
dimension space. And if we have a system of 1000 equations for 1000 unknowns, we are talking
about its solution as a vector living in a 1000-dimensional space! Obviously it is impossible to
visualize spaces of dimensions higher than 3, the study of vectors in higher-dimensional spaces
must proceed entirely by analytic means.
In this section, we move to spaces of n dimensions where n is most of the time (much) larger
than three. We use the symbol x 2 Rn to denote such a vector, and we write (with respect to a
chosen basis which is usually the standard basis)
2 3
x1
6 7
6x2 7
xD6 7
6 :: 7 ; or x D .x1 ; x2 ; : : : ; xn /
4:5
xn
where the second notation is to save space. When we say a vector we mean a column vector. For
2D/3D vectors, we called xi the i th coordinate. However for n dimensional vector we call it the
i th component, as x is no longer representing a positional vector. Actually, xi can be anything:
price of a product, deflection of a point in a beam etc. It should be emphasized that a vector
exists independent of a coordinate system. So, when we write (or see) x D .x1 ; x2 ; : : : ; xn /, we
should be aware that a certain choice of a coordinate system was made.
For vectors a and b in a n-dimensional space and a scalar ˛, we have the following definitions
for vector addition, scalar vector multiplication, dot product of two vectors, which are merely
extensions of what we know for 3D vectors:
X
n
addition: aCbD .ai C bi /
i D1
scalar multiplication: ˛a D .˛a1 ; ˛a2 ; : : : ; ˛an /
X
n
dot product: a  b D ai bi D ai bi
i D1
!1=2
p X
length (norm): jjajj D aaD ai2
i
Pn
where we have used Einstein summation rule in i D1 ai bi D ai bi . According to this rule, when
an index variable (i in this example) appears twice in a single term, it implies summation of that

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 859

term over all the values of the index. The index i is thus named summation index
Pnor dummy
index. The dummy word is used because we can replace it by any other symbol: i D1 ai bi D
P n
j D1 aj bj D aj bj .
ŽŽ

All the rules about vector addition and scalar vector multiplication in Box 11.1 still
apply for vectors in Rn . And note that we did not define the cross product for vectors
Remark
living in a space with dimensions larger than three! Lucky for us that in the world of
linear algebra we do not need the cross product.

Notation Rn . Let’s discuss how mathematicians say about 1D, 2D, 3D and nD spaces. When x
is a number living on the number line, they write x 2 R. When a point x D .x; y/ lives on a
plane, they write x 2 R2 ; this is because x 2 R and y 2 R. Similarly, they write x 2 R3 and
x 2 Rn . This notation follows the Cartesian product of two sets discussed in Section 5.5.
We have special numbers: 0 and 1, and we also have special vectors. The zero vector 0, note
the bold font for 0, has all components being zeros, and the ones vector 1 has all components
equal to one. And the unit vectors (remember i; j; k of the 3D space?):
2 3 2 3 2 3
1 0 0
6 7 6 7 6 7
607 617 607
6 7 6 7 6 7
e1 D 6 7 6 7 6 7
607 ; e 2 D 607 ; : : : ; e n D 607 (11.2.1)
6 :: 7 6 :: 7 6 :: 7
4:5 4:5 4:5
0 0 1

That is vector e i has all component vanished except the ith component which is one.

Linear combination. If u1 ; : : : ; um are m vectors in Rn and ˛1 ; : : : ; ˛m are m real numbers,


then the vector
˛1 u1 C ˛2 u2 C    C ˛m um (11.2.2)
is called a linear combination of the vectors u1 ; : : : ; um . The scalars ˛1 ; : : : ; ˛m are the coeffi-
cients of the combination.
For some special values for ˛i we obtain some special combinations:

sum (˛i D 1): u1 C u2 C    C um


1
average (˛i D 1=m): Œu1 C u2 C    C um 
m
m1 r 1 C m2 r 2 C    C mn r n
center of mass R CM :
m1 C m2 C    C mn
And we shall see more and more linear combinations of vectors in coming sections. The key
operation in linear algebra is taking a (linear) combination of some vectors. One special linear
combination is that any vector 2 Rn can be written as a linear combination of the unit vectors
ŽŽ
Of course it is not a requirement to use Einstein notation in linear algebra; but it can be very useful elsewhere.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 860

with its components being the coefficients:


2 3 2 3 2 3 2 3
a1 1 0 0
6 7 6 7 6 7 6 7
6 a2 7 6 7 6 7 6 7
6 7 D a1 607 C a2 617 C    C an 607 (11.2.3)
6:7: 6:7
: 6:7
: 6 :: 7
4:5 4:5 4:5 4:5
an 0 0 1

11.3 System of linear equations


The central problem of linear algebra is to solve a system of linear equations. In these linear
equations we never meet xy or sin x, we just have unknowns multiplied by constants e.g. 2x C
3y D 7. If you have not see any system of linear equations before, check Section 2.16 out first.
Let’s start humbly with a system of two equations for two unknowns:
2x y D 1
(11.3.1)
xCy D5
All of us know the technique to solve it: elimination method. We keep the first equation, but
replace the second by the sum of the second equation and the first (to remove y):
2x 1y D 1
(11.3.2)
3x C 0y D 6
Then, we have x D 2 from the second equation, and back substituting x D 2 into the first
equation gives us y D 3. This is pretty easy. What is interesting is the fact that we write the
second equation 3x D 6 as 3x C 0y D 6. Furthermore, we can work on the two equations
without referring to x; y (after all, instead of x; y we can equally use u; v or whatever pleases
us); we just need to focus on the numbers 2; 1; 1; 1; 1; 5. So, we put the numbers appearing in
the LHS in a rectangular array with 2 rows and 2 columns, denoted by a capital boldface symbol
A, the numbers in the RHS in a vector (b), and the unknowns in another vector (x):
" #" # " #
2 1 x 1
D ; or Ax D b (11.3.3)
1 1 y 5
and this 2 row and 2 col array is called the coefficient matrixŽŽ and the vector on the RHS is
called the RHS vector. Note that this is not simply a notation. Eq. (11.3.3) says that the matrix A
acts on the vector x to produce the vector b. Matrices do something as they are associated with
linear transformations. More about this later in Section 11.11.3.
In a matrix there are rows and columns, thus we can view Eq. (11.3.3) from the row picture
or the column picture. In the row picture, each row is an equation, which is geometrically a
line in a 2D plane. There are two lines, Fig. 11.13-left, and they intersect at .2; 3/, which is the
solution of the system. And this solution is unique, as there is no other solutions.
ŽŽ
Historically it was the 19th-century English mathematician James Sylvester (1814 – 1897) who first coined the
term matrix, even though Chinese mathematicians knew about matrices from the 10th–2nd century BCE, written in
The Nine Chapters on the Mathematical Art.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 861

y y

 
−1
  3
1 +1
3
5

=1

x
+
 

y
2

−y
 

=
−1

5
+1 2  

2x
1 solution 1 2
2
1

1 2 3 x x

Figure 11.13: System of linear equations: row view (left) and column view (right).

In the column picture, we do not see two equations with scalar unknowns x and y, but we
see only one vector equation: " # " # " #
2 1 1
x Cy D (11.3.4)
1 1 5
And we are seeking for the right linear combination of the columns of the coefficient matrix to
get the RHS vector. In Fig. 11.13-right, we see that if we go along the first column two times its
length and then follow the second column three times the length, then we reach the RHS .1; 5/.
For this simple example in 2D, the row picture is easier to work with. However, for a system
of more than three unknowns such a geometric view does not exist.

No solution and many solutions. Using the row picture it is easy to see that Ax D b either: (i)
has a unique solution, (ii) has no solution and (iii) has many solutions. The following systems
have no solution and many solutions:
( (
2x y D 1 2x y D 1
; (11.3.5)
2x y D 2 4x 2y D 2

In the first system, the two lines are parallel and thus do not intersect. In the second system, the
second equation is just a multiple of the first; we have then just one equation and all the points
on the line of the first equation are the solutions, see Fig. 11.14.

Underdetermined versus overdetermined systems. A system of linear equations is consid-


ered underdetermined if there are fewer equations than unknowns. On the other hand, in an
overdetermined system, there are more equations than unknowns. For example,
2 3 2 3 2 3
2 3 x1 1 1 2 3 2 3 2 3
1 2 2 2 6 7 6 7 6 7 x1 3
6 7 6x2 7 627 62 4 6 76 7 6 7
42 4 6 8 5 6 7 D 6 7 (underdetermined); 6 7 4x2 5 D 445
4x3 5 455 42 6 85
3 6 8 10 x3 5
x4 7 2 8 10

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 862

y y
2x yD 2
2x 1y D 1
2x yD1
4x 2y D 2

NO SOLUTION MANY SOLUTIONS

x x
0 0

Figure 11.14: Linear systems of equations: no solution versus infinitely many solutions.

The first matrix has more columns than rows–it is short and wide. The second matrix has more
rows than columns–it is thin and tall.

Elementary row operations. It is clear that we can perform some massages to a system of linear
equations without altering the solutions. For example, the system in Eq. (11.3.1) is equivalent to
the following ones:
( ( (
xCy D5 2.x C y/ D 10 xCy D5
” ”
2x y D 1 2x y D 1 3x D 6

in which the first system was obtained by swapping the two original equations; from it, the
second system obtained by multiplying the first equation by two and the third system by adding
the first equation to the second equation. Using the row picture, what we have done is called
elementary row operationsŽŽ . This is because the coefficients of the system are stored in the
coefficient matrix, and thus what done to the equations are done to the rows of this matrix. There
are only three types of elementary row operations:

 (Row Swap) Exchange any two rows.

 (Scalar Multiplication) Multiply any row by a constant.

 (Row Sum) Add a multiple of one row to another row.

The Gaussian elimination method, discussed in the next section, uses the elementary row opera-
tions to transform the system into a simpler form.

ŽŽ
We only mentioned about multiplying a row by a constant, but if the constant is 1=c, where c ¤ 0, then we
also cover division. Similarly, by adding a negative multiple of one row to another, we’re actually subtracting.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 863

11.3.1 Gaussian elimination method


To demonstrate the Gaussian elimination method, let’s consider the following system of three
unknowns and three equations:

2x1 C 4x2 2x3 D 2


4x1 C 9x2 3x3 D 8
2x1 3x2 C 7x3 D 10

which is re-written in this matrix form,


2 3 2 3 32
2 4 2 x1 2
6 7 6 7 6 7
Ax D b; A D 4 4 9 35 ; x D 4x2 5 ; bD485
2 3 7 x3 10

Once the elimination process–to be discussed shortly–has been done, we get a new form Ux D
c: 2 3 2 3
2 4 2 2
6 7 6 7
U D 40 1 1 5 ; c D 445
0 0 4 8
where U, a matrix of which all elements below the main diagonal are zeros, is called an upper
triangular matrix; the non-zero red terms form a triangle. All the pivots of this upper triangular
matrix are on the diagonal. Obviously solving Ux D c is super easy: back substitution. The last
row gives us 4x3 D 8 or x3 D 2, substituting that x3 into the 2nd row: x2 C x3 D 4 we get
x2 D 2. Finally substituting x3 ; x2 into the first row we get x1 D 1.
The elimination process brings A to U which is in a row echelon form (REF). A matrix is
said to be in row echelon form if all entries below the pivots are zero.
Now, I present the elimination process. We start with the elimination of x1 in the second row
(or equivalently the blue number 4); this is obtained by subtracting two times the first row from
the second row (the red number 2 is the first non-zero in the row that does the elimination, it is
called a pivot):

2x1 C 4x2 2x3 D 2 R2$R2 2R1 2x1 C 4x2 2x3 D 2


‚…„ƒ
4x1 C 9x2 3x3 D 8 H) 0x1 C 1x2 C 1x3 D 4
2x1 3x2 C 7x3 D 10 2x1 3x2 C 7x3 D 10

We observe that after this elimination step, only the second equation changes, highlighted by the
red terms. We continue to remove x1 in the third equation, or in other words, remove -2 below
the first zero in the second equation:

2x1 C 4x2 2x3 D 2 R3$R3CR1 2x1 C 4x2 2x3 D 2


‚…„ƒ
0x1 C 1x2 C 1x3 D 4 H) 0x1 C 1 x2 C 1x3 D 4
2x1 3x2 C 7x3 D 10 0x1 C 1x2 C 5x3 D 12

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 864

Now, the 1st column has been finished, we move to the second column; and we want to remove
x2 in the third equation i.e., the one below the pivot on row 2:

2x1 C 4x2 2x3 D 2 R3$R3 R2 2x1 C 4x2 2x3 D 2


‚…„ƒ
0x1 C 1x2 C 1x3 D 4 H) 0x1 C 1 x2 C 1x3 D 4
0x1 C 1x2 C 5x3 D 12 0x1 C 0x2 C 4x3 D 8

11.3.2 The Gauss-Jordan elimination method

As the Gaussian elimination applies to both the coefficient matrix and the RHS vector, it is more
efficient to put the matrix and the vector into a so-called augmented matrix and carry out the
elimination:
2 3 2 3
h i 2 4 2 2 2 4 2 2
6 7 6 7
A b D4 4 9 3 8 5 H) 40 1 1 45
2 3 7 10 0 0 4 8

Gauss would finish here and do back substitution. JordanŽŽ continued with elimination until
the left block is the unit matrix: A becomes I. And the obtained form is called the reduced row
echelon form; it makes the back substitution super easy. A matrix is said to be in reduced row
echelon form (RREF) if all the entries below and above the pivots are zero. What we have to do
is to remove the red terms–making zeros above the pivots and making the pivots ones:

2 3 2 3 2 3 2 3
2 4 2 2 2 0 6 14 2 0 0 2 1 0 0 1
6 7 6 7 6 7 6 7
40 1 1 45 H) 40 1 1 4 5 H) 40 1 0 2 5 H) 40 1 0 25
0 0 4 8 0 0 4 8 0 0 1 2 0 0 1 2

The solution is now simply the right block, which is . 1; 2; 2/. Note that the columns in A
transformed to the three unit vectors .1; 0; 0/; .0; 1; 0/ and .0; 0; 1/ of R3 in the reduced row
echelon form.

ŽŽ
Wilhelm Jordan (1842 – 1899) was a German geodesist who conducted surveys in Germany and Africa. He
is remembered among mathematicians for the Gauss–Jordan elimination algorithm, with Jordan improving the
stability of the algorithm so it could be applied to minimizing the squared error in the sum of a series of surveying
observations. This algebraic technique appeared in the third edition (1888) of his Textbook of Geodesy.Wilhelm
Jordan is not to be confused with the French mathematician Camille Jordan (Jordan curve theorem), nor with the
German physicist Pascual Jordan (Jordan algebras).

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 865

Is this solution making sense? We have three un-


knowns and three equations; each equation is then a
plane in R3 . The intersection of two such planes gives
a line, and a line intersects the remaining plane at a
single point (if it is not parallel to the plane): that is
Solution
intersection

the solution to the system. This system is similar to


Fig. 11.13-left; it is harder to plot three planes and
show their intersection. See the figure for an illus-
tration (noting that this is not for the system we’re
solving).

Many solutions: underdetermined systems. We consider the following underdetermined sys-


tem where there are more unknowns than equations:
2 3 2 3
x1 x2 x3 C 2x4 D 1 1 1 1 2 1 1 1 0 1 2
6 7 6 7
2x1 2x2 x3 C 3x4 D 3 H) 4 2 2 1 3 35 H) 4 0 0 1 1 15
1x1 C 1x2 x3 C 0x4 D 3 1 1 1 0 3 0 0 0 0 0
where to save space we have carried out the Gauss-Jordan elimination process in the final
step§ . Looking at the RREF, we have the third row full of zeros: it is meaningless because it is
equivalent to the equation 0 D 0. This indicates that the hyperplane 1x1 C1x2 x3 C0x4 D 3
is just a linear combination of the other hyperplanes. Indeed, the third row of A is equal to three
times the first row minus two times the second one.
Now, we have 4 unknowns but only 2 equations; there are so many freedom here. We say that
there are 4 2 D 2 free variables. And we also have two pivots (indicated by boxes in the above
equation). The columns containing the pivots are called the pivot columns; in this example, they
are the 1st and 3rd columns. They are of course the unit vectors .1; 0; 0/ and .0; 1; 0/ of R3 . The
other columns are called the non-pivot columns; they are the 2nd and 4th columns.
Now comes an important fact: the non-pivot columns can be written as linear combinations
of the pivot columns. Look at the first non-pivot column, it is the second column. Its nonzero
entries must be in the first entry (if not the case, then it would be a pivot column). Obviously, we
can write . 1; 0; 0/ D . 1/  .1; 0; 0/. The first non-pivot column is a linear combination of
the first pivot column. The second non-pivot column is .1; 1; 0/: it has the nonzero entries at
the first two slots, thus it is a linear combination of the first two unit vectors (or the 1st two pivot
columns): .1; 1; 0/ D .1/  .1; 0; 0/ C . 1/  .0; 1; 0/. To illustrate this point, let’s consider a
RREF for a 4  6 matrix with 3 pivots:
2 3
1 b12 0 b14 0 b16
6 7
60 0 1 b24 0 b26 7
RD6 7
40 0 0 0 1 b36 5
0 0 0 0 0 0
§
As I did not aim to practice the Gauss-Jordan method, I used Julia to do this for me. The aim was to see the
solution of the system.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 866

Another important fact: in the RREF the 4th col is the 1st col minus the third col, if not clear
check again Eq. (11.2.3). And we also have the same relation in A: check that the 4th col of A is
exactly the 1st col minus the third one. To explain why we need to consider Ax D 0 discussed
in Section 11.3.3ŽŽ .
It is a choice we made to select the variables associated with the non-pivot columns as the
free variables, and compute other variables, called the pivot variables, in terms of the free ones.
Thus, x2 ; x4 are the free variables and x1 ; x3 are the pivot variables. For the free variables we
can assign x2 D s and x4 D t, then
2 3 2 3 2 3 2 3
2Cs t 2 1 1
6 7 6 7 6 7 6 7
x1 x2 C x4 D 2 x1 D 2 C s t 6 s 7 607 617 607
H) H) x D 6 7 D 6 7Cs6 7Ct6 7
x3 x4 D 1 x3 D 1 C t 4 1Ct 5 415 405 415
t 0 0 1
(11.3.6)

This specific example tells us that the number of free variables equals the number of
unknowns minus the number of nonzero rows in the echelon form of A. Thus, we need to
introduce another number that characterizes the matrix better (for a matrix we have already two
numbers: the number of rows and cols): that is the concept of the rank of the matrix.

Definition 11.3.1
The rank of a matrix is the number of nonzero rows in its row echelon form (or its reduced
REF). It is also the number of pivots.

Theorem 11.3.1: The rank theorem


Let A be the coefficient matrix of a system of linear equations with n variables. If the system
is solvable (or consistent), then

number of free variables D n rank.A/

11.3.3 Homogeneous linear systems


Now the focus is on the solutions to Ax D 0. Such a system is called a homogeneous system.
The coefficient matrix A is of shape m  n which can be either rectangular or square. There
should be three questions to ask now

 Why Ax D 0 called a homogeneous system?

 And why we care about such systems?


ŽŽ
The short answer is that Ax D 0 is equivalent to Rx D 0.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 867

 What can be the solutions of such systems?

The answer to the first question is simple: if x  is a solution we have Ax  D 0, and thus
A.cx  / D 0 with c 2 R; in other words cx  is also a solution. And that’s why mathematicians
call Ax D 0 a homogeneous equation. If the RHS is not 0, then we get an inhomogeneous
system.
We focus on the third question for now. It is obvious that one possible solution is the zero
vector, which is called understandably the trivial solution. This is similar to the equation 5x D 0.
But for the equation 0x D 0, then there are infinitely many solutions. So, Ax D 0 either has
one unique solution which is the zero vector or has infinitely many solutions. From the previous
section, we know that only when we have free variables we have infinitely many solutions.

Theorem 11.3.2
If ŒAj0 is a homogeneous system of m linear equations with n unknowns, where m < n, then
the system has infinitely many solutions.

Proof. Note that the system is solvable, then we use the rank theorem to have

number of free variables D n rank.A/

Now comes the fact that rank.A/  mŽŽ , thus

number of free variables D n rank.A/  n m>0

which indicates that there is at least one free variable, and hence, infinitely many solutions.


11.3.4 Spanning sets of vectors and linear independence


One important operation in linear algebra is to consider a linear combination of a given set
P
of vectors. If S D fv1 ; v2 ; :::; vk g, then one linear combination of v1 ; v2 ; :::; vk is kiD1 ˛i vi .
More often that not, we’re interested in ALL the linear combinations of v1 ; v2 ; :::; vk . To this
end, the concept of a spanning set of vectors is introduced.

Definition 11.3.2
If S D fv1 ; v2 ; :::; vk g is a set of vectors in Rn , then the set of ALL linear combination of
v1 ; v2 ; :::; vk is called the span of v1 ; v2 ; : : : ; vk , and is denoted by span.v1 ; v2 ; : : : ; vk / or
span.S /. If span.S/ D Rn , then S is called a spanning set for Rn .

ŽŽ
Rank of A is the number of nonzero rows and we have maximum m rows.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 868

Example 11.1
Show that R2 D span.f.2; 1/; .1; 3/g/. What we need to prove is that, for an arbitrary vector
in R2 , namely .a; b/ it is possible to write it as a linear combination of f.2; 1/; .1; 3/g. That
is, the following system

2x C y D a
x C 3y D b

always has solution for all a; b. We can use the Gaussian elimination to solve this system and
see that it always has solution.

Example 11.2
Find the span.f.1; 0/; .0; 1/; .2; 3/g/. We simply use the definition to compute the span:
" # " # " #
1 0 2
span.f.1; 0/; .0; 1/; .2; 3/g/ D c1 C c2 C c3
0 1 3

What is interesting is that the third vector .2; 3/ is nothing new, it is a linear combination of
the first two, so the span can be written in terms of only the first two vectors:
" # " # " # " #! " # " #
1 0 1 0 1 0
span.f.1; 0/; .0; 1/; .2; 3/g/ D c1 C c2 C c3 2 C3 D˛ Cˇ
0 1 0 1 0 1

Linear independence. We have seen that in matrices, it is possible that some columns can be
written in terms of others. For example, we can have

a3 D 2a1 3a1

In this case, we say that the three columns or vectors are linear dependent. Noting that the
above writing is not symmetric, as a3 was received special treatment. Thus, mathematicians will
re-write the above relation as
2a1 3a1 a3 D 0
And with that we have the following definitions about linear independence/dependence of a set
of vectors.

Definition 11.3.3
A collection of k vectors u1 ; : : : ; uk is linear dependent if

˛1 u1 C ˛2 u2 C    C ˛k uk D 0

holds with at least one ˛k ¤ 0.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 869

Definition 11.3.4
A collection of k vectors u1 ; : : : ; uk is linear independent if it is not linear dependent. That is

˛1 u1 C ˛2 u2 C    C ˛k uk D 0 H) ˛i D 0 .i D 1; 2; : : : ; k/

In summary, a collection of n vectors is said to be linear dependent if we can express one


vector in terms of the n 1 remaining vectors. On the other hand, it is linear independent if none
of them can be written as a linear combination of others: they are independent.
Here is an important fact: Any set of vectors containing the zero vector is linearly dependent.
This is because we can always write: 10C0a2 C  C0ak D 0. Thus, according to our definition,
the set f0; v1 ; v2 ; : : : ; vk g is linear dependent.

Example 11.3
Determine whether the vectors f.1; 2; 0/; .1; 1; 1/; .1; 4; 2/g are linear independent. This is
equivalent to see if the following system
2 32 3 2 3
1 1 1 ˛1 0
6 76 7 6 7
42 1 45 4˛2 5 D 405
0 1 2 ˛3 0

has trivial solution (zero vector) or not. Using the Gauss elimination method, we get one zero
row, thus this system has infinitely many solutions, and one solution is not the zero vector.
Thus, the vectors are linear dependent.

It can be seen then that in a 2D plane, 3 (or more) vectors are surely linear dependent. This
can be intuitively explained: on a 2D plane, two directions (two vectors which are not parallel)
are sufficient to get us anywhere, so the third vector can be nothing new: it must be a combination
of the first two directions. Similarly, in a 3D space, any four vectors are linearly dependent. We
can state this fact as the following theorem

Theorem 11.3.3
Any set of m vectors in Rn is linearly dependent if m > n.

Proof. The proof is based on theorem 11.3.2, which tells us that a system of equation Ax D 0,
where A is a n  m matrix, has a nontrivial solution whenever n < m. Thus, we build A with
its columns are the set of m vectors in Rn . Because x ¤ 0, the columns of A are linearly
dependentŽ . 

Ž
Do not forget the column picture of Ax D b that x is the coefficients of the linear combination of A’s columns.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 870

11.4 Matrix algebra


This section is about operations that can be performed on matrices and the rules that govern these
operations. Many rules are similar to the rules of vectors (which are similar to the arithmetic
rules of numbers). This is so because a vector is a list of numbers and a matrix is a list of vectors.
First, we start with a formal definition of what is a matrix.

Definition 11.4.1
A matrix is a rectangular array of numbers called entries or elements, of the matrix.

The size of a matrix gives the number of rows and columns it has. An m  n matrixŽŽ has m
rows and n columns:
2 3
A11 A12    A1n
6 7 h i
6 A21 A22    A2n 7
AD6 :6 7 ; or A D A 1 A 2    A n
: A1n 7
:: ::
4 :: : 5
Am1 Am2    Amn

and we denote by Aij the entry at row i and column j of A. The columns of A are vectors in Rm
(i.e., they have m components) and the rows of A are vectors in Rn . In the above, the columns
of A are A i ; i D 1; 2; : : : ; n. When m D n we have a square matrix. The most special square
matrix is the identity matrix I, or In to explicitly reveal the size, where all the entries on the
diagonal are 1: Ii i D 1:
2 3
1 0  0
6 7
60 1    07 h i
6
I D In WD 6 : : : 7 (11.4.1)
7 D e1 e2    en
4 :: :: : : 05
0 0  1

This matrix is called the identity matrix because Ix D x for all x, it is the counterpart of number
one. As can be seen, I consists of all unit vectors in Rn .

11.4.1 Matrix operations


We can do things with numbers and vectors; things such as addition and multiplication. It is no
surprise that we can add (and subtract) two matrices, we can multiply a scalar with a matrix, we
can multiply a matrix with a vector and finally we can multiply (and divide) two matrices.
Considering two m  n matrices A and B, the sum of the two matrices is an m  n matrix
defined as
.A C B/ij D Aij C Bij (11.4.2)
ŽŽ
pronounced m by n matrix.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 871

So, to add two matrices, add the entries. This is similar to adding vectors! Similarly, we can
scale a matrix by a factor c as
.cA/ij D cAij (11.4.3)
which is similar to scaling a vector.
The next operation is multiplication of a m  n matrix A with a n-vector x (a vector in Rn
is referred to as n-vector). The result is a vector of length m of which the i th entry is given by

X
n
.Ax/i D Ai k xk (11.4.4)
kD1

That is the ith entry of the result vector is the dot product of the i th row of A and x. This
definition comes directly from the system Ax D b. Because the dot product has the distributive
property that a  .b C c/ D a  b C a  c, the matrix-vector multiplication also has the same
property:
2 3 2 3
row 1 of A  .a C b/ row 1 of A  a C row 1 of A  b
6 7 6 7
6 7
def 6 row 2 of A  .a C b/ 7 6 row 2 of A  a C row 1 of A  b 7
A.a C b/ D 6 6 7 D Aa C Ab
:: 7D6 :: 7
4 : 5 4 : 5
row m of A  .a C b/ row m of A  a C row 1 of A  b

Now comes the harder matrix-matrix multiplication. One simple example for the motivation:
considering the following two linear systems:

x1 C 2x2 D y1 y1 y2 D z1
;
0x1 C 3x2 D y2 2y1 C 0y2 D z2

Now, we want to eliminate y1 ; y2 to have a system with unknowns x1 ; x2 . This is simple, we


can just substitute y1 ; y2 in the second system by the first system. The result is:

x1 x2 D z1
(11.4.5)
2x1 4x2 D z2

Now, we do the same but using matrices:


" #" #" # " #
1 1 1 2 x1 z
D 1
2 0 0 3 x2 z2

Thus, the product of the two matrices in this equation must be another 2  2 matrix, and this
matrix must be, because we know the result from Eq. (11.4.5)
" #" # " #
1 1 1 2 1 1
D
2 0 0 3 2 4

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 872

This result can be obtained if we first multiply the left matrix on the LHS with the first column
of the right matrix (red colored), and we get the first column of the RHS matrix. Doing the
same we get the second column. And with that, we now can define the rule for matrix-matrix
multiplication. Assume that A is an m  n matrix and B is an n  p matrix, then the product
AB is an m  p matrix of which the ij entry is:

X
n
.AB/ij D Ai k Bkj (11.4.6)
kD1

In words: the entry at row i and column j of the product AB is the dot product of row i of A
and column j of BŽŽ . And we understand why for matrix-matrix multiplication the number of
columns in the first matrix must be equal to the number of rows in the second matrix.
It must be emphasized that the above definition of matrix-matrix multiplication is not the only
way to look at this multiplication. In Section 11.4.4 other ways are discussed. This definition is
used for the actual computation of the matrix-matrix product, but it does not tell much what is
going on.
Of course you can define matrix-matrix multiplication in a different way; and in
the process you would create another branch of algebra. However, the presented
Remark
definition is compatible with matrix-vector multiplication. Thus, it inherits many
nice properties as we shall discuss shortly.

11.4.2 The laws for matrix operations


Let’s denote by A; B and C three matrices (of appropriate shapes) and a real number ˛, we
obtain the following laws for matrix operations, which are exactly identical to the arithmetic
rules of real numbers (except the broken AB ¤ BA):

(a): commutative law ACB DBCA


(b): distributive law ˛.A C B/ D ˛A C ˛B
(c): associative law A C .B C C/ D .A C B/ C C
(d): associative law for ABC A.BC/ D .AB/C (11.4.7)
(e): distributive law (left) A.B C C/ D AB C AC
(f): distributive law (right) .A C B/C D AC C BC
(f): broken commutative law (multiplication) AB ¤ BA

Certainly mathematicians ask for proofs. Proving the first three laws is straightforward. This
is not unexpected as these laws are exactly identical to the laws for vector addition and scalar
multiplication. If we want we can think of a matrix as a ’long’ vector (and this is actually how
computers store matrices).
ŽŽ
Thus matrix-matrix multiplication is not actually something entirely new.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 873

For the distributive law from the left: we consider one column of A.B C C/, it is A.bi C c i /,
which is Abi C Ac i (due to the linearity of matrix-vector multiplication).
After multiplication is powers, so we now define powers of a matrix. With p is a positive
integer, the pth power of a square matrix A is defined as

Ap WD AAA    A .p factors/

And the usual laws of exponents e.g. 2m  2n D 2nCm hold for matrix powers:

.Ap /.Aq / D ApCq ; .Ap /q D Apq

Similar to 20 D 1, when p D 0 we have A0 D I–the identity matrix.

11.4.3 Transpose of a matrix


Considering two (column) vectors x D .x; y/ and a D .a; b/ in R2 . Our problem is how to
write the dot product x  a D ax C by using matrix notation. We cannot use xa because matrix
multiplication requires that the shapes of the matrices must be compatible. We can solve the
problem if we turn the column vector x into a row vector, then we’re done:
" #
h i a
x  a D ax C by D x y D x > a .D a> x/ (11.4.8)
b

where the notation x > is to denote the transpose of x. It turns a column vector into a row vector.
As a matrix can be seen as a collection of some column vectors, we can also transpose a matrix.
With two vectors we can multiply them to get a number with the above dot product. A
question should arise: is this possible to get a matrix from the product of two vectors? The
answer is yes:
" # " # " # " #
1 3 1 h i 3 4
aD ; bD H) ab> D 3 4 D
2 4 2 6 8

So, a vector a of length m with a vector b of length n via the outer product ab> yields an m  n
matrix.

Definition 11.4.2
The transpose of an m  n matrix A is the n  m matrix A> obtained by interchanging the
rows and columns of A. That is the i th column of A> is the i th row of A.

One example suffices to clarify the definition:


2 3
" # 1 2
1 2 3 6 7
AD H) A> D 42 45
2 4 6
3 6

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 874

With the introduction of the transpose operator, we can define a symmetric matrix as:

Definition 11.4.3
A square matrix A of size n  n is symmetric if it is equal to its transpose, or Aij D Aj i.

Obviously transpose is an operator or a function, and thus it obeys certain rules. Here are
some basic rules regarding the transpose operator for matrices:

(a): .A> /> DA


(b): .A C B/> D A> C B>
(11.4.9)
(c): .kA/> D kA>
(d): .AB/> D B> A>

Recall that in Section 4.2.1, we have seen that it is possible to decompose any function into
an even function and an odd function:
1 1
f .x/ D Œf .x/ C f . x/ C Œf .x/ f . x/
2 2
in which the first term is an even function, i.e., g. x/ D g.x/ and the second is an odd function
i.e., g. x/ D g.x/ (see Section 4.2.1). And we have applied this decomposition to the expo-
nential function y D e x , we had e x D 21 Œe x C e x  C 21 Œe x e x , which led to the definition
of the hyperbolic cosine and sine functions. Now, we do the same thing for square matrices.
Given a square matrix A, we can write
1 1
A D .A C A> / C .A A> /
2 2
and applying that decomposition to the following matrix,
2 3 2 3 2 3
1 2 3 2 6 10 0 2 4
6 7 16 7 16 7
A D 44 5 65 D 4 6 10 145 C 42 0 25
2 2
7 8 9 10 14 18 4 2 0

we get a symmetric matrix, and a skew-symmetric matrix (the second matrix). A skew-symmetric
matrix A is a square matrix with the property A> D A.

11.4.4 Partitioned matrices


Considering two 3  3 matrices A and B, its product AB is given by
2 3
.row 1 of A/  .col 1 of B/ .row 1 of A/  .col 2 of B/ .row 1 of A/  .col 3 of B/
6 7
AB D 4.row 2 of A/  .col 1 of B/ .row 2 of A/  .col 2 of B/ .row 2 of A/  .col 3 of B/5
.row 3 of A/  .col 1 of B/ .row 3 of A/  .col 2 of B/ .row 3 of A/  .col 3 of B/

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 875

Thus, we can split B into three columns B 1 ; B 2 ; B 3 , and AB is equal to the product of A with
each column, and the results put together:
h i h i
AB D A B 1 B 2 B 3 D AB 1 AB 2 AB 3
The form on the right is called the matrix-column representation of the product. What does
this representation tell us? It tells us that the columns of AB are the linear combinations of
the columns of A (e.g. AB 1 is a linear combination of the cols of A from the definition of
matrix-vector multiplication). And that leads to the linear combination of all columns of AB is
just a linear combination of the columns of A. Later on, this results in rank.AB/  rank.A/.
And nothing stops us to partition matrix A, but we have to split it by rows:
2 3 2 3
A1 A1B
6 7 6 7
AB D 4 A 2 5 B D 4 A 2 B 5
A3 A3B
And this is called the row-matrix representation of the product.
It is also possible to partition both matrices, and we obtain the column-row representation of
the product:
2 3
h i B1
6 7
AB D A 1 A 2 A 3 4 B 2 5 D A 1 B 1 C A 2 B 2 C A 3 B 3
„ ƒ‚ …
B3 sum of rank 1 matrices

This reminds us of the dot product, but the individual terms are matrices not scalars because
A 1 B 1 is the outer product. For example, A 1 B 1 is a 3  3 matrix as A 1 is a 3  1 matrix and
B 1 is a 1  3 matrix:
2 3 2 3
A11 h i A11 B11 A11 B12 A11 B13
6 7 6 7
A 1 B 1 D 4A21 5 B11 B12 B13 D 4A21 B11 A21 B12 A21 B13 5
A31 A31 B11 A31 B12 A31 B13
Matrices like A 1 B 1 are called rank-1 matrices because their rank is one; this is because the rank
of either A 1 or B 1 is oneŽŽ .
Each of the forgoing partitions is a special case of partitioning a matrix in general. A matrix
is said to be partitioned if horizontal and vertical lines are introduced, subdividing it into sub-
matrices or blocks. Partitioning a matrix allows it to be written as a matrix whose entries are its
blocks (which are matrices of themselves). For example,
2 3 2 3
1 0 0 2 1 4 3 1 2 1
6 7 6 7
60 1 0 1 37 " # 6 1 2 2 1 17 " #
6 7 I A12 6 7 B11 B12 B13
AD6 60 0 1 4 077D 0 A ; BD6 6 1 5 3 3 17 7D I
6 7 22 6 7 0 B23
40 0 0 1 65 4 1 0 0 0 25
0 0 0 7 1 0 1 0 0 3
ŽŽ
This comes from the fact that rank.AB/  min.rank.A/; rank.B//. Another way to see this is: A 1 B 1 is a
linear combination of A 1 , thus it is just a line in the direction of A 1 and a line has rank 1.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 876

where A has been partitioned into a 2  2 matrix and B as a 2  3 matrix. (Note that I was used
to denote the identity matrix but the size of I varies; similarly for 0.) With these partitions, the
product AB can be computed blockwise as if the entries are numbers:
" #" # " #
I A12 B11 B12 B13 B11 C A12 B12 B13 C A12 B23
AB D D
0 A22 I 0 B23 A22 0 A22 B23

Using Julia it is quick to check that the usual way to compute AB gives the same result as the
way using partitioned matrices.

11.4.5 Inverse of a matrix


Inverse matrices are related to inverse functions. Recall that start with an angle x and press the
sin button on a calculation we get y D sin x. Now, to get back to where we have started i.e., x,
press the inverse function arcsin and we have: arcsin y D x. Now, we have a square matrix A,
a vector x, when A acts on x we get a new vector b. To get back to x, do:

A 1 A x D A 1 b H) A 1 b D x
Ax D b H) „ƒ‚… (11.4.10)
I

The matrix A 1 is called the left inverse matrix of A. There exists the right inverse matrix of A
as well: it is defined by AA 1 D I. If a matrix is invertible, then its inverse, A 1 , is the matrix
that inverts A:
A 1 A D I and AA 1 D I (11.4.11)
Property 1. If a matrix is invertible, its inverse is unique.

Property 2. The inverse of the product AB is the product of the inverses, but in reverse order:
1
.AB/ D B 1A 1

Even though this is natural|| , an algebraic proof goes:

.AB/ 1 .AB/ D .B 1 A 1 /.AB/ D B 1 .A 1 A/B D B 1 IB D B 1 B D I

And of course starting with this property for 2 matrices, we can develop the rule for three
matrices,
.ABC/ 1 D C 1 B 1 A 1
and then for any number of matrices that we want.

Property 3. If A is invertible then A 1 is invertible and its inverse is A. That is .A 1 / 1


D A.
Property 4. If A is invertible then A> is invertible and .A> / 1 D .A 1 />ŽŽ .
||
This property is sometimes called the socks-and-shoe rule: you put in the socks and then the shoe. Now, you
take of the shoe first, then remove the socks.
ŽŽ
Proof: A 1 A D I, thus A> .A 1 /> D I

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 877

Property 5. If A is invertible then An is invertible for all nonnegative integer n and


.An / 1 D .A 1 /n .

Elementary matrices. We are going to use matrix multiplication to describe the Gaussian
elimination method used in solving Ax D b. The key idea is that each elimination step is
corresponding with the multiplication of an elimination matrix E with the augmented matrix.
We reuse the example in Section 11.3.1. We’re seeking for a matrix E that expresses the
process of subtracting two times the first equation from the second equation. To find that matrix,
look at the RHS vector: we start with .2; 8; 10/ and we get .2; 4; 10/ after the elimination step;
this can be nearly achieved with:
2 32 3 2 3
1 0 0 2 2
6 76 7 6 7
40 1 05 4 8 5  4 4 5
0 0 1 10 10

We need to change this matrix slightly as follows, and we get what we have wanted for:
2 32 3 2 3
1 0 0 2 2
6 76 7 6 7
4 2 1 05 4 8 5 D 4 4 5
0 0 1 10 10

Thus, starting from the identity matrix I: Ib D b, the elimination matrix E21 is I with the extra
non-zero entry 2 in the .2; 1/ position. How to get that -2 from I? Replacing the second row
(of I) by subtracting two times the first row from the second row. But that is exactly what we
wanted for b!
Multiplying E21 with A has the same effect:
2 32 3 2 3
1 0 0 2 4 2 2 4 2
6 76 7 6 7
4 2 1 05 4 4 9 35 D 4 0 1 15
0 0 1 2 3 7 2 3 7

Definition 11.4.4
An elementary matrix is a matrix that can be obtained from the identity matrix by one single
elementary row operation. Multiplying a matrix A by an elementary matrix E (on the left)
causes A to undergo the elementary row operation represented by E. This can be expressed
by symbols, where R denotes a row operation:

A0 D R.A/ ” A0 D ER A (11.4.12)

Now, as the row operation affects the matrix A and the RHS vector b altogether, we can
put the coefficient matrix A and the RHS vector b side-by-side to get the so-called augmented

Proof for n D 2: A2 .A 1 2
/ D AAA 1
A 1
D AIA 1
D AA 1
D I.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 878

matrix, and we apply the elimination operation to this augmented matrix by left multiplying it
with E21 : 2 3
h i h i 2 4 2 2
6 7
E21 A b D E21 A E21 b D 4 0 1 1 45 (11.4.13)
2 3 7 10
To proceed, we want to eliminate -2 using the pivot 2 (red). The row operation is: replacing row
3 by row 3 + row 1, and that can be achieved with matrix E31 as follows (obtained from I by
replacing its row 3 by row 3 + row 1)
2 3
1 0 0
6 7
E31 D 40 1 05
1 0 1
to remove x1 in the third equation. Together, the two elimination steps can be expressed as:
2 3
h i h i 2 4 2 2
6 7
E31 E21 A b D E31 E21 A E31 E21 b D 40 1 1 45
0 1 5 12
Finally, we use E32 as follows (we want to remove the blue 1, or x2 in the row 3, and that is
obtained by replacing row 3 with row 3 minus row 2)
2 3
1 0 0
6 7
E32 D 40 1 05
0 1 1
Altogether, the three elimination steps can be expressed as:
2 3
h i h i 2 4 2 2
6 7
E32 E31 E21 A b D E32 E31 E21 A E32 E31 E21 b D 40 1 1 45 (11.4.14)
0 0 4 8
And we have obtained the same matrix U that we got before. Notice the pivots along the diagonal.

The inverse of an elementary matrix. The inverse of an elementary matrix E is also an elemen-
tary matrix that undoes the row operation that E has done. For example,
2 3 2 3
1 0 0 1 0 0
6 7 6 7
E32 D 40 1 05 H) .E32 / 1 D 40 1 05
0 1 1 0 1 1

Finding the inverse: Gauss-Jordan elimination method. We have an invertible matrix and
we want to find its inverse. To illustrate the method, let’s consider a 3  3 matrix A. We know

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 879

that its inverse A 1 is a 3  3 matrix such that AA 1 D I. Let’s denote by x 1 ; x 2 ; x 3 the three
columns of A 1 . We’re looking for these columns. The equation AA 1 D I is equivalent to
three systems of linear equations, one for each column:

Ax i D e i ; i D 1; 2; 3 (11.4.15)

where e i are the unit vectors.


We know how to solve a system of linear equations, using the Gaussian elimination method.
The idea of the Gauss-Jordan elimination method is to solve Eq. (11.4.15) altogether. So, the
augmented matrix is ŒAj e 1 e 2 e 3  or ŒA I, and we perform the usual row operations on it. Let’s
consider a concrete matrix with 2’s on the diagonal and -1’s next to the 2’s, then the augmented
matrix is 2 3
2 1 0 1 0 0
6 7
4 1 2 1 0 1 05
0 1 2 0 0 1
The Gaussian elimination steps are:
2 3
2 1 0 1 0 0
6 7
H) 40 3=2 1 1=2 1 05 .1/2 row 1 + row 2/
0 1 2 0 0 1
2 3
2 1 0 1 0 0
6 7
H) 40 3=2 1 1=2 1 05 .2/3 row 2 + row 3/
0 0 4=3 1=3 2=3 1
What we have to do is to remove the red terms–making zeros above the pivots:
2 3
2 1 0 1 0 0
6 7
H) 40 3=2 0 3=4 3=2 3=45 .3/4 row 3 + row 2/
0 1 2 0 0 1
2 3
2 0 0 3=2 1 1=2
6 7
H) 40 3=2 0 3=4 3=2 3=45 .2/3 row 2 + row 1/
0 0 4=3 1=3 2=3 1
2 3
1 0 0 3=4 1=2 1=4
6 7
H) 40 1 0 1=2 1 1=25 .making the pivots of each row equal 1/
0 0 1 1=4 1=2 3=4

Now, the three columns of A 1


are in the second half of the above matrixŽŽ . Thus,
h i h i
A I H) I A 1

ŽŽ
This is because the the 1st column after the vertical bar is x 1 , the first column of the inverse of A.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 880

which can be written as


  h i h i
Rk : : : R2 R1 A I D I A 1

And each row operation corresponds with an elementary matrix, so the above can be also written
as h i h i
Ek : : : E2 E1 A I D I A 1

From that, we obtain

Ek : : : E2 E1 A D I
1
Ek : : : E2 E1 D A

Taking the inverse of the second equation and using the rule .AB 1
D B 1 A 1 , we can express
A as a product of the inverses of Ei :

A D E1 1 E2 1    Ek 1 (11.4.16)

As the inverse of an elementary matrix is also an elementary matrix, this tells us that every
invertible matrix can be decomposed as the product of elementary matrices.

11.4.6 LU decomposition/factorization
This section shows that the Gaussian elimination process results in a factorization of the matrix
A into two matrices: one lower triangular matrix L and the familiar upper triangular matrix U
that we have met. Recall Eq. (11.4.14) that
2 3
h i h i 2 4 2 2
6 7
E32 E31 E21 A b D E32 E31 E21 A E32 E31 E21 b D 40 1 1 45
0 0 4 8

From which, we can write,

E32 E31 E21 A D U H) A D .E32 E31 E21 / 1 U D E211 E311 E321 U

From Property 3 of matrix inverse, we know the inverse matrices E211 ; E311 ; E321 : they are all
lower triangular matrices with 1s on the diagonal. Therefore, we get a lower triangular matrix
as their product. Thus, we have decomposed A into two matrices:
2 32 3
1 0 0 2 4 2
6 76 7
A D 4 2 1 0540 1 15
1 1 1 0 0 4

just similar to how we can decompose a number e.g. 12 D 2  6. And this is always a good
thing: dealing with 2 and 6 is much easier than with 12. L and U contain many zeros.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 881

What is the benefits of this decomposition? It is useful because we replace Ax D b into two
problems with triangular matrices:
(
Ly D b
Ax D b ” LUx D b ” (11.4.17)
Ux D y
in which we first solve for y, then solve for x. Using the LU decomposition method to solve
Ax D b is faster than the Gaussian elimination method when we have a constant matrix A but
many different RHS vectors b1 ; b2 ; : : : This is because we just need to factor A into LU once.
Another benefit of the LU decomposition is that it allows us to compute the determinant of a
matrix as the product of the pivots of the U matrix:
Y
det.A/ D det.LU/ D det.L/ det.U/ D 1  ui (11.4.18)
i

where ui are the entries on the diagonal of U (pivots). There are more to say about determinants
in Section 11.9.

11.5 Subspaces, basis, dimension and rank


Subspaces. Inside a vector space there might be a subspace, that is a smaller set of vectors
but big enough to be a space of itself. One example to demonstrate the idea. Note that a plane
passing through the origin can be expressed as a linear combination of two (direction) vectors:
P W x D us C tv, where s; v 2 R3 and u; t 2 R. Now, considering two vectors x 1 and x 2 lying
on this plane, we can write
x 1 D u 1 s C t1 v x 1 C x 2 D .u1 C u2 /s C .t1 C t2 /v 2 P;
H)
x 2 D u 2 s C t2 v ˛x 1 D ˛u1 s C ˛t1 v 2 P
This indicates that if we take two vectors on this plane, their sum is also on this plane and
the product of one vector with a real number is also on the plane. We say that: The plane
going through the origin .0; 0; 0/ is a subspace of R3 . And this example leads to the following
definition of a subspace.

Definition 11.5.1
A subspace of Rn is a set of vectors in Rn that satisfies two requirements: if u and v are two
vectors in the subspace and ˛ is a scalar, then

.i / u C v is in the subspace .i i/ ˛u is in the subspace

We can combine the two requirements into one: ˛u is in the subspace and ˇv is in the subspace
(from requirement 2), then ˛u C ˇv is also in the subspace (requirement 1). And that means
that the linear combination of u and v is in the subspace:

if u and v in the subspace then ˛u C ˇv is in the subspace (11.5.1)

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 882

If we take ˛ D 0, then ˛u D 0 which is in a subspace. A subspace must contains the zero


vector. Back to the example of a plane in R3 , the plane went through .0; 0; 0/.
The plane P W x D us C t v is a subspace. This leads us to think that given a set of vectors
v1 ; v2 ; : : : ; vk in Rn , all the linear combinations of them is a subspace of Rn . And that is true:

( (
u D ˛i vi u C v D .˛i C ˇi /vi ;
H)
v D ˇi v i u D ˛i vi

This gives us the following theorem (check definition 11.3.2 for what a span is)

Theorem 11.5.1: Span is a subspace


Let v1 ; v2 ; : : : ; vk be vectors in Rn . Then span.v1 ; v2 ; : : : ; vk / is a subspace of Rn .

And this theorem leads to the following subspaces of matrices: column space, row space,
nullspace.

Subspaces associated with matrices. We know that solving Ax D b is to find the linear
combination of the columns of A with the coefficients being the components of vector x so that
this combination is exactly b. And this leads naturally to the concept of the column space of
a matrix. And why not row space. And there are more. We put all these subspaces related to a
matrix in the following definition.

Definition 11.5.2
Let A be an m  n matrix.

(a) The row space of A is the subspace R.A/ of Rn spanned by the rows of A.

(b) The column space of A is the subspace C.A/ of Rm spanned by the columns of A.

(c) The null space of A is the subspace N.A/ of Rn that contains all the solutions to
Ax D 0.

With this definition, we can deduce that Ax D b is solvable if and only if b is in the column
space of A. Therefore, C.A/ describes all the attainable right hand side vectors b.

Basis. A plane through .0; 0; 0/ in R3 is spanned by two linear independent vectors. Fewer than
two independent vectors will not work; more than two is not necessary (e.g. three vectors in
R3 , assuming that the third vector is a combination of the first two, then a linear combination of
these three vectors is essentially a combination of the first two vectors). We just need a smallest
number of independent vectors.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 883

Definition 11.5.3
A basis for a subspace S of Rn is a set of vectors in S that

(a) spans S , and

(b) is linear independent

The first requirement makes sure that a sufficient number of vectors is included in a basis;
and the second requirement ensures that a basis contains a minimum number of vectors that
spans the subspace. We do not need more than that.
It is easy to see that the following sets of vectors are the bases for R2 (because they span R2
and they are linear independent):
" # " #! " # " #!
1 0 1 1
; ; ;
0 1 0 1

Even though R2 has many bases, these bases all have the same number of vectors (2). And this
is true for any subspace by the following theorem.

Theorem 11.5.2: The basis theorem


Let S be a subspace of Rn . Then any two bases of S have the same number of vectors.

Proof. Let B D fu1 ; u2 ; : : : ; us g and C D fv1 ; v2 ; : : : ; vr g be two bases of Rn . We want to


prove that s D r. As Sherlock Holmes noted, “When you have eliminated the impossible,
whatever remains, however improbable, must be the truth” (from The Sign of Four by Sir Arthur
Conan Doyle), we will prove that neither s > r nor s < r is possible, and thus we’re left with
r D s. Assuming first that s < r, we then prove that C is linear dependent , which contradicts
the fact that it is a basis. 

Any two bases of a subspace of Rn have the same number of vectors. That number should
be special. Indeed, it is the dimension of the subspace. So, we have the following definition for it.

Definition 11.5.4
Let S be a subspace of Rn , then the number of vectors in a basis for S is called the dimension
of S , denoted by dim.S/. Using the language of set theory, the dimension of S is the cardinality
of one basis of S.


Express each vi in terms of u1 ; u2 ; : : :. Then build c1 v1 C    D 0, which in turn is in terms of ./u1 C ./u2 C
   D 0. As B is a basis all the terms in the brackets must be zero. This is equivalent to a linear system Ac D 0
with A 2 Rsr . This system has a nontrivial solution c due to theorem 11.3.2.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 884

Example 11.4
Find a basis for the row space of
2 3
1 1 3 1 6
6 7
6 2 1 0 1 17
AD6 7
4 3 2 1 2 15
4 1 6 1 3

The way to do is the observation that if we perform a number of row elementary operations
on A to get another matrix B, then R.A/ D R.B/a . So, the same old tool of Gauss-Jordan
elimination gives us:
2 3
1 0 1 0 1
6 7
60 1 2 0 37
RD6 7
40 0 0 1 45
0 0 0 0 0
Now, the final row consists of all zeros is useless; thus the first three non-zero rows form a
basis for R.A/b . And we also get dim.R.A// D 3.
a
The rows of B are simply linear combinations of the rows of A, thus the linear combination of the rows of
B is a linear combination of all rows of A. This leads to R.B/  R.A/. But the row operations can be reversed
to go from B to A, so we also have R.A/  R.B/.
b
Why? Because the nonzero rows are independent.

Example 11.5
Find a basis for the column space of A given in Example 11.4. We have row operations not
column operations. So, one solution is to transpose the matrix to get A> in which the rows
are the columns of A. With A> , we can proceed as in the previous example. The second way
is better as we just work with A. Noting that basis is about the linear independence of the
columns of A. That is to see Ax D 0 has a zero vector as a solution or not. With this view,
we can study Rx D 0 instead where R is the RREF of A.
There are three pivot columns in R: the 1st, 2nd and 4th columns. These pivot columns
are the standard unit vectors e i , so they are linear independent. The pivot columns also span
the column space of Ra . Now, we know that the pivot columns of R is a basis for the column
space of R. And this means that the pivot columns of A is a basis for the column space of A.
And we also obtain dim.C.A// D 3/b .
a
This is because the non-pivot columns are linear combinations of the pivot ones, they do not add new thing
to the span.
b
Be careful that C.A/ ¤ C.R/

From the previous examples, we see that the column and row space of that specific matrix
have the same dimension. And in fact it is true for any matrix. So, we have the following theorem.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 885

Theorem 11.5.3
The row and column spaces of a matrix have the same dimension.

A nice thing with this theorem is that it allows us to have a better definition for the rank
of a matrix. The rank of a matrix is the dimension of its row and column spaces. Compared
with the definition of the rank as the number of nonzero rows, this definition is symmetric with
both rows and columns. And it should be. With this row-column symmetry, it is no surprise that
rank.A/ D rank.A> /.
Suppose that A and B are two matrices such that AB makes sense, from the definition
of matrix-matrix product, we know that the columns of AB are linear combinations of the
columns of A. Thus C.AB/  C.A/. Therefore, rank.AB/  rank.A/. Similarly, we have
R.AB/  R.B/. Then, rank.AB/  rank.B/. Finally, rank.AB/  min.rank.A/; rank.B//.
Proof. [Proof of theorem 11.5.3] Consider a matrix A, and we need to prove that dim.R.A// D
dim.C.A//. We start with the row space with the fact that R.A/ D R.R/ where R is the RREF
of A. Thus, dim.R.A// D dim.R.R//. But dim.R.R// is equal to the number of unit pivots,
which equals to the number of pivot columns of A. And we know that the pivot columns of A is
C.A/. 
We have the dimension for the row space and column space. What about the null space?

Definition 11.5.5
The nullity of a matrix A is the dimension of its null space and is denoted by nullity(A).

Example 11.6
Find a basis for the null space of A given in Example 11.4. This is equivalent to solving the
homogeneous system Ax D 0. We get the RREF as
2 3 2 3
1 1 3 1 6 0 1 0 1 0 1 0
6 7 6 7
6 2 1 0 1 1 07 60 1 2 0 3 07
AD6 7 H) 6 7
4 3 2 1 2 1 05 40 0 0 1 4 05
4 1 6 1 3 0 0 0 0 0 0 0
Looking at the matrix R, we know that there are 2 free variables x3 ; x5 . We then solve for the
pivot variables in terms of the free ones with x3 D s and x5 D t :
2 3 2 3 2 3 2 3
x1 sCt 1 1
6 7 6 7 6 7 6 7
6x2 7 6 2s 3t 7 6 27 6 37
6 7 6 7 6 7 6 7
6x 7 D 6 7 D 6 7 C 6 07
6 37 6 s 7 s 6 71 t 6 7
6 7 6 7 6 7 6 7
4x4 5 4 4t 5 4 05 4 45
x5 t 0 1
Therefore, the null space of A has a basis of the two red vectors. And the nullity of A is 2.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 886

Theorem 11.5.4: Rank theorem


Let A be an m  n matrix, then

rank.A/ C nullity.A/ D n

Theorem 11.5.5
Let A be an m  n matrix, then

(a) rank.A> A/ D rank.A/

(b) The matrix A> A is invertible if and only if rank.A/ D n.

Proof. Using Theorem 11.5.4 for matrices A and A> A (both have the same number of cols n),
we have
rank.A/ C nullity.A/ D n; rank.A> A/ C nullity.A> A/ D n
Thus, we only need to show that nullity.A/ D nullity.A> A/. In other words, we have to show
that if x is a solution to Ax D 0, then it is also a solution to A> Ax D 0 and vice versa. I present
only the way from A> Ax D 0 to Ax D 0:

x > A> Ax D 0 ” .Ax/  .Ax/ D 0 H) Ax D 0

The last step is due to the property of the dot product, see Box 11.2, property (d). 

Expansion in a basis or coordinates. Let S be a subspace of Rn and let B D fu1 ; : : : ; uk g be


a basis for S , then any vector b 2 S can be written as a linear combination of the basis in a
unique way.

Proof. The proof is as follows (where we write two linear combinations for b and subtract them
and use the definition of linear independent vectors to show that the two set of coefficients are
identical)

b D ˛1 u1 C ˛2 u2 C    C ˛k uk
b D ˇ1 u1 C ˇ2 u2 C    C ˇk uk
H) 0 D .˛1 ˇ1 /u1 C .˛2 ˇ2 /u2 C    C .˛k ˇk /uk

As u1 ; : : : ; uk are linear independent, it must follow that ˇi D ˛i for i D 1; 2; : : : ; k. This is


one common way to prove something is unique: we assume this something can be written in
two ways and prove that two ways are identical. 

If S is a subspace of Rn and B D fu1 ; : : : ; uk g a basis for S , then b 2 S can be written as


ci ui . There is a special name to these ci ’s. So, we have the following definition for them.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 887

Definition 11.5.6
Let S be a subspace of Rn and let B D fu1 ; : : : ; uk g be a basis for S , and b 2 S is a vector
in S. We can then write b D ci ui . Then, .c1 ; c2 ; : : : ; ck / are called the coordinatesh of b with
respect to B. And the vector making of c’s is called the coordinate vector of b with respect to
B.
h
Some authors use expansion coefficients instead of coordinates.

Let’s demonstrate the fact that the same vector will y


2 y
have different coordinates in different bases. For exam-
 
ple, in 2D, consider two bases: the first one is the tra- ‹
b bD
ditional one that Descarte gave us: e 1 D .1; 0/ and b
3
a2

e 2 D .0; 1/. The second basis is a1 D .1; 0/ and e 2
a2 D .1; 1/. Now, consider a fixed point p D .2; 3/ O e 1 x O a1 x
in the first base. In the second basis, we write
" # " # " #
2 1 1
D . 1/ C .3/
3 0 1

That is, in the second basis, the coordinates of p are . 1; 3/. How did we find out the coordinates?
We had to solve the following system of equations:
" # " # " #
1 1 2
˛ Cˇ D
0 1 3

which is easy but nevertheless taking time. Imagine that if our space is Rn , then we would need
to solve a system of n linear equations for n unknowns. A time-consuming part! Why things
are easy for e 1 D .1; 0/ and e 2 D .0; 1/? They are orthogonal to each other. We shall discuss
orthogonal vectors in Section 11.8.

11.6 Introduction to linear transformation


Let’s consider this function of one variable y D f .x/ D ax where a; x 2 R. This function has
the following two properties

linearity: f .x1 C x2 / D f .x1 / C f .x2 /


homogeneity: f .˛x1 / D ˛f .x1 /

which also means that f .˛x1 C ˇx2 / D ˛f .x1 / C ˇf .x2 /. The function y D g.x/ D ax C b,
albeit also a linear function, does not satisfy these two properties: it is not a linear function. But
y D g.x/ D ax C b is an affine function.
Any function possesses the linearity property of f .˛x1 C ˇx2 / D ˛f .x1 / C ˇf .x2 / is
called a linear function. And there exists lots of such functions. But we need to generalize our

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 888

concept of function. A function f W D ! R maps an object of D to an object of R. By objects,


we mean anything: a number x, a point in 3D space x D .x; y; z/, a point in a n-dimensional
space, a function, a matrix etc.

Of course linear algebra studies vectors and functions that take a vector and return another
vector. However, a new term is used: instead of functions, mathematicians use transformations.
For a vector u 2 Rn , a transformation T turns it into a new vector v 2 Rm . For example, we can
define T W R2 ! R3 as: 2 3
" #! x1 C x2
x1 6 7
T D 4 x1 x2 5
x2
x1 x2
However among many types of transformation, linear algebra focuses on one special trans-
formation: linear transformation. This is similar to ordinary calculus focus on functions that are
differentiable.

Definition 11.6.1
A linear transformation is the transformation T W Rn ! Rm satisfying the following two
properties:

linearity: T .u C v/ D T .u/ C T .v/


homogeneity: T .˛u/ D ˛T .u/

for all u; v 2 Rn and ˛ 2 R. The domain of T is Rn and the codomain of T is Rm . For a


vector v in the domain of T , the vector T .v/ in the codomain is called the image of v under
the action of T . The set of all possible images T .v/ is called the range of T .

For abstract concepts (concepts for objects do not exist in real life) we need to think about
some examples to understand more about them. So, in what follows we present some linear
transformations.

Some 2D linear transformations. Fig. 11.15 shows a shear transformation. The equation for a
2D shear transformation is " #! " #
x1 x1 C x2
T D (11.6.1)
x2 x2
If we apply this transformation to the two unit vectors i and j , i is not affected but j is sheared
to the right ( D 1 in the figure). So the unit square made by i and j was transformed to a
parallelogram.
In Fig. 11.15 we applied the transformation T to all the grid lines of the 2D space. You can
see that grid lines (which are grey lines in Fig. 11.15a) are transformed to lines (red lines in
Fig. 11.15b), the origin is kept fixed and equally spaced points transformed to equally spaced
points. These are the consequence of the following properties of any linear transformation.
Let T W Rn ! Rm be a linear transformation, then

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 889
y

j
jO

x
i
x
iO

(a) A square before transformation (b) A parallelogram after transformation

Figure 11.15: Shear transformation is a linear transformation from plane to plane. Side note: a shear
transformation does not change the area. That’s why a parallelogram has the same area as the rectangle
of same base and height.

(a) T .0/ D 0 .

(b) For a set of vectors v1 ; v2 ; : : : ; vk and set of scalars c1 ; c2 ; : : : ; ck , we haveŽŽ

T .c1 v1 C c2 v2 C    C ck vk / D c1 T .v1 / C c2 T .v2 / C    C ck T .vk /

The second property is the mathematical expression of the fact that linear transformations
preserve linear combinations. For example, if v is a certain linear combination of other vectors
s; t, and u, say v D 3s C 5t 2u, then T .v/ is the same linear combination of the images of
those vectors, that is T .v/ D 3T .s/ C 5T .t/ 2T .u/.

The standard matrix associated with a linear transformation. Consider again the linear
transformation in Eq. (11.6.1). Now, we choose three vectors: the first two are very specials–
they are the unit vectors e 1 D .1; 0/ and e 2 D .0; 1/; the third vector is arbitrary a D .1; 2/.
After the transformation T , we get three new vectors:

T .e 1 / D .1; 0/; T .e 2 / D .1; 1/; T .a/ D .3; 2/

As a D e 1 C 2e 2 and a linear transformation preserves the linear combination, we have

T .a/ D 1T .e 1 / C 2T .e 2 /

Proof: T .v/ D T .0 C v/ D T .0/ C T .v/.
ŽŽ
Proof for k D 2: T .c1 v1 C c2 v2 / D T .c1 v1 / C T .c2 v2 / D c1 T .v1 / C c2 T .v2 /.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 890

Knowing matrix-vector multiplication as a linear combination of some columns, we can write


T .a/ as a matrix-vector multiplication:
" #" #
1 1 1
T .a/ D
0 1 2
Of course carrying out this matrix-vector multiplication will give us the same result as of direct
use of Eq. (11.6.1). It is even slower. Why bother then? Because, a linear transformation T W
Rn ! Rm determines an m  n matrix A, and conversely, an m  n matrix A determines a linear
transformation T W Rn ! Rm . This is important as from now on when we see Ax D b, we do
not see a bunch of meaningless numbers, but we see it as a linear transformation that A acts on
x to bring it to b.
We now just need to generalize what we have done. Let’s consider a linear transformation
T W Rn ! Rm . Now, for a vector u D .u1 ; u2 ; : : : ; un / in Rn , we can always write u as a linear
combination of the standard basis vectors e i (we can use a different basis, but that leads to a
different matrix):
u D u1 e 1 C u2 e 2 C    C un e n
So, the linear transformation applied to u can be written as

T .u/ D T .u1 e 1 C u2 e 2 C    C un e n / D u1 T .e 1 / C u2 T .e 2 / C    C un T .e n / (11.6.2)

which indicates that the transformed vector T .u/ is a linear combination of the transformed basis
vectors i.e., T .e i /, in which the coefficients are the coordinates of the vector. In other words,
if we know where the basis vectors land after the transformation, we can determine where any
vector u lands in the transformed space.
Now, assume that the n basis vectors in Rn are transformed to n vectors in Rm with coordi-
nates (implicitly assumed that the standard basis for Rm was used)
T .e 1 / D .a11 ; a21 ; : : : ; am1 /
T .e 2 / D .a12 ; a22 ; : : : ; am2 /
::: D :::
T .e n / D .a1n ; a2n ; : : : ; amn /
So we can characterize a linear transformation by storing T .e i /, i D 1; 2; : : : ; n in an m  n
matrix like this
2 3
2 3 a11 a12    a1n
j j j j 6 7
6 7 6 a21 a22    a2n 7
A WD 4T .e 1 / T .e 2 /    T .e n /5 D 6
6 :: :: :::
7
7 (11.6.3)
4 : : a1n 5
j j j j
am1 am2    amn
That is, each column of this matrix is T .e i /, which is a vector of length m. This matrix is called
the standard matrix representing the linear transformation T . Why standard? Because we have
used one standard basis for Rn and another standard bases for Rm .

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 891

With this introduction of A, the linear transformation in Section 11.11.3 can be re-written as
a matrix-vector product:
T .u/ WD Au (11.6.4)
A visual way to understand linear transformations is to use a geogebra applet and play
with it. In Fig. 11.16, we present some transformations of a small image of Mona Lisa. By
changing the transformation matrix M , we can see the effect of the transformation immediately.

(a) (b)

Figure 11.16: Transformation in a plane: geogebra applet.

Proof. Now we can prove that a linear transformation maps a line to a line. Considering a line l
described by p C t d for t 2 R. Then, for every t 2 R, we have

T .p C td/ D A.p C t d/ D Ap C t Ad D p 1 C td 1

Hence T .l/ D l1 , where l1 W p 1 C td 1 is obviously a line. Using similar arguments, it can be


shown that a linear transformation preserves the property of parallelism among lines and line
segments (i.e., parallel lines are mapped to parallel lines).
To prove equally spaced points are mapped to equally spaced points, we consider points A
and B such that A is specified by vector u and B by v, and v D 2u (or OA D AB). After a linear
transformation, A becomes A0 and B becomes B 0 specified by vectors Au and A.2u/ D 2.Au/.
Hence, OA0 D A0 B 0 . 
Determinants. While playing with the geogebra applet we can see that sometimes a transfor-
mation enlarges the image and sometimes it shrinks the image. Can we quantify this effect of
a linear transformation? Let’s do it, but in a plane only. We consider a general transformation
matrix " #
a b
AD
c d


It can be found easily using google https://www.geogebra.org/m/pDU4peV5.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 892

which tells us that the unit vector i is now at .a; c/ and j is at y


a
.b; d /. We are going to compute the area of the parallelogram b

made up by these two vectors. This parallelogram is what the c


unit square (which has an area of 1) has been transformed to.
d
Based on the next figure, this area is ad bc: the area of the big
square minus the total area of those simple shapes. So, any unit
square in the plane is transformed to a parallelogram with an area d
of ad bc. What about a square of 2  2? It is transformed to a c
parallelogram of area 4.ad bc/. So, ad bc is the scaling of
the transformation. But how about a curvy domain in the plane? a b x

Is it still true that its area is scaled up by the same amount? We’re in linear algebra, but do not
forget calculus! The area of any shape is equal to the sum of the area of infinitely many unit
squares, and each small unit square is scaled by ad bc and thus the area of any shape is scaled
by ad bc. Mathematicians call this scaling the determinant of the transformation matrix. They
use either detA or jAj to denote the determinant of matrix A.
Looking at the determinant (i.e., ad bc) of a 2  2 matrix it can be suspected that this
determinant can be positive or negative. See Fig. 11.17 for an illustration of what a negative
determinant means.

Figure 11.17: Determinant of a matrix can be negative. In that case the linear transformation flips the
space or changes the orientation. Look at the orientation of the unit vectors before the transformation and
after.

It is obvious that the next move is to repeat the same analysis but in 3D. The three unit vectors
make a cube of which volume is one, see Fig. 11.18a. After the (linear transformation) these
vectors are transformed to a D .a; d; g/, b D .b; e; h/ and c D .c; f; i/. Thus, we consider the
following 3  3 matrix
2 3
a b c
6 7
A D 4d e f 5
g h i
which is the matrix of a 3D linear transformation. The determinant of this matrix is the volume
of the parallelepiped formed by these three vectors (Fig. 11.18b). We know how to compute

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 893

(a) before transformation (b) after transformation

Figure 11.18: The determinant of a 33 matrix is the volume of the parallelepiped formed by the columns
of the matrix.

such a volume using the scalar triple product in Section 11.1.5 (particularly Fig. 11.12b):
volume D c  .a  b/ D .aei C dhc C gbf / .ceg C ahf C dbi/
and this is the determinant of our 3  3 matrix A.
That’s the most we can do about determinant using geometry. We cannot find out the
formula for the determinant of a 4  4 matrix. How did mathematicians proceed then? We refer
to Section 11.9 for more on the determinant of a square matrix.


Matrix-matrix product as composition of transformations. Still remember function compo-
sition, like sin x , which is composed of two functions: y D sin t and t D x 3 ? Now we apply
3

the same concept but our functions are linear transformations. And doing it will reveals the rule
for matrix-matrix multiplication.
Assume we have a linear transformation T W Rn ! Rm and a second linear transformation
S W Rm ! Rp . From the previous sub-section, we know that there exists a matrix A, of size
m  n, associated with the transformation T and another matrix B (of size p  m) associated
with S. Now, consider a composite transformation of first applying T .u/ and second applying
S to the outcome of the first transformation. Mathematically, we write .S ı T /.u/ D S.T .u//
which transform u 2 Rn to Rp .
Assume that there exists a matrix CŽ associated with .S ı T /.u/. Then the j th column of C
is .S ı T /.ej /:

CŒW; j  D .S ı T /.ej / D S.T .ej // D S.AŒW; j / D BAŒW; j 

Therefore,
h i
BA D BA1 BA2    BAn (11.6.5)

So, if we denote by C D BA the matrix-matrix multiplication, then the j th column of C is


the product of matrix B and the j th column of A. Using Eq. (11.4.4), we can write the entry Cij
Ž
We can see this by .S ı T /.u/ D S.Au/ D B.Au/ D .BA/u.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 894

P
as Cij D m kD1 Bi k Akj which is the familiar matrix-matrix multiplication rule: the entry at row
i and column j of BA is the dot product of row i of B and column j of A.
We have ab D ba for a; b 2 R. Do we have the same for matrix-matrix product: AB D BA
(of course if their sizes are consistent for matrix-matrix multiplication).
 The answer is no. And
the proof is simple: f .g.x// ¤ g.f .x//, for example sin x 3 ¤ sin3 x.
How about ABC? From function composition discussed in Section 4.2.3, we know that it is
associative, so .AB/C D A.BC/. This is a nice proof, much better than the proof that is based
on the definition of matrix-matrix multiplication (you can try it to see my point).
With the geometric meaning of determinant and matrix-matrix product, it is easy to see that
the determinant of the product of two matrices is the product of the determinants of each matrix:

jABj D jAjjBj (11.6.6)

This is because AB is associated with first a linear transformation which area scaling of jBj,
followed by another transformation which area scaling of jAj. Thus, in total the area scaling
should be jAjjBj.

11.7 Linear algebra with Julia


Let the computer do the computation so that we can focus on the theory of linear algebra.
This would not serve you well in exams; but I am not fan of exams. To serve this purpose, in
Listing 11.1 I summarize some common matrix operations using Julia.

11.8 Orthogonality
We begin with orthogonal vectors in Section 11.8.1. Orthogonality of two n-vectors is a gener-
alization of the notion of perpendicularity of two vectors in R3 . Orthogonal vectors are always
linear independent and thus make a good base for a subspace. When these vectors are normal-
ized i.e., having unit lengths, they make orthonormal basis vectors (Section 11.8.2). Stacking
orthonormal vectors column by column and we obtain an orthogonal matrix (Section 11.8.3).

11.8.1 Orthogonal vectors & orthogonal bases


We know that two vectors in R2 or R3 , a and b, are called orthogonal when their dot product
is zero i.e., a  b D 0. We extend this to vectors in Rn . So, vectors x; y in Rn are said to be
orthogonal (denoted by x ? y) if x  y D 0 or x > y D 0. Usually, we are interested in a bunch
of vectors that are orthogonal to each other as the following definition.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 895

Listing 11.1: Basic linear algebra in Julia.


1 using LinearAlgebra # you have to install this package first
2 using RowEchelon
3 A = [2 4 -2;4 9 -3;-2 -3 7] # create a 3x3 matrix
4 b = [2,8,10] # create a vector of length 3
5 x = A\b # solving Ax=b
6 E21 = [1 0 0;-2 0 1;0 0 1]
7 E21*A # matrix matrix multiplication
8 detA = det(A) # determinant of A
9 invE = inv(E) # inverse of E
10 rref(A) # get reduced row echelon form
11 A’ # get transpose of A
12 x = A[1,:] # get 1st row
13 y = A[:,1] # get 1st col
14 AA = zeros(3,3) # 3x3 zero matrix
15 BB = ones(3,3) # 3x3 one matrix
16 dot(x,y) # dot product of two vecs
17 V = eigvecs(A) # cols of matrix V = eigenvectors of A
18 v = eigvals(A) # vector v contains eigenvalues of A
19 svd(A) # SVD of A
20 norm(b, Inf) # max norm of b
21 norm(b, 1) # sum norm of b, or 1-norm of b, see Norms

Definition 11.8.1
A set of vectors a1 ; : : : ; ak in Rn is an orthogonal set if all pairs of distinct vectors in the set
are orthogonal. That is if

ai  aj D 0 for any i; j with i ¤ j , i; j D 1; 2; : : : ; k

The most famous example of an orthogonal set of vectors is the standard basis
fe 1 ; e 2 ; : : : ; e n g of Rn . And we know that these basic vectors are linear independent. There-
fore, we guess that orthogonal vectors are linear independent. And that guess is correct as stated
by the following theorem.

Theorem 11.8.1: Orthogonality-Independence


Given a set of non-zero orthogonal vectors a1 ; : : : ; ak in Rn , then they are linear independent.

Proof. The idea is to assume that there is a zero vector expressed as a linear combination of these
orthogonal vectors. Then take the dot product of two sides with ai and use the orthogonality to

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 896

obtain ˛i D 0 for i D 1; 2; : : :ŽŽ :


˛1 a1 C ˛2 a2 C    C ˛k ak D 0
H) ai  .˛1 a1 C ˛2 a2 C    C ˛k ak / D 0
(11.8.1)
H) ˛i .ai  ai / D 0
H) ˛i D 0


Example 11.7
Considering these three vectors in R3 : v1 D .2; 1; 1/, v2 D .0; 1; 1/ and v3 D .1; 1; 1/.
We can see that: (i) they form an orthogonal set of vectors, then (ii) from theorem 11.8.1, they
are linear independent, then (iii) 3 independent vectors in R3 form a basis for R3 . If these
vectors form a basis, then we can find the coordinates of any vector in R3 w.r.t. this basis.
Find the coords of v D .1; 2; 3/.
We simply have to solve the following system to find the coordinates of v D .1; 2; 3/:
2 32 3 2 3 2 3
2 0 1 c1 1 1=6
6 76 7 6 7 6 7
4 1 1 1 5 4c2 5 D 425 H) c D 45=25
1 1 1 c3 3 2=3

Solving a 3  3 system is not hard, but what if the question is for a vector in R100 ? Is there
any better way? The answer is yes, and thus orthogonal bases are very nice to work with. We
need to define first what an orthogonal basis is.

Definition 11.8.2
An orthogonal basis for a subspace S of Rn is a basis of S that is an orthogonal set.

Now, we are going to find out the coordinates of v D .1; 2; 3/ using an easier way. We write
v in terms of the orthogonal basis vectors .v1 ; v2 ; v3 /, and we take the dot product of both sides
with v1 , due to the orthogonality, all terms vanish, and we’re left with c1 :
v D c1 v 1 C c2 v 2 C c3 v 3
H) v  v1 D .c1 v1 C c2 v2 C c3 v3 /  v1
v  v1 v  v1
H) v  v1 D c1 .v1  v1 / H) c1 D D
v1  v1 kv1 k2
What does this formula tell us? To find c1 , just compute two dot products: one of v with the
first basis vector, and the other is the squared length of this basis vector. The ratio of these two
products is c1 .
ŽŽ
If the last step was not clear, just use a specific a1 , and assuming there are only 3 vectors a1 ; a2 ; a3 . Then, the
LHS of the second line in Eq. (11.8.1) is: a1  .˛1 a1 C ˛2 a2 C ˛3 a3 /, which is ˛1 a1  a1 C ˛2 a1  a2 C ˛3 a1  a3 D
˛1 jja1 jj C 0 C 0. And thus, we get ˛1 D 0. Similarly, we get ˛2 D 0 if we started with a2 and so on.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 897

Nothing can be simpler. Wait, I wish we did not have to do the division with the squared
length of v1 . It is possible if that vector has a unit length. And we know that we can always make
a non-unit vector a unit vector simply by dividing it by its length, a process known as normalizing
a vector, see Eq. (11.1.7). Thus, we now move from orthogonal bases to orthonormal bases.

11.8.2 Orthonormal vectors and orthonormal bases


First, a definition is given for orthonormal vectors and orthonormal basis.

Definition 11.8.3
A set of vectors in Rn is an orthonormal set if it is an orthogonal set of unit vectors. An
orthonormal basis for a subspace S of Rn is a basis of S that is an orthonormal set.

If a collection of vectors a1 ; : : : ; ak is mutually orthogonal and jjai jj D 1, i D 1; 2; : : : ; k


(i.e., having unit length), then it is orthonormal. We can combine the two conditions of orthogo-
nality and normality to have:
8
<1 i D j
Orthonormal vectors a1 ; a2 ; : : :: ai  aj D D ıij
:0 i ¤ j

where we have introduced the Kronecker delta notation (named after Leopold Kronecker ) ıij .
A vector b in a subspace S with an orthonormal basis v1 ; v2 ; : : : ; vk has coordinates w.r.t.
to the basis given by

b D ˛1 v1 C ˛2 v2 C    C ˛k vk ; ˛i D b  vi (11.8.2)

Did we see this before? Remember Monsieur Fourier? What he did was to write a periodic
function f .x/ as a linear combination of the sine/cosine functions:
1 
X nx nx 
f .x/ D a0 C an cos C bn sin
nD1
L L

And below is how he obtained the coefficients in this linear combinationŽŽ :


Z
1 L nx
an D f .x/ cos dx
L L L
Z
1 L nx
bn D f .x/ sin dx
L L L

Leopold Kronecker (7 December 1823 – 29 December 1891) was a German mathematician who worked on
number theory, algebra and logic. He criticized Georg Cantor’s work on set theory, and was quoted by Weber (1893)
as having said, "God made the integers, all else is the work of man".
ŽŽ
To be historically precise Euler did this before Fourier, even though Euler doubted the idea of trigonometric
expansion of a periodic function.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 898

What he was doing? To find the coefficients an , he multiplied the function f .x/ with cos nx=L
and integrated. This is similar to b  vi and herein the basis vectors are the functions
sin x; cos x; sin 2x; cos 2x; : : :. They are orthonormal to each other. Of course we need to define
what the dot product of two functions is. See Eq. (11.11.9) for the definition of the dot product
of two functions.
We have gone a long way: two vectors in R2 can be orthogonal to each other, then two
n-vectors can also be orthogonal to each other. We even have two functions orthogonal to each
other. Why not two orthogonal matrices?

11.8.3 Orthogonal matrices


Orthonormal vectors are special because of vi vj D ıij . If we put some orthonormal vectors in a
matrix, we should get a special matrix. Let’s do that with three orthonormal vectors v1 ; v2 ; v3 2
R3 . They make a 3  3 matrix A. Now, consider the product A> A, and see what matrix we get:
2 32 3 2 3
v1 j j j 1 0 0
6 76 7 6 7
4 v2 5 4v1 v2 v3 5 D 40 1 05
v3 j j j 0 0 1

We got an identity matrix, which is special. This reminds us of the inverse, and indeed we have
some special matrix-a matrix of which the inverse is equal to the transpose:

A> A D I H) A> D A 1

And this leads to the following special matrix whose inverse is simply its transpose. The notation
Q is reserved for such matrices.

Definition 11.8.4
An n  n matrix Q whose columns form an orthonormal set is called an orthogonal matrix.

We now present an example of an orthogonal matrix. Assume that we want to rotate a point
P to P 0 an angle ˇ as shown in Fig. 11.19. The coordinates of P 0 are given by
" #
cos ˇ sin ˇ
x 0 D Rx; R D
sin ˇ C cos ˇ

It is easy to check that the columns of R are orthonormal vectors. Therefore, R> R D I, which
can be checked directly. We know that any rotation preserves length (that is jjx 0 jj D jjxjj or
jjRxjj D jjxjj); which is known as isometry in geometry. It turns out that every orthogonal
matrix transformation is an isometry. Note also that det R D 1. It is not a coincidence. Indeed,
from the property A> A D I, we can deduce the determinant of A:

A> A D I H) det A> A D 1 H) .det.A//2 D 1 H) det.A/ D ˙1

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 899

x′ = r cos(α + β)
y ′ = r sin(α + β)
P ′ (x′ , y ′) β
x′ = r cos α cos β − r sin α sin β
= x cos β − y sin β
r
y ′ = r sin α cos β + r cos α sin β
β P (x, y) = x sin β + y cos β
α
x
x = r cos α

Figure 11.19: Rotation in a plane is a matrix transformation that preserves length. The matrix of the
rotation is an orthogonal matrix.


I used det.AB/ D det.A/ det.B/ and det A> D det.A/. With this special example of an
orthogonal matrix (and its properties), we now have a theorem on orthogonal matrices.

Theorem 11.8.2
Let Q be an n  n matrix. The following statements are equivalent.

(a) Q is orthogonal.

(b) Qx  Qy D x  y for all x; y 2 Rn .

(c) jjQxjj D jjxjj for all x 2 Rn .

Proof. We prove (a) to (b) first:

Qx  Qy D .Qx/> .Qy/ D .x > Q> /Qy D x > .Q> Q/y D x > Iy D x > y D x  y

Going from (b) to (c) is easy: use (b) with y D x. We need to go backwards: (c) to (b) to (a),
which is left as an exercise. Check Poole’s book if stuck. 

11.8.4 Orthogonal complements

Considering a plane in R3 and a vector n normal to the plane, then n is orthogonal to all vectors
in the plane. We now extend this to any subspace of Rn .

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 900

Definition 11.8.5
Let W be a subspace of Rn .

(a) We say that a vector v is orthogonal to W if it is orthogonal to every vector in W h .

(b) The set of all vectors that are orthogonal to W is called the orthogonal complement of
W , denoted by W ? . That is,

W ? D fv 2 Rn W v  w D 0 for all w 2 W g

(c) Two subspaces S and W are said to be orthogonal i.e., S ? W if and only if x ? y, or,
x > y D 0 for all x 2 S and for all y 2 W .
h
For a vector to be orthogonal to a subspace, it just needs to be orthogonal to the span of that subspace.

This definition actually consists of three definitions. The first one extend the idea that we
discussed in the beginning of this section. Why we need W ? ? Because it is a subspace. We
know how to prove whether something is a subspace: Assume that v1 ; v2 2 W ? , and we need
to show that c1 v1 C c2 v2 is also in W ? :
(
v1  w D 0
H) .c1 v1 C c2 v2 /  w D 0
v2  w D 0

And the third definition is about orthogonality of two subspaces. We has gone a long way from
orthogonality of two vectors in R2 to that of two vectors in Rn , then to the orthogonality of one
vector a subspace and finally to the orthogonality of two subspaces.

Fundamental subspaces of a matrix. With the concept of orthogonal complements, we can


understand more about the row space, column space and null space of a matrix. Furthermore,
we will see that there is another subspace associated with a matrix. The results are summarized
in the following theorem.
Theorem 11.8.3
Let A be an m  n matrix. Then the orthogonal complement of the row space of A is the null
space of A, and the orthogonal complement of the column space of A is the null space of A> :

.R.A//? D N.A/; .C.A//? D N.A> /

The proof is straightforward. The null space of A is all vector x such that Ax D 0, and from
matrix-vector multiplication, this is equivalent to saying that x is orthogonal to the rows of A.
Now, replace A by its transpose, then we have the second result in the theorem above.
To conclude, an mn matrix A has four subspaces, namely R.A/, N.A/, C.A/, N.A> /. But
they go in pairs: the first two are orthogonal complements in Rn , and the last two are orthogonal
in Rm .

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 901

11.8.5 Orthogonal projections


We recall the orthogonal projection of a vector u onto another vector v, defined by the projection
operator projv .u/, check Section 11.1.4 if needed:

uv
projv .u/ WD v (11.8.3)
vv

While projecting u onto vector v, we also get perpv .u/ WD u projv .u/, which is orthogonal to
v, see Fig. 11.20aŽŽ . This indicates that we can decompose a vector u into two vectors,

u D projv .u/ C perpv .u/

of which one is in span.v/ and the other is in span.v/? .


The next step is of course to project a vector u on a plane in R3 . Suppose that this plane (of
dim 2) has a basis .i ; j /. We can project u onto the first basis vector i , then project it onto the
second basis and sum the two, see Fig. 11.20b,

proji ;j .u/ WD proji .u/ C projj .u/ (11.8.4)

Is this still an orthogonal projection? We just need to check whether proji ;j .u/  i D 0 and
proji ;j .u/  j D 0. The answer is yes, and due to the fact that i ? j .

A u u ˘.i;j / .u/

j ˘j .u/
u O y
u ˘v .u/ ? v
i
˘i .u/

v
O ˘v .u/ H x
(a) Projection on a line (b) Projection on a plane

Figure 11.20: Orthogonal projection of a vector onto another vector (or a line) and onto a plane.

ŽŽ
Proof: v  perpv .u/ D v  .u uv=vvv/ Dvu u  v D 0.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 902

Definition 11.8.6
Let W be a subspace of Rn and let fv1 ; v2 ; : : : ; vk g be an orthogonal basis for W . For any
vector v 2 Rn , the orthogonal projection of v onto W is defined as
     
v1  v v2  v vk  v
projW .v/ D v1 C v2 C    C vk
v1  v1 v2  v2 vk  vk
The component of v orthogonal to W is the vector

perpW .v/ D v projW .v/

11.8.6 Gram-Schmidt orthogonalization process


The Gram-Schmidt algorithm takes a set of linear independent vectors v1 ; v2 ; : : : ; vk and gen-
erates an orthogonal linear independent set of vectors u1 ; u2 ; : : : ; uk . In the process, if needed,
these vectors can be normalized to get an orthonormal set e 1 ; e 2 ; : : : ; e k .
The method is named after the Danish actuary and mathematician Jørgen Pedersen Gram
(1850 – 1916) and the Baltic German mathematician Erhard Schmidt (1876 – 1959), but Pierre-
Simon Laplace had been familiar with it before Gram and Schmidt.
The idea is to start with the first vector v1 , nothing to do here so we take u1 D v1 and
normalize it to get e 1 . Next, move to the second vector v2 , we make it orthogonal to u1 by
u2 D v2 proju1 .v2 /. Then, we normalize u2 . Now, move to the third vector v3 . We make it
orthogonal to the hyperplane spanned by u1 and u2 . And the process keeps going until the last
vector:
u1
u1 D v1 ; e1 D
jju1 jj
u2
u2 D v2 proju1 .v2 /; e2 D
jju2 jj
u3
u3 D v3 proju1 .v3 / proju2 .v3 /; e3 D
jju3 jj
::
:
X
k 1
uk
uk D vk projui .vk /; ek D
i
jjuk jj

The calculation of the sequence u1 ; : : : ; uk is known as Gram–Schmidt orthogonalization, while


the calculation of the sequence e 1 ; : : : ; e k is known as Gram–Schmidt orthonormalization as
the vectors are normalized.
If we denote by W1 D span.v1 /, then u1 is the basis of W1 . Moving on, let
W2 D span.v1 ; v2 /, we guess that fu1 ; u2 g are the basis vectors for W2 . Why? First, by
definition u1 and u2 are linear combinations of v1 and v2 , thus they are in W2 . Second, they
are linear independent (because they’re orthogonal). And finally, fu1 ; u2 ; : : : ; uk g form an

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 903

orthogonal basis for the subspace Wk D span.v1 ; v2 ; : : : ; vk /.

11.8.7 QR factorization
The Gauss elimination process of Ax D b results in the LU factorization: A D LU. Now, the
Gram-Schmidt orthogonalization process applied to the linear independent columns of a matrix
A results in another factorization–known as the QR factorization: A D QR. To demonstrate this
factorization, consider a matrix with three independent columns A D Œa1 ja2 ja3 . Applying the
Gram-Schmidt orthonormalization to these three vectors we obtain e 1 ; e 2 ; e 3 . We can write then

a1 D .e 1 ; a1 /e 1
a2 D .e 1 ; a2 /e 1 C .e 2 ; a2 /e 2
a3 D .e 1 ; a3 /e 1 C .e 2 ; a3 /e 2 C .e 3 ; a3 /e 3

which can be written as (using block matrices multiplication)


2 3
h i h i .e 1 ; a 1 / .e 1 ; a 2 / .e 1 ; a 3 /
6 7
A D a1 a2 a3 D e 1 e 2 e3 4 0 .e 2 ; a2 / .e 2 ; a3 /5 D QR
0 0 .e 3 ; a3 /

The matrix Q consists of orthonormal columns and thus is an orthogonal matrix (that explains
why the notation Q was used). The matrix R is an upper triangular matrix.

History note 11.1: James Joseph Sylvester (1814 – 1897)


James Joseph Sylvester was an English mathematician. He made fun-
damental contributions to matrix theory, invariant theory, number the-
ory, partition theory, and combinatorics. He played a leadership role in
American mathematics in the later half of the 19th century as a profes-
sor at the Johns Hopkins University and as founder of the American
Journal of Mathematics. At his death, he was a professor at Oxford
University.
James Joseph was born in London on 3 September 1814, the son of
Abraham Joseph, a Jewish merchant. James later adopted the surname
Sylvester when his older brother did so upon emigration to the United States—a country
which at that time required all immigrants to have a given name, a middle name, and
a surname. Sylvester began his study of mathematics at St John’s College, Cambridge
in 1831, where his tutor was John Hymers. Although his studies were interrupted for
almost two years due to a prolonged illness, he nevertheless ranked second in Cambridge’s
famous mathematical examination, the tripos, for which he sat in 1837.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 904

11.9 Determinant
To derive the formula for the determinant of a square matrix n  n when n > 3, we cannot
rely on geometry. To proceed, it is better to deduce the properties of the determinant from the
special cases of 2  2 and 3  3 matrices. From those properties, we can define what should be
a determinant. It is not so hard to observe the following properties of the determinant of a 2  2
matrix (they also apply for 3  3 matrices):

 The determinant of the 2  2 unit matrix is one; this is obvious because this matrix does
not change the unit square at all;

 If the two columns of a 2  2 matrix are the same, its determinant is zero; this is obvious
either from the formula or from the fact that the two transformed basic vectors collapse
onto each other, a domain transforms to a line with zero area;

 If one column is a multiple of the other column, the determinant is also zero; The ex-
planation is similar to the previous property; this one is a generalization of the previous
property;

 If one column of a 2  2 matrix is scaled by a factor ˛, the determinant is scaled by the


same factor: " #
a ˛b
BD H) jBj D ˛ad ˛cd D ˛jAj
c ˛d

 Additive property:
jŒu v C wj D jŒu vj C jŒu wj
This is a consequence of the fact that we can decompose the area into two areas, see
Fig. 11.21.

 If we interchange the columns of A, the determinant changes sign (changes by a factor of


-1): " # " #
b a a b
det D bc da D .ad bc/ D det
d c c d

11.9.1 Defining the determinant in terms of its properties


Up to now we know that a matrix 2  2 or 3  3 has a number associated with it, which is called
the determinant of the matrix. We can see it as a function D W Rnn ! R which assigns to
each n  n matrix a single real number. We write D.A/ to label this number, and we also write
D in terms of the columns of A: D.a1 ; a2 ; : : : ; an / where a1 ; a2 ; : : : ; an are the columns of
the matrix A. We did this because from the previous discussion we know that the determinant
depends heavily on the columns of the matrix.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 905

y
detŒu z D u1 z2 D u1 v2 C u1 w2

z z2 D v2 C w2

w
v
detŒu v D u1 v2
u
x
Figure 11.21: Additive area property: to ease the proof, vector u is aligned to the x-axis.

Now, we propose the following properties for D inspired from the properties of the
determinants of 3  3 matrices.

Property 1. D.I/ D 1.
Property 2. D.a1 ; a2 ; : : : ; an / D 0 if ai D aj for i ¤ j .
Property 3. If n 1 columns of A held fixed, then D.A/ is a linear function of the remaining
column. Stated in terms of the j th column, this property says that:
D.a1 ; : : : ; aj 1 ; uC˛v; : : : ; an / D D.a1 ; : : : ; aj 1 ; u; : : : ; an /C˛D.a1 ; : : : ; aj 1 ; v; : : : ; an /

This comes from the additive area property and the fact that if we scale one column by ˛, the
determinant is scaled by the same factor.

Property 4. D is an alternating function of the columns, i.e., if two columns are interchanged,
the value of D changes by a factor of -1. Let’s focus on columns i th and j th, so we write
D.ai ; aj / leaving other columns untouched and left behind the scene. What we need to show is
that D.aj ; ai / D D.ai ; aj /.
Proof. The proof is based on Property 2 and Property 3. The trick of using Property 2 is to add
zero or subtract zero to a quantity.
:0
D.aj ; ai / D D.aj ; ai / C  .added 0 due to Property 2/

D.a i ; ai /
 

D D.ai C aj ; ai / .due to Property 3/
:0
D D.ai C aj ; ai / D.ai C ; ai C aj / .minus 0 due to Property 2/

aj 

D D.ai C aj ; aj / .due to Property 3/
:0
D D.ai ; aj / .due to Property 3/

D.a
j ; aj /


D D.ai ; aj / .due to Property 3 with ˛ D 1/

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 906


Property 5. If the columns of A are linear dependent then D D 0. One interesting case is that
if A has at least one row of all zeros, its determinant is zeroŽŽ .
Proof. Without loss of generality, we can express a1 as a1 D ˛2 a2 C ˛3 a3 C    C ˛n an . Now,
D.A/ is computed as
D D D.a1 ; a2 ; : : : ; an /
D D.˛2 a2 C ˛3 a3 C    C ˛n an ; a2 ; : : : ; an /
D D.˛2 a2 ; a2 ; : : : ; an / C D.˛3 a3 ; a2 ; : : : ; an / C    C D.˛n an ; a2 ; : : : ; an /
D ˛2 D.a2 ; a2 ; : : : ; an / C ˛3 D.a3 ; a2 ; a3 ; : : : ; an / C    C ˛n D.an ; a2 ; : : : ; an /
D 0 C 0 C    C 0 .Property 2/
where Property 3 was used in the third equality, Property 3 again in the fourth equality (with
˛ D 0).

Property 6. Adding a multiple of one column to another one does not change the determinant.
Proof. Suppose we obtain matrix B from A by adding ˛ times column j to column i . Then,
D.B/ D D.a1 ; : : : ; ai 1 ; ai C ˛aj ; : : : ; an /
D D.a1 ; : : : ; ai 1 ; ai ; : : : ; an / C ˛D.a1 ; : : : ; ai 1 ; aj ; : : : ; an / .Property 3/
D D.a1 ; : : : ; ai 1 ; ai ; : : : ; an / .second red term is zero of Property 2/
D D.A/


11.9.2 Determinant of elementary matrices


It is obvious that we have
2 3
" # a 0 0
a 0 6 7
det D ab; det 40 b 05 D abc
0 b
0 0 c
which can be verified using the formula, or from the geometric meaning of the determinant.
What is more interesting is the following results:
2 3
" # a d e
a c 6 7
det D ab; det 40 b f 5 D abc
0 b
0 0 c
ŽŽ
In case it is not clear. Any set of vectors containing the zero vector is linearly dependent: 10C0a2 C  C0ak D
0.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 907

A geometry explanation for these results is that for the 2D matrix, shearing a rectangle does not
change its area, and for the 3D matrix, shearing a cube also does not change its volume. Still we
need an algebraic proof so that it can be extended to larger matrices. For the 3  3 matrix, the
second column can be decomposed as
2 3 2 3 2 3
d d 0
6 7 6 7 6 7
4 b 5 D 4 0 5 C 4b 5
0 0 0
Then, using Property 3, its determinant is given by
ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ
ˇa d e ˇ ˇa d e ˇ ˇa 0 e ˇ ˇa 0 e ˇ
ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ
ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ
ˇ0 b f ˇ D ˇ0 0 f ˇ C ˇ0 b f ˇ D ˇ0 b f ˇ
ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ
ˇ0 0 c ˇ ˇ0 0 c ˇ ˇ0 0 c ˇ ˇ0 0 c ˇ
The red determinant is zero because of Property 5: the first and second columns are linear
dependent. Now, we do the same thing for the determinant left by decomposing column 3:
ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ
ˇa 0 e ˇ ˇa 0 e ˇ ˇa 0 0 ˇ ˇa 0 0ˇ ˇa 0 0ˇ
ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ
ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ
ˇ0 b f ˇ D ˇ0 b 0ˇ C ˇ0 b f ˇ C ˇ0 b 0ˇ D ˇ0 b 0ˇ D abc
ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ
ˇ0 0 c ˇ ˇ0 0 0ˇ ˇ0 0 0 ˇ ˇ0 0 c ˇ ˇ0 0 c ˇ

Property 7. The determinant of a triangular matrix is the product of its diagonal entries. This
property results in another fact that if A is a triangular matrix, its transpose is also a triangular
matrix with the same entries on the diagonal, thus D.A/ D D.A> /. This holds for any square
matrix, not just for triangular matrix.

Property 8. D.A> / D D.A/. The proof goes as: If A is invertible, it can be written as a product
of some elementary matrices:
A D E1 E2    Ek
Thus, with D.EF/ D D.E/D.F/, we can write
D.A/ D D.E1 E2    Ek / D D.E1 /D.E2 /    D.Ek /
D D.E> > > > > >
1 /D.E2 /    D.Ek / D D.Ek /    D.E2 /D.E1 /
D D.E> > > > >
k    E2 E1 / D D..E1 E2    Ek / / D D.A /

where the fact that for an elementary matrix E, D.E> / D D.E/ was used. The importance of
Property 7 is that it allows us to conclude that all the properties of the determinant that we have
stated concerning the columns also work for rows; e.g. if two rows of a matrix are the same its
determinant is zero. This is so because the columns of A> are the rows of A.

Property 9. If A is invertible then we have det.A 1 / D 1=det.A/ŽŽ . So, we do not need to know
what A 1 is, still we can compute its determinant.
ŽŽ
We start with AA 1
D I, which leads to det.A 1
/det.A/ D det.I/ D 1.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 908

11.9.3 A formula for the determinant


Let’s start with a 3  3 matrix. We can compute its determinant as follows (decomposing the
first column as the sum of three vectors and use Property 3):
ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ
ˇa ˇ ˇa ˇ ˇ 0 a ˇ ˇ 0 a ˇ
ˇ 11 12 13 ˇ ˇ 11 12 13 ˇ ˇ
a a a a 12 a 13 ˇ ˇ 12 a 13 ˇ
ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ
ˇa21 a22 a23 ˇ D ˇ 0 a22 a23 ˇ C ˇa21 a22 a23 ˇ C ˇ 0 a22 a23 ˇ
ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ
ˇa31 a32 a33 ˇ ˇ 0 a32 a33 ˇ ˇ 0 a32 a33 ˇ ˇa31 a32 a33 ˇ
Next, for the second determinant in the RHS, we exchange row 1 and row 2 (Property 4), we
get a minus. Then, for the third determinant in the RHS, we exchange rows 1/3 and another
exchange between rows 3/2 (Property 4 with two minuses we get a plus):
ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ
ˇa ˇ ˇ ˇ ˇ ˇ ˇ ˇ
ˇ 11 a12 a13 ˇ ˇa11 a12 a13 ˇ ˇa21 a22 a23 ˇ ˇa31 a32 a33 ˇ
ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ
ˇa21 a22 a23 ˇ D ˇ 0 a22 a23 ˇ ˇ 0 a12 a13 ˇ C ˇ 0 a12 a13 ˇ (11.9.1)
ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ
ˇa31 a32 a33 ˇ ˇ 0 a32 a33 ˇ ˇ 0 a32 a33 ˇ ˇ 0 a22 a23 ˇ
The nice thing we get is that all the three determinants in the RHS are of this form:
ˇ ˇ 02 31
ˇa ˇ
ˇ 11 12 13 ˇ
d d a 11 d 12 d 13
ˇ ˇ B6 7C
ˇ 0 b22 b23 ˇ D det @4 0 5A
ˇ ˇ B
ˇ 0 b32 b33 ˇ 0
which can be re-written as (to get a lower triangular matrix)
ˇ ˇ ˇ ˇ
ˇa ˇ ˇ ˇ  
ˇ 11 d12 d13 ˇ ˇa11 d12 d13 ˇ
ˇ ˇ ˇ ˇ b23 b32
ˇ 0 b22 b23 ˇ D ˇ 0 b22 b23 ˇ D a11 b22 b33
ˇ ˇ ˇ ˇ b22
ˇ 0 b32 b33 ˇ ˇ 0 0 b33 b23 b32=b22 ˇ
D a11 .b22 b33 b32 b23 /
where in the first equality Property 6 (for row) was used and in the second equality, Property
7 was used. The red term is called a cofactor, and it is exactly the determinant of B. With this
result, Eq. (11.9.1) can be re-written as
ˇ ˇ
ˇa ˇ ˇ ˇ ˇ ˇ ˇ ˇ
ˇ 11 12 13 ˇ
a a ˇ ˇ ˇ ˇ ˇ ˇ
ˇ ˇ ˇ 22 23 ˇ
a a ˇ 12 13 ˇ
a a ˇ 12 13 ˇ
a a
ˇa21 a22 a23 ˇ D a11 ˇ ˇ a21 ˇ ˇ Ca31 ˇ ˇ (11.9.2)
ˇ ˇ ˇa32 a33 ˇ ˇa32 a33 ˇ ˇa22 a23 ˇ
ˇa31 a32 a33 ˇ
Finally, noting that the matrix B is obtained by deleting a certain row and column of A. So, we
define Aij the matrix obtained by deleting row ith and column j th of A. With this definition,
the determinant of A can be expressed as:
ˇ ˇ
ˇa ˇ
ˇ 11 a12 a13 ˇ
ˇ ˇ
ˇa21 a22 a23 ˇ D a11 jA11 j a21 jA21 j C a31 jA31 j (11.9.3)
ˇ ˇ
ˇa31 a32 a33 ˇ

There is nothing special about the first column; this is just one way to go.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 909

There is a pattern in this formula. Also this formula works for 2  2 matrix (you can check it).
So, for a n  n matrix A, its determinant is given by:
X
n
jAj D a11 jA11 j a21 jA21 j C a31 jA31 j    C an1 jAn1 j D . 1/i 1
ai1 jAi1 j (11.9.4)
i D1

This is called the cofactor expansion along the first column of A.


But why only column 1? We can choose any column to start and for column j th, we have
the following definition of the determinant:

X
n
jAj D . 1/i Cj aij jAij j (11.9.5)
i D1

Why this definition works? Because it allows us to define the determinant of a matrix inductively:
we define the determinant of an n  n matrix in terms of the determinants of .n 1/  .n 1/
matrices. We begin by defining the determinant of a 1  1 matrix A D Œa by det.A/ D a. Then
we proceed to 2  2 matrices, then 3  3 and so on. This is similar to how the factorial was
defined: nŠ D n.n 1/Š. Note that a definition is not the best way to compute the factorial and
also the determinant.

11.9.4 Cramer’s rule


Cramer’s rule solves Ax D b using determinants. Its unique feature is that it provides an explicit
formula for x. The key idea is (assuming a system of 3 equations and 3 unknowns) the following
identity: 2 32 3 2 3
6 7 x1 0 0 b1 a12 a13
6 A 76 7 6 7
4 54x2 1 05 D 4b2 a22 a23 5 WD B1
x3 0 1 b3 a32 a33
where B1 is the matrix A with the first col replaced by b. Now, taking the determinant of both
sides, noting that the determinant of the red matrix (triangular matrix) is x1 and the determinant
of the product is equal to the product of the determinants:

x1 jAj D jB1 j

which gives us x1 :
jB1 j
x1 D
jAj
which is strikingly similar to x D b=a for the linear equation ax D b. But now, we have to
live with determinants. Similarly, we have x2 D jB2 j=jAj. The geometric meaning of Cramer’s
rule is given in Fig. 11.22 for the case of 2  2 matrices for y D jB2 j=jAj (noting that y is x2 ).
The area of the parallelogram formed by e 1 and x is y (or x2 ). After the transformation by A,
e 1 becomes a1 D .a11 ; a21 / and x becomes b. The transformed parallelogram’s area is thus

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 910

y y
area = y


a b
x b area = 11 1
y a21 b2
= y det(A)

a1
e1 x x
1
before transformation after transformation

Figure 11.22: Geometric meaning of Cramer’s rule illustrated for 2  2 matrices. Considering A as a
linear transformation which transforms e 1 into a1 –the first col of A, and x into b.

det.Œa1 b/. But we know that this new area is the original area scaled by the determinant of A.
The Cramer rule follows.
It is now possible to have Cramer’s rule for a system of n equations for n unknowns, if
jAj ¤ 0

jB1 j jB2 j
x1 D ; x2 D ; : : : ; Bj is matrix A with j th col replaced by b (11.9.6)
jAj jAj

It is named after the Genevan mathematician Gabriel Cramer (1704–1752), who published the
rule for an arbitrary number of unknowns in 1750, although Colin Maclaurin also published
special cases of the rule in 1748 (and possibly knew of it as early as 1729).
Cramer’s rule is of theoretical value than practical as it is not efficient to solve Ax D b
using Cramer’s rule; use Gaussian elimination instead. However, it leads to a formula of the
inverse of a matrix in terms of the determinant of the matrix. We discuss this now.

Cramer’s rule and inverse of a matrix. Suppose that we want to find the inverse of 2  2 matrix
A. Let’s denote " #
x y
A 1D 1 1
x2 y2
We then solve for x1 ; x2 ; y1 ; y2 such that AA 1 D I, or two systems of linear equations:
" # " # " # " #
x1 1 y 0
A D ; A 1 D
x2 0 y2 1

And Cramer’s rule is used for this, so we have


" # " # " # " #
1 a12 a 1 0 a12 a 0
det det 11 det det 11
0 a22 a21 0 1 a22 a21 1
x1 D ; x2 D ; y1 D ; y2 D
det A det A det A det A
Phu Nguyen, Monash University © Draft version
Chapter 11. Linear algebra 911

Thus, we obtain the explicit formula for the inverse of a 2  2 matrix: (And also understand why
for an invertible matrix, the determinant must be non-zero)
" # " #
a b 1 d b
AD H) A 1 D (11.9.7)
c d ad bc c a
What next? Many people would go for a general n  n matrix, but I am slow, so I do the same
for a 3  3 matrix. However, I just need to compute the .3; 2/ entry of the inverse:
02 31
B6
a11 a21 0
7C " #!
det @4a21 a22 15A a11 a21
. 1/det
a31 a32 0 a31 a32 jA23 j
.A 1 /32 D D D
jAj jAj jAj
Notice that the nominator of the .3; 2/ entry of the inverse matrix is the cofactor jA23 j. Now, we
have the formula for the inverse of a n  n matrix:
2 3
jA11 j jA21 j    jAn1 j
6 7
 det A 1 6 jA j jA j    jA j 7
adj A; adj A D 6 7
j i 12 22 n2
A 1 ij D ; A 1D 6 :: :: ::: :: 7 (11.9.8)
det A det A 4 : : : 5
jA1n j jA2n j    jAnn j

where two formula are presented: the first one is for the ij -entry of the A 1 , and the second one
is for the entire matrix A 1 with the introduction of the so-called adjoint (or adjugate) matrix of
A. This matrix is the transpose of the matrix of cofactors of A.

11.10 Eigenvectors and eigenvalues


The definition of eigenvectors and eigenvalues is actually simple: an eigenvector x of a matrix
A is a vector such that Ax is in the same direction as the vector x: Ax D x. In other words,
multiplying matrix A with an eigenvector gives a new vector x where –an eigenvalue–is a
stretching/shrinking factor. The computation of eigenvectors and eigenvalues for small matrices
of sizes 2  2, 3  3 can be done manually and in a quite straightforward manner.
However, it is hard to understand why people came up with the idea of eigenvectors. To
present a motivation for eigenvectors, we followed Euler in his study of rotation of rigid bodies.
In this context, eigenvectors appear naturallyŽŽ . So, we discuss briefly angular momentum and
inertia tensor in Section 11.10.1. Then, in Section 11.10.2 we discuss principal axes and principal
moments for a 3D rotating rigid body. From this starting point, we leave mechanics behind, and
move on to the maths of eigenvectors. I have read An Introduction To Mechanics by Daniel
Kleppner, Robert Kolenkow [34] and Classical Mechanics by John Taylor [70] for the materials
in this section.
Eigenvectors appear in many fields and thus I do not know exactly in what context eigenvalues first appeared.
ŽŽ

My decision to use the rotation of rigid bodies as a natural context for eigenvalues is that the maths is not hard.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 912

11.10.1 Angular momentum and inertia tensor


Let’s consider a rigid body which is divided into many small pieces with masses m˛
(˛ D 1; 2; 3; : : :). Now assume that this rigid body is rotating about an arbitrary axis with an
angular velocity !. The total angular momentum of that body is given by:
X
lD r ˛  p˛ (11.10.1)
˛

where p˛ D m˛ !  r ˛ ; and r ˛ denotes the position vector of mass ˛ Ž . With the vector identity
a  .b  c/ D b.a  c/ c.a  b/, we can elaborate the angular momentum l further as
X X 
lD m˛ r ˛  .!  r ˛ / D m˛ r˛2 ! m˛ r ˛ .r ˛  !/ (11.10.2)
˛ ˛

With a coordinate system, the angular velocity and position vector are written as
2 3 2 3
!x x˛
6 7 6 7
! D 4!y 5 ; r ˛ D 4y˛ 5
!z z˛

Thus, we can work out explicitly the components of the angular momentum in Eq. (11.10.2) as
2 3 2 3
2 2
lx .y C z /! x y ! x z !
6 7 X
x ˛ ˛ y ˛ ˛ z
6 7
˛ ˛
4ly 5 D m˛ 4 y˛ x˛ !x C .x˛2 C z˛2 /!y y˛ z˛ !z 5
˛
lz z˛ x˛ !x z˛ y˛ !y C .x˛2 C y˛2 /!z

which can be re-written in matrix-vector notation as


2 3 2P P P 32 3
lx m˛ .y˛2 C z˛2 / m˛ x˛ y˛ m˛ x˛ z˛ !x
6 7 6 P P P 76 7
4ly 5 D 4 m˛ y˛ x˛ 2 2
m˛ .x˛ C z˛ / m˛ y˛ z˛ 5 4!y 5 (11.10.3)
P P P
lz m˛ z˛ x˛ m˛ z˛ y˛ m˛ .x˛2 C y˛2 / !z
„ ƒ‚ …
I!

The matrix I! is called the moment of inertia matrix; it is a symmetric matrixŽŽ .


Next, we show that by calculating the kinetic energy of a 3D rotating body, the matrix I!
shows up again. The kinetic energy is given by
X m˛ v˛  v˛ X m˛ .r ˛  !/  .r ˛  !/
KD D (11.10.4)
˛
2 ˛
2
Ž
The length of this position vector is denoted by r˛ .
ŽŽ
To be precise, I! is a second order tensor, and its representation in a coordinate system is a matrix. However,
for the discussion herein, the fact that I! is a tensor is not important.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 913

which in conjunction with this vector identity jja  bjj D jjajj2 jjbjj2 .a  b/2 becomes
X m˛ .r 2 ! 2 .r ˛  !/2 /
˛
KD (11.10.5)
˛
2

Using the components of r ˛ and !, K is written as



1 X
KD m˛ .y˛2 C z˛2 /!x2 C m˛ .x˛2 C z˛2 /!y2 C m˛ .x˛2 C y˛2 /!z2
2 ˛
 (11.10.6)
2m˛ x˛ y˛ !x !y 2m˛ y˛ z˛ !y !z 2m˛ x˛ z˛ !x !z

which is a quadratic form; check Section 7.7.4 for a refresh. So, we can re-write it in this familiar
vector-matrix-vector product, and of course the matrix is I:
1
K D !> I! ! (11.10.7)
2
Moment of inertia for continuous bodies. For a continuous body B, its matrix of moment of
inertia is given by (sum is replaced by integral and mass is replaced by dV ):
Z Z Z
Ixx D .y C z /dV; Iyy D .x C z /dV; Izz D .x 2 C y 2 /dV
2 2 2 2
BZ ZB BZ
(11.10.8)
Ixy D xydV; Ixz D xzdV; Iyz D yzdV
B B B

Example 11.8
As the first example, compute the matrix of inertia for a cube of side a and mass M (the mass
is uniformly distributed i.e., the density is constant) for two cases: (a) for a rotation w.r.t. to
one corner and (b) w.r.t. to the center of the cube. The coordinate system axes are parallel to
the sides.
For case (a), we have:
Z Z Z a Z a Z a
2 2 2 2Ma2
Ixx D Iyy D Izz D y dV C z dV D 2 dx y dy dz D
3
Z a Z a Z a 0
2
0 0
Ma
Ixy D Ixz D Iyz D  xdx ydy dz D
0 0 0 4
where M D a3 . Thus, the inertia matrix is given by (this matrix has a determinant of
242M a2=12) 2 3
8 3 3
Ma 62
7
I! D 4 3 8 35 (11.10.9)
12
3 3 8

Check the discussion around Eq. (11.1.20) if this identity is not clear.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 914

Now, we will compute the angular momentum if the cube is rotated around the x-axis (due
to symmetry it does not matter which axis is chosen) with an angular velocity ! D .!; 0; 0/.
The angular velocity in this case is
2 32 3 2 3
8 3 3 ! 8!
Ma 62
7 6 7 Ma 6
2
7
lD 4 3 8 35 4 0 5 D 4 3! 5
12 12
3 3 8 0 3!

What we learn from this? Two things: first the inertia matrix is full and the angular momentum
is not parallel to the angular velocity. That is I! is in different direction than !. Let’s see
p
what we get if the angular velocity is along the diagonal of the cube i.e., ! D != 3.1; 1; 1/:
2 32 3 2 3
8 3 3 1 2
M a2 ! 6 7 6 7 Ma ! 6 7 Ma
2 2
lD p 4 3 8 35 415 D p 425 D !
12 3 12 3 6
3 3 8 1 2

In this case, the angular momentum is parallel to the angular velocity. In other words, I! ! D
!,  D M a2 =6.
For case (b), we have (same calculations with different integration limits from a=2 to
a=2 instead)
Z Z Z a=2 Z a=2 Z a=2
2 2 2 Ma2
Ixx D Iyy D Izz D y d V C z dV D 2 dx y dy dz D
a=2 a=2 a=2 6
Z a=2 Z a=2 Z a=2
Ixy D Ixz D Iyz D  xdx ydy dz D 0
a=2 a=2 a=2

Figure 11.23
Actually Ixy is zero because the integrand is an odd function xy. Another explanation is, by
looking at Fig. 11.23, we see that the material on the side above the plane y D 0 cancels the
contribution of the material below this plane (so, Ixy D Iyz D 0). Thus, the inertia matrix is

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 915

given by 2 3
1 0 0
Ma2 6 7
I! D 40 1 05
6
0 0 1
If we compute now the angular momentum for any angular velocity !, we get l D .M a2=6/!
because I! is a multiple of the identity matrix (the red matrix). So, we see two things: (1)
the inertia matrix is diagonal (entries not in the diagonal are all zeros), and (2) the angular
momentum is parallel to the angular velocity, or I! D !,  D Ma2 =6. And this holds for
any ! because of the infinite symmetry of a cube w.r.t. to its center.

Example 11.9
The second example is finding the inertia matrix for a spinning top that is a uniform solid cone
(mass M , height h and base radius R) spinning about its tip O; cf. Fig. 11.23. The z-axis is
chosen along the axis of symmetry of the cone.
All the integrals in the inertia matrix are computed using cylindrical coordinates. Due to
symmetry, all the non-diagonal terms are zero; and Ixx D Iyy . So, we just need to compute
three diagonal terms. Let’s start with Izz , but not Ixx (we will see why this saves us some
calculations):
Z Z
Izz D .x C y /dV D  r 3 drddz
2 2

Z h "Z zR= h Z 2 #
3 3M 2
D r dr d dz D R
0 0 0 10
R
From this we also get y 2 dV D Izz =2 D .3M=20/R2 . And this saves us a bit of work
when calculating Ixx :
Z Z Z
Ixx D .y C z /dV D y dV C z 2 dV
2 2 2

Z h "Z zR= h Z 2 #
3M 2
D .3M=20/R2 C  rdr d z 2 dz D .R C 4h2 /
0 0 0 20

So, the inertia matrix for this cone is a diagonal matrix:


2 3
1 0 0
6 7 3M 2 3M 2
I! D 4 0 1 0 5 ; 1 D .R C 4h2 /; 2 D R
20 20
0 0 2
We get a diagonal matrix. For an angular velocity .!x ; !y ; !z /, the corresponding angular mo-
mentum is .1 !x ; 1 !y ; 2 !z /. To get something interesting, consider this angular velocity
! D .!; 0; 0/ (that is rotation about the x-axis), then the angular momentum is .1 !; 0; 0/ or
1 !.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 916

11.10.2 Principal axes and eigenvalue problems


We have studied some inertia matrices and we have observed that for the same solid, depending
on the chosen axes, its inertia matrices are either full or diagonal, and when the inertia matrix
is diagonal, if the rotation axis is one of the axes, then the angular momentum is parallel to the
rotation axis. We also observe that by exploiting the symmetry of the solid, we can select axes so
that the inertia matrix is diagonal. A question naturally arise: does a non-symmetric solid have
axes such that its matrix of inertia is diagonal? The answer is yes (probably due to Euler) and
such axes are called principal axes by the great man. The diagonal inertia matrix has this form
2 3
1 0 0
6 7
I! D 4 0 2 0 5
0 0 3

where i are called principal moments.


Ok, now we have two problems. The first problem is how to prove that any non-symmetric
solid has principal axes and the second problem is how to find the principal axes. Herein, we
focus on the second problem, being pragmatic. But wait, why the angular momentum being
parallel to the rotation axis is important? Otherwise, people did not spend time studying this
case. ???
To find the principal axes, we use the fact that for a principal axis through a certain origin O,
if the angular velocity points along this axis, then the angular momentum is parallel to !, that is:

I! ! D ! (11.10.10)

And this is an eigenvalue equation. A vector ! satisfying Eq. (11.10.10) is called an eigenvec-
tor, and the corresponding number , the corresponding eigenvalueŽŽ . To solve the eigenvalue
equation, we re-write it in this form .I! I/! D 0. This equation only has non-zero solution
(i.e., ! ¤ 0) only when the determinant of the coefficient matrix is zero (if the determinant is
not zero, then the only solution is ! D 0, similar to equation 2x D 0). That is,

det.I! I/ D 0 (11.10.11)

This is called the characteristic equation which is a cubic equation in terms of . Solving this for
 and substitute  into Eq. (11.10.10), we get a system of linear equations for three unknowns
! of which solutions are the eigenvectors (or principal axes).
We consider the cube example again (case a). The characteristic equation is, see Eq. (11.10.9)
ˇ ˇ 8̂
ˇ8  ˇ ˆ
ˇ 3 3 ˇ <1 D 2
ˇ ˇ
ˇ 3 8  3 ˇ D 0 H) .2 /.11 / D 0 H) 2 D 11
2
ˇ ˇ ˆ
ˇ 3 3 8 ˇ :̂
 D 11 3

ŽŽ
The German adjective eigen means “own” or “characteristic of”. Eigenvalues and eigenvectors are character-
istic of a matrix in the sense that they contain important information about the nature of the matrix.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 917

with  D Ma2 =12. First observation: 1 C 2 C 3 D 24 and is equal to I11 C I22 C I33 .
Second observation 1 2 3 D 2423 , which is det I! . So, at least for this example, the sum of
the eigenvalues is equal to the trace of the matrix, and the product of the eigenvalues is equal to
the determinant of the matrix.
For the first eigenvalue  D 2, we have this system of equations:
2 32 3 2 3
6 3 3 !1 0
6 76 7 6 7
4 3 6 35 4!2 5 D 405
3 3 6 !3 0
p
of which the solution is !1 D !2 D !3 . So, the first principal axis is e 1 D .1= 3/.1; 1; 1/.
For the second and third eigenvalues  D 11, we have this system of equations:
2 32 3 2 3
3 3 3 !1 0
6 76 7 6 7
4 3 3 35 4!2 5 D 405
3 3 3 !3 0
of which the solution is !1 C !2 C !3 D 0. We are looking for the other two axes, so we think
of vectors perpendicular to the first principal axis i.e., e 1 . So, we write !1 C !2 C !3 D 0 as
!  e 1 D 0. This indicates that the other two axes are perpendicular to the first axis. Later on
we shall prove that the eigenvectors corresponding to distinct eigenvalues are orthogonal if the
matrix is symmetric.

Principal stresses and principal planes. It is a fact that the same thing happens again and
again in many different fields. Herein, we demonstrate this by presenting principal stresses and
principal planes from a field called solid mechanics or mechanics of materials. This field is
studied by civil engineers, mechanical engineers, aerospace engineers and those people who
want to design structures and machines.
Similar to I! , ! and l , in solid mechanics there are the (second order) stress tensor  , the
normal vector n and the traction vector t. And we also have a relation between them by Cauchy:
t D n (11.10.12)
Again t is in general not in the same direction as n. So, principal planes are those with normal
vectors n such that  n D  n, with  being called the principal stresses (there are three principal
stresses).

11.10.3 Eigenvalues and eigenvectors


We now provide a formal definition of eigenvectors and eigenvalues of a square matrix.

Definition 11.10.1
Let A be an n  n matrix. A scalar  is called an eigenvalue of A if there is a nonzero vector
x such that Ax D x. Such a vector is called an eigenvector of A corresponding to .

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 918

If x is an eigenvector of A with the corresponding eigenvalue , then Ax D x, which


leads to A.cx/ D c.Ax/ D cx D .cx/. This means that any non-zero multiple of x (that
is cx) is also an eigenvector. Thus, if we want to search for eigenvectors geometrically, we
need only consider the effect of A on unit vectors. Fig. 11.24 shows what happens when we
transform unit vectors with matrices. All transformed vectors lie on the surface of an ellipse;
refer to Section 11.12.2 for an explanation.

4 4

2 2

0 0

−2 −2

−4 −4
−4 −2 0 2 4 −4 −2 0 2 4

(a) (b)

Figure 11.24: Eigenpicture: x are points on the unit circle (highlighted by blue) and the transformed
vectors Ax, highlighted by red, are plotted head to tail with x. The eigenvector is the one in which the
blue and red vectors are aligned.

Example 11.10
Find the eigenvalues and the eigenspaces of
2 3
0 C1 0
6 7
A D 40 C0 15
2 5 4

The characteristic polynomial is

det.A I/ D 3 C 42 5 C 2 D . 1/. 1/. 2/

Thus, the characteristic equation is . 1/. 1/. 2/ D 0, which has solutions 1 D 2 D 1


and 3 D 2. Note that even though A is a 33 matrix, it has only two distinct eigenvalues. But
if we count multiplicities (repeated roots), then A has exactly three eigenvalues. The algebraic
multiplicity of an eigenvalue is its multiplicity as a root of the characteristic equation. Thus,
 D 1 has algebraic multiplicity 2 and  D 2 has algebra multiplicity 1.
Now, to find the eigenvectors for a certain , we such for x such that

.A I/x D 0

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 919

Thus, the eigenvector x is in the null space of A I. The set of all eigenvectors and the zero
vector forms a subspace known as an eigenspace and denoted by E . Now for 1 D 2 D 1
we need to find the null space of A I (using Gauss elimination)a :
2 3 2 3 02 31
1 1 0 1 0 1 0 1
6 7 6 7 B6 7C
A ID4 0 1 1 5 H) 4 0 1 1 0 5 H) E1 D span @415A
2 5 3 0 0 0 0 1

Similarly, to find the eigenvectors for 3 D 2, we look for the null space of A 2I:
2 3 2 3 02 3 1
2 1 0 1 0 1=4 0 1
6 7 6 7 B6 7 C
A 2I D 4 0 2 1 5 H) 4 0 1 1=2 0 5 H) E2 D span @425A
2 5 2 0 0 0 0 4

Note that, dim.E1 / D dim.E2 / D 1. Let us define the geometric multiplicity of an eigenvalue
to be the dimension of its eigenspace. Why we need this geometric multiplicity? Because
of this fact: an n  n matrix is diagonalizable if and only if the sum of the dimensions of
the eigenspaces is n or the matrix has n linearly independent eigenvectors. (Thus, the matrix
considered in this example is not diagonalizable.)
a
Why we see a row full of zeros? This is because A I is singular by definition of eigenvectors.

11.10.4 More on eigenvectors/eigvenvalues


If we take a 3  3 lower (or upper) triangular matrix A and compute its eigenvalues, we shall
find that the process is easy. This is because when A is a triangular matrix, so is A I with
diagonal entries ai i . And we know that the determinant of a triangular matrix is the product
of the diagonal terms, thus the characteristic equation is simply the following factored cubic
equation
.a11 /.a22 /.a33 / D 0

From that, the eigenvalues are simply the diagonal entries of the triangular matrix. This fact
holds for any n  n triangular matrix, including diagonal matrices.
Below are some properties of eigenvectors and eigenvalues:

1. If Ax D x, then A2 x D 2 x and An x D n x, for a positive integer nŽ .

2. If Ax D x, then A 1 x D  1 x.

3. If Ax D x, then An x D n x, for a any integer nŽŽ .


Ž
We write A2 x D AAx D A.Ax/ D A.x/ D .Ax/ D .x/ D 2 x.
ŽŽ
This holds because .An / 1 D .A 1 /n for positive integer n.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 920

4. A square matrix A is invertible if and only if 0 is not an eigenvalue of A‘ .

5. Let A be an n  n matrix and let 1 ; 2 ; : : : ; m be distinct eigenvalues of A with corre-


sponding eigenvectors v1 ; v2 ; : : : ; vm . Then, these eigenvectors are linear independent.

6. Let
Q 1 ; 2 ; : : : ; n bePa complete set of eigenvalues of an n  n matrix A, then det.A/ D
i i , and tr.A/ D i i .

Proof. [Proof of 5] For simplicity the proof is for a 2  2 matrix only. The two eigenvectors of A
are x 1 ; x 2 . Suppose that c1 x 1 C c2 x 2 D 0. Multiplying it with A yields: c1 1 x 1 C c2 2 x 2 D 0
and multiplying it with 2 gives: c1 2 x 1 C c2 2 x 2 D 0. Subtracting the obtained two equations
yields
.1 2 /c1 x 1 D 0
Now that 1 ¤ 2 and x 1 ¤ 0 (the premise of the problem), thus we must have c1 D 0. Doing
the same thing we also get c2 D 0. Thus, the eigenvectors are linear independent. 

Proof. Proof of 6

det.A I/ D p./ D . 1/n . 1 /. 2 /    . n /


D .1 /.2 /    .n /

11.10.5 Symmetric matrices


The question addressed in this section is: what is special about Ax D x when A is symmetric?
As we shall see, many nice results about eigenvectors/eigenvalues when the matrix is symmetric.
Strang wrote ‘It is no exaggeration to say that symmetric matrices are the most important
matrices the world will ever see’.
The first nice result is stated in the following theorem.

Theorem 11.10.1
If A is a symmetric real matrix, then its eigenvalues are real.

Proof. How we’re going to prove this theorem? Let denote by x and  the eigenvector and
eigenvalue of A;  might be a complex number of the form a C bi and the components of x
may be complex numbers. Our task is now to prove that  is real. One way is to prove that the
complex conjugate of , which is  D a bi , is equal to . That is, prove  D . To this end,
we need to extend the notion of complex conjugate to vectors and matrices. It turns out to be
easy: just replace the entries of vectors/matrices by the conjugates. That is, if A D Œaij , then its

Since A is only invertible when det A ¤ 0, which is equivalent to det.A 0I/ ¤ 0. Thus 0 is not an
eigenvalue of A when it is invertible.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 921

conjugate A D Œaij . Properties of complex conjugates as discussed in Section 2.25 still apply
for matrices/vectors; e.g. AB D AN B.
N
We start with Ax D x, and to make  appear, take the conjugate of this equation to get
Ax D Ax D x D x
Now, to use the information that A is real (which means that A D A) and it is symmetric (which
means that A> D A), we transpose the above Ax D x:
x > A D x >
Now we have two equations:
Ax D x; x > A D x >
Now we compute the dot product of the first equation with x > , and the dot product of the second
equation with x, we obtain
x > Ax D x > x; x > Ax D x > x; H) x > x D x > x H) . /x > x D 0
But, x > x ¤ 0 as x is not a zero vector (it is an eigenvector). Thus, we must have  D  or
a C bi D a bi which leads to b D 0. Hence, the eigenvalues are real. 
We know that for any square matrix, eigenvectors corresponding to distinct eigenvalues are
linear independent. For symmetric matrices, something stronger is true: such eigenvectors are
orthogonal|| . So, we have the following theorem.

Theorem 11.10.2
If A is a symmetric matrix, then any two eigenvectors corresponding to distinct eigenvalues
of A are orthogonal.

The proof of this theorem is not hard, but why we know this result? In Section 11.11.4 on
matrix diagonalization, we know that we can decompose A as A D VV 1 . Transposing it
gives us A> D .V 1 /> V> . As A is symmetric, we then have VV 1 D .V 1 /> V> . We
then guess that V> D V 1 . Or, V> V D I: V is an orthogonal matrix!
Theorem 11.10.3: Spectral theorem
Let A be an n  n real symmetric matrix, then it has the factorization A D QQ> with real
eigenvalues in  and orthonormal eigenvectors in the columns of Q:

A D QQ 1
D QQ> with Q> D Q 1

Next, we derive the so-called spectral decomposition of A. To see the point, assume that A
is a 2  2 matrix, we can then write (from the Spectral theorem)
" #" # " #
h i  0 q > h i  q> X 2
>
A D QQ D q1 q2 1 1
>
D q1 q2 1 1
>
D i qi q>
i (11.10.13)
0 2 q2  2 q2 i D1
||
The proof goes as 1 x 1  x 2 D    D 2 x 1  x 2 , thus .1 2 /x 1  x 2 D 0. But 1 ¤ 2 .

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 922

11.10.6 Quadratic forms and positive definite matrices


We have seen quadratic forms (e.g. ax 2 C bxy C cy 2 ) when discussing the extrema of functions
of two variables (Section 7.7) and when talking about the kinetic energy of a 3D rotating body
(Section 11.10.1). Now is the time for a formal definition of quadratic forms:

Definition 11.10.2
A quadratic form in n variables is a function f W Rn ! R of the form

f .x/ D x > Ax

where A is a symmetric n  n matrix and x 2 Rn . We refer to A as the matrix associated with


the quadratic form f .

Definiteness of quadratic forms. Let x be a vector in Rn and f .x/ D x > Ax is a quadratic


form. An important property of f .x/ is its definiteness, defined as:

f .x/ is positive definite f .x/ > 0 8x ¤ 0


f .x/ is positive semi-definite f .x/  0 8x
f .x/ is negative definite f .x/ < 0 8x ¤ 0 (11.10.14)
f .x/ is negative semi-definite f .x/  0 8x
f .x/ is indefinite f .x/ takes both +/- values

If f .x/ is positive definite, then its associated matrix A is said to be a positive definite matrix.
The next problem we have to solve is: when a quadratic form is positive definite? What is
then the properties of A? To answer this question, one observation is that, if there is no cross
term in f .x/, then it is easy to determine its positive definiteness. One example is enough to
convince us: f .x/ D 2x 2 C 4y 2 is positive semi-definite (PSD). Furthermore, without the cross
term, the associated matrix is diagonal:
" #" #
h i 2 0 x
f .x/ D 2x 2 C 4y 2 D x y (11.10.15)
0 4 y

Diagonal matrices? We need the spectral theorem (theorem 11.10.3) that states that an n  n
real symmetric matrix has the factorization A D QQ> with real eigenvalues in the diagonal
matrix  and orthonormal eigenvectors in the columns of Q. Thus, we do a change of variable
x D Qy, and compute the quadratic form with this new variable y, magic will happenŽŽ :
X
n
f .x/ D x > Ax D .Qy/> A.Qy/ D y > Q> AQ y D y > y D i yi2 (11.10.16)
„ ƒ‚ …
 i D1

ŽŽ
We cannot know this will work, but we have to try and usually pieces of mathematics fit nicely together.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 923

Obviously y > y is of the form in the RHS of Eq. (11.10.15).


With this result, it is now straightforward to say about the definiteness of a quadratic form:
it all depends on the eigenvalues of A:

f .x/ is positive definite A has all positive eigenvalues


f .x/ is positive semi-definite A has all non-negative eigenvalues
f .x/ is negative definite A has all negative eigenvalues (11.10.17)
f .x/ is negative semi-definite A has all non-positive eigenvalues
f .x/ is indefinite A has both positive/negative eigenvalues

Principal axes theorem and ellipses. Eq. (11.10.16) is the theorem of principal axes. This
theorem tells us that any quadratic form can be written in a form without the cross terms. This
is achieved by using a change of variable x D Qy. Now, we explain the name of the theorem.
Consider the following conic section (Section 4.1.5):
" #
5 4
5x 2 C 8xy C 5y 2 D 1 ” x > Ax D 1; x D .x; y/; A D
4 5

First, the eigenvectors and eigenvalues of A :


p p p p
1 D 1; 2 D 9I v1 D .1= 2; 1= 2/; v2 D .1= 2; 1= 2/

Then, the following change of variableŽŽ


" p p #
C1= 2 1= 2
x D Qx 0 ; Q D p p ; x 0 D .y1 ; y2 /
1= 2 1= 2

results in (see Eq. (11.10.16))


 2  2
y1 y1
1y12 C 9y22 D 1; or p C p D1
1= 1 1= 9
Thus, our conic is an ellipse. The second expression gives us the length of the ellipse axes. Now,
to graph this ellipse we need to know its axes. To this end, we need to know where is the unit
vector in the .y1 ; y2 / coordinate systems: e 01 D .1; 0/. Using x D Qy, we have
" p p #" # " p #
C1= 2 1= 2 1 C1= 2
Qe 01 D p p D p
1= 2 1= 2 0 1= 2


You can reverse the direction of v1 .
ŽŽ
If you check Section 4.1.5 again you would see that this change of variable is exactly the rotation mentioned
in that section. Here, we have A D C D 5, thus the rotation angle is =4.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 924
y

which is the first eigenvector of A. Similarly, e 02


D .0; 1/ is
v2 y2
the second eigenvector. Thus, the eigenvectors of A–the matrix λ2 = 9
associated with a quadratic form–give the directions of the
principal axes of the corresponding graph of the quadratic form.
This explains why A D QQ> is called the principal axis 45◦
x

theorem–it displays the axes. What is more, the eigenvalues of A λ1 = 1


gives us the lengths of the axes. The
p smaller eigenvalue (1) gives
v1
the length of semi major axis (1= 1) and the p larger eigenvalue y1
(9) gives the shorter axis ( of half length 1= 9). This geometry
will help us to solve constrained optimization problems relating to quadratic forms as explained
in what follows.

Constrained optimization problems. I present herein now one application about the definite-
ness of a quadratic form. Assume that a quadratic form f .x/ D x > Ax is positive semi-definite,
then since f .0/ D 0, the minimum value of f .x/ is zero, without calculus. It is more often that
we have to find the maximum/minimum of f .x/ with x subjected to the constraint kxk D 1.
Thus, we pose the following constrained optimization problem||

x > Ax
max or max x > Ax
x¤0 x>x jjxjjD1

The solution to this problem actually lies in Eq. (11.10.16): to see that just look at f D 1y12 C
9y22 D 1 with the constraint y12 C y22 D 1, the maximum is f D 9, the maximum eigenvalue of
the matrix associated with the quadratic form. Thus, we sort the eigenvalues of A in this order
1  2      n , then

X
n
f .x/ D i yi2 D 1 y12 C 2 y22 C    C n yn2
i D1
 1 y12 C 1 y22 C    C 1 yn2

 1 y12 C y22 C    C yn2 D 1

||
To see that the two forms are equivalent, we can do this

x > Ax x > Ax x> x


D D A
x>x jjxjj2 jjxjj jjxjj

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 925

Another derivation.

A symmetric matrix A has orthonormal eigenvectors v1 ; : : : ; vn . Thus, we can write


x D c1 v1 C    C cn vn , hence

kxk2 D x > x D c12 C    C cn2 ; x > Ax D 1 c12 C    C n cn2

Therefore, the Rayleigh quotient R.x/ is now written as

x > Ax 1 c12 C    C n cn2


D
x>x c12 C    C cn2

From that it can be seen that the maximum of R.x/ is 1 . What is nice with this form of
R.x/ is the ease to find the maximum of R.x/ when the constraint is x is perpendicular
to v1 . This constraint means that c1 D 0, thus

x > Ax 2 c22 C    C n cn2


max D max D 2
x>x c22 C    C cn2

11.11 Vector spaces


Up to this point we have seen many mathematical objects: numbers, vectors, matrices and
functions. Do these different objects share any common thing? Many, actually. First, we can add
two numbers, we can add two vectors, we can add two matrices and of course we can add two
functions. Second, we can multiply a vector by a scalar, a matrix by a scalar and a function by
a scalar. Third, adding two vectors gives us a new vector, adding two matrices returns a matrix,
and adding two functions gives us a function (not anything else).
We believe the following equation showing a vector in R4 , a polynomial of degree less than
or equal 3, and a 2  2 matrix
2 3
a " #
6 7
6 7
b a b
u D 6 7 ; p.x/ D a C bx C cx 2 C dx 3 ; A D
4c 5 c d
d

is a good illustration that all these objects are related. After all, they are represented by 4 numbers
a; b; c; d ŽŽ .
It seems reasonable and logical now for mathematicians to unify all these seemingly different
but similar objects. Here comes vector spaces, which constitute the most abstract part of linear
ŽŽ
We can view p.x/ D a C bx C cx 2 C dx 3 as a space–similar to Rn –with a basis of f1; x; x 2 ; x 3 g. Thus,
.a; b; c; d / are the coordinates of p.x/ with respect to that basis. And .a; b; c; d / can also be seen as the coordinates
of a point in R4 !

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 926

algebra. The term vector spaces is a bit confusing because not all objects in vector spaces are
vectors; e.g. matrices are not vectors. A better name would probably be linear spaces. About the
power of algebra, Jean le Rond d’Alambert wrote: "Algebra is generous; she often gives more
than is asked of her".

11.11.1 Vector spaces


To define a vector space, let V be a set of objects u; v; w; : : : on which two operations, called
addition and scalar multiplication are defined: the sum of u and v is denoted by u C v, and if
˛ is a scalar, the scalar multiple of v is denoted by ˛v. Then, V is defined as a vector space
(sometimes also referred to as linear space) if the following ten axioms are satisfied (˛; ˇ are
scalars):

(1) commutativity of addition: uCv DvCu


(2) associativity of addition: .u C v/ C w D u C .v C w/
(3) identity element of addition: uC0Du
(4) inverse element of addition: u C . u/ D 0
(5) distributivity w.r.t. vector addition: ˛.u C v/ D ˛u C ˛v
(11.11.1)
(6) distributivity w.r.t field addition: .˛ C ˇ/u D ˛u C ˇv
(7) distributivity : ˛.ˇu/ D .˛ˇ/u
(8) identity element of multiplication: 1u D u
(9) closure under addition: uCv 2V
(10) closure under multiplication: ˇu 2 V

So, a vector space is a set of objects called vectors, which may be added together and multiplied
("scaled") by numbers, called scalars and these vectors satisfy the above ten axioms. Sometimes
we see this notation .V; R; C; / to denote a vector space V over R with the two operations of
addition and multiplication.

Example 1. Of course Rn with n  1 is a vector space. All the ten axioms of a vector space can
be verified easily.

Example 2. Let P2 be the set of all polynomials of degree less than or equal 2 with real coeffi-
cients. To see if P2 is a vector space, we first need to define the two basic operations of addition
and scalar multiplication. If p.x/; q.x/ are two objects in P2 , then p.x/ D a0 C a1 x C a2 x 2
and q.x/ D b0 C b1 x C b2 x 2 . Addition and scalar multiplication are defined as

p.x/ C q.x/ D .a0 C b0 / C .a1 C b1 /x C .a2 C b2 /x 2 ; ˛p.x/ D ˛a0 C ˛a1 x C ˛a2 x 2

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 927

This verifies the last two axioms on closure. The identity element for addition is the
polynomial with all coefficients being zero. The inverse element of addition of p.x/ is
p.x/ D a0 a1 x a2 x 2 . Verification of other axioms is straightforward as they come from
the arithmetic rules of real numbers.

Example 3. Let denote by F the set of all real-valued functions defined on the real line. If f .x/
and g.x/ are such two functions and ˛ is a scalar, then we define .f C g/.x/ and ˛f .x/ as

.f C g/.x/ WD f .x/ C g.x/; .˛f /.x/ WD ˛f .x/ (11.11.2)

The zero function is f .x/ D 0 for all x. The negative function . f /.x/ is f .x/. It can
then be seen that F is a vector space, but a vector space of infinite dimension. Usually linear
algebra deals with finite dimensional vector spaces and functional analysis concerns infinite
dimensional vector space. But we do not follow this convention and cover both spaces in this
chapter. Similarly, we have another vector space, FŒa; b that contains all real-valued functions
defined on the interval Œa; b.

Example 4. All rectangular matrices of shape m  n belong to a vector space Rmn . From
Section 11.4.2, we can verify that matrices obey the ten axioms of linear spaces. And the columns
of a m  n matrix are also vector spaces because a column is a Rm vector.
If matrices are vectors, then we can do a linear combination of matrices, we can talk about
linearly independent matrices. For example, consider the space of all 2  2 matrices M . It is
obvious that we can write any such matrix as:

" # " # " # " # " #


a b 1 0 0 1 0 0 0 0
Da Cb Cc Cd
c d 0 0 0 0 1 0 0 1

The red matrices are linear independent, and they are the basis vectors of M ; they play the same
roles of the unit vectors e i that we’re familiar with.
If a C c D b C c then a D b for a; b; c being scalars, n vectors or matrices. Thus, we
guess that this holds for any vector in a vector space. The following theorem is a summary of
some properties that vectors in a vector space satisfy. These properties are called the trivial
consequences of the axioms as they look obvious.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 928

Theorem 11.11.1
Let V be a vector space, and a; b; c be vectors in V and c is a scalar. Then, we have

(a) If a C c D b C c then a D b.

(b) If a C b D b then a D 0.

(c) 0 D 0, 0v D 0.

(d) . 1/v D v.

(e) If cv D 0, then either c D 0 or v D 0.

Proof. Proof of (a) goes as follows.

aDaC0 (axiom 3)
D a C .c C x/ (x is the identity element for addition of c)
D .a C c/ C x (axiom 2)
D .b C c/ C x (given)
D b C .c C x/ (axiom 2)
DbC0 (x is the identity element for addition of c)
Db (axiom 3)

Proof of (b) is based on (a):

a C b D b D b C 0 H) a D 0 (using (a))

Proof of (c) is based on axioms 5/6ŽŽ :

ax:5 .b/
0 D .0 C 0/ D 0 C 0 H) 0 D 0

ax:5 .b/
0v D .0 C 0/v D 0v C 0v H) 0v D 0
Proof of (d) is:
ax:6 .c/
v C . 1/v D 1v C . 1/v D .1 C . 1//v D 0v D 0
But we know that v C . v/ D 0, thus . 1/v D v. Proof of (e) is (we’re interested in the case
c ¤ 0 only, otherwise (e) is simply (c)):
 
ax:8 1 ax:7 1 1 .c/
v D 1v D c v D .cv/ D 0 D 0
c c c

ŽŽ
Why? Because these axioms involve the scalar multiplication of vectors.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 929

11.11.2 Change of basis


Similar to Rn –which is a vector space–there exists subspaces inside a vector space V , each
subspace has a basis that is a set of vectors that are linear independent and span the subspace.
I do not state the definitions of those concepts here for sake of brevity. Below are some examples.

Example 1. In P2 , determine whether the set f1 C x; x C x 2 ; 1 C x 2 g is linearly independent.


Let c1 ; c2 ; c3 be scalars such that

c1 .1 C x/ C c2 .x C x 2 / C c3 .1 C x 2 / D 0

This is equivalent to

.c1 C c3 / C .c1 C c2 /x C .c2 C c3 /x 2 D 0  1 C 0  x C 0  x 2

Equating the coefficients of 1; x; x 2 results in

c1 C c3 D 0 c1 C c2 D 0; c2 C c3 D 0 H) c1 D c2 D c3 D 0

It follows that f1 C x; x C x 2 ; 1 C x 2 g is linearly independent.

Example 2. So that the set f1 C x; x C x 2 ; 1 C x 2 g is a basis for P2 . In the previous example, we


have shown that this set is linearly independent. Now to prove that it is the basis for P2 , we just
need to show that it spans P2 . In other words, we need to show that we can always find c1 ; c2 ; c3
such that
c1 .1 C x/ C c2 .x C x 2 / C c3 .1 C x 2 / D a C bx C cx 2
holds for all a; b; c. This is equivalent to

.c1 C c3 / C .c1 C c2 /x C .c2 C c3 /x 2 D a C bx C cx 2

And we then get again a linear system:

c1 C c3 D a c1 C c2 D b; c2 C c3 D c

The coefficient matrix of this system is invertible, thus it has a solution. As f1Cx; xCx 2 ; 1Cx 2 g
is a basis for P2 , we deduce that dim.P2 / D 2. And it is a finite dimensional subspace. The
following definition aims to make this precise.

Definition 11.11.1
A vector space V is called finite-dimensional if it has a basis consisting of finitely many
vectors. The dimension of V , denoted by dimV , is the number of vectors in a basis for V . The
dimension of the zero vector space f0g is defined to be zero. A vector space that has no finite
basis is called infinite-dimensional.

Coordinates. Consider a vector space V with a basis B D fv1 ; v2 ; : : : ; vn g, any vector v 2 V


can be written as a unique linear combination of vi ’s: v D c1 v1 C c2 v2 C    C cn vn . The vector

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 930

ŒvB D .c1 ; c2 ; : : : ; cn / is called the coordinate vector of v with respect to the basis B, or the
B-coordinates of v. Whatever v is, its B coordinates is a vector in the familiar Rn . Now with
the new object ŒvB , certainly we have some rules that this object obey, as stated by the following
theoremŽŽ .
Theorem 11.11.2
Consider a vector space V with a basis B D fv1 ; v2 ; : : : ; vn g. If we have two vectors u and v
in V and we know their coordinates ŒuB and ŒvB , then we can determine the coordinates of
their sum and the coordinates of ˛v

(a) Œu C vB D ŒuB C ŒvB

(b) Œ˛vB D ˛ŒuB

Example 11.11
Consider a vector in P2 : p.x/ D a C bx C cx 2 . If we use the standard basis B D f1; x; x 2 g
for P2 , then it is easy to see that the coordinate vectors of p.x/ w.r.t. B is
h i>
Œp.x/B D a b c

which is simply a vector in R3 . Thus, Œp.x/B connects the possibly unfamiliar space P2 with
the familiar space R3 . Points in P2 can now be identified by their coordinates in R3 , and every
vector-space calculation in P2 is accurately reproduced in R3 (and vice versa). Note that P2
is not R3 but it does look like R3 as a vector space.
What is Œ1B ? As we can write 1 D .1/.1/ C 0.x/ C 0.x 2 /, therefore Œ1B D .1; 0; 0/ D e 1 .
Similarly, ŒxB D .0; 1; 0/ D e 2 . So, if B D fv1 ; v2 ; : : : ; vn g is a basis for a vector space,
then Œvi B D e i .

The above example demonstrates that there is a connection between a vector space V
and Rn , and the following theorem is one of such connection. We shall use this theorem in
definition 11.11.2 when we discuss the change of basis matrix and use it to show that this matrix
is invertible.

Theorem 11.11.3
Let B D fv1 ; v2 ; : : : ; vn g be a basis for a vector space V and let u1 ; u2 ; : : : ; uk be vectors in
V , then fu1 ; u2 ; : : : ; uk g is linear independent in V if and only if fŒu1 B ; Œu2 B ; : : : ; Œuk B g is
linear independent in Rn .

Proof. First, we prove that if fu1 ; u2 ; : : : ; uk g is linear independent in V then


fŒu1 B ; Œu2 B ; : : : ; Œuk B g is linear independent in Rn . To this end, we consider a linear
ŽŽ
The proof is straightforward and uses the definition of ŒvB . If you’re stuck check [56].

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 931

combination of fŒu1 B ; Œu2 B ; : : : ; Œuk B g and set it to zero

c1 Œu1 B C c2 Œu2 B C    C ck Œuk B D 0

Our task is now to show that c1 D c2 D    D ck D 0. Theorem 11.11.2 allows us to rewrite the
above as
Œc1 u1 C c2 u2 C    C ck uk B D 0

which means that the coordinate vector of c1 u1 C c2 u2 C    C ck uk w.r.t. B is the zero vector.
Therefore, we can write

c1 u1 C c2 u2 C    C ck uk D 0v1 C 0v2 C    C 0vn D 0

Since fu1 ; u2 ; : : : ; uk g is linear independent the above equation forces ci ’s to be all zero.


Change of basis. Now, we discuss the topic of change of bases. The reason is simple: it is
convenient to work with some bases than others. We study how to do a change of bases herein.
Consider the easy R2 plane with two nonstandard bases: B with u1 D . 1; 2/ and u2 D .2; 1/;
and C with v1 D .1; 0/ and v2 D .1; 1/. Certainly, all these vectors (e.g. u1 ) are written with
respect to the standard basis .1; 0/ and .0; 1/. The question is: given a vector x with ŒxB D .1; 3/,
what is ŒxC ?
The first thing we need to do is to write the basis vectors of B in terms of those of C ŽŽ :
" # " # " # " #
1 1 1 3
D 3 C2 H) Œu1 C D
C2 0 1 C2
" # " # " # " #
C2 1 1 C3
D C3 1 H) Œu2 C D
1 0 1 1

Now, the vector x with ŒxB D .1; 3/ is x D 1u1 C 3u2 . Thus,


" # " #" # " #
h i 1 3 3 1 6
ŒxC D Œ1u1 C 3u2 C D 1Œu1 C C 3Œu2 C D Œu1 C Œu2 C D D
3 2 1 3 1

where Theorem 11.11.2 was used in the second step. And with the red matrix, denoted for now
by P, whose columns are the coordinate vectors of the basis vectors in B w.r.t. C, the calculation
of the coordinates of any vector in C is easy: ŒxC D PŒxB .
Thus, we have the following definition of this important matrix.

ŽŽ
You can either draw these vectors and see this or you can simply solve two 2-by-2 systems.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 932

Definition 11.11.2
Let B D fu1 ; u2 ; : : : ; un g and C D fv1 ; v2 ; : : : ; vn g be bases for a vector space V . The n  n
matrix whose columns are the coordinate vectors Œu1 C ; : : : ; Œun C of the vectors in the old
basis B with respect to the new basis C is denoted by PC B and is called the change-of-basis
matrix from B to C.
That matrix allows us to compute the coordinates of a vector in the new base:

ŒxC D PC B ŒxB

Change of basis formula relates the coordinates of one and the same vector in two different
bases, whereas a linear transformation relates coordinates of two different vectors in the same
basis. One more thing is that PC B is invertible, thus we can always go forth and back between
the bases:
ŒxC D PC B ŒxB H) ŒxB D PC 1 B ŒxC
Why the change-of-base matrix is invertible? This is thanks to theorem 11.11.3: the vectors
fu1 ; u2 ; : : : ; un g are linear independent in V , thus the vectors fŒu1 C ; : : : ; Œun C g are linear inde-
pendent in Rn : the columns of the change-of-basis matrix are thus linear independent. Hence, it
is invertible.

11.11.3 Linear transformations


In Section 11.6 a brief introduction to linear transformations from Rn to Rm was presented. In
this section, we present linear transformations from a vector space V to another vector space
W . I do not repeat the definition because we just need to replace Rn with V and Rm with W .
Instead, ‘new’ linear transformations are shown in what follows.

Example 1. The differential operator, D.f / D df =dx, is a linear transformation because

d .f C g/ df dg d .cf / df
D C ; Dc
dx dx dx dx dx
Example 2. Let FŒa; b be a vector space of all real-valued functions defined on the interval
Rb
Œa; b. The integration operator, S W FŒa; b ! R by S.f / D a f .x/dx is a linear transforma-
tion.
Linear transformation is a fancy term and thus seems scary. Let’s get back to the friendly
y D f .x/: pop in a number x and it is transformed to a new number f .x/. Thus, a linear
transformation is simply a generalization of the concept of function, instead of taking a single
number now it takes in a vector and gives another vector. The key difference is that linear
transformations are similar to y D ax not y D sin x: the transformation is linear only. In
Section 4.2.4 we have discussed the concept of range of a function. We extend that to linear
transformation and introduce a new concept: kernel of the transformation. For y D f .x/, the
roots of this function is all x  such that f .x  / D 0. The kernel of a linear transformation is
exactly this.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 933

Definition 11.11.3
Let T W V ! W be a linear transformation.

(a) The kernel of T , denoted by ker.T /, is the set of all vectors in V that are mapped by T
to 0 in W . That is,

ker.T / D fv 2 V W T .v/ D 0g

(b) The range of T , denoted by range.T /, is the set of all vectors in W that are images of
vectors in V under T . That is,

range.T / D fw 2 W W w D T .v/ for some v 2 V g

Definition 11.11.4: Rank and nullity of a transformation


Let T W V ! W be a linear transformation. The rank of T is the dimension of the range of T
and is denoted by rank.T /. The nullity of T is the dimension of the kernel of T and is denoted
by nullity.T /.

One-to-one and onto linear transformation. If we consider the function y D x 2 we have


y. x/ D y.x/ D x 2 ; that is the horizontal line y D x02 cuts the curve y D x 2 at two points. In
that case it is impossible to go inverse: from x 2 to x or x? We say that the function y D x 2 is
not one-to-one. Functions such as y D e x or y D x 3 are one-to-one functions.
A function is called onto if its range is equal to its codomain. The function sin W R ! R
is not onto. Indeed, taking b D 2, the equation sin.x/ D 2 has no solution. The range of the
sine function is the closed interval Œ 1; 1, which is smaller than the codomain R. The function
y D e x is not onto: the range of y D e x is .0; 1/ which is not R. Functions such as y D x 3 are
onto functions.
A function such as y D x 3 , which is both one-to-one and onto, is special: we can always
perform an inverse: x ! x 3 ! x. Now, we generalize all this to linear transformations.

Definition 11.11.5
Consider a linear transformation T W V ! W .

(a) T is called one-to-one if it maps distinct vectors in V to distinct vectors in W . That is,
for all u and v in V , then u ¤ v implies that T .u/ ¤ T .v/.

(b) T is called onto if range.T / D W . In the words, the range of T is equal to the codomain
of T . Or, every vector in the codomain is the output of some input vector. That is, for
all w 2 W , there is at least one v 2 V such that T .v/ D w.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 934

Again the definition above is not useful to check whether a transformation is one-to-one or
onto. There exists theorems which provides simpler ways to do that. Below is such a theorem:

(a) A linear transformation T W V ! W is one-to-one if ker.T / D f0g.

(b) Let T W V ! W be an one-to-one linear transformation. If S D fv1 ; v2 ; : : : ; vk g is a linear


independent set in V , then T .S/ D fT .v1 /; T .v2 /; : : : ; T .vk /g is a linear independent set
in W .

(c) A linear transformation T W V ! W is invertible if it is one-to-one and onto.

Isomorphism of vector spaces.

Definition 11.11.6
A linear transformation T W V ! W is called an isomorphism if it is one-to-one and onto. If
V and W are two vector spaces such that there is an isomorphism form V to W , then we say
that V is isomorphic to W and write V Š W .

The idea is that an isomorphism T W V ! W means that W is “just like” V in the context of
any question involving addition and scalar multiplication. The word isomorphism and isomor-
phic are derived from the Greek words isos, meaning “equal” and morph, meaning “shape”.

Example 11.12
Show that Pn 1 and Rn are isomorphic. To this end, we need to prove that there exists a linear
transformation T W Pn 1 ! Rn that is one-to-one and onto. Actually, we already knew such
transformation: the one that gives us the coordinates of a vector in Pn 1 with respect to a basis
of Pn 1 .
Let E D f1; x; :::; x n 1 g be a basis for Pn 1 . Then, any vector p.x/ in Pn 1 can be written
as

p.x/ D a0 .1/ C a1 .x/ C    C an 1 .x n 1 / H) Œp.x/E D .a0 ; a1 ; : : : ; an 1 /

Now, we define the following transformation T W Pn 1 ! Rn (this transformation is known


as a coordinate map)
T .p.x// WD Œp.x/E
Is this a linear transformation? Yes, thanks to Theorem 11.11.2. What left is to prove that T
is one-to-one and onto. For the former, just need to show that ker.T / D f0g. For the latter,
dim Pn 1 D dim Rn D n.

Matrix associated with a linear transformation. Let V and W be two finite dimensional
vector spaces with bases B and C, respectively, where B D fv1 ; v2 ; : : : ; vn g. Now consider a
linear transformation T W V ! W . Our task is to find the matrix associated with T . To this end,
consider a vector u 2 V , we can write it as

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 935

Table 11.3: The parallel universes of P2 and R3 : P2 is isomorphic to R3 by the coordinate map
T .p.x// WD Œp.x/E where E D f1; t; t 2 g is the standard basis in P2 .
P2 R3
" #
a
p.t / D a C bt C ct 2 b
"c # " # " #
1 2 1
. 1 C 2t C 3t 2 / C .2 C 4t C 3t 2 / D 1 C 6t C 6t 2 2 C 4 D 6
"3 # "3 # 6
2 6
3.2 C t C 3t 2 / D 6 C 3t C 9t 2 3 1 D 3
3 9

u D u1 v 1 C u2 v 2 C    C un v n
So, the linear transformation T applied to u can be written as
T .u/ D T .u1 v1 C u2 v2 C    C un vn / D u1 T .v1 / C u2 T .v2 / C    C un T .vn /
Now T .u/ is a vector in W , and with respect to the basis C, its coordinates are

ŒT .u/C D Œu1 T .v1 / C u2 T .v2 / C    C un T .vn /C


D u1 ŒT .v1 /C C u2 ŒT .v2 /C C    C un ŒT .vn /C
So we can characterize a linear transformation by storing ŒT .vi /C , i D 1; 2; : : : ; n in an
m  n matrix like this
2 3
j j j j
6 7
ŒT C B WD 4ŒT .v1 /C ŒT .v2 /C    ŒT .vn /C 5 (11.11.3)
j j j j
This matrix is called the matrix of T with respect to the bases B and C. Then, any vector x 2 V
with B coordinate vector ŒxB is transformed to vector T .x/ 2 W with C coordinate vector
ŒT .x/C :
ŒT .x/C D ŒT C B ŒxB (11.11.4)
Thus, we have shown that any linear transformation can be described by a matrix. In the special
case where V D W and B D C, we have
ŒT .x/B D ŒT B ŒxB (11.11.5)
Matrices of a transformation in different bases. Consider a linear transformation T W V ! V .
The choice of a basis for V identifies this transformation with a matrix multiplication. Now, we
consider two bases, then we will have two matrices:
basis B W Rn ! Rn W ŒxB ! ŒT B ŒxB
basis C W Rn ! Rn W ŒxC ! ŒT C ŒxC

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 936

Our problem is now to relate ŒT B to ŒT C or vice versa. First, we consider an arbitrary vector
x 2 V and the basis B, we can write the transformation T on x as

ŒT .x/B D ŒT B ŒxB

Now, we look at the transformed vector T .x/ but in the basis C, by multiplying ŒT .x/B with
the change-of-basis matrix PC B :

PC B ŒT .x/B D PC B ŒT B ŒxB
„ ƒ‚ …
ŒT .x/C

Of course we have ŒT .x/C D ŒT C ŒxC , thus

ŒT C ŒxC D PC B ŒT B ŒxB

Now, to get rid of ŒxC , we replace it by PC B ŒxB , and obtain

ŒT C PC B ŒxB D PC B ŒT B ŒxB

This equation holds for any ŒxB , thus we get the following identity ŒT C PC B D PC B ŒT B
from which we obtain
ŒT B D PC 1 B ŒT C PC B (11.11.6)
This is often used when we are trying to find a good basis with respect to which the matrix of
a linear transformation is particularly simple (e.g. diagonal). For example, we can ask whether
there is a basis B such that the matrix ŒT B of T W V ! V is a diagonal matrix. The next section
is answering this question.

11.11.4 Diagonalizing a matrix


A diagonal matrix is so nice to work with. For example, the eigenvalues can be read off
immediately–the entries on the diagonal. It turns out that we can always transform a full matrix
to a diagonal one using ... eigenvalues and eigenvectors. This is not so surprising if we already
know principal axes of rotating rigid bodies. Let’s start with an example.

Example 11.13
Let’s consider the following matrix, which is associated to a linear transformation T , with its
eigenvalues and eigenvectors:
" # " # " #
3 1 1 1
AD ; 1 D 3; 2 D 2; v1 D ; v2 D
0 2 0 C1

Now, we consider two bases: the first basis C is the standard basis with .1; 0/ and .0; 1/ as
the basis vectors, and the second basis B with the basis vectors being the eigenvectors v1 ; v2 .

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 937

Now, we have " #


h i 1 1
ŒT C D A; PC B D v1 v2 D
0 1
Now using Eq. (11.11.6) the transformation T –that is associated with A w.r.t. C–is now given
by w.r.t. the eigenbasis B:
" # 1 " #" # " #
1 1 3 1 1 1 3 0
ŒT B D PC 1 B ŒT C PC B D D
0 1 0 2 0 1 0 2

Look at what we have obtained: a diagonal matrix with the eigenvalues on the diagonal! In
other words, we have diagonalized the matrix A.

Suppose the n  n matrix A has n linearly independent eigenvectors v1 ; v2 ; : : : ; vn (and


eigenvalues 1 ; 2 ; : : : ; n ). Put them into the columns of an eigenvector matrix V. We now
compute A times V:
h i h i h i
AV WD A v1 v2    vn D Av1 Av2    Avn D 1 v1 2 v2    n vn
h i
Now, the trick is to split the matrix 1 v1 2 v2    n vn into V times a diagonal matrix 
with i ’s on the diagonalŽŽ :
2 3
1 0    0
h i h i66 0 2    0 7
7
AV D 1 v1 2 v2    n vn D v1 v2    vn 6 6 :: :: : :
7
:: 7 D V
4 : : : : 5
0 0    n

Thus we have obtained AV D V and since V has linear independent columns, it can be
inverted, so we can diagonalize A:

1
AV D V H) A D VV

With this form, it is super easy to compute powers of A. For example,

A3 D .VV 1 /.VV 1 /.VV 1 / D V.V 1 V/.V 1 V/V 1 / D V3 V 1

And nothing can stop us from going to Ak D Vk V 1 whatever k might be: 1000 or 10000.
This equation tells us that the eigenvalues of Ak are k1 ; : : : ; kn , and the eigenvectors of Ak are
the same as the eigenvectors of A.

h
ŽŽ
If this is not clear,
i check Section 11.4.4 on the matrix-column representation of the product AB: AB D
AB 1 AB 2 AB 3 . And AB 1 is a linear combination of the cols of A with the coefficients being the compo-
nents of B 1 . Here, A is V and B 1 D .1 ; 0; : : :/.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 938

11.11.5 Inner product and inner product spaces


In this section we present the inner product which is an extension of the familiar dot product
between two vectors in Rn . First, let’s recall the dot product of two n-vectors:
X
n
abD ai bi (11.11.7)
i D1

This dot product has these properties: ab D ba, aa  0 and .˛aCˇb/c D ˛.ac/Cˇ.bc/.
Now, we define an inner product between two vectors a; b in a vector space V , denoted by ha; bi,
which is an operation that assigns these two vectors a real number such that this product has
properties identical to those of the dot product:

symmetry: ha; bi D hb; ai


positivity: ha; ai  0
(11.11.8)
positivity: ha; ai D 0 if and only if a D 0
linearity: h˛a C ˇb; ci D ˛ha; ci C ˇhb; ci

Other notations for the inner product are .a; b/. From the linearity property, we can show that
the inner product has the bilinearity property that readsŽŽ

hax C by; cu C d vi D achx; ui C ad hx; vi C bchy; ui C bd hy; vi

The word bilinearity is used to indicate that the inner product is linear with respect to both input
vectors.
Proof.

hax C by; cu C d vi D hax; cu C d vi C hby; cu C d vi (linearity prop.)


D hcu C d v; axi C hcu C d v; byi (symmetry prop.)
D hcu; axi C hd v; axi C hcu; byi C hd v; byi (linearity prop.)
D cahu; xi C dahv; xi C cbhu; yi C dbhv; yi (linearity prop.)

Example 11.14
Let u D .u1 ; u2 / and v D .v1 ; v2 / be two vectors in R2 . Then, the following

hu; vi D 5u1 v1 C 7u2 v2

defines an inner product. It’s not hard to check that this really satisfies all the properties in
Eq. (11.11.8). Now, we generalize it to Rn . Let u D .u1 ; u2 ; : : : ; un / and v D .v1 ; v2 ; : : : ; vn /
ŽŽ
With ˇ D 0, the linearity property gives us h˛a; ci D ˛ha; ci. And from that we also have h˛a; ci D
˛ ha; ci.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 939

be two vectors in Rn and w1 ; w2 ; : : : ; wn are n positive weights, then


2 3
w1    0
> 6 :: : : :: 7
hu; vi D w1 u1 v1 C w2 u2 v2 C    C wn un vn D u Wv; W D 4 : : : 5
0    wn

defines an inner product called a weighted dot product.

A vector space equipped with an inner product is called an inner product space. Don’t
be scared as the space Rn is an inner product space! It must be as it was the inspiration for
mathematicians to generalize it to inner product spaces. We shall meet other inner product
spaces when we define concrete inner product. But first, with the inner product, similar to how
the dot product defines length, distance, orthogonality, we are now able to define these concepts
for vectors in an inner product space.

Definition 11.11.7
Let u and v be two vectors in an inner product space V .
p
(a) The length (or norm) of v is jjvjj D hv; vi.

(b) The distance between u and v is d.u; v/ D jju vjj.

(c) u and v are orthogonal if hu; vi D 0.

Example 11.15
If we consider two functions f and g in CŒa; b–the vector space of continuous functions in
Œa; b, show that
Z b
hf; gi D f .x/g.x/dx (11.11.9)
a

defines an inner product on CŒa; b.


We need of course to verify that this definition satisfies all four conditions in Eq. (11.11.8).
This satisfaction comes from the properties of definite integrals.

Gram-Schmidt orthogonalization and Legendre polynomials. If we apply the Gram-Schmidt


orthogonalization, with Eq. (11.11.9) in place of the dot product, to 1; x; x 2 ; : : : we will obtain
the so-called Legendre polynomials.
Applying the Gram-Schmidt orthogonalization to 1; x; x 2 ; x 3 we obtain the first four Legen-

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 940

dre polynomials (Note that Legendre polynomials are defined on the interval Œ 1; 1):
L0 .x/ D 1
Z
h1; xi 1 1
L1 .x/ D x 1Dx xdx D x
h1; 1i 2 1
h1; x 2 i hx; x 2 i 1 (11.11.10)
L2 .x/ D x 2 1 x D x2
h1; 1i hx; xi 3
3 3
h1; x i hx; x i hx ; x 3 i 2
2
3
L3 .x/ D x 3 1 x x D x3 x
h1; 1i hx; xi hx ; x i
2 2 5
Actually, we need to scale these polynomials so that Ln .1/ D 1, then we have the standard
Legendre polynomials as shown in Table 11.4. One surprising fact about Legendre polynomials,
their roots are symmetrical with respect to x D 0, and Ln .x/ has n real roots within Œ 1; 1, see
Fig. 11.25. And these roots define the quadrature points in Gauss’ rule–a well known numerical
integration rule (Section 12.4.3).
1.0
n Ln .x/
0.5
0 1
1 x 0.0

2 1
2
.3x 2
1/ 0.5
L0 L2 L4
3 1
.5x 3
3x/ L1 L3 L5
2 1.0
4 1
8
.35x 4 30x 2 C 3/ 1.0 0.5 0.0 0.5 1.0
5 1
.63x 5 70x 3 C 15x/
8 Figure 11.25: Plots of some Legendre polynomials.
Table 11.4: The first six Legendre polynomials.

Adrien-Marie Legendre (1752 – 1833) was a French mathematician who made numerous
contributions to mathematics. Well-known and important concepts such as the Legendre polyno-
mials and Legendre transformation are named after him.
Now, we focus on the inner product space of polynomials. Because Legengre polynomials
are orthogonal to each other, they can be the basis for the inner product space of polynomials.
For example, any 2nd degree polynomial can be uniquely written as
p2 .x/ D c0 L0 .x/ C c1 L1 .x/ C c2 L2 .x/
where Li .x/ are the orthogonal Legendre polynomials, see Table 11.4. Next, we compute the
inner product of p2 .x/ with L3 .x/, because the result is beautiful:
Z 1 Z 1
L3 .x/p2 .x/dx D Œc0 L0 .x/ C c1 L1 .x/ C c2 L2 .x/ L3 .x/dx
Z 1 Z 1 Z 1
1 1

D c0 L0 .x/L3 .x/dx C c1 L1 .x/L3 .x/dx C c3 L2 .x/L3 .x/dx


1 1 1
D0

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 941

This is due to the orthogonality of Legendre polynomials. We will use this in Section 12.4.3 to
derive the famous Gauus-Legendre quadrature rule.

The Cauchy-Schwarz inequality. In Section 2.20.3, we have met the Cauchy-Schwarz inequal-
ity. At that time, we did not know of Rn . But now, we can see that this inequality is, for two
vectors u and v in Rn
ju  vj  jjujjjjvjj
The nice thing of mathematics is that the same inequality holds for two vectors in an inner
product space. We just replace the dot product by the more general inner product.

Proof. The proof is pretty similar to the one given in Section 2.20.3. We construct the following
function, which is always non-negative

f .t/ D hu C t v; u C tvi

which can be re-written as

f .t/ D hu C tv; u C tvi


D hv; vit 2 C 2hu; vit C hu; ui  0 for all t

So, f .t / is a quadratic function in t , we hence compute the discriminant  and it has to be less
than or equal to 0:
 D 4hu; vi2 4hv; vihu; ui  0


And with this, we also get the triangle inequality for vectors in an inner product space:

jja C bjj  jjajj C jjbjj (11.11.11)

11.11.6 Complex vectors and complex matrices


A complex vector is a vector whose components are complex numbers. For example, z D
.1 C 2i; 3 4i/ is a complex vector, we use the notation z 2 C 2 for this. A general n-complex
vector is given by
h i>
z D a1 C ib1 a2 C ib2    an C ibn

The first question we have to askpis: how we compute the length of a complex vector? If a is a
real n-vector, then its lengthpis a12 C    C an2 . Can we use this for complex vectors? Just try
for z D .1; i/, then jjzjj D 12 C i 2 D 0, which cannot be correct: a non-zero vector cannot
have a zero length!

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 942

Definition 11.11.8
If u D .u1 ; u2 ; : : : ; un / and v D .v1 ; v2 ; : : : ; vn / are vectors in C n , then the complex dot
product of them is defined by

u  v D uN 1 v1 C uN 2 v2 C    C uN n vn

where uN i is the complex conjugate of ui . Recall that z D a C bi, then zN D a bi.

Now, using this definition of the complex


p dot product we
p can get the correct length of a
complex vector. If z D .1; i/, then jjzjj D 1 C . i/.i/ D 2.
2

11.11.7 Norm, distance and normed vector spaces


We live in a 3D Euclidean world, and therefore, concepts
from Eucledian geomerty govern our way of looking at the
world. For example, when thinking about the distance between
two
p points A and B, we thought of the shortest distance as
.xB xA /2 C .yB yA /2 . Thus, we’re using the length de-
fined as the square root of the inner product of xB xA with
itself. However, there exists different types of distance. For ex-
ample, suppose we’re at an intersection in a city, trying to get to
another intersection. In this case, we do not use the Eucledian
distance, instead we use the so-called taxicab distance. This is so because that is how taxicab
drivers measure distance.
This section presents new ways to measure distance. The starting point cannot be the inner
product (from which we can only define the Euclidean length or norm). Instead, we start directly
with the concept of a norm with certain properties that we want it to have.

Definition 11.11.9
A norm on a vector space V is a mapping that associated with each vector v a real number
jjvjj, called the norm of v, such that the following properties are satisfied for all vectors u and
v and all scalars c:

(a) (non-negativity) jjvjj  0, and jjvjj D 0 if and only if v D 0

(b) (scaling) jjcvjj D jcjjjvjj.

(c) (triangle inequality) jju C vjj  jjujj C jjvjj.

A vector space equipped with a norm is called a normed vector space.

In the following example, we consider the vector space Rn and show that there are many
norms rather than the usual Eucledian norm.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 943

Example 11.16
Consider v D .v1 ; v2 ; : : : ; vn /, the following common norms for v:

(a) (l 1 ) jjvjj1 D jv1 j C jv2 j C    C jvn j


1=2
(b) (l 2 ) jjvjj2 D jv1 j2 C jv2 j2 C    C jvn j2

(c) (l 1 ) jjvjj1 D maxfjv1 j; jv2 j; : : : ; jvn jg D maxi jvi j

(d) (l p ) jjvjjp D .jv1 jp C jv2 jp C    C jvn jp /1=p ; 1  p < 1

where jjvjj2 is the usual Eucledian norm. It is not hard to prove that l 1 , l 2 and l 1 are indeed
norms (we just need to verify the three properties stated in the definition of a norm). For
l p , the proof is harder and thus skipped. Note that I wrote jv1 j2 instead of v12 because the
discussion covers complex vectors as well. Thus, the symbol jj indicates the modulus.

Fig. 11.26 presents the geometry of these norms in R2 . Is this just for fun? Maybe, but it
reveals that the different norms are close to each other. Precisely, the norms are all equivalent on
Rn in the sense thatŽŽ
p
kvk2  kvk1  nkvk2

v2 v2 v2
1
v1

v12 + v22 = 1
+
v2
=
1

v1 −1 1 v1 v1
1

=
v2

v1

p−1
kvk1 = |v1| + |v2| kvk2 = v12 + v22 kvk∞ = max{|v1|, |v2 |}

Figure 11.26: Geometry of different norms: illustrated in R2 with v D .v1 ; v2 /.

11.11.8 Matrix norms


To tell when a matrix is large or small or to know when the two matrices are close to each other,
we need to define the norm of a matrix. If we can accept that a function has a length, then it
is Ok that a matrix has a norm (kind of length). We do not know the definition of that, but we
know what properties a matrix norm should have. So, we define a matrix norm based on these
P P pP p
ŽŽ
One proof is: kvk1 D i jvi j D i jvi j2 1  i jvi j 12 C    C 12 using the Cauchy-Schwarz inequal-
ity.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 944

properties. Later on, once we have found the formula for the norm, we check whether it satisfies
all these properties. This is similar to how we defined the determinant of a matrix.

Definition 11.11.10
A norm on a matrix space Mnn is a mapping that associated with each matrix A a real number
jjAjj, called the norm of A, such that the following properties are satisfied for all matrices A
and B and all scalars c:

(a) (non-negativity) jjAjj  0, and jjAjj D 0 if and only if A D 0

(b) (scaling) jjcAjj D jcjjjAjj.

(c) (triangle inequality) jjA C Bjj  jjAjj C jjBjj.

(d) (additional inequality) jjABjj  jjAjjjjBjj.

Now we define a matrix norm which is based on a vector norm. Starting with a vector x with
a norm jjjj defined on it, we consider the norm of the transformed vector, that is kAxk. One way
to measure the magnitude of A is to compute the ratio kAxk=kxk. We can simplify this ratio as

jjAxjj 1 x
D Ax
D
D kAx  k
jjxjj
A
jjxjj jjxjj
where the scaling property of a vector norm (definition 11.11.9) was used in the second equality.
A norm is just one single number, so we are interested only in the maximum of the ratio kAxk=kxk:
jjAxjj
max D max kAx  k
kxk¤0 jjxjj kx  kD1

Mathematicians then define the operator norm, of a matrix, induced by the vector norm kxk
asŽŽ :
kAk D max kAxk
kxkD1

We think of kxk1 , kxk2 and kxk1 as the important vector norms. Then, we have three
corresponding matrix norms:
jjAjj1 D max kAxk1 ; jjAjj2 D max kAxk2 ; jjAjj1 D max kAxk1
kxk1 D1 kxk2 D1 kxk1 D1

The definition looks scary but it turns out that we can actually compute the norms quite straight-
forwardly at least for the 1-norm and the 1 norm. For jjAjj2 we need the singular value
decomposition, so the discussion of that norm is postponed to Section 11.12.3. I want to start
with jjAjj1 for simple 2  2 matrices:
" # " #
a b ax1 C bx2
AD H) y WD Ax D H) kyk1 D jax1 C bx2 j C jcx1 C dx2 j
c d cx1 C dx2
ŽŽ
Of course we have to check the conditions in definition 11.11.10. I skipped that part. Check [56].

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 945

Now, to find jjAjj1 , we just need to find the maximum of jax1 C bx2 j C jcx1 C dx2 j subjecting
to jx1 j C jx2 j D 1:

kyk1  jx1 jjaj C jx2 jjbj C jx1 jjcj C jx2 jjd j


 jx1 j.jaj C jcj/ C jx2 j.jbj C jd j/
 .jx1 j C jx2 j/M D M; M D maxfjaj C jcj; jbj C jd jg

Thus, jjAjj1 is simply the largest absolute column sum of the matrix. Not satisfied with this
simple English, mathematicians write

X
n
jjAjj1 D max jAij j
j D1;:::;n
i D1

Follow the same steps, it can be shown that jjAjj1 is the largest absolute row sum of the matrix.
The proof for an n  n matrix for jjAjj1 is not hard but for jjAjj1 it is harder.

11.11.9 The condition number of a matrix


When we solve some systems of linear equations Ax D b, we often see that small changes in
the entries of A or b can produce large changes in the solutions x. For example, considering the
following system of equations
" #" # " # " # " #
1 1 x1 3 x 1
D H) 1 D
1 1:0005 x2 3:0010 x2 2

If the matrix is slightly changed, we obtain a completely different solution:


" #" # " # " # " #
1 1 x1 3 x 2
D H) 1 D
1 1:0010 x2 3:0010 x2 1

Now the problem is to study when this happens, or in other words, there is any measure of A that
can quantify this behavior? The answer is yes and that measure is what we call the condition (or
conditioning) number of the matrix. To work out this number, we consider a general Ax D band
A0 x 0 D b where A0 is slightly different from A. As A0 is slightly different from A, we can write
it as A0 D A C A. Similarly, we write x 0 D x C x. If we can compute the norm of x we
will know when this change in the solution is large or small.
Starting with A0 x 0 D b we have:

A0 x 0 D b ” .A C A/.x C x/ D b ” x D A 1 Ax 0

Now, we can compute the norm of x



kxk D A 1 Ax 0 D kA 1 Ax 0 k  kA 1 kkAx 0 k  kA 1 kkAkkx 0 k

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 946

Thus,
kxk kAk
0
 kA 1 kkAk D kA 1 kkAk
kx k kAk

And the red term is defined as the condition number of A, denoted by cond.A/. Why we had to
make kAk appear in the above? Because only the relative change in the matrix (e.g. kAk=kAk)
makes sense. Thus, the conditioning number gives an upper bound on the relative change in the
solution:
kxk kAk
0
 cond.A/
kx k kAk

It is certain that the conditioning number of a matrix depends on the choice of the matrix norm
used. The most commonly used norms are kAk1 and kAk1 . Below is one example.

Example 11.17
Find the conditioning number of the matrix A given in the beginning of this section. We need
to compute A 1 : " # " #
1 1 C2001 2000
AD ; A 1D
1 1:0005 2000 C2000
Then, the norms of A and its inverse, and the condition number are:

kAk1 D 2:0005; kA 1 k1 D 4001 H) cond1 .A/  8004


kAk1 D 2:0005; kA 1 k1 D 4001 H) cond1 .A/  8004

If we compute cond2 .A/ it is about 8 002. Thus, when the condition number of a matrix is
large for a compatible matrix norm, it will be large for other norms. And that saves us from
having to compute different condition numbers! To appreciate that this matrix A has a large
condition number, consider now the well behaved matrix in Eq. (11.3.1), its condition number
is just three. Matrices such as A with large condition numbers are called ill conditioned
matrices.

11.11.10 The best approximation theorem

A common problem in mathematics is finding the vector w in a subspace W of a vector space


V that best approximates (i.e., is closest to) a given vector v in V . For example, given vector
v in R2 , find the vector w living in the span of vector u that is closest to it (Fig. 11.27a). This
problem gives rise to the following definition of what is the best approximation.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 947

Definition 11.11.11
If W is a subspace of a normed linear space V and if v is a vector in V , then the best
approximation to v in W is the vector v in W such that

kv v k  kv wk

for every vector w in W .

B
A
v projW .v/
v w v
v
A
projW .v/ w projW .v/

u w
O H W C
(a) (b)

Figure 11.27: Best approximation theorem: If W is a subspace of a normed linear space V and if v is a
vector in V , then the best approximation to v in W is the vector projW .v/.

Now, the question is what is v ? Actually we already met the answer: the projection of v
onto W is the answer. This guess is based on the standard geometry of R3 . Refer to Fig. 11.27b,
and consider the right triangle ABC , the Pythagorean theorem gives us

kw projW .v/k2 C kv projW .v/k2 D kv wk2

which immediately results in

kv projW .v/k2  kv wk2 H) kv projW .v/k  kv wk

And we have just proved the best approximation theorem.

11.12 Singular value decomposition


For a square matrix A, which might be not symmetric, we have the factorization A D VV 1 .
When the matrix is symmetric, we have another factorization A D QQ> . All these decompo-
sitions are based on eigenvalues/eigenvectors Ax D x. If A is non-square i.e., A is a m  n
matrix, we cannot have Ax D x as the left side is in Rm and the right side is in Rn . The singu-
lar value decomposition fills this gap in a perfect way. Now, this section presents this invaluable
decomposition.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 948

11.12.1 Singular values


For any m  n matrix A, the matrix A> A is a symmetric n  n matrix with real non-negative
eigenvaluesŽŽ . That’s one thing special about A> A. Let’s denote by  the eigenvalue of A> A
and v the corresponding unit eigenvector, then we have
0  jjAvjj2 D .Av/  .Av/ D .Av/> .Av/ D v> A> Av D v> v D jjvjj2 D 
Thus,  is the squared length of the vector Av. So, for a rectangular matrix, we do not have
eigenvalues but we have singular values, which are the eigenvalues of A> A:

Definition 11.12.1
If A is an m  n matrix, the singular values of A are the square roots of the eigenvalues of
A> A and are denoted by 1 ; 2 ; : : : ; n . It is conventional to arrange the singular values in a
descending order: 1  2      n .

We can find the rank of A by counting the number of non-zero singular values. From theo-
rem 11.5.5 we have rank.A/ D rank.A> A/. But,
rank.A> A/ D rank.QQ> /  min.rank.Q/; rank.// D r

11.12.2 Singular value decomposition


Now the problem we want solve is: starting with two orthogonal vectors v1 and v2 . They are
transformed by a matrix A to Av1 D y 1 and Av2 D y 2 . We want those transformed vector to
be orthogonal too. Recall from Section 11.12.1, the length of y 1 is the singular value 1 of A> A.
Therefore, we can write y 1 D 1 u1 with u1 is a unit vector. Now we write,
" #
h i h i h i  0
1
Av1 D 1 u1 ; Av2 D 2 u2 H) A v1 v2 D 1 u1 2 u2 D u1 u2
0 2
Now, we introduce the matrix V D Œv1 v2 , matrix U D Œu1 u2  and ˙ is the diagonal matrix
containing 1;2 . The above equation then becomes

AV D U˙ H) A D U˙V>
And the decomposition in the box is the singular value decomposition of A. Why y 1 is orthogo-
nal to y 2 ? To see this, suppose vi is the eigenvector of A> A corresponding to the eigenvalue i .
Then, for i ¤ j , we have
.Avi /  .Avj / D .Avi /> .Avj / D v> > >
i A Avj D j vi vj D 0

The final equality is due to the fact that the eigenvectors of the symmetric matrix A> A are
orthogonal.
ŽŽ
But if you’re wondering why we know to consider A> A in the first place, I do not have a correct historical
answer. However this might help. Start with a rectangular matrix A, it transform a vector x into y D Ax. If we ask
what is the length of y, then A> A appears. Indeed, jjyjj2 D .Ax/> .Ax/ D x > A> Ax.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 949

Example 11.18
Find a singular value decomposition for the following matrix:
" #
1 1 0
AD
0 0 1

The first step is to consider the matrix A> A and find its eigenvalues/eigenvectors:
2 3 2 p 3 2 3 2 p 3
1 1 0 1= 2 0 1= 2
6 7 6 p 7 6 7 6 p 7
A> A D 41 1 05 H) v1 D 41= 25 ; v2 D 405 ; v3 D 4 1= 2 5
0 0 1 0 1 0

with corresponding eigenvalues 1 D 2; 2 D 1; 3 D 0. Note that as rank.A/ D 2, we have


rank.A> A/ D 2, thus one eigenvalue must be zero. Notepalso that as A> A is symmetric, fvi g
is an orthogonal set. Thus, V and ˙ are given by (i D i )
2 p p 3
1= 2 0 1= 2 "p #
6 p p 7 2 0 0
V D 41= 2 0 C1= 25 ; ˙ D
0 1 0
0 1 0

To find U find ui :
1 1
u1 D Av1 D .1; 0/; u2 D Av2 D .0; 1/
1 2
These two vectors are already an orthonormal basis. Now, we have U; V and ˙ , then the SVD
of A is: 2 3
" # " # "p # 1=p2 1=p2 0
1 1 0 1 0 2 0 0 6 7
D 4 0 0 15
0 0 1 0 1 0 1 0 p p
„ ƒ‚ … „ ƒ‚ … „ ƒ‚ … 1= 2 1= 2 0
A U ˙ „ ƒ‚ …
V>

Using Julia we can easily verify that the above is correct. Thus, we have singular value
decomposed a rectangular matrix!

Hope that this example demonstrates what a SVD is. Now, we give the formal definition of
it and then we need to prove that it is always possible to do a SVD for any matrix.

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 950

Definition 11.12.2
Let A be an m  n matrix with singular values 1  2      n  0. Let r denote
the number of non-zero singular values of A. A singular value decomposition of A is the
following factorization A D U˙V> , where U is an m  m orthogonal matrix, V is an n  n
orthogonal matrix and ˙ is an m  n diagonal matrix whose i th diagonal entry is the i th
singular value i for i D 1; 2; :::; r. All other entries of ˙ are zero.

Proof. We now prove that we can always do a SVD for A. The idea of the proof is to show that
for any vector x 2 Rn , we have Ax D U˙ V> x. If so, then of course A D U˙ V> . To this end,
we start with V> x, then ˙ V> x:
2 3 2 3
2 3 1 v 1  x 1 v> x
6 7 6 7
1
v1  x
6 7 62 v2  x 7 62 v> x7
6v2  x 7 6 7 6 : 7 2
V> x D 6 7 >
6 :: 7 H) ˙ V x D 6
6 ::
: 7 D 6 :: 7
7 6 7
4 : 5 6 7 6 7
4r vr  x 5 4r vr x 5
>
vn  x
0 0

Now, we consider U˙ V> x, noting that U contains ui D i 1 Avi :

U˙ V> x D u1 1 v> >


1 x C    C ur r v r x (Ax is a linear combination of the cols of A)
D 1 1 Av1 1 v> >
1 x C    C r Av r r vn x (use ui D i Av i )
1 1

D Av1 v> >


1 x C    C Avn v n x (Avi D 0, i > r)
D A .v1 v> >
1 C    C vn vn / x D Ax
„ ƒ‚ …
I

So, in the third equality we just added a bunch of zero vectors. Note that Avi D 0, i > r
because we have only r non-zero singular values. The final equality comes from the fact that if
fv1 ; : : : ; vn g is an orthonormal set then v1 v> >
1 C    C v n vn D I.

Left and right singular vectors. We have A> A with the eigenvectors vk . How about uk ? Are
they the eigenvectors of some matrix? The answer is yes: it is the eigenvector of AA> . Maths is
really nice, isn’t it. The proof goes as

Avk AŒ.A> A/vk  Ak2 vk


AA> uk D AA> D D D k2 uk
k k k

The key to the proof was the fact that .AA> /A D A.A> A/. Some new terms: the vk are called
the right singular vectors and the uk are called the left singular vectors.

Geometry of the SVD. We have seen in Fig. 11.24 that the linear transformation Ax transform
a circle in R2 into an ellipse in R2 . With the SVD, it can be proved that an m  n matrix A

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 951

maps a unit sphere in Rn into an ellipsoid in Rm . Consider a unit vector x 2 Rn , and its image
y D Ax 2 Rm :

x D x1 v1 C x2 v2 C    C xn vn H) y D Ax D x1 1 u1 C    C xr r ur

The image vector has coordinates .y1 ; y2 ; : : : ; yr / D .x1 1 ; : : : ; xr r /, where


 2  2  2 X r
y1 y2 yr
C C  C D xi2  1
1 2 r i D1

The last inequality comes from the unit vector x. Now, if r D n (i.e., the matrix A is a full
column rank matrix), then in the above inequality we have equal sign, and thus the image Ax
is the surface of the ellipsoid. On the other hand, if r < n, then the image is a solid ellipsoid in
Rm .
We can even have a geometry interpretation of the different matrices in a SVD. For that
we have torestrict to a plane. Start with a unit vector x 2 R2 . Now the transformation Ax
is U˙ V> x. From Section 11.6 on linear transformation we know that we’re dealing with a
composite transformation. And we handle it from right to left. So, we start with V> x, which is
a rotation, thus we get a circle from a circle. But now we see the transformed
 circle in the plane
>
in which the axes are v1 and v2 (Fig. 11.28). Then comes ˙ V x which simply stretches
(sometimes shrinks) our circle (the second circle from the left) to an ellipse. Finally, U is a
rotation and we got an oblique ellipse as the final Ax.
A byproduct of this is that we are now able to compute jjAjj2 , it is simply 1 : jjAjj2 D 1 .

 
σ1 0
Σ=
V⊤ 0 σ2 U
R2 : kxk = 1
y kV ⊤ xk = kxk y   y   y
0 0
V ⊤ v2 = ΣV ⊤ v2 =
1 σ2 σ2 u2
v2

x x x x
 
v1 1  
V ⊤ v1 =
0 σ1 σ1 u1
ΣV ⊤ v1 =
0


y = V ⊤ ΣU x

Figure 11.28: The geometry of the Singular Value Decomposition.

11.12.3 Matrix norms and the condition number


Recall that the 2-condition number of matrix A is defined as
cond2 .A/ D kA 1 k2 kAk2

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 952

Now, with the SVD of A, we can compute these norms and thus the 2-condition number.
As shown in Fig. 11.28, the norm of A is simply its largest singular value:
kAk2 D max kAxk2 D 1
kxk2 D1

The inverse of A (if it exists) can be determined easily from the SVD A D U˙V> , namely
A 1
D V˙ 1
U> (11.12.1)
where ˙ 1 is a diagonal matrix with 1=i on the diagonal. The reason is simple using the idea
of inverse mapping by undoing each of the three operations shown in Fig. 11.28. First, undo the
last rotation by multiplying with U> , second un-stretch by multiplying by 1=i along each axis,
thirs, un-rotate by multiplying by V. If you need to see an algebra proof, here it is:
   
A 1 A D V˙ 1 U> U˙ V> D V˙ 1 U> U ˙ V> D V ˙ 1 ˙ V> D VV> D I

The 2-norm of A 1
is its maximum singular value which is 1=n :
1
kA 1 k2 D max kA 1 xk2 D
kxk2 D1 n
Now, the 2-condition number of A is simply the ratio of the maximum singular value and
minimum singular value, or
1
cond2 .A/ D 1 (11.12.2)
n

11.12.4 Low rank approximations


We have seen Taylor series and Fourier series of which the main idea is to expand or decompose
a function into a sum of many pieces. Thus, it is not a surprise when we can do the same thing
with a matrix:
U˙ V> x D 1 u1 v> > > >
1 x C    C r ur v r x H) A D 1 u1 v 1 C    C r ur v r

Similar to what we have done to Taylor series, we truncate the sum on the RHS of A to get Ak –a
rank k matrix:
Ak D 1 u1 v> >
1 C    C k uk v k

And we expect there exists a truth between A and Ak . And this truth was discovered by Schmidt
in 1907, which was later proved by Eckart and Young in 1936 and by Mirsky in 1955. The
theorem is now called the Eckart-Young-Mirsky theorem stating that Ak is the closet rank k
matrix to A. Obviously we need to use matrix norms to express this theorem:
Theorem 11.12.1: The Eckart-Young-Mirsky theorem
If B has rank k then

kA Bk  kA Ak k; Ak D 1 u1 v> >
1 C    C k uk vk

Phu Nguyen, Monash University © Draft version


Chapter 11. Linear algebra 953

SVD in image compression. Suppose that the original image is a gray image of size .512; 512/,
and we rebuild the image with 50 singular values, then we only need to save 2  512  50 C 50
numbers to rebuild the image, while original image has 512  512 numbers. Hence this gives
us a compression ratio 19.55% if we don’t consider the storage type. Fig. 11.29 presents one
example and the code to produce it is given in Listing A.23.

Figure 11.29: From left to right: original image, 10, 50 and 100 singular values.

Phu Nguyen, Monash University © Draft version


Chapter 12
Numerical analysis

Contents
12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 955
12.2 Numerical differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . 957
12.3 Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 959
12.4 Numerical integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 971
12.5 Solving nonlinear equations . . . . . . . . . . . . . . . . . . . . . . . . . 980
12.6 Numerical solution of ordinary differential equations . . . . . . . . . . . 983
12.7 Numerical solution of partial differential equations . . . . . . . . . . . . 993
12.8 Numerical optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1000
12.9 Numerical linear algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . 1005

Numerical analysis is an area of mathematics that creates, analyzes, and implements al-
gorithms for obtaining numerical solutions to problems involving continuous variables. The
Newton-Raphson method to solve numerically the equation tan x D x is one example. The
Rb
Gauss quadrature method to numerically evaluate any definite integral a f .x/dx is also one
example. The finite difference method to solve ordinary and partial differential equations is yet
another example.
Numerical solutions are numbers not closed form expressions. For example, it is possible
to solve the quadraticpequation ax 2 C bx C c D 0 exactly to get the well known closed form
solutions x1;2 D b˙ b 2 4ac=2a. Such solutions do not exist for polynomial equations of fifth
order or higher and for transcendental equations such as tan x D x. However, the Newton-
Raphson method can solve all the equations efficiently; but it only gives us numerical solutions.
For example, applying to tan x D x, it gives us 4:49340946.
The following books were consulted for the majority of the material presented in this chapter:
 Approximation Theory and Approximation Practice by Lloyd Trefethen [71]

Lloyd Nicholas Trefethen (born 30 August 1955) is an American mathematician, professor of numerical

954
Chapter 12. Numerical analysis 955

 Numerical methods for scientists and engineers by Richard Hamming§ [28]

 Analysis of numerical methods, by Eugene IsaacsonŽŽ and Herbert Keller‘ [32]

 Finite Difference Computing with PDEs: A Modern Software Approach by Hans Petter
LangtangenŽ and Svein Linge, [40],

 Computational Fluid Dynamics the basic and applications by John Anderson [4]

 Computational Physics by Nicholas J. Giordano and Hisao Nakanishi [26].

I strongly recommend the book of Anderson; it is so well written and a joy to read. Even though
it addresses numerical methods to solve the Navier-Stokes equations (which are of interest only
to people fascinated by the behavior of fluids), he explains things so clearly.

12.1 Introduction
Suppose we have to compute the following sum, for many values of ˛:

X
3
f .˛/ D ak cos.k˛/ (12.1.1)
kD0

where ak are known constants. The first solution is, of course, to compute term by term and add
them up. What do you think if someone tell you that there is a much much better method to
compute f .˛/? The secret is that cos ˛, cos 2˛ and so on, they are all related. Recall that, we
have derived such a relation in Eq. (3.8.20), re-given here

cos.k˛/ D 2 cos ˛ cos.k 1/˛ cos.k 2/˛; if k  2 (12.1.2)


analysis and head of the Numerical Analysis Group at the Mathematical Institute, University of Oxford. He is
perhaps best known for his work on pseudospectra of non-normal matrices and operators.
§
Richard Wesley Hamming (1915 – 1998) was an American mathematician whose work had many implications
for computer engineering and telecommunications. His contributions include the Hamming code (which makes use
of a Hamming matrix), the Hamming window, Hamming numbers, sphere-packing (or Hamming bound), and the
Hamming distance.
ŽŽ
Eugene Isaacson (1919–2008), was a US mathematician who pioneered modern numerical analysis. He
was a mathematics and physics graduate of City College in New York, he then entered the graduate program in
mathematics at New York University gaining a PhD on water waves on sloping beaches in 1949. His academic
career was then spent at the Courant Institute until his retirement.

Herbert Bishop Keller (1925–2008) was an American applied mathematician and numerical analyst. He was
professor of applied mathematics, emeritus, at the California Institute of Technology.
Ž
Hans Petter Langtangen (1962 – 2016) was a Norwegian scientist trained in mechanics and scientific com-
puting. Langtangen was the director of the Centre for Biomedical Computing, a Norwegian Center of Excellence
hosted by Simula Research Laboratory. Langtangen promoted the use of Python for scientific computing through
numerous journal papers and conference talks.

John D. Anderson Jr. (born October 1, 1937) is the Curator of Aerodynamics at the National Air and Space
Museum at the Smithsonian Institution in Washington, DC, Professor Emeritus in the Department of Aerospace
Engineering at the University of Maryland, College Park.

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 956

A name was given to such formula as it occurs quite often in mathematics. This is known as
the three-term recurrence relation because it involves three terms. Even with the hint that this
recurrence is the key to an efficient computation of the mentioned sum, it is really hard to know
where to start. Unless you know where to look for inspiration, and it comes in the name of the
Horner method in polynomial evaluation.

Horner’s method. In Section 2.29.6, the Horner method was presented as an efficient way to
evaluate any polynomial at a point x0 . As a recap, let’s consider a specific cubic polynomial
p.x/ D 2x 3 6x 2 C 2x C 1. In Horner’s method, we massage p.x0 / a bit as:

p.x0 / D 2x03 6x02 C 2x0 C 1 D x0 Œ2x02 6x0 C 2 C 1 D x0 Œx0 .2x0 6/ C 2 C 1

To implement Horner’s method, a new sequence of constants is defined recursively as follows:

b 3 D a3 b3 D 2
b 2 D x 0 b 3 C a2 b2 D 2x0 6
b1 D x0 b2 C a1 b1 D x0 .2x0 6/ C 2
b0 D x0 b1 C a0 b0 D x0 .x0 .2x0 6/ C 2/ C 1

where the left column is for a general cubic polynomial whereas the right column is for the
specific p.x/ D 2x 3 6x 2 C 2x C 1. Then, p.x0 / D b0 . As to finding the consecutive b-values,
we start with determining b3 , which is simply equal to a3 . We then work our way down to the
other b’s, using the recursive formula:

bk 1 D ak 1 C bk x0

until we arrive at b0 . This relation can also be written as

bk D ak C bkC1 x0 (12.1.3)

But what is the relation between the sum in Eq. (12.1.1) and a polynomial? To see that
relation, we need to write an n-order polynomial using the sum notation:

X
n
pn .x0 / D ak x k ; x k D xx k 1

kD0

Now, we can see that the sum in Eq. (12.1.1) and a polynomial are of the same form

X
n
f .x/ D ak k .x/ (12.1.4)
kD0

where k .x/ has either a three term recurrence relation or a two term recurrence relation (in the
case k .x/ D x k ).

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 957

Inspired by Eq. (12.1.3), we define the sequence of bk ’s as, where the only difference is the
red term which is related to cos.k 2/˛ in the three term recurrence relation (and of course
2 cos ˛ replaced x):

bk D ak C .2 cos ˛/bkC1 bkC2 H) ak D bk C bkC2 2 cos ˛bkC1 (12.1.5)

To compute the sum in Eq. (12.1.1), we compute a0 ; a1 ; a2 ; a3 in terms of bi ’s:

a3 D b3
a2 D b2 2 cos ˛b3
a1 D b1 C b3 2 cos ˛b2
a0 D b0 C b2 2 cos ˛b1

Substitution of ai ’s into Eq. (12.1.1), and re-arrangement the terms in this form b0 C b1 .   / C
b2 .   / C b3 .   /:

X
3
f .˛/ D ak cos.k˛/
kD0
D .b0 C b2 2 cos ˛b1 / C .b1 C b3 2 cos ˛b2 / cos ˛ C .b2 2 cos ˛b3 / cos 2˛ C b3 cos 3˛
D b3 .cos 3˛ C cos ˛ 2 cos ˛ cos 2˛/ C b2 .cos 2˛ C 1 2 cos2 ˛/ C b1 . cos ˛/ C b0

Amazingly, all the red terms are zeros because of Eq. (12.1.2), thus the scary sum is just the
following simple formula
X3
ak cos.k˛/ D b0 b1 cos ˛ (12.1.6)
kD0

This is Clenshaw’s algorithm, named after the English mathematician Charles William Clenshaw
( 1926–2004) who published this method in 1955.

12.2 Numerical differentiation


Numerical differentiation deals with numerical approximations of derivatives. The first ques-
tions that comes up to mind is: why do we need to approximate derivatives at all? After all,
we do know how to analytically differentiate every function. Nevertheless, there are several
reasons as of why we still need to approximate derivatives. The most important application of
numerical differentiation is in numerically solving ordinary and partial differential equations
(Section 12.6). When approximating solutions to ordinary (or partial) differential equations, we
typically represent the solution as a discrete approximation that is defined on a grid. Since we
then have to evaluate derivatives at the grid points, we need to be able to come up with methods
for approximating the derivatives at these points, and, this will typically be done using only
values that are defined on that grid.

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 958

12.2.1 First order derivatives


The starting point is Taylor’s theorem (Section 4.15.10):
f 00 .1 / 2
f .x C h/ D f .x/ C f 0 .x/h C h ; 1 2 .x; x C h/
2Š (12.2.1)
f 00 .2 / 2
f .x h/ D f .x/ f 0 .x/h C h ; 2 2 .x h; x/

From that, by ignoring the red terms, we obtain different ways to approximate the first derivative
of f .x/:
f .x C h/ f .x/ f 00 ./
forward difference: f 0 .x/  ; error  h
h 2
f .x/ f .x h/
backward difference: f 0 .x/ 
h
Since the approximations are obtained by truncating the term .f 00 ./=2Š/h2 from the exact formula
(12.2.1), this term is the error in our approximations, and is called the truncation error. When
the truncation error is of the order of O.h/, we say that the method is a first order method. We
refer to a method as a pth-order method if the truncation error is of the order of O.hp /. The
forward difference was used to develop the famous Euler’s method which is commonly used to
solve ordinary differential equations.
To develop a 2nd-order method we use more terms in the Taylor series including f 00 .x/:
f 00 .x/ 2 f 000 .1 / 3
f .x C h/ D f .x/ C f 0 .x/h C h C h ; 1 2 .x; x C h/
2Š 3Š (12.2.2)
f 00 .x/ 2 f 000 .2 / 3
f .x h/ D f .x/ f 0 .x/h C h h ; 2 2 .x h; x/
2Š 3Š
And subtracting the first from the second, we arrive at
 000 
0 f .x C h/ f .x h/ f .1 / C f 000 .2 / 2
f .x/ D h
2h 12
which yields the so-called centered difference for the 1st derivative:
f .x C h/ f .x h/
f 0 .x/  (12.2.3)
2h
This approximation is a 2nd order method by construction as the error is  h2 . To demonstrate
the performance of these approximations, let’s consider the function f .x/ D sin x C cos x
and compute f 0 .0/ and the errors (noting that the exact value is 1). The results are shown in
Table 12.1.
The result clearly indicates that as h is halved, the error of one-sided differences is only
halved (in Table 12.1, starting from the first row and going down, each time h is half of the
previous row), but the error of centered difference is decreased four times.

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 959

Table 12.1: Finite difference approximations of f 0 .x/ for f .x/ D sin x C cos x. Errors of one-sided
differences (forward/backward) versus two-sided centered difference.

h Forward diff. Backward diff. Centered diff.

0.2500 0.1347 0.1140 0.0104

0.1250 0.0650 0.0598 0.0026

0.0625 0.0319 0.0306 0.0007

0.0313 0.0158 0.0155 0.0002

12.2.2 Second order derivatives


To get a formula for f 00 .x/, we also start with Eq. (12.2.2)

f 00 .x/ 2 f 000 .x/ 3 f .4/ .1 / 4


f .x C h/ D f .x/ C f 0 .x/h C h C h C h
2Š 3Š 4Š (12.2.4)
0 f 00 .x/ 2 f 000 .x/ 3 f .4/ .2 / 4
f .x h/ D f .x/ f .x/h C h h C h
2Š 3Š 4Š
However, now we add them up to get

f .x C h/ 2f .x/ C f .x h/
f 00 .x/ D C O.h2 / (12.2.5)
h2
This approximation was used to develop the famous Verlet’s method which is commonly used
to solve Newton’s equations of motions F D ma.

12.2.3 Richardson’s extrapolation

12.3 Interpolation
Assume that we are back in time to the period of no calculators and no formula for calculating
sine. Luckly, some people made up a table of sine of 1ı ; 5ı ; 10ı ; 15ı ,... But we need sin 2ı .
What are we going to do? We will use a method that has become to what we know today as
interpolation . In the first attempt, we assume that the two data points .1ı ; sin 1ı /, .5ı ; sin 5ı /
are connected by a line. We can determine the equation, let’s call it f .x/, for this line (because
it is straightforward). Having such a equation, it is a simple task to compute sin 2ı , it is f .2ı /.
Then, we realize that our assumption was too crude. In need of higher accuracy, instead
of a line joining the two points, we assume a parabola joining three data points. Generally,

The word "interpolation" originates from the Latin verb interpolare, a contraction of "inter", meaning "be-
tween", and "polare", meaning "to polish". That is to say, to smooth in between given pieces of information.

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 960

interpolation is where an approximating function is constructed in such a way as to agree


perfectly with the usually unknown original function at the given measurement/data points.
There exits another situation where we need to do interpolation. Suppose that we have a very
complex function y D f .x/ that we do not want to work with it directly. So, we can generate
some data points .xi ; f .xi // and use them to generate an interpolating function that matches
f .x/ only at xi . The thing is usually the interpolating function is simple to work with e.g. it is a
polynomial.

12.3.1 Polynomial interpolations


Given two points .x1 ; y1 / and .x2 ; y2 /, where x1 ¤ x2 , there is one and only one line joining
them. And its equation is
   
y1 y2 y2 x1 y1 x2
yD xC (12.3.1)
x1 x2 x1 x2
This is straightforward as it should be. How about finding the parabola passing through three
points .x1 ; y1 /, .x2 ; y2 / and .x3 ; y3 /? The same approach of y D ax 2 C bx C c and 3 equations
for 3 unknowns to determine a; b; c would work, but it is laborious. The situation is even worse
if we have to find the curve going through 10 points. There is a much better way and it is hidden
in Eq. (12.3.1).
The idea is to re-write Eq. (12.3.1) as
   
x x2 x1 x
yD y1 C y2 (12.3.2)
x1 x2 x1 x2
Why this form is better than the previous one? It is because with this form, it is immediately
clear that when x D x1 , y D y1 because the blue term is zero or when x D x2 , y D y2 as the
red term vanishes. Thus, y has this form y D u.x/y1 C v.x/y2 with u.x1 / D 1, u.x2 / D 0 and
v.x1 / D 0, v.x2 / D 1. Note that u.x/ C v.x/ D 1.
With this, we suspect that for a parabola its equation should have this form:

y D u.x/y1 C v.x/y2 C w.x/y3

where u.x1 / D 1, u.x2 / D 0 and u.x3 / D 0. The following form satisfies the last two conditions

u.x/ D k.x x2 /.x x3 /

And the first condition gives us k D 1=.x1 x2 /.x1 x3 /, thus


.x x2 /.x x3 /
u.x/ D (12.3.3)
.x1 x2 /.x1 x3 /
Similarly, we get the expressions for v.x/ and w.x/
.x x1 /.x x3 / .x x1 /.x x2 /
v.x/ D ; w.x/ D
.x2 x1 /.x2 x3 / .x3 x1 /.x3 x2 /

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 961

and then, the quadratic interpolation is:


.x x2 /.x x3 / .x x1 /.x x3 / .x x1 /.x x2 /
yD y1 C y2 C y3 (12.3.4)
.x1 x2 /.x1 x3 / .x2 x1 /.x2 x3 / .x3 x1 /.x3 x2 /
At this point, we should check whether what we have observed, u.x/ C v.x/ D 1, continues

holding. That is, u.x/ C v.x/ C w.x/ D 1. The algebra might be messy, but the identity holds.
Now, we can write the equation for a 17th degree polynomial passing through 18 points. But
the equation would be so lengthy. We need to introduce some short notations. First, for n C 1
points .x0 ; y0 /; : : : ; .xj ; yj /; : : : ; .xn ; yn / the interpolating polynomial is now given by

X
n
y.x/ D li .x/yi (12.3.5)
i D0

What is this? It is (AGAIN!) a linear combination of some functions li .x/ with coefficients
being yi . In this equation, li .x/ is written as, (after examining the form of u; v; w, see again
Eq. (12.3.3))
Y n
x xj
li .x/ D (12.3.6)
j D0
xi xj
j ¤i

and are the so-called Lagrange basis polynomials. Plots of linear and quadratic Lagrange poly-
nomials are given in Fig. 12.1. Although named after Joseph-Louis Lagrange, who published it
in 1795, the method was first discovered in 1779 by the English mathematician Edward Waring
(1736–1798). We mentioned this not to imply that Lagrange is not great. He is one of the greatest
of all time. Just to mention that sometimes credit was not given to the first discoverer. About
this topic, some more examples are: the Lagrange interpolation formula was discovered by
Waring, the Gibbs phenomenon was discovered by Wilbraham, and the Hermite integral formula
is due to Cauchy. These are just some of the instances of Stigler’s LawŽŽ in approximation theory.

Example. There are 7 data points given in Table 12.2. And we use Lagrange interpolation to
find the 6th degree polynomial passing through all these points. As I am lazy (already in the
early 40s when doing this), I did not explicitly compute li .x/. Instead I wrote a Julia code
(Listing A.12) and with it got Fig. 12.2: a nice curve joining all the points.

Runge’s phenomenon. Consider the Runge function


1
f .x/ D (12.3.7)
1 C 25x 2
ŽŽ
Stigler’s law of eponymy, proposed by statistician Stephen Stigler in his 1980 publication Stigler’s law of
eponymy, states that no scientific discovery is named after its original discoverer. Examples include Hubble’s law,
which was derived by Georges Lemaître two years before Edwin Hubble, the Pythagorean theorem, which was
known to Babylonian mathematicians before Pythagoras, and Halley’s Comet, which was observed by astronomers
since at least 240 BC. Stigler himself named the sociologist Robert K. Merton as the discoverer of "Stigler’s law"
to show that it follows its own decree, though the phenomenon had previously been noted by others.

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 962

1.0
1.0

0.8 0.8

0.6 0.6
l0 l0
l1 0.4 l1
0.4 l2
0.2
0.2
0.0

0.0 −0.2
−1.0 −0.5 0.0 0.5 1.0 −1.0 −0.5 0.0 0.5 1.0

(a) linear l0 and l1 (b) quadratic li .x/; i D 0; 1; 2

Figure 12.1: Plots of linear and quadratic Lagrange basis functions in Œ 1; 1. It is clear that li .xj / D ıij .

1.0
x f .x/
0 0 0.5

1 0.8415 0.0

2 0.9093
−0.5
3 0.1411
4 -0.7568 −1.0
0 1 2 3 4 5 6
5 -0.9589
6 -0.2794 Figure 12.2: Lagrange interpolating function.

Table 12.2: Data points.

And we want to use equidistant points xi between 1 and 1 such that:

2i
xi D 1C ; i D f0; 1; 2; : : : ; ng
n
to construct Lagrange polynomials that can capture this function. Then, we hope that a 5th
degree Lagrange polynomial can fit Runge’s function. But it does not do a good job. Well, after
all just 6 points were used. Then, we used 10 points to have a 9th degree Lagrange polynomial,
and this is even worse: there is oscillation at the edges of the interval, even though far from the
edges, the approximation is quite good (Fig. 12.3). This is known as Runge’s phenomena.

Interpolation error. Let f be an m times continuously differentiable function on Œa; bŽŽ .


Suppose we have m sampling points x1 ; x2 ; : : : ; xm and we construct a m 1 degree polynomial
ŽŽ
which can be compactly written as f is C m .Œa; b/.

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 963

1.0
1/(1 + 25x2 )
0.8 5th Lagrange
9th Lagrange
0.6

0.4

0.2

0.0

−0.2

−1.00 −0.75 −0.50 −0.25 0.00 0.25 0.50 0.75 1.00

Figure 12.3: Runge’s phenomena. This happens only for high order polynomials and equi-spaced points.
This is named Runge’s phenomenon as it was discovered by the German mathematician Carl David Tolmé
Runge (1856–1927) in 1901 when exploring the behavior of errors when using polynomial interpolation to
approximate certain functions. The discovery was important because it shows that going to higher degrees
does not always improve accuracy. Note that this phenomenon is similar to the Gibbs phenomenon in
Fourier series (Section 4.19).

×107
10 6th derivative
0.50
0 0.25

−10 0.00

−20 −0.25

−30 −0.50

−0.75
−40
2nd derivative −1.00
−50
−1.0 −0.5 0.0 0.5 1.0 −1.0 −0.5 0.0 0.5 1.0

(a) (b)

Figure 12.4: Derivatives of the Runge function f .x/ D 1=1C25x 2 . Note that, I used SymPy to automatically
compute f .m/ .x/ and evaluate the resulting expression at sampling points in Œ 1; 1 to generate these
plots. We should take advantage of CAS to focus on other things.

p.x/ going through these points. Then for any x 2 Œa; b, we have
f .m/ ./
R.x/ WD f .x/ p.x/ D .x x1 /.x x2 /    .x xm / (12.3.8)

for some  2 Œa; b. It follows that
.x/ Y
m
jf .x/ p.x/j  max jf .m/ .y/j; .x/ D .x xi / (12.3.9)
mŠ y2Œa;b i D1

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 964

And this theorem explains the Runge phenomenon in which the derivatives blow up, Fig. 12.4b.
Note that .x/ is a monic polynomial, which is a single-variable polynomial in which the
leading coefficient (the nonzero coefficient of highest degree) is equal to 1. An n degree monic
is of this form x n C cn 1 x n 1 C    C c2 x 2 C c1 x C c0 .

Properties. If f .x/ is a polynomial of degree less than or equal n, and we use n C 1 points
.xi ; f .xi // to construct a Lagrange interpolating function y.x/. Then, y.x/  f .x/, or in other
words the Lagrange interpolation is exact. Another property
P is that the polynomial interpolant is

unique . And this uniqueness allows us to state that i li .x/ D 1 for all x–a fact that we have
observed for n D 2 and n D 3ŽŽ .
Now, the Lagrange basis functions have two properties, as stated below:

Kronecker delta property li .xj / D ıij


P (12.3.10)
Partition of unity property niD0 li .x/ D 1 for all x

Applications. Lagrange polynomials are used to derive Newton-Cotes numerical integration


rule. But its most well known application is as the basis for the finite element method–a very
powerful numerical method for solving complex 1D/2D/3D partial differential equations. Civil
engineers use this method in the design process of buildings, bridges. Mechanical engineers
use it to design cars. Aerospace engineers also use it. Just to name a few. Section 10.12 briefly
presented this method and its early history.

Motivation. If you’re wondering what ensures that there exists a polynomial that can interpolate
a given function, rest assured, it is the Weierstrass approximation theorem.
Theorem 12.3.1: Weierstrass approximation theorem
Let f be a real-valued function defined on an interval Œa; b of R. Then, for any  > 0, there
exists a polynomial p.x/ such that

jf .x/ p.x/j <  for all x 2 Œa; b

This theorem does not tell us what is the expression of p.x/; you have to find it for
yourself! But it motivates mathematicians: if you work hard, you can find a poynomial that can
approximate well any function.

Vandermonde matrix. Let’s attack the interpolation problem directly and the Vandermonde
matrix will show up. We use an n degree polynomial of this form

Pn .x/ D a0 C a1 x C a2 x 2 C    C an x n

How to prove this? In maths, to prove something is unique, we can assume that there are two versions of it,
and prove that the two are the same.
ŽŽ
Consider the case where fi D 1 for xi ; i D 0; 1; :::; n. Through these points, there is the horizontal
P line
y.x/ D 1, and this line is the only polynomial that interpolates the points. Thus, Eq. (12.3.5) leads to i li .x/ D 1.

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 965

to interpolate the n C 1 points .xi ; yi /, i D 0; 1; 2; : : : We have this system of linear equations


to solve for the coefficients ai :
a0 C a1 x0 C a2 x02 C    C an x0n D y0
a0 C a1 x1 C a2 x12 C    C an x1n D y1
:: :
: D ::
a0 C a1 xn C a2 xn2 C    C an xnn D yn
which can be re-written in a matrix notation as
2 32 3 2 3
1 x0 x02    x0n a0 y0
6 76 7 6 7
61 x1 x1    x1 7 6a1 7 6y1 7
2 n
6 76 7 6 7 (12.3.11)
6 :: :: :: : : :: 7 6 :: 7 D 6 :: 7
4 : : : : : 54 : 5 4 : 5
1 xn xn2    xnn an yn
The red beautiful matrix is the Vandermonde matrix, named after Alexandre-Théophile Vander-
monde (1735 - 1796)– a French mathematician, musician and chemist. Now, as an exercise to
determinant, we’re going to compute the determinant of the Vandermonde matrix. And from that
determinant we can alsho prove that the interpolating polynomial is unique.
It’s easier to deal with the tranpose of the Vandermonde matrix, so we consider the tranpose:
2 3
2 3 1 1  1
2
1 x 0 x0    x0 n
6 7
6 7 6 x0 x1    xn 7
61 x1 x12    x1n 7 6 7
VD6 6 :: ::
7 H) V> D 6x 2 x 2    x 2 7
: :: 7 6 0
:: : : : 1 n7
4: : : 5 6 :: :: : : :: 7
4: : : :5
1 xn xn2    xnn n n
x0 x1    xnn
Now, we consider a simpler problem with only 4 points:
ˇ ˇ ˇ ˇ
ˇ 1 1 1 1 ˇ ˇ1 ˇ
ˇ ˇ ˇ 1 1 1 ˇ
ˇ ˇ ˇ ˇ
 ˇx0 x1 x2 x3 ˇ ˇ0 x1 x0 x2 x0 x3 x0 ˇ
ˇ ˇ ˇ ˇ
det V> D ˇ 2 2 2 2 ˇ D ˇ ˇ
ˇx0 x1 x2 x3 ˇ ˇ0 x1 x1 x0 x2 x2 x0 x3 x3 x0 ˇ
2 2 2
ˇ ˇ ˇ ˇ
ˇ ˇ ˇ ˇ
ˇx 3 x 3 x 3 x 3 ˇ ˇ0 x 3 x 2 x0 x 3 x 2 x0 x 3 x 2 x0 ˇ
0 1 2 3 1 1 2 2 3 3

where the second row was replaced by 2nd row minus x0 times the 1st row; the third row by
the third row minus x0 times the second row and so on. Now, of course we expand by the first
column and do some factorizations to get, check Section 11.9.3 if something was not clear:
ˇ ˇ ˇ ˇ
ˇ ˇ ˇ ˇ
ˇ x1 x0 x2 x0 x3 x0 ˇ ˇ1 1 1ˇ
 ˇˇ ˇ
ˇ
ˇ
ˇ
ˇ
ˇ
det V> D ˇx12 x1 x0 x22 x2 x0 x32 x3 x0 ˇ D .x1 x0 /.x2 x0 /.x3 x0 /ˇx1 x2 x3 ˇ
ˇ ˇ ˇ ˇ
ˇ 3 ˇ ˇ 2 2 2ˇ
ˇx1 x1 x0 x2 x2 x0 x3 x3 x0 ˇ
2 3 2 3 2 ˇx1 x2 x3 ˇ

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 966

Now the red determinant should not be a problem for us, we can write immediately the answer
ˇ ˇ
ˇ ˇ ˇ ˇ
ˇ1 1 1ˇ ˇ1 1ˇ
ˇ ˇ ˇ ˇ
ˇ ˇ
ˇx1 x2 x3 ˇ D .x2 x1 /.x3 x1 / ˇˇ ˇ D .x2 x1 /.x3 x1 /.x3 x2 /
ˇ
ˇ ˇ ˇx2 x3 ˇ
ˇ 2 2 2ˇ
ˇx1 x2 x3 ˇ

The final result is then given by


ˇ ˇ
ˇ1 1 1 1ˇ
ˇ ˇ
ˇ ˇ
ˇx0 x1 x2 x3 ˇ
ˇ ˇ
ˇ 2 2 2 2 ˇ D .x1 x0 /.x2 x0 /.x3 x0 /.x2 x1 /.x3 x1 /.x3 x2 /
ˇx0 x1 x2 x3 ˇ
ˇ ˇ
ˇ ˇ
ˇx 3 x 3 x 3 x 3 ˇ
0 1 2 3
Y
2 Y
3 Y
D .xj xi / D .xj xi /
i D0 j Di C1 0i <j 2

As xi ’s are distinct, the determinant is different from zero, thus the Vandermonde matrix is
invertible. Thus, Eq. (12.3.11) has a unique solution. In other words, there is only one single
polynomial passing through all data points.

12.3.2 Chebyshev polynomials


Recall that in Eq. (3.8.20) we have derived a recursive formula for cos n˛:

ˆ
<1; if n D 0
cos.n˛/ D if n D 1 (12.3.12)
ˆcos ˛;

2 cos ˛ cos.n 1/˛ cos.n 2/˛; if n  2

The Chebyshev polynomials are two sequences of polynomials related to the cosine and sine
functions, notated as Tn .x/ and Un .x/. They can be defined in several equivalent ways; in this
section the polynomials are defined by starting with trigonometric functions. The Chebyshev
polynomials of the first kind Tn .x/ are defined in this way. Note that from the above equation,
cos.n˛/ is a polynomial in terms of cos ˛, e.g. cos 3˛ D 4.cos ˛/3 3 cos.˛/. For n being a
fixed counting number, the Chebyshev polynomial is defined to be that polynomial of cosine:

Tn .cos ˛/ D cos.n˛/

Change of variable x D cos ˛, and we get

Tn .x/ WD cos.n arccos x/ (12.3.13)

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 967

These polynomials were named after Pafnuty Chebyshev. The letter T is used, by Berstein,
because of the alternative transliterations of the name Chebyshev as Tchebycheff, Tchebyshev
(French) or Tschebyschow (German). Pafnuty Lvovich Chebyshev (1821 – 1894) was a Russian
mathematician and considered to be the founding father of Russian mathematics.
The recursive definition of Tn .x/ follows from the recursive formula for cos n˛:

ˆ
<1; if n D 0
Tn .x/ D if n D 1 (12.3.14)
ˆx;

2xTn 1 .x/ Tn 2 .x/; if n  2

The first four Chebyshev polynomials are, obtained using Eq. (12.3.14)

T0 .x/ D 1 D1
T1 .x/ D x D 20 x 1
T2 .x/ D 2x 2 1 D 21 x 2 1 (12.3.15)
T3 .x/ D 4x 3 3x D 22 x 3 3x
T4 .x/ D 8x 4 8x 2 C 1 D 23 x 4 8x 2 C 1

From this, we can see that Tn .x/ is an n-degree polynomial. Furthermore, the leading coefficient
of Tn .x/ is 2n 1 . Plots of the first four Tn .x/ are given in Fig. 12.5. We can see that jTn .x/j  1,
which is expected as Tn .cos ˛/ D cos.n˛/. And Tn .x/ has n real roots which lead to the
following concept.

T0 T1 T2 T3 T4
1.0

0.5

0.0

−0.5

−1.0
−1.0 −0.5 0.0 0.5 1.0

Figure 12.5: Plots of the first four Chebyshev polynomials Tn .x/. Check source file
lagrange-interpolation.jl.

Chebyshev nodes are the roots of the Chebyshev polynomial of the first kind of degree n. To

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 968

find the roots, just use Eq. (12.3.13):



Tn .x/ D 0 ” cos.n arccos x/ D 0 ” n arccos x D C k
2
Therefore, for a given positive integer n the Chebyshev nodes in the interval . 1; 1/ are
  
 1
xk D cos kC ; k D 0; 1; : : : ; n 1 (12.3.16)
n 2
It’s a good habit to plot the nodes to see how they distribute in Œ 1; 1. Fig. 12.6 is such a plot.
Also plotted are the angles n k C 21 which correspond to equally spaced points on the upper
half of the unit circle.
1.00
xk
0.75

0.50

0.25

0.00
−1.0 −0.5 0.0 0.5 1.0

Figure 12.6: Distribution of Chebyshev nodes for n D 16.

As Tn .x/ is an n degree polynomial with leading coefficient of 2n 1 and it has n roots


x1 ; x2 ; : : : ; xn , which are the Chebyshev nodes, we can write Tn .x/ in this factor form:

Tn .x/ D 2n 1 .x x1 /.x x2 /    .x xn /

Then, due to the fact that jTn .x/j  1, we have


Y
n
1
2n 1 .x x1 /.x x2 /    .x xn /  1 H) .x xi / 
i D1
2n 1

If we use the Chebyshev nodes in a polynomial approximation, then Eq. (12.3.9) gives us
1
jf .x/ p.x/j  max jf .n/ .y/j (12.3.17)
nŠ2n 1 y2Œa;b

And we hope that the denominator with nŠ and 2n 1 will dominate when n is large (compared
with jf .n/ .y/j), and thus the error jf .x/ p.x/j will decrease to zero. And we have a better
approximation. Of course we verify our guess with the Runge function (that troubled Lagrange
polynomial with equally spaced points), and the result shown in Fig. 12.7 confirms our analysis.
Now we discuss the orthogonality of Chebyshev functions. Recall that
Z 
I D cos n˛ cos m˛d˛ D 0 .m ¤ n/ (12.3.18)
0

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 969

1.0
1/(1 + 25x2 )
9th Lagrange
0.8 19th Lagrange

0.6

0.4

0.2

0.0
−1.00 −0.75 −0.50 −0.25 0.00 0.25 0.50 0.75 1.00

Figure 12.7: Approximation of Runge’s function using Chebyshev nodes: 10 nodes (red points) and 20
nodes. No more oscillations near 1 and 1.

A change of variable from ˛ to x:


p
x D cos ˛ H) dx D sin ˛d˛ D 1 x 2 d˛

reveals that I is given by


Z 1
dx
I D cos.n arccos x/ cos.m arccos x/ p D0
1 1 x2
And that is the orthogonality of Chebyshev polynomials:
Z 1
1
Tn .x/Tm .x/ p dx D 0 .m ¤ n/ (12.3.19)
1 1 x2

12.3.3 Lagrange interpolation: efficiency and barycentric forms


The efficiency of the Lagrange interpolation is not good. For example, consider again the case
of a quadratic interpolation going through three points:

.x x2 /.x x3 / .x x1 /.x x3 / .x x1 /.x x2 /


yD y1 C y2 C y3 (12.3.20)
.x1 x2 /.x1 x3 / .x2 x1 /.x2 x3 / .x3 x1 /.x3 x2 /

For each value of x, to evaluate y.x/ one needs 18 multiplications/divisions and 15 addition-
s/subtractions. The efficiency can be improved just by simple algebraic manipulations of the
formula.
First, define the following function (called the node polynomial):

l.x/ D .x x1 /.x x2 /.x x3 /

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 970

and the following numbers, which are independent of x, and thus can be computed once and for
all x i.e., out of the loop when computing y.x/, (note that i are also independent of fi , so the
same calculation can be used to interpolate different data!)
1 1 1
1 D ; 2 D ; 3 D
.x1 x2 /.x1 x3 / .x2 x1 /.x2 x3 / .x3 x1 /.x3 x2 /
then, Eq. (12.3.20) can be re-written as
 
1 2 3
y D l.x/ y1 C y2 C y3
x x1 x x2 x x3
And thus, for the general case, the new form of the Lagrange interpolation is given by (first done
by Jacobi in his PhD thesis)

X
n
i Y
n
1
y.x/ D l.x/ yi ; l.x/ D .x xi /; i D Q (12.3.21)
i D0
x xi i D0 j ¤i xi xj

It can be seen that, in this form, the Lagrange basis li .x/ is written as
i
li .x/ D l.x/ (12.3.22)
x xi
To test the efficiency of this new form, one can try to use random data. For example, in
Fig. 12.8, 80 random yi in Œ 1; 1 are generated corresponding to 80 Chebyshev nodes. Then,
Eq. (12.3.21) was used to compute y.x/ at 2001 drawing points to get the interpolating poly-
nomial (the blue curve in the figure). The new form is about 1.5 times faster than the original
form.
1.5

1.0

0.5

0.0

−0.5

−1.0

−1.0 −0.5 0.0 0.5 1.0

Figure 12.8: A Lagrange interpolating polynomial through 80 random values at 80 Chebyshev nodes. The
solid red dots are the data points.

But that’s not the end of the story. We can massage the formula to get more of it. Using the
PoU property of li .x/, we can find a formula of l.x/ as:
X X
n
i 1
li .x/ D 1 H) l.x/ D 1 H) l.x/ D Pn i
(12.3.23)
i i D1
x xi i D1 x xi

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 971

With this new form of l.x/, Eq. (12.3.21) becomes:

X n X n
i yi i 1
y.x/ D ; i D Q (12.3.24)
i D0
x xi i D1
x xi j ¤i xi xj

What’s special about this form, beside the fact that it is more efficient than the previous formsŽŽ ?
Actually, this formula has a form that most of us are familiar with. To show that, let’s introduce
this symbol
i
wi D (12.3.25)
x xi
Eq. (12.3.24) then becomes:
Pn
wi yi
y.x/ D Pi D0 n (12.3.26)
i D1 wi

This has the exactly same form of the center of mass in physics, see Eq. (7.8.17), if we think of
wi as the masses of particles. Barycenter is the term used in astrophysics for the center of mass
of two or more bodies orbiting each other. Therefore, Eq. (12.3.24) is called the barycentric
form.

12.4 Numerical integration


Rb
This section is about the computation of definite integrals (e.g. I D a f .x/dx). The idea is to
compute these integrals without finding anti-derivatives. Note that we are in the field of applied
mathematics, no more exact solutions. It is obvious that we have to use the old idea of chopping
the plane into many small pieces (for instance thin rectangles). But we cannot go to infinity.
Not only because no computer is powerful enough for that, but also that we do not need that
extremely high accuracy for applications. Instead, what we need to do is to find a smart way of
chopping so that we have a finite sum and still an accurate evaluation of theR integrals.
1
Let’s start with the simple linear function y D x. The integral I D 0 xdx has a value
of 0.5. And as the first option, we divide the region by n equal rectangles (Fig. 12.9). Using
Eq. (4.3.3), the numerical value of this integral is given by (noting that  D 1=n)
X
n
1 X n.n C 1/
I.n/ D .i/ D 2
iD (12.4.1)
i D1
n i 2n2

where we have used the formula of the sum of the first n positive integers, see Eq. (2.6.2). For
various values of n, the corresponding values of I.n/ are given in Table 12.3. We can observe a
few things from this table. First, I.n/ always overestimates I –this should be obvious by looking
at Fig. 12.9. Second, we need 500 000 intervals to get an accuracy of 6 decimals. This is not
ŽŽ
You can check this by implementing this form and compare with the others. In Julia you can use the package
BenchmarkTools for measuring the running time of a program.

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 972

practically useful. Note that for a general function it is impossible to


Phave a final formula for I.n/
as in Eq. (12.4.1); instead we have to compute I.n/ as a sum i.e., f .xi /. With n D 500 000
we need such number of function evaluation f .xi / and such a number of multiplications. That’s
a lot of work for a simple function!
y
y=x
1
E1 = ∆(y1 − y0 )
2
1
E2 = ∆(y2 − y1 )
2
1
E1 E3 = ∆(y3 − y2 )
2

x0 x1 x2 x3 x

Figure 12.9: Numerical integration of y D x from 0 to 1 using three (n D 3) equal sub-intervals and the
right points. Obviously the area is overestimated by an amount of E1 C E2 C E3 , which is the area of
the crossed triangles above the curve y D x.

R1
Table 12.3: Numerical integration of 0 xdx. Exact value is 0.5.

n 1 10 100 1000  500 000

 1 0.1 0.01 0.001  2e 6

I.n/ 1 0.55 0.505 0.5005  0.500001

I.n/ I 0.5 0.05 0.005 0.0005  0.500001

As for any approximation we need to know the associated error with our numerical integral.
Looking at Fig. 12.9, the error, denoted by E.3/, is obviously:

1 1 1
E.3/ D E1 CE2 CE3 D Œ.y1 y0 /C.y2 y1 /C.y3 y2 / D .y3 y0 / D  (12.4.2)
2 2 2
where y3 D f .1/ D 1 and y0 D f .0/ D 0, and this error can be generalized to E.n/ D 0:5.
The data (last row in Table 12.3) confirms this. Now, we can understand why the sequence .E.n//
converges slowly to 0.5. This is because the error is proportional only to . We desperately need
better methods, those for which the error is proportional to 2 or higher powers of .

12.4.1 Trapezoidal and mid-point rule


The poor performance of a mere application of the definition of a definite integral but with finite
terms is due to the fact that the thin rectangles do not faithfully align with the curve. A better

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 973

approximation, known as the mid-point rule, is obtained by also dividing the interval Œa; b into
n equal sub-intervals as before; however the height of a slice is computed at the mid-point of a
sub-interval (Fig. 12.10a). The corresponding integral is thus given by

X
n 1  

M.n/ D   f .2i C 1/ (12.4.3)
i D0
2

We use the symbol M.n/ to remind us itRis a mid-point rule. It can be seen from Fig. 12.10a that
1
this mid-point rule gives exact value of 0 xdx. We can also get the same value algebraically.

y y
y
y=x y=x 2

Ai = (yi + yi+1 )
2

yi yi+1
Ai

x0 x1 x2 x3 x x0 x1 x2 x3 x xi xi+1 x
(a) Mid-point rule (b) Trapezoidal rule

Figure 12.10: Second order quadrature rules: mid-point and trapezoidal rule. In a mid-point rule, each
slice is still a rectangle of which the height is evaluated at the mid-point, whereas in the trapezoidal rule
each slice is now a trapezoidal.

Let’s seeR the performance of the mid-point rule for a harder function y D x 2 . That is, we’re
1
computing 0 x 2 dx D 1=3. The results, given in Table 12.4, indicates that it is a 2nd order
method (look at the last column).
R1
Table 12.4: Performance of the mid-point rule versus the for 0 x 2 dx (exact value is 1/3).

n  S1 .n/ S1 .n/ 1=3 M.n/ M.n/ 1=3

1 1.0 1.00000000 6.666667e-01 0.25000000 -8.333333e-02

10 0.1 0.38500000 5.166667e-02 0.33250000 -8.333333e-04

100 0.01 0.33835000 5.016670e-03 0.33332500 -8.333333e-06

1000 0.001 0.33383350 5.001700e-04 0.33333325 -8.333333e-08

In a similar manner, one can develop a trapezoidal rule where each slice is now a trapezoidal,
because we know how to compute the area of a trapezoidal. Thus, the integral is given by (T in

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 974

T .n/ is for trapezoidal):


T .n/ D Œ.y0 C y1 / C .y1 C y2 / C    C .yn 1 C yn /
2 (12.4.4)

D Œy0 C 2y1 C 2y2 C    C yn 
2
R1
In Table 12.5 we compare the mid-point rule and the trapezoidal rule for 0 x 2 dx. Both are 2nd
order methods, but still not efficient as we need 100 intervals just for an accuracy of 6 decimals.
We need better methods. To have better methods, we need to change point of view. All the
methods discussed so far focus on the way the area of each thin slice is computed; the integrand
y D f .x/ was not touched!
R1
Table 12.5: Performance of the mid-point rule versus the trapezoidal rule for 0 x 2 dx.

n  T .n/ T .n/ 1=3 M.n/ M.n/ 1=3

1 1.0 0.50000000 1.6666667e-1 0.25000000 -8.333333e-02

10 0.1 0.33450000 1.166670e-03 0.33250000 -8.333333e-04

100 0.01 0.33334950 1.617000e-05 0.33332500 -8.333333e-06

1000 0.001 0.33333350 1.700000e-07 0.33333325 -8.333333e-08

12.4.2 Simpson’s rule


The Simpson’s rule is based on the approximation of the function f .x/ by a quadratic function
y D a2 x 2 C a1 x C a0 . The idea is similar to the other rules we have discussed: dividing the
interval into a number of sub-intervals, but instead of approximating the curve in any sub-interval
by a line as in the trapezoidal rule, Simpson approximated that segment by a parabola. And we
know how to integrate a parabola. Thomas Simpson (1710 – 1761) was a British mathematician
and inventor known for the eponymous Simpson’s rule to approximate definite integrals. The
attribution, as often in mathematics, can be debated: this rule had been found 100 years earlier
by Johannes Kepler, and in German it is called Keplersche Fassregel.
Assume that we consider the bi-unit interval Œ 1; 1 and we consider three points on it:
. 1; f . 1//, .0; f .0// and .1; f .1//, and we find a parabola of the form g.x/ D a2 x 2 C a1 x C
a0 passing through these three points. It is straightforward to find these coefficients ai Ž :

1 1
a2 D Œf . 1/ C f .1/ f .0/; a1 D Œf .1/ f . 1/ ; a0 D 2f .0/ (12.4.5)
2 2
Ž
Using the method of undetermined coefficients to get three equations for three unknowns a0 ; a1 ; a2 .

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 975

Now, we can approximate the integral in Œ 1; 1 by replacing the function by g.x/


Z 1 Z 1
f .x/dx  g.x/dx
Z
1 1
1
D .a2 x 2 C a1 x C a0 /dx (12.4.6)
1
 
2a2 1
D C 2a0 D f . 1/ C 4f .0/ C f .1/ .Eq: (12.4.5)/
3 3

Going from Œ 1; 1 to Œc; d  is easy. By using a change of variableŽ , we thus obtain the well-
known Simpson’s rule:
Z d    
d c cCd
f .x/dx  f .c/ C 4f C f .d / (12.4.7)
c 6 2

More often we need to break the interval Œa; b into n equal sub-intervals of length  D .b a/=n
and apply the Simpson rule for each interval:
Z b n Z
X aCi
f .x/dx  f .x/dx
a i D1 aC.i 1/
  (12.4.8)
X n

D f .a C .i 1// C 4f .a C i =2/ C f .a C i/
i D1
6

We test the performance of Simpson’s rule for x 2 ; x 3 and x 4 . The Julia code is given in
Listing A.10 which is based on Eq. (12.4.8). The error for y D x 2 is zero which is expected.
The error is also zero for y D x 3 , which is a surprise. And the error for y D x 4 is proportional
to 4 ; Simpson’s rule is a 4th order method, which explains its popularity in calculators and
codes.

Table 12.6: Performance of Simpson’s rule for integral of x 2 ; x 3 ; x 4 from 0 to 1.

n 1 10 100

 1.00e+00 1.00e-01 1.00e-02

error for y D x 2 0.00e+00 0.00e+00 0.00e+00

error for y D x 3 0.00e+00 0.00e+00 0.00e+00

error for y D x 4 8.33e-03 8.33e-07 8.33e-11

Ž
If you’re not clear of this change of variable, check Section 12.4.3.

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 976

Another derivation. By now we can see that all quadrature rules have this common form
Z b X
f .x/dx D wi f .xi / (12.4.9)
a i

that is the sum of f .x/ evaluated at some points xi multiplied with a weight wi . In other words,
the integral is a weighted sum of function values at specially selected locations. So, we can select
a prior xi ’s–the quadrature points–and determine the corresponding
R1 weights wi . The first choice
is to use equally spaced quadrature points. For example, 1 f .x/dx can be computed as with 3
equally spaced points at 1; 0; 1:
Z 1
f .x/dx D w1 f . 1/ C w2 f .0/ C w3 f .1/ (12.4.10)
1

The problem is now how to determine the weights wi . We use Simpson’s idea of parabolic
approximation to replace f .x/ by ax 2 C bx C c. With this f .x/, Eq. (12.4.10) becomes:
2
a D w1 .a b C c/ C w2 .c/ C w3 .a C b C c/
3
D a.w1 C w3 / C b.w3 w1 / C c.w1 C w2 C w3 /
So we have two expressions supposed to be identical for all values of a; b; c. This can happen
only when: 9
w1 C w3 D 2=3> = 1 4
w1 w3 D 0 H) w1 D w3 D ; w2 D
>
; 3 3
w1 C w2 C w3 D 0
which is the same result we have obtained in Eq. (12.4.6).

Newton-Cotes rule. It can be seen that the mid-point rule can be derived similarly to the
Simpson rule by approximating the function f .x/ with a constant function within each slice.
And the trapezoidal rule is where a linear approximation to the function was used. Actually these
rules are special cases of the so-called Newton-Cotes rules. Note that, in Newton-Cotes rules,
the quadrature points are evenly spaced along the interval and thus known. We just need to find
the quadrature weights wi .

12.4.3 Gauss’s rule


R1
Gauss also considered this integral 1 f .x/dx. But he wanted to beat Newton-Cotes by having
R1 P
less quadrature points. He also used 1 f .x/dx D i wi f .xi /, but the quadrature points xi
are not selected a priori, they are also unknowns to be determined together with the weights wi .

Two-point Gauss rule. In the two-point Gauss rule, two quadrature points are used, thus we
write Z 1
f .x/dx D w1 f .x1 / C w2 f .x2 / (12.4.11)
1

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 977

To determine the 4 unknowns (i.e., .x1 ; x2 ; w1 ; w2 /), we need 4 equations. Gauss’s idea is to
exactly integrate these functions 1; x; x 2 ; x 3 . Using Eq. (12.4.11) for these 4 functions, we have

f .x/ D 1 W 2 D w1 C w2
f .x/ D x W 0 D w1 x1 C w2 x2
2
f .x/ D x 2 W D w1 x12 C w2 x22
3
f .x/ D x 3 W 0 D w1 x13 C w2 x23

Four equations and four unknowns should be ok. But the equations are nonlinear. How to solve
them? Lucky for us, the equations are symmetric: changing w1 with w2 does not change the
equations! So we know w1 D w2 and thus from the first equation they are both equal to one.
p p
Symmetry demands that x1 D x2 . Then, it is straightforward to get x1 D 1= 3 and x2 D 1= 3.
The two-point Gauss rule is thus given by
Z 1    
1 1
f .x/dx  1  f p C1f p
1 3 3
So, with two quadrature points (now referred to as Gauss points) Gauss quadrature can integrate
exactly cubic polynomials, by its very definition.

Three-point Gauss rule. In the same manner, we can develop the three-point Gauss rule:
Z 1
f .x/dx D w1 f .x1 / C w2 f .x2 / C w3 f .x3 / (12.4.12)
1

To determine the 6 unknowns, we need 6 equations. So, the idea is to exactly integrate these six
functions 1; x; x 2 ; x 3 ; x 4 ; x 5 . Using Eq. (12.4.12) for these 6 functions, we have

f .x/ D 1 W 2 D w1 C w2 C w3
f .x/ D x W 0 D w1 x1 C w2 x2 C w3 x3
2
f .x/ D x 2 W D w1 x12 C w2 x22 C w3 x32
3
f .x/ D x 3 W 0 D w1 x13 C w2 x23 C w3 x33
2
f .x/ D x 4 W D w1 x14 C w2 x24 C w3 x34
5
f .x/ D x 5 W 0 D w1 x15 C w2 x25 C w3 x35

Again, symmetry will help us to solve this scary-looking equations:

x1 D x; w1 D w
x2 D 0; w2 D w2
x3 D x; w3 D w

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 978

where x > 0. Now, the system is simplified to


9
>
2w C w2 D 2 >
>
> r
2 2 = 3 5 8
2wx D H) x D ; w D ; w D
3> 2
2>
5 9 9
>
>
4
2wx D ;
5

The three-point Gauss rule is thus given by

Z p
1
5 8 5 p
f .x/dx  f . 3=5/ C f .0/ C f . 3=5/ (12.4.13)
1 9 9 9

So, with three Gauss points Gauss quadrature can integrate exactly quintic polynomials. We can
generalize this to: using n Gauss points, Gauss’ rule can integrate exactly polynomials of degree
equal or less than 2n 1.
How we are going to develop 4-point Gaussian quadrature and higher order versions? The
way we just used would become tedious. But wait. The quadrature points xi are special. Can
you say what they are? Yes, they are the roots of Legendre polynomials, see Table 11.4. That’s
why Gaussian quadrature is also referred to as Gauss-Legendre (GL) quadrature. While this
is a pleasant surprise, we need to be able to explain why Legendre polynomials appear here.
Then, nice formula will appear and derivation of GL quadrature of any points will be a breeze.
Table 12.7 presents values for some GL rules.

Table 12.7: Gauss-Legendre quadrature formulas on Œ 1; 1.

n i wi

1 0. 2.0000000000

2 ˙0:5773502692 1.0000000000

3 ˙0:7745966692 0:5555555556

0. 0:8888888889

4 ˙0:8611363116 0.3478548451

˙0:3399810436 0.6521451549

Rb R1
Arbitrary interval. We need a f .x/dx not 1 f ./d . A simple change of variable is needed:
x D 0:5.1 /a C 0:5.1 C /b. So, the n points GL quadrature is given by

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 979

Z Z "  #
b
b a 1
b a X aCb b a
f .x/dx D f .x.//d   wi f C i
a 2 1 2 i
2 2

(12.4.14)

which can accurately integrate any polynomial of degree less than or equal 2n 1.

Derivation of Gauss-Legendre quadrature rule. Herein we present the derivation of the


Gauss-Legendre quadrature rule using orthogonal Legendre polynomials. Assume that we want
to compute the following integral where p5 .x/ is any 5th degree polynomial:
Z 1
I D p5 .x/dx (12.4.15)
1
We do not compute this integral directly, but we massage p5 .x/ a bit: we divide it by the Legendre
polynomial L3 .x/:
p5 .x/ D Q2 .x/L3 .x/ C R2 .x/ (12.4.16)
where Q2 .x/ and R2 .x/ are polynomials of degree 2 at most. Now, the integral becomes
Z 1 Z 1
I D ŒQ2 .x/L3 .x/ C R2 .x/dx D R2 .x/dx (12.4.17)
1 1
We converted an integral of a 5th degree polynomial to the Rintegral of a 2nd degree polynomial!
1
(This is so because Q2 .x/ and L3 .x/ are orthogonal, i.e., 1 Q2 .x/L3 .x/dx D 0, check Sec-
tion 11.11.5 if this is not clear). Now, we tackle the problem of how to compute the integral of
R2 .x/ without knowing its expression. But, we do know p5 .x/. So, if we use the roots of L3 .x/,
denoted by x0 ; x1 ; x2 , we have this, from Eq. (12.4.16)
p5 .xi / D R2 .xi /; or R2 .xi / D p5 .xi / (12.4.18)
Now, the problem is easier. We build a Lagrange polynomial interpolating the points .xi ; R2 .xi //,
or .xi ; p5 .xi //. But this polynomial is exactly R2 .x/, so we have
X
2
R2 .x/ D li .x/p5 .xi / (12.4.19)
i D0
With all these results, the original integral can be computed as
Z 1 Z 1 Z 1X 2 X
2 Z 1
I D p5 .x/dx D R2 .x/dx D li .x/p5 .xi /dx D p5 .xi / li .x/dx
1 1 1 i D0 i D0 1

X
2 Z 1
D p5 .xi /wi ; wi WD li .x/dx; xi are roots of L3 .x/ D 0
iD0 1

Now, we understand why GL points are the roots of Legendre polynomials. You should double
check the values in Table 12.7 using this.

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 980

12.4.4 Two and three dimensional integrals


It is a straightforward step to go from one dimensional integrals to two dimensional integrals.
We just keep one variable fixed, and integrate the integral using any known 1D quadrature rule.
Then, we apply again a 1D rule for the remaining integral:
Z 1Z 1 Z 1 Z 1  Z 1 "X n
#
f .; /d d D f .; /d  d D wi f .i ; / d
1 1 1 1 1 i D1
n Z 1
X X
n X
n
D wi f .i ; /d D wi wj f .i ; j /
i D1 1 i D1 j D1

Sometimes a short notation is used, and we write


Z 1 Z 1 X
nn
f .; /d d D wk f . k /
1 1
kD1

12.5 Solving nonlinear equations


Solving nonlinear equations of the form f .x/ D 0 is a common problem in applied mathematics.
Sometimes it is referred to as root finding. Of course we’re interested in systematic methods
that can solve any equation using some algorithm. Herein I discuss just some of them so that we
know the general principles of these methods. Section 12.5.1 provides an analysis of the fixed
point iteration method.

12.5.1 Analysis of the fixed point iteration


In Section 2.11 we have discussed the fixed point iteration method to solve nonlinear equations.
The algorithm is simple, but we do not know why it works. Recall that to solve the equation
f .x/ D 0, we convert it to a fixed point problem i.e., that of the form x D g.x/Ž . Starting with
the initial x0 , the fixed point iterations are then given by

xnC1 D g.xn /; n D 0; 1; 2; : : :

And we continue until we arrive at the solution ˛ (i.e., ˛ D g.˛/). Practically it is when the so-
called stopping condition jxn xn 1 j <  is met, where  is a small positive number controlling
the tolerance. Then the iterations stop.
To motivate the discussion, let’s use the fixed point method to solve the following equations:

E1 W x D 1 C 0:5 sin x
(12.5.1)
E2 W x D 3 C 2:0 sin x
Ž
One way is to cast f .x/ D 0 as x D x C f .x/. But note that this form might not work. Herein I skip the
discussion on how to choose g.x/ that guarantee the success of the method.

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 981

A simple plotting shows that both equations have one solution. But if we use the fixed point
iterations we see that, while it works for the first equation (the iterates xn converge to the correct
solution), it does not work for the second (Fig. 12.11). The question is why?

5
3.0 1.0

2.5 4
0.8

2.0
3
0.6
1.5
2
0.4
1.0

1
0.5 0.2

0.0 0
0.0
0 1 2 3 0 1 2 3 4 5 0.0 0.2 0.4 0.6 0.8 1.0

(a) g.x/ D 1 C 0:5 sin x (b) g.x/ D 3 C 2:0 sin x (c) g.x/ D 2:8x.1 x/.

Figure 12.11: Fixed point iterations for two functions given in Eq. (12.5.1) and g.x/ D 2:8x.1 x/.
Source code: fixed_point_iter.jl.

Now consider this equation x D g.x/ with one solution denoted by ˛. We consider the
interval Œa; b that contains ˛. Now, as always, to analyze the method, we study the error enC1 D
˛ xnC1 . The error is certainly a seque]Simply a list of many numbers., and if this sequence
converges to zero when n is getting larger, then the method works. We compute the error now:

˛ xnC1 D g.˛/ f .xn /

The mean value theorem (Section 4.11.2) gives us a way to understand f .˛/ f .xn /:

g.˛/ g.xn / D g 0 .n /.˛ xn /; n 2 Œ˛; xn 

Let’s denote by  the maximum of jf 0 .x/j for x 2 Œa; b. Thus,

˛ xnC1  .˛ xn / ” j˛ xnC1 j  j.˛ xn /j

The boxed equation tells us: the distance to the fixed point ˛ is shrinking every iteration. And
that’s exactly what we see in Fig. 12.11a or Table 12.8. The boxed equation can be used to show
that
j˛ xn j  n j˛ x0 j
If   1, then the red term e.g. n will vanish, thus the error j˛ xn j is vanishing too, and voilà,
the iterates xn converge to the solution.
Is it possible to find ? From Table 12.8, we can see that   0:0361. It turns out that
 D g 0 .˛/ D g 0 .1:4987011332479/. The reason is as follows. The MVT gives us

˛ xnC1 D g.˛/ g.xn / D g 0 .n /.˛ xn /; n 2 Œ˛; xn 

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 982

Table 12.8: Fixed point iterations for g.x/ D 1 C 0:5 sin x with x0 D 0 and  D 10 6.

n xn j˛ xn j j˛ xnC1 j=j˛ xn j xn xn =xn


1 1 xn 2

0 0.0000000000000 1.4987011335180 - -

1 1.0000000000000 0.4987011335180 0.3327555590402 -

2 1.4207354924039 0.0779656411141 0.1563374050587 0.4207354924039

3 1.4943809925643 0.0043201409537 0.0554108308730 0.1750399039063

6 1.4987009254070 0.0000002081110 0.0360178432695 0.0360577760154

7 1.4987011260224 0.0000000074956 0.0360171325034 0.0360178698262

8 1.4987011332479 0.0000000002701 0.0360359025969 0.0360164311995

Thus,

j˛ xnC1 j j˛ xnC1 j
D jg 0 .n /j H) lim D lim jg 0 .n /j (12.5.2)
j˛ xn j n!1 j˛ xn j n!1

As n ! 1, xn ! ˛, hence n ! ˛, and as g 0 .x/ is a continuous function, limn!1 jg 0 .n /j D


jg 0 .˛/j. With g.x/ D 1C0:5 sin x, g 0 .˛/ D 0:5 cos.1:4987011332479/ D 0:0360. The analysis
confirms the numerical experiment shown in Table 12.8.
If we look at Figs. 12.11a and 12.11c we see that, although the iterates converge to the
solution, the way they approach the solution is different. In Fig. 12.11c the iterates approach the
solution from both sides, thus making a cobweb. On the other hand, in Fig. 12.11a, the iterates
approaches the solution from only one side, making a staircase picture. Can we explain this? Of
course yes. Implied by Eq. (12.5.2) is, close to ˛ we have

˛ xnC1
˛ xnC1  g 0 .˛/.˛ xn /; or  g 0 .˛/
˛ xn

This equation says that the ratio ˛ xnC1=˛ xn has the same sign with g 0 .˛/. When g 0 .˛/ is
negative, ˛ xnC1 and ˛ xn are of different sign; the iterates oscillate between ˛. Indeed, for
g.x/ D 2:8x.1 x/, g 0 .x/ D 2:8 5:6x < 0 at the solution.
Now, we can also explain why the method did not work for g.x/ D 3 C 2 sin x. This is
because jg 0 .˛/j > 1. When this is the case, from j˛ xnC1 j  jg 0 .˛/jj.˛ xn /j, we see that
the errors will increase as we approach the root rather than decrease in size. This is what we see
in Fig. 12.11b. Precisely g 0 .˛/ D 1:99, negative: that’s why we’re seeing a cobweb.

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 983

12.5.2 subsubsection name

12.6 Numerical solution of ordinary differential equations


This section presents some common numerical methods (such as Euler’s method) to solve either
a single ordinary differential equation or a system of ordinary differential equations . It is with
the power of these methods that Katherine JohnsonŽ helped to put men on the moon.
We begin with the simplest method–the Euler method (Section 12.6.1) for first order ODEs.
Next, we discuss this method for second order ODEs (e.g. equations of motions of harmonic
oscillators and of planets orbiting the Sun) in Section 12.6.2. Albeit simple, the Euler method
does not conserve energies, it therefore is bad for modeling the long term behavior of oscillatory
systems. Thus, we need a better method and one of them is the Euler-Aspel-Cromer method
presented in Section 12.6.3. Having a good numerical method, we then apply it to the Kepler
problem i.e., we solve the Sun-earth problem (Section 12.6.4). For what? To rediscover for
ourselves that planets do indeed go around the Sun in elliptical orbits. And high school students
can achieve that because the maths behind all of this is simple. In a logical development, we
study three-body and N -body problems in Section 12.6.5. Although Euler’s method and related
variants are simple and good, they are only first order methods (i.e., the accuracy is low), I present
a second order method in Section 12.6.6. That is the Verlet method–a very popular method used
to solve Newton’s equations of motions i.e., F D ma. Section 12.6.7 presents an analysis of the
Euler method to answer questions such as what is the accuracy of the method.

12.6.1 Euler’s method: 1st ODE


To introduce Euler’ method, let’s consider the following 1st order ODE

xP D f .x; t/; x.0/ D x0 (12.6.1)

We can think of the above equation as the velocity of an object. Suppose that at a given time t
the object has a certain position x.t/. What is the position at a slightly later time t C ? ( is
referred to as the time step). If we can answer this question we have solved Eq. (12.6.1), for then
we can start with the initial position x0 and compute how it changes for the first instant , the
next instant 2, and so on.
Now, if  is small, we can compute the velocity as the averaged velocityŽŽ
x.t C / x.t/
xP D (12.6.2)


Refer to Chapter 9 for an introduction to these ODEs.
Ž
Katherine Johnson (August 26, 1918 – February 24, 2020) was an American mathematician whose calculations
of orbital mechanics as a NASA employee were critical to the success of the first and subsequent U.S. crewed
spaceflights. During her 33-year career at NASA and its predecessor, she earned a reputation for mastering complex
manual calculations and helped pioneer the use of computers to perform the tasks. The space agency noted her
"historical role as one of the first African-American women to work as a NASA scientist".
ŽŽ
Or if you like you can say that we are using the forward difference formula for the first derivative of x.t /.
They are equivalent.

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 984

With that xP being substituted into Eq. (12.6.1), we can get x.t C /:
x.t C / x.t/
D f .x; t/ H) x.t C / D x.t/ C f .x; t/ (12.6.3)

The boxed equation, which is the Euler method, enables the solution x.t/ to advance or march
in time starting from x.0/. If you use Euler’s method with small  you will find that it works
nicely. (Just try it with some 1st ODE). We rush now to second order ODEs which are more fun.
But how small is small for ? Does the numerical solution converge to the exact solution when
 goes to zero? What is the accuracy of the method? Those are questions that mathematicians
seek answer for. For now, let’s have fun first and in Section 12.6.7 we shall try to answer those
questions. That’s how scientists and engineers approach a problem.

12.6.2 Euler’s method: 2nd order ODE


As a typical 2nd order ODE, let’s consider the simple harmonic oscillator with mass m, spring
k and damping b (Section 9.8):

mxR C b xP C kx D 0; x.0/ D x0 ; x.0/


P D v0 (12.6.4)

Now, introducing the velocity v D x,


P the above equation is re-written as
b k
vP D v x WD F .v; x/ (12.6.5)
m m
Using the Euler method, that is the boxed equation in Eq. (12.6.3), for the position equation
xP D v and the velocity equation vP D F , we obtain

x.t C / D x.t/ C v.t/


(12.6.6)
v.t C / D v.t/ C F .v.t/; x.t//

The Euler method is easy to program. Usually it works nicely but for some problems it per-
forms badly, and simple harmonic oscillation is one of them (Fig. 12.12). Input data used:
k D m D 1, x0 D 1, v0 D 0 and b D 0 (i.e., no damping), the total time is three periods and
time step  D 0:01. The plot of x.t/ shows that the amplitude of the oscillation keeps increasing
(Fig. 12.12a). This means that energies also increase, and thus energy conservation is violated.
Thus, the phase portrait is no longer a nice circleŽŽ (Fig. 12.12b). The orange is the exact phase
portrait.
To understand what went wrong, we need a better notation. Instead of writing x.n/, we
write xn . Thus the subscript n is used to indicate the time when a certain term is evaluated; the
discrete time events are tn D n for n D 0; 1; 2; : : : With the new notation, Eq. (12.6.6) becomes

xnC1 D xn C vn
(12.6.7)
vnC1 D vn C Fn
ŽŽ
Refer to Fig. 9.8 and the related discussion if phase portrait is not clear.

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 985

1.0
0.50

0.25 0.5

0.00


0.0

−0.25
−0.5

−0.50
−1.0
0 5 10 15 −1.0 −0.5 0.0 0.5 1.0
t x

(a) (b)

Figure 12.12: Euler’s method does not conserve : simple harmonic oscillation problem.

As the total energy is wrong, we analyze it. At two iterations or time steps tn and tnC1 , the total
energies are (without loss of generality I used m D k D 1)
1 1
En D vn2 C xn2
2 2 (12.6.8)
1 2 1 2
EnC1 D vnC1 C xnC1
2 2
Now using Eq. (12.6.7), we compute EnC1 :

1 1 2 2
EnC1 D .vn C Fn /2 C .xn C vn /2 D En C Fn vn C Fn2 C xn vn C vn2 (12.6.9)
2 2 2 2
Noting that Fn D xn , thus the change in total energy is
 
2 1 2 1 2
En WD EnC1 En D  x C v >0 (12.6.10)
2 n 2 n
And that’s why the numerical total energy is increasing and finally it will blow up the computa-
tions.

12.6.3 Euler-Aspel-Cromer’ s method: better energy conservation


The method that we now refer to as the Euler-Cromer method was discovered quite by accident.
Around 1980, Abby Aspel (who at the time was a high school student) correctly coded up the
Euler method for the Kepler problem. Thinking the resulting inaccurate model was caused by
an error in her code, she interchanged two lines in her program, and the model seemed to work.
Abby accidentally stumbled upon the method, given for our problem of SHO as follows:

vnC1 D vn C Fn
(12.6.11)
xnC1 D xn C vnC1

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 986

The only change is in the red term, instead of using vn , now vnC1 is used. If you modify the code
(very slightly) and rerun the SHO problem, you will see that the results are very good. Cromer
in his paper entitled Stable solutions using the Euler approximation (so Cromer did not call his
method Cromer’s method and he gave credit to Aspel even though in a footnote) presented a
mathematical analysis of why the method works.
The change in total energy is now given by
 
2 1 2 1 2
En D  v x  3 vn xn (12.6.12)
2 n 2 n

12.6.4 Solving Kepler’s problem numerically


Now we have a powerful tool–the Euler-Aspel-Cromer method–
to solve ordinary differential equations. Let’s use it to analyze y
the motion of a planet around the sun. We hope that we would y(t) Fx
discover the elliptical orbit that Newton did. But we use a com- F α m
puter to do that. This is known as the 2-body problem. With that Fy
r
name we certainly have 3-body problems and generally N -body
problems. Note that even though there exists closed form solu-
tions to the 2-body problem, there is no such formula for N -body M x
x(t)
problems when N > 2. Numerical methods such as Euler’s method can solve N -body problems.
Let’s use a Cartesian coordinate system with x being the horizontal and y the vertical axis.
The sun is at the origin of this system. The position of a planet is .x.t/; y.t// at time t. Of
course, we are going
p to use Newton’s 2nd law and his theory of gravitation. When the planet is
at a distance r D x 2 C y 2 from the sun, it is pulled by the sun with a gravitational force of
magnitude
GM m
F D
r2
We decompose this force into two components Fx and Fy (Fx D F sin ˛, sin ˛ D x=r see
above figure):
x GM m y GM m
Fx D F D 3
x; Fy D F D y
r r r r3
Newton’s 2nd law now give us

dvx GM m dvy GM m
m D x; m D y (12.6.13)
dt r3 dt r3
p
We have two ODEs, not one. But that’s no problem. Don’t forget that r D x 2 C y 2 . Using the
Euler-Aspel-Cromer method, we have (as the mass of the Sun is too big, it is assumed that the

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 987

Sun is stationary)
q
rn D xn2 C yn2
 
GM
vx;nC1 D vx;n C  xn
rn3
  (12.6.14)
GM
vy;nC1 D vy;n C  yn
rn3
xnC1 D xn C vx;nC1
ynC1 D yn C vy;nC1
with the initial conditions .x0 ; y0 / and .vx0 ; yy0 /, to be discussed shortly. Remark: the notation
got a bit ugly now: vx;nC1 means the x component of the velocity at time step n C 1.
Before we can run the code, there is the matter of choice of units. As the radius of Earth’s
orbit around the sun is about 1:5  1011 m, a graph showing this orbit would have labels of
1  1011 m, 2  1011 m etc., which is awkward. It is much more convenient to use astronomical
units, AU, which are defined as follows. One astronomical unit of length (i.e., 1 AU) is the
average distance between the Sun and the Earth, which is about 1:5  1011 m. For time, it is
convenient to measure it in years. What is then the unit of mass?
Recall that the Earth’s orbit is, to a very good approximation, circular. Thus, there must be a
force equal to ME v 2 =r (r D 1 AU), where v is the Earth’s speed which is equal to 2 r=.1 yr/ D
2 AU/yr. Thus, we have
ME v 2 GMME
D 2
H) GM D v 2 r D 4 2 AU3 /yr2
r r
Now, we discuss the initial positions and velocities for
Mercury (as we want to see an ellipse). Using astronomical
data we know that the eccentricity of the elliptical orbit for
Mercury is e D 0:206, and the radius (or semi major axis)
a D 0:39 AU. For the simulation, we assume that the initial
position of Mercury is at the aphelion .x0 ; y0 / D .r1 ; 0/
with r1 D a.1 C e/ (check Section 4.13.2 if something not
clear). The initial velocity is .0; v1 /. How to compute this v1 ?
We need two equations: angular momentum conservation
and energy conservation evaluated at two points; these two
equations involve two unknown velocities v1 and v2 . The
angular momentum is rx py ry px , evaluated at two points
.r1 ; 0/ and .0; r2 /:
Figure 12.13: Mercury elliptical or-
v1 r1 p
v1 r1 D v2 b H) v2 D ; b D a 1 e2 bit.
b
With m being the mass of Mercury and M the mass of the Sun, conservation of total energy
provides us the second equation:
GM m 1 2 GM m 1 2
C mv1 D C mv2
r1 2 r2 2

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 988

Solving these two equations for v1 , noting that r2 D a, we get


r
GM 1 e
v1 D
a 1Ce
Now, we can really let the Mercury go! And with the Euler-Aspel-Cromer method and Newton’s
laws, we are able to get the elliptical orbit of planetary motion (Fig. 12.13). We can determine
the period T (how?) etc. Applying the same method for other planets we can also discover
Kepler’s third law: for each planet just compute T =a3=2 and you will see that this quantity is
p
approximately one (recall that Kepler told us that this constant should be k D 2= GM D 1).
We can also discover the 2nd Kepler’s law.

12.6.5 Three body problems and N body problems


It is logical that after solving the two body problem, we solve the three body problem. Newton
did that. In Proposition 66 of Book 1 of the Principia, and its 22 Corollaries, Newton took the
first steps in the definition and study of the problem of the movements of three massive bodies
subject to their mutually perturbing gravitational attractions. In Propositions 25 to 35 of Book
3, Newton also took the first steps in applying his results of Proposition 66 to the lunar theory,
the motion of the Moon under the gravitational influence of Earth and the Sun. Newton did not
succeed just because, as Poincare pointed out many years later, there is no closed form solution.
If those solutions existed Newton would be able to discover them.
Herein, we solve the N body problem using numerical y m2
method. We use the three body problem as an example to set
F21
up the equations, but the code is written for N bodies. It is more m3
convenient to use vectors now; because one vectorial equation F31
replaces two normal equations: we save time. Consider now three F12
bodies of masses m1 ; m2 ; m3 , their positions are r i .t/ and their F13
Gm1 m2
velocities are vi .t/; i D 1; 2; 3. Now focusing on mass m1 , the m 1 F12 =
||r12 ||3
r12
forces acting on it are (jjrjj means the Eucledian length of r) x
q
Gm1 mj
F 1 D F 12 C F 13 ; F 1j D r 1j ; r 1j D rj r 1; .jjxjj D x12 C x22 /
jjr 1j jj3

Now the dynamical equations for m1 are

d v1 d r1
m1 D F 1; D v1
dt dt
Using the Euler-Aspel-Cromer method, we update the velocity and position for mass m1 :

v1;nC1 D v1;n C F 1;n ; r 1;nC1 D r 1;n C v1;nC1

Then we do the same thing for the other two masses. That’s it. It’s time to generalize to N bodies.

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 989

For body i , we do:


X
N
Gmj
F i;n D r ij ; r ij D rj;n r i;n
jjr ij jj3
j D1;j ¤i (12.6.15)
vi;nC1 D vi;n C F i;n
r i;nC1 D r i;n C vi;nC1
Let’s have fun with this. From wikipedia page on three body problems, I obtained the fol-
lowing initial conditions:
r1 .0/ D r 3 .0/ D . 0:97000436; 0:24308753/I r 2 .0/ D .0; 0/
v1 .0/ D v3 .0/ D .0:4662036850; 0:4323657300/I v2 .0/ D . 0:93240737; 0:86473146/

And with thatŽ we get the beautiful figure-eight in Fig. 12.14a with equal masses (I used m1 D
m2 D m3 D 1 and G D 1). You can go to the mentioned wikipedia page to see the animation.
Now with mass m2 slightly changed to r 2 .0/ D .0:1; 0/ instead, we get Fig. 12.14b. How about
solution time? With a time step  D 0:01 and a total time of about 6 (whatever unit it is), that is
600 iterations or steps, the code runtime is about 42 seconds including generation of animations
on a 16 GB RAM Mac mini with Apple M1 chip.

1.0
1.0

0.5 0.5

0.0 0.0
y

0.5 0.5

1.0 1.0
1.0 0.5 0.0 0.5 1.0 1.0 0.5 0.0 0.5 1.0
x x

(a) (b)

Figure 12.14: Three body problems solved with Euler-Aspel-Cromer method.

12.6.6 Verlet’s method


As Euler’s method is only a first-order method, its accuracy is low (see Section 12.6.7 to under-
stand what it means). Now I present a second-order method, the Verlet methodŽŽ . The Verlet
Ž
The program is presented in Appendix A.7.
ŽŽ
It is named after Loup Verlet (1931 – 2019), a French physicist who pioneered the computer simulation of
molecular dynamics models. In a famous 1967 paper he used what is now known as Verlet integration (a method
for the numerical integration of equations of motion) and the Verlet list (a data structure that keeps track of each
molecule’s immediate neighbors in order to speed computer calculations of molecule to molecule interactions).

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 990

method is a popular method to integrate Newton’s equations of motion xR D f .x; t/. We begin
with Taylor expansions:
R
x.t/ «.t/ 3
x
x.t C / D x.t/ C x.t/
P C 2 C 
2 3Š (12.6.16)
R
x.t/ «.t/ 3
x
x.t / D x.t/ x.t/P C 2 
2 3Š
Adding and subtracting these two equations we obtain
2
x.t C / C x.t / D 2x.t/ C x.t/
R
(12.6.17)
x.t C / x.t / D 2x.t/
P
And from that, we obtain the Verlet method
2
x.t C / D 2x.t/ x.t / C x.t/ R
x.t C / x.t / (12.6.18)
P
x.t/ D
2
We can see that the position update requires positions at previous two time steps (i.e., x.t /
and x.t / ). Thus the Verlet method is a two-step method and furthermore it is not self starting. At
t D 0, we need x. /. The velocities are not required in the position update, but often they are
necessary for the calculation of certain physical quantities like the kinetic energy. That where
the second equation comes in. Due to the blue term in the second equation we will have problem
with round of errors. What is more, we have to store the position at three steps x.t /; x.t /
and x.t C /.
A mathematically equivalent algorithm known as Velocity Verlet was developed to solve
these issues. The Velocity Verlet method is‘ :
1 2
x.t C / D x.t/ C x.t/
P C x.t/
R
 2  (12.6.19)
R C x.t
x.t/ R C /
P C / D x.t/
x.t P C 
2
The first equation is obtained by eliminating x.t / in Eq. (12.6.18): substituting that term
obtained from the second into the first. The derivation of the velocity update is as follows:
x.t C 2/ x.t/
P C / D
x.t
2
1
x.t C 2/ D x.t C / C x.t
P C / C x.t R C / 2
2
1 2
x.t C / D x.t/ C x.t/
P C x.t/
R
2

The algorithm was first used in 1791 by Delambre and has been rediscovered many times since then. It was
also used by Cowell and Crommelin in 1909 to compute the orbit of Halley’s Comet, and by Carl Størmer in 1907
to study the trajectories of electrical particles in a magnetic field (hence it is also called Störmer’s method).

Note that as the velocity update requires the acceleration at t C , the Verlet method cannot be used for
problems in which the force depends on the velocity. For example, it cannot be used to solve damped harmonic
oscillation problems.

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 991

12.6.7 Analysis of Euler’s method


Let’s forget time and motion, all first order ODE is of the following general form:

y 0 .x/ D f .x; y/; a  x  b; y.a/ D y0 (12.6.20)

where the independent variable x lies within a and b. And our goal is to study the accuracy of
Euler’s method. Let’s start with one example that we know the exact solution (from that we can
calculate the error in Euler’s method):
x2
y 0 D xy; y.0/ D 0:1 H) y.x/ D 0:1e 2 (12.6.21)

We solve this problem using Euler’s method with different


step sizes h D 0:4; 0:2; 0:1. And we plot these numerical so- 0.7 exact

lutions with the exact solution in one plot. In this way, we can 0.6
h =0.4
h =0.2

understand the behavior of the method. From the results shown in 0.5 h =0.1

the figure, we observe that the numerical solutions get closer to

y(x)
0.4

0.3
the exact one when the step size h is getting smaller. The second 0.2

observation is that the error is getting bigger as x increases. Our 0.1

problem is now to quantify the error and show that it is getting 0.0
0.0 0.5 1.0
x
1.5 2.0

smaller and smaller when we reduce h.


Now, using Taylor’s theorem we can write y.x C h/ as (noting that y 0 D f )
y 00 ./ 2 y 00 ./ 2
y.xCh/ D y.x/Cy 0 .x/hC h D y.x/ C f .x; y/hC h ;  2 Œx; xCh (12.6.22)
2 2
Up to now, we have been working with the exact solution y.x/. Now comes Euler, with the
approximate solution. To differentiate the exact and approximate solution, the latter is denoted
by y.x/.
Q At x C h, Euler’s approximate solution is:

Q C h/ D y.x/
y.x Q C f .x; y/h (12.6.23)

Putting the exact solutions and Euler’s solution together, we get:


y 00 ./ 2
ynC1 D yn C f .xn ; yn /h C h
2 (12.6.24)
yQnC1 D yQn C f .xn ; yQn /h
With that we can calculate the error, which is the difference between the exact solution and
the numerical solution, that is EnC1 WD ynC1 yQnC1 . Subtracting the first from the second in
Eq. (12.6.24), we get EnC1 as
1
EnC1 D En C Œf .xn ; yn / f .xn ; yQn / h C y 00 ./h2 (12.6.25)
2
The error consists of two parts (assume that rounding error is zero): the first part is the local
truncation error–occurs when we neglected the red term–this error is O.h2 /–and the second part
is related to the blue term.

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 992

Now that we have an expression for the error, we need to find an upper bound for it, i.e.,
jEn j  . Note that for the error we’re interested in its magnitude only, thus we need jEnC1 j.
And the triangle inequality (Eq. (2.20.12)) enables us to write

1
jEnC1 j  jEn j C jf .xn ; yn / f .xn ; yQn /jh C jy 00 ./jh2 (12.6.26)
2
To proceed, we need to introduce some assumptions. The first one is

jf .x; y1 / f .x; y2 /j  Ljy1 y2 j (12.6.27)

1
ˇ D max y 00 .x/ for x 2 Œa; b (12.6.28)
2
With these conditions, Eq. (12.6.26) is simplified to

jEnC1 j  ˛jEn j C ˇh2 ; ˛ WD 1 C hL (12.6.29)

What we do with this equation? Start with E0 , which is assumed to be zero, we compute E1 ,
then E2 and so on:

n D 0 W jE1 j  ˛jE0 j C ˇh2 D ˇh2 ; .jE0 j D 0/


n D 1 W jE2 j  ˛jE1 j C ˇh2  .1 C ˛/ˇh2
n D 2 W jE3 j  ˛jE2 j C ˇh2  .1 C ˛ C ˛ 2 /ˇh2

We can see a pattern here, and that pattern gives us jEnC1 j (recall that ˛ D 1 C hL):

˛n 1 2 .1 C hL/n 1
jEnC1 j  ˇh D ˇh (12.6.30)
˛ 1 L
This equation gives a bound for jEn j in terms of h, L, ˇ and n. Note that for a fixed h, this error
bound increases with increasing n. This is in agreement with the example of y 0 D xy that we
considered at the beginning of the section.
With this inequality .1 C hL/n  e nhL and nh  b a, we then have

e nhL 1 e .b a/L 1
jEnC1 j  ˇh  ˇh WD Kh (12.6.31)
L L
We have just showed that the error at time step n is proportional to h with the proportionality
constant K depending on L; ˇ and the time interval b a. With this result, we’re now able to
talk about the error of Euler’s method: it is defined as the maximum of jEn j over all the time
steps:
E WD max jEn j  Kh H) E D O.h/ H) lim E D 0 (12.6.32)
n h!0

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 993

12.7 Numerical solution of partial differential equations


As discussed in Chapter 9 engineers and scientists and mathematicians resort to partial differen-
tial equations when they need to describe a complex phenomenon. The problem is that partial
differential equations — as essential and ubiquitous as they are in science and engineering —
are notoriously difficult to solve, if they can be solved at all.
Numerical methods for partial differential equations is the branch of numerical analysis that
studies the numerical solution of partial differential equations (PDEs). Common methods are
finite difference method (FDM), finite volume method (FVM), finite element method (FEM),
spectral methods, meshfree methods etc. The field is simply huge and I do not have time to learn
all of them. The finite difference method is often regarded as the simplest method to learn and
use. This section is a brief introduction to the FDM.

12.7.1 Finite difference for the 1D heat equation: explicit schemes


We now solve the 1D heat equation  t D  2 xx for .x; t/, with 0  x  L and 0  t  T ,
using finite differences. The idea is simple as we simply follow Euler in approximating the
derivatives by some finite difference formula. Thus, we construct a grid (or lattice or mesh or
whatever you want to call it) of points in the 2D xt plane. A point on this grid is labeled by
.i; n/, which means that the spatial coordinate is ix and the temporal coordinate is nt , where
x is the grid spatial size and t is the time step. Such a grid is given in Fig. 12.15 (left). Note
that this is a uniform grid in which x and t are constant. (But nothing can prevent us from
using non-uniform grids). With such a grid, the temperature is only available at the points; for
example at point .i; n/, the temperature is in : the subscript is for space and the superscript is for
time.
If, by any way, we can transform the PDE  t D  2 xx into a system of algebraic equations
containing all inC1 given that the temperature at all grid point is known at the previous time
step n, then we’re done. This is so because we can start with i0 , i D 0; 1; 2; : : :, compute i1 ,
then i2 , marching in time just as we do with Euler’s method to solve ODEs.

Figure 12.15: A 2D (uniform) finite difference grid: the space Œ0; L is discretized by N points.

To start simple, we use the forward difference for the time partial derivative  t evaluated at

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 994

grid point .i; n/:


 n
@ inC1 in
D C O.t/ (12.7.1)
@t i t
and a central difference for the spatial second order derivative xx evaluated at grid point .i; n/:
 n
@2  inC1 2in C in 1
D C O..x/2 / (12.7.2)
@x 2 i .x/2

Substituting Eqs. (12.7.1) and (12.7.2) into the heat equation (after removing the high order
terms of course), we get the following equation

inC1 in n 2in C in 1


D  2 i C1 ; i D 1; : : : ; N 2 (12.7.3)
t .x/2

This is called a finite difference equation for the heat equation. Note that, there are N 2 such
equations for N unknowns inC1 for i D 0; 1; : : : ; N 1 as the temperature is known at time n.
But do not worry we have two equations coming from the boundary conditions. Another note,
Eq. (12.7.3) is just one specific type of finite difference equation for the heat equation. We can
develop other FD equations e.g. if we use the backward difference for  t instead of the forward
difference in Eq. (12.7.1). But, there is something nice about Eq. (12.7.3). In that equation, for
any i , there is only one unknown inC1 . So, we can solve for it easily:

t  n 
inC1 D in C  2  2in C in 1 ; i D 1; : : : ; N 2 (12.7.4)
.x/2 i C1

This equation is called a computational molecule or stencil and plotted in Fig. 12.15 (right). And
this finite difference method is known as the Forward Time Centered Space or FTCS method.
What is more, it is an explicit method. It is so called because to determine inC1 , we do not have
to solve any system of equations. Eq. (12.7.4) provides an explicit formula to quickly compute
inC1 . There are explicit methods, just because there are implicit ones. And the next section
presents one implicit method.

12.7.2 Finite difference for the 1D heat equation: implicit schemes


This section presents an implicit FDM using the backward Euler for  t . Let’s use a simple ODE
to demonstrate the difference between explicit and implicit methods: solving xP D sin x with
x.0/ D x0 . Using the backward difference formula (Section 12.2.1), we can write

xn xn 1 xn xn 1
xP D H) D sin xn
t t

Obviously to solve for xn with xn 1 known we have to solve the boxed equation, which is a non-
linear equation. This is an implicit method which involves the solution of a nonlinear equation.
On the contrary, an explicit method does not need to solve any equation; see Eq. (12.7.4) for

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 995

example. So, you might be thinking we should not then use implicit methods. But that’s not the
whole story, otherwise the backward Euler’s method would not have been developed.
Getting back to the heat equation, now we write  t as
 n
@  n in 1
D i C O.t/ (12.7.5)
@t i t
Substituting Eqs. (12.7.2) and (12.7.5) into the heat equation, we get the following equation
in in 1 inC1 2in C in 1
D 2 ; i D 1; : : : ; N 2 (12.7.6)
t .x/2
And we have obtained the Backward Time Centered Space (BTCS) difference method for the
heat equation. In the above equation only the red term is known, and thus we cannot solve it
equation by equation. Instead we have to assemble all the equations into Ax D b and solve
this system of linear equations once for all in . To get the matrix A, we just need to rewrite
Eq. (12.7.6) in which we separate the knowns (in the RHS of the equation) and the unknownsŽŽ :
 2 t
sin 1 C .1 C 2s/in sinC1 D in 1 ; i D 1; : : : ; N 2; s WD (12.7.7)
.x/2
Noting that each equation involves only three unknowns at point i 1, i and i C 1, thus, when
we assemble all the equations from all the nodes, we get a tridiagonal matrix A. For example,
if we have six points (i.e., N D 6), we will have (the first and last row come from the boundary

conditions 0=5
n
D 0=5 ):
2 32 3 2 3
1 0 0 0 0 0 0n 0
6 76 7 6 7
6 s 1 C 2s s 0 0 0 7 61n 7 61n 1 7
6 7 6 n7 6 n 17
60 1 C 2s 07 6 7 6 7
6 s s 0 7 62 7 D 62 7 (12.7.8)
60 07 6 7 6 7
6 0 s 1 C 2s s 7 63n 7 63n 1 7
6 76 7 6 7
40 0 0 s 1 C 2s s 5 44n 5 44n 1 5
0 0 0 0 0 1 5n 5
To see more clearly the pattern of the matrix, we need to have a bigger matrix. For example, with
100 points, we have the matrix shown in Fig. 12.16; the one on the left shows the entire matrix
and the right figure shows only the first ten rows/cols. Eq. (12.7.8) is obviously of the form
Ax D b and without knowing it beforehand we are back to linear algebra! We need techniques
from that field to have a fast method to solve this system. But we do not delve into that topic
here. We just use a linear algebra library to do that so that we can focus on the PDE (and the
physics we’re interested in).
It is obvious that the BTCS finite difference method is an implicit method as we have to solve
a system of (linear) equations to determine the temperate at all the nodes at a given time. What
are then the pros/cons of implicit methods compared with explicit methods? The next section
gives an answer to that question.
ŽŽ
This finite difference equation appeared for the first time in 1924 in a paper of Erhard Schmidt.

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 996

Figure 12.16: A tridiagonal matrix resulting from the FDM for the heat equation: obtained using the func-
tion imshow in matplotlib. A tridiagonal matrix is a band matrix that has nonzero elements only on the
main diagonal, the subdiagonal/lower diagonal (the first diagonal below this), and the supdiagonal/upper
diagonal (the first diagonal above the main diagonal). Check the source file heat_btcs.jl for detail.

12.7.3 Implicit versus explicit methods: stability analysis


Briefly, explicit methods are easy to code but they are unstable, if not used properly. On the
other hand, implicit methods are harder to code but are stable. To see what stability means, let’s
consider a simple ODE:

xP D kx; x.0/ D x0 ; k > 0 H) x.t/ D x0 exp. kt/

The exact solution is a decaying exponential function: as t ! 1 the solution approaches x0 .


Now, we use the forward Euler method to have
xnC1 xn
D kxn H) xnC1 D .1 kt/xn H) xn D .1 kt/n x0
t
which is an explicit method, nothing can be simpler. However, it is easy to see that the
solution depends heavily on the value of t. In Fig. 12.17 the exact solution is plotted with
two numerical solutions–one with t such that j1 ktj  1 (blue curve) and one with
j1 ktj > 1 (red curve). The red curve is absolutely wrong; that solution grows to infinity
and blows up our computer! That is numerical instability. As the method gives stable solution
only for t  2=k, the method is said to be conditionally stable.

von Neumann stability analysis is a procedure used to check the stability of finite difference
schemes as applied to linear partial differential equations. The analysis is based on the Fourier
decomposition of numerical error and was developed at Los Alamos National Laboratory after
having been briefly described in a 1947 article by British researchers Crank and Nicolson. Later,
the method was given a more rigorous treatment in an article by John von Neumann.
Let’s denote by A the exact solution to the heat equation (i.e.,  t D ˛xx ), by D the exact
solution to the finite difference equation corresponding to the heat equation. For example, if we
consider the FTBS method, then D is the exact solution to the following equation

inC1 in inC1 2in C in 1


D˛ ; i D 1; : : : ; N 2 (12.7.9)
t .x/2

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 997

1.0
exact sol.
400
|r| ≤ 1
0.8
200
0.6
0

0.4 −200

0.2 −400

0.0 −600
0.0 0.2 0.4 0.0 0.1 0.2 0.3

Figure 12.17: Demonstration of numerical stability in solving ODEs using finite difference methods.

This exact solution was obtained if our computer has no round off errors (which is not reality).
That’s why we have another solution, N , which is the actual solution to Eq. (12.7.9) that we
obtain from our computer. Now, we can define some errors:

discretization error D A D
(12.7.10)
round off error  D N D H) N D  C D

The stability of numerical schemes is closely associated with numerical error. A finite difference
scheme is stable if the errors made at one time step of the calculation do not cause the errors to be
magnified as the computations are continued. Thus the plan is now to study how  behaves. We
are going to show that the error is also a solution of Eq. (12.7.9). The proof is simply algebraic.
Indeed, as N is the solution to Eq. (12.7.9), we have

inC1 in DinC1 Din inC1 2in C in 1 DinC1 2Din C Din 1
C D˛ C˛
t t .x/2 .x/2
where the red terms cancel each other leading to

inC1 in n 2in C in


D ˛ i C1 1
(12.7.11)
t .x/2
Now comes the surprise: the error .x; t/ is decomposed into a Fourier series in a complex
formŽŽ :
X1 X1
i n2x=L
.x; t/ D cn e D cn e i kn x ; cn D e at (12.7.12)
nD 1 nD 1

Instead of considering the whole series, we focus on just one term. That is .x; t/ D e at e i kn x .
With that and Eq. (12.7.11), we can obtain the following

e at 1 e i kn x 2 C e i kn x
D (12.7.13)
˛t .x/2
ŽŽ
Check Section 4.19.3 if this is not clear.

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 998

and this allows us to determine the ratio of the error at two consecutive time steps i =in :
nC1

inC1 ˛t i kn x
n
D e at D 1 C .e 2 C e i kn x / (Eq. (12.7.13))
i .x/2
˛t
D1C .2 cos kn x 2/ (12.7.14)
.x/2
4˛t kn x
D1 2
sin2
.x/ 2
The last two steps are purely algebraic. It is interesting that trigonometry identities play a role
in the context of numerical solutions of the heat equation, isn’t it?
ˇ nC1We do
ˇ not want the error to grow, so we’re interested in when the following inequality holds
ˇi =in ˇ  1. With Eq. (12.7.14), this condition becomes
ˇ ˇ
ˇ ˇ
ˇ1 4˛t sin2 kn x ˇ  1 H) 2˛t sin2 kn x  1 H) ˛t  1 (12.7.15)
ˇ .x/2 2 ˇ .x/2 2 .x/2 2
The boxed equation gives the stability requirement for the FTCS scheme as applied to one-
dimensional heat equation. It says that for a given x, the allowed value of t must be small
enough to satisfy the boxed equationŽŽ .

12.7.4 Analytical solutions versus numerical solutions


To test the implementation of various FD schemes for the heat equation and also to demonstrate
the differences between analytical solutions and numerical solutions, let’s solve a specific heat
conduction problem:
@ @2 
D 0:12 2 0 < x < 1; t > 0
@t @x
.x; 0/ D 1 0  x  1
.0; t/ D 0; .1; t/ D 0 t > 0
The exact solution in Eq. (9.9.14) for this specific problem is
X
1
4
.n/2 t
.x; t/ D Bn e sin.nx/; Bn D (12.7.16)
nD1;3;:::
n

whereas a (part of) numerical solution is shown in Table 12.9. An analytical solution allows
us to compute the solution at any point (in the domain). On the other hand, we only have the
numerical solutions at some points (at the nodes). The analytical solution can tell us how the
parameters (e.g.  here) affect the solution. The numerical solutions are obtained only for a
specific value of the parameters.
Now is the time for code verification. The results in Fig. 12.18 indicate that the implementa-
tion is correct and it also confirms the von Neumann stability analysis.
ŽŽ
One example to see how small the time step must be: ˛ D 1, x D 0:1, then t  0:05.

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 999

Table 12.9: Numerical solutions are given in a tabular format: each row corresponds with a time step.

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

0.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0

4.0 0.0 0.816 0.973 0.996 0.999 0.999 0.999 0.996 0.973 0.816 0.0

7.5 0.0 0.188 0.358 0.492 0.578 0.607 0.578 0.492 0.358 0.188 0.0

1.0 1.0

0.8 0.8

0.6 0.6

0.4 0.4

0.2 exact 0.2 exact


numerical numerical
0.0 0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

(a) BTCS (b) FTCS

Figure 12.18: Analytical versus numerical solution of the heat equation. Ten terms are used in
Eq. (12.7.16). For the FTCS scheme, a time step slightly larger than the upper limit in Eq. (12.7.15)
was used. Thus, the solution shows instability. For later time steps, the numerical solution blew up.

12.7.5 Finite difference for the 1D wave equation


We now move on to solving the 1D wave equation u t t D c 2 uxx for u.x; t/ also using finite
differences. One simple explicit method is to adopt a central finite difference for both u t t and
uxx . Now using Eq. (12.2.5), we can compute u t t and uxx as
 n
@2 u unC1
i 2uni C uni 1
D C O..x/2 /
@t 2 i .t/2
 2 n (12.7.17)
@ u uniC1 2uni C uni 1
D C O..x/2 /
@x 2 i .x/2

Substituting these into the wave equation we obtain with r WD t


x
c
 
unC1
i D 2.1 r 2 /uni uni 1
C r 2 uniC1 C uni 1 (12.7.18)

which allows us to determine unC1


i explicitly i.e., without solving any system of equations. This
is referred to as the Centered in time and space (CTCS) FD scheme. Now that we know of the
von Neumann stability analysis, we can carry out such an analysis to see the condition on the

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 1000

time step t:

inC1 kn x
D g; g2 2ˇg C 1 D 0 ; ˇ D 1 2r 2 sin2
in 2
Solving the boxed equation we obtain g as
p
g1;2 D ˇ ˙ ˇ2 1

Note that g can be a complex number and we need jgj  1 so that our method is stable. And
this requires that jˇj  1. In this case, we can write g as
p
g1;2 D ˇ ˙ i 1 ˇ 2 H) jgj D 1

So, we conclude that the method is stable as long as jˇj  1, or


ˇ ˇ
ˇ ˇ
ˇ1 2r 2 sin2 kn x ˇ  1 H) r WD t c  1 (12.7.19)
ˇ 2 ˇ x

Figure 12.19: Waves propagating on a string with fixed ends. The data are: c D 300 m=s, L D 1 m, x D
0:01 m, t D x=c.  The initial string shape is given at the top, which is a Gaussian pluck u.x; 0/ D
exp k.x x0 /2 with x0 D 0:3 m and k D 1000 1=m2 . The wave is split into two wavepackets
(pulses) which travel in opposite directions (second and third figs). This is consistent with d’ Alembert
solution in Eq. (9.10.6). The left pulse reaches the left end and reflected, this reflection inverts the pulse
so that its displacement is now negative (fourth fig). Meanwhile the right pulse keeps going to the right,
reaches the fixed end, reflected and inverted.

12.7.6 Solving ODE using neural networks

12.8 Numerical optimization


"Optimization” comes from the same root as “optimal”, which means best. When you optimize
something, you are “making it best”. Of course when we write optimization we mean mathe-
matical optimization. And we only discuss continuous optimization in the sense that we can
use calculus. And numerical optimization refers to algorithms that are used to solve continuous
optimization problems numerically (approximately). Those algorithms are developed to solve
large scale industry optimization problems.

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 1001

Basically we have an objective function f .x/ of multiple variables x D .x1 ; : : : ; xn / where


n is really big (e.g. millions). These variables are the inputs– things that you can control. Usually
the inputs subject to some constraints, which are equations that place limits on how big or small
some variables can get. There are two types of constraints: equality and inequality constraints.
Equality constraints are usually noted hn .x/ and inequality constraints are noted gn .x/.
When there are constraints we are dealing with a constrained optimization problem. Other-
wise, we have an unconstrained optimization problem.
Optimization is now a big branch of applied mathematics with a wide range of applications.
This section is just a very brief introduction to some numerical algorithms commonly used to
solve optimization problems. Section 12.8.1 is devoted to the gradient descent method.

12.8.1 Gradient descent method


Consider the problem of finding the minimum of the function f .x1 ; : : : ; xn /. The gradient
descent method is an iterative method to solve this optimization problem. The idea is simple:
starting at an initial point x 0 we find the next point x 1 that decreases the function as much as
possible. Knowing that the gradient of f (rf ) is the direction of steepest ascentŽ , the direction
of steepest descent is simply rf . But that only the direction, we need to know the step size
which tells us how much to go in that rf direction. Thus, the new solution is x 1 D x 0 rf ,
and from that we continue with x 2 ; x 3 ; : : :. The algorithm is then (very simple, easy to code)

x kC1 D x k k rf .x k /; k D 0; 1; 2; : : :

Let’s consider one example to see how the value of affects the performance of the method.

Example 12.1
We’re going to minimize the following quadratic function:
 2  
3 3 2 x1 x2 9 9 1 1
f .x1 ; x2 / D x1 C .x2 2/ C ; rf D x1 C x2 ; 2x2 4 C x1
4 2 4 8 4 4 4

The exact solution is .1:6; 1:8/. The source code is in gradient_descent_example.jl. The
initial x0 is .5; 4/ and various are used.
gamma 0.01 — final grad 5.764086792320361 gamma 0.1 — final grad 1.304983149085438 gamma 0.2 — final grad 0.34239835144927167
5 5 5

4 4 4

3 3 3

2 2 2

1 1 1

0 0 0
0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5

Ž
Check Section 7.5 if this is not clear.

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 1002

gamma 0.3 — final grad 0.09247189723413639 gamma 0.5 — final grad 0.003267333546280508 gamma 0.75 — final grad 0.028443294078113326
5 5 5

4 4 4

3 3 3

2 2 2

1 1 1

0 0 0
0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5

Two observations can be made: (1) each step indeed takes us towards the solution (i.e., de-
creasing the function f ) and (2) we need to find a good value for to have a fast method.

The specific function considered in the above example belongs to a general quadratic function
of the following form
1
f .x/ D x > Ax b> x C c
2

where A is symmetric and positive definiteŽŽ . In what follows we consider c D 0 for simplicity
as it does not affect the solution but only the minimum value of f . Note that due to the positive
definiteness of A, the shape of f is like a bowl (think of the simple function 0:5ax 2 bx with
a > 0 or Fig. 7.9). Thus, there is only one minimum. The gradient of f is rf D Ax bŽ . For
this case, we can find k exactly. The idea is: choose k such that f .x kC1 / is minimized. This
is simply an one dimensional optimization problem. Let’s consider the following function

1
g. k / D f .x kC1 / D .x k k rf .x k //> A .x k k rf .x k // b> .x k k rf .x k //
2

which is nothing but a quadratic function g. k / D a k2 C d k C e with

1
a D rf .x k /> Arf .x k /; d D .b> x>
k A/rf .x k / D rf .x k /> rf .x k /
2

Thus, k is given by

d rf .x k /> rf .x k /
k D D (12.8.1)
2a rf .x k /> Arf .x k /

ŽŽ
See Section 11.10.6.
Ž
See Section 12.9.2 for a proof of this.

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 1003

Now, we solve the example again, with the step size deter- 5

mined by Eq. (12.8.1). The complete algorithm is in Algorithm 2 4


together with a Julia implementation (noting the striking resem-
blance of the Julia code with the algorithm). The solution is 3
given in the figure. Very good performance was obtained: a few
iterations were needed to get convergence. But we notice some- 2

thing special: the path generated by fx k g is zig-zag. We have 1


the so-called zig-zag theorem. It goes like this: Let fx k g be the
sequence generated by the steepest descent algorithm. Then, for 0
0 1 2 3 4 5
all k, x kC1 x k is orthogonal to x kC2 x kC1 .
Proof. Of course to prove the orthogonality of two vectors, we need to show its dot product is
zero. We have
(
x kC1 D x k k rf .x k /
H) .x kC1 x k ; x kC2 x kC1 / D k kC1 .rf .x k /; rf .x kC1 //
x kC2 D x kC1 kC1 rf .x kC1 /

As k is the minimizer of f .x k k rf .x k //, we have df =d D 0. Using the chain rule, this


derivative is computed as
df
D rf .x k k rf .x k //  rf .x k / D .rf .x kC1 /; rf .x k //
d
This derivative is zero leads to the dot product .rf .x kC1 /; rf .x k // being zero which results
in .x kC1 x k ; x kC2 x kC1 / D 0. 
Obviously a zig-zag path is not the shortest path, so the gradient descent is not a very fast
method. This will be proved later using a convergence analysis and if we zoom in to look
more closely at the path we see that we follow some direction that was taken earlier. In other
words, there exist rf .x i / and rf .xj / which are parallel. This observation will lead to a better
method: the conjugate gradient method, to be presented in Section 12.9.2.

Algorithm 2 Gradient descent algo-


rithm (exact line search).
1: Inputs: A; b; x 0 , the tolerance 
2: Outputs: the solution x
3: x D x 0
4: rf D Ax b F gradient of f
5: while krf k >  do
rf > rf
6: D rf > Arf F step size
7: x D x rf F update x
8: rf D Ax b
9: end while
Convergence analysis. The gradient descent method generates a sequence fx k g that converges
towards x–the solution. We have seen one numerical evidence of that. And we need a proof.

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 1004

Then, what is the convergence rate (of the method) that tells us how fast we go from x 0 to x.
Certainly, this rate of convergence is evaluated using error function E.x/:

E W Rn ! R such that E.x/  0 8x 2 Rn

One choice is to define the error e k D x k x, then define E as the following energy norm:
1=2
E.x k / D e >
k Ae k

We need the formula for updating e k , it satisfies the same equation as for x k :

x kC1 x D .x k x/ k rf .x k / ” e kC1 D e k k rf .x k /

Now, we can compute E.x kC1 / by considering its square, and relating it to E.x k /:

ŒE.x kC1 /2 D e >


kC1 Ae kC1
D .e >
k k rf .x k /> /A.e k k rf .x k //
D e>k Ae k 2 k rf .x k /> Ae k C k2 rf .x k /> /Arf .x k /
Œrf .x k /> rf .x k /2 (12.8.2)
D ŒE.x k /2 >
" rf .x k / Arf .x k / #
> 2
Œrf .x k / rf .x k /
D ŒE.x k /2 1
.rf .x k /> Arf .x k //.e > k Ae k /

Now comes the magic of eigenvectors and eigenvalues. As A is a real symmetric matrix, it has
n independent orthonormal eigenvectors vi and n positive real eigenvalues i , and we use them
as a basis of Rn to express the error–which is a vector in Rn –e k as

X
n
ek D i v i (12.8.3)
i D1

With that it is possible to compute different terms in the last expression of Eq. (12.8.2). We start
with
X
n X
n
rf .x k / D Ax k b D Ae k D A i v i D i  i v i (12.8.4)
i D1 i D1

Thus,
! 0 1
X
n X
n X
n
>
rf .x k / rf .x k / D i i vi @ j j vj A D i2 2i
i D1 j D1 i D1
! 0 1 (12.8.5)
X
n X
n X
n
rf .x k /> Arf .x k / D i i vi @ j j2 vj A D i2 3i
i D1 j D1 i D1

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 1005

And in the same manner, we can compute e >k Ae k as


! 0 n 1
Xn X X
n
>
e k Ae k D i v i  @ A
j j vj D i2 i (12.8.6)
i D1 j D1 i D1

With all the intermediate results, from Eq. (12.8.2) we can finally get
 
Œi2 2i 2
2 2
ŒE.x kC1 / D ŒE.x k / 1 (12.8.7)
.i2 3i /.i2 i /

12.9 Numerical linear algebra


12.9.1 Iterative methods to solve a system of linear equations
In Section 3.7 we have met the Persian astronomer al-Kashi, who computed x D sin 1ı itera-
tively from a trigonometry identity:

sin 3ı C 4x 3
sin 3ı D 3 sin 1ı 4 sin3 1ı H) x D
3
We are going to do the samething but for Ax D b: we split the matrix into two matrices
A D S T, then the system becomes .S T/x D b or Sx D Tx C b. Then, following al-Kashi,
we solve this system iteratively, starting from x 0 we get x 1 , and from x 1 we obtain x 2 and so
on:

Sx kC1 D Tx k C b; k D 0; 1; 2; : : : (12.9.1)
Thus, instead of solving Ax D b directly using e.g. Gaussian elimination method, we’re adopting
an iterative method.
It is obvious that we need to select S in a way that
(a) Eq. (12.9.1) is solved easily (or fast), and

(b) The difference (or error) x x k should go quickly to zero. To get an expression for this
difference, subtracting Eq. (12.9.1) from Sx D Tx C b:

Se kC1 D Te k H) e kC1 D S 1 Te k
The matrix B D S 1 T controls the convergence rate of the method.
To demonstrate iterative methods for Ax D b, we first consider the following methods:

Jacobi method: S is the diagonal part of A


Gauss-Seidel method: S is the lower triangular part of A (diagonal included)

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 1006

Example 12.2
Consider the following system, with solution:
" #" # " # " # " #
C2 1 x C4 x 2
D has the solution D
1 C2 y 2 y 0

The Jacobi iterations are:


" # " # " # (
2 0 0 1 C4 2xkC1 D yk C 4
x kC1 D xk C ;
0 2 1 0 2 2ykC1 D xk 2
And the Gauss-Seidel iterations are
" # " # " # (
C2 0 0 1 C4 2xkC1 D yk C 4
x kC1 D xk C ;
1 2 0 0 2 2ykC1 D xkC1 2

Jacobi method Gauss-Seidel method

xk yk xk yk

0.000 0.000000 0.000000000 -1.000000000

2.000 -1.000000 1.500000000 -0.250000000

1.500 0.000000 1.875000000 -0.062500000

2.000 -0.250000 1.968750000 -0.015625000

12.9.2 Conjugate gradient method


It is easy to see that the minimum point of the quadratic function f .x/ D 1=2ax 2 bx C c
(with a > 0) is x such that ax D b. By analogy, the minimum point of the following quadratic
function where x; b 2 Rn
1
f .x/ D x > Ax b> x C c
2
is the solution to the linear system Ax D b when A is symmetric and positive definite (similar
to the fact that a > 0).
Proof. We need to prove this d=d x.x > Ax/ D .A C A> /x. Indeed, g.x/ D x > Ax D Aij xi xj ,
now take the derivative of g.x/ with respect to xk ,
@xk @xi @xj @xi @xj
D Aij xj C Aij xi D Aij xj C Aij xi
@g.x/ @xk @xk @xk @xk
D Aij ıi k xj C Aij ıj k xi D Akj xj C Ai k xi

Phu Nguyen, Monash University © Draft version


Chapter 12. Numerical analysis 1007

which is the kth component of .A C A> /x. Doing something similar, we get the derivative of
b> x is bŽ . Hence, the derivative of f .x/ is Ax b. Setting this derivative to zero, and we get
the linear system Ax D b. 
So, facing a large sparse linear system Ax D b, we do not solve it directly, but we find the
minimum of the function f .x/. Why we do that? Intuitively finding a minimum of a nice func-
tion such as f .x/, which geometrically is a bowl, seems easy. We just need to start somewhere
on the bowl and moving downhill, more often we will hit home: the bottom of the bowl. We
have actually seen such a method: the gradient descent method in Section 12.8.1.
Why can’t we just use the gradient descent method?

12.9.3 Iterative methods to solve eigenvalue problems

Ž
Noting that A D A> as A is symmetric.

Phu Nguyen, Monash University © Draft version


Appendix A
Codes

Coding is to programming as typing is to writing. (Leslie Lamport)

To encourage young students to learn coding and also to demonstrate the important role of
coding in mathematics, engineering and sciences, in this book I have used many small programs
to do some tedious (or boring) calculations. In this appendix, I provide some snippets of these
programs so that young people can learn programming while learning maths/physics.
There are so many programming languages and I have selected Julia for two main reasons.
First, it is open source (so we can use it for free and we can see its source code if we find that
needed). Second, it is easy to use. For young students, the fact that a programming language is
free is obviously important. The second reason–being easy to use–is more important as we use
a programming language just as a tool; our main purpose is doing mathematics (or physics). Of
course you can use Python; it is also free and easy to use and popular. The reason I have opted
for Julia was to force me to learn this new language; I forced myself to go outside of my comfort
zone, only then I could find something unexpected. There is actually another reason, although
irrelevant here, is that Julia codes run faster than Python ones. Moreover, it is possible to use
Python and RŽŽ in Julia.
It is worthy noting that our aim is to learn coding to use it to solve mathematical problems.
We do not want to learn coding to write software for general use; that is a compltely story.
And that is why I do not spend time (for time is limited) learning how to make graphical user
interfaces (GUI), and do not learn coding with languages such as Visual Basic, Delphi and so
on.
In the text, if there is certain amount of boring calculations (e.g. a table of partial sums of an
infinite series), certainly I have used a small Julia program to do that job. And I have provided
links to the code given in this appendix. Now, in the code snippets, I provide the link back to the
associated text in the book.
To reduce the thickness of the book, all other codes, which are not given in the text, are put
in githubŽ at this address.
ŽŽ
R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety
of UNIX platforms, Windows and MacOS.
Ž
GitHub is a website and cloud-based service that helps developers store and manage their code, as well as

1008
Appendix A. Codes 1009

As this appendix is not a tutorial to Julia, I just provide the


codes and provide little explanations. I would like to emphasize
that programming or coding is not hard. This is true, as kids can
do it (Fig. A.1). A program is simply a set of instructions written
by you to demand a computer to perform a certain task. These
instructions are written in a programming language with vocabu-
laries such as for, while, if, function etc. All we need to do are: (1)
write down what we need to achieve in clear steps (in your own
mother language), and (2) translate these steps to the program- Figure A.1: Baby program-
ming language we’ve selected. That’s it! ming.
And by the way, we do not have to memorize anything (such
as should I write for i=1 to 10 or for i=1:10?). Google is our best friend. If you’re not sure about
anything, just google. And with time all these for/while/if... will reside in your brain without you
knowing it! Albert Einstein once said ‘Never memorize something that you can look up’.
Having said that, programming becomes, however, harder when we have to write optimized
codes; codes that are very fast to execute and codes that can be easy to maintain (i.e., easy to
modify to add new codes). But that is beyond the scope of this book.
When you have known a language and enjoy the process, learn a new one or even more. And
it is beneficial to read lots of code written by others, it is one of the best ways to become a better
programmer. This is similar to writers who read Shakespeare, Mark Twain, and Hemingway.

A.1 Algebra and calculus


In this section, I present some codes used in Chapter 2. Not all programs are presented as a few
representative ones are sufficient to demonstrate the basics of programming.
Listing A.1 is a code to compute the square root of any positive number S using the for-
mula xnC1 D 0:5.xn C S=xn / starting with an initial value x0 , see Section 2.9.3. This code
is representative for any iteration based formula. Note that x is replaced by the new value, so
intermediate x’s are not recorded.

Listing A.1: Computing the square root of a positive real number S . Julia built in functions are in blue
heavy bold font.
1 function square_root(S,x0,epsilon)
2 x = x0 # x is for x_{n+1} in our formula
3 while (true) # do the iterations, a loop without knowing the # of iterations
4 x = 0.5 * ( x + S/x )
5 if (abs(x*x-S) < epsilon) break end # if x is accurate enough, stop
6 end
7 return x
8 end

track and control changes to their code.

Phu Nguyen, Monash University © Draft version


Appendix A. Codes 1010

P
Listing A.2 is the code to compute the partial sums of a geometric series niD1 1=2i . The code
is typical for calculating a sum of n terms. We initialize the sum to zero, and using a for loop to
add one term to the sum Peach time. Listing A.3 is a similar code, but for the Taylor series of the
1
sine function sin x D iD1 . 1/i 1 1=.2i 1/Šx 2i 1 ; see Section 4.15.6. The code introduces the
use of the factorial(n) function to compute nŠ Note that we have to use big numbers as nŠ is
very large for large n.

Pn
Listing A.2: Partial sum of geometric series i D1
1=2i . Also produces directly Table 2.12.
1 using PrettyTables # you have to install this package first
2 function geometric_series(n) # make a function named ‘geometric_series’ with 1 input
3 S = 0.
4 for k=1:n # using ’for’ for loops with known number of iterations
5 S += 1/2^k # S += ... is short for S = S + ...
6 end
7 return S
8 end
9 data = zeros(20,2) # this is an array of 20 rows and 2 columns
10 for i=1:20 # produce 20 rows in Table 2.10
11 S = geometric_series(i)
12 data[i,1] = i # row ‘i’, first col is ‘i’
13 data[i,2] = S # second col is S
14 end
15 pretty_table(data, ["n", "S"]) # print the table to terminal

P1
Listing A.3: Calculating sin x using the sine series sin x D i D1 . 1/i 1 1=.2i 1/Šx 2i 1 .

1 using PrettyTables # you have to install this package first


2 function sinx_series(x,n)
3 A = 0.
4 for i=1:n
5 A += (-1)^(i-1) * x^(2*i-1) / factorial(big(2*i-1))
6 end
7 return A
8 end
9 # compute sin series with n=1,2,...,10 terms
10 data = zeros(10,2)
11 for i=1:size(data,1)
12 S1 = sinx_series(pi/4,i)
13 data[i,1] = i
14 data[i,2] = S1
15 end
16 pretty_table(data, ["n", "S1"], backend = :latex,formatters = ft_printf("%5.8f",[2]))

Listing A.4 is the program to check whether a natural number is a factorion. Having such a
function, we just need to sweep over, let say the first 100 000 numbers and check every number
if it is a factorion. We provide two solutions: one using the built in Julia‘s function digits to

Phu Nguyen, Monash University © Draft version


Appendix A. Codes 1011

get the digits of an integer. This solution is a lazy one. The second solution does not use that
function. Only then, we’re forced to work out how to get the digits of a number. Let’s say the
number is 3 258, we can get the digits starting from the first one (and get 3; 2; 5; 8) or we can
start from the last digit (to get 8; 5; 2; 3). The second option is easier because 8 D 3258%10 (the
last digit is the remainder of the division of the given number with 10). Once we have already
got the last digit, we do not need it, so we just need to remove it; 325 D d iv.3258; 10/; that is
325 is the result of the integer division of 3258 with 10.

Listing A.4: Checking if an integer is a factorion (Section 2.26.2)


1 function is_factorion_version1(n)
2 s = 0
3 digits_n = digits(n)
4 for i=1:length(digits_n)
5 s += factorial(big(digits_n[i]))
6 end
7 return ( s == n )
8 end
9 function is_factorion_version2(n)
10 s = 0
11 n0 = n # kepp the original number as we modify n
12 while ( n > 0 ) # the loop stops when n = 0
13 x = n % 10 # get the last digit
14 s += factorial(big(x))
15 n = div(n,10) # remove the last digit
16 end
17 return ( s == n0 )
18 end

Q 
Listing A.5 is the code for the calculation of sn D nkD0 kn that is the product of all the
binomial coefficients. The idea is the same as the calculation of a sum but we need toinitialize
the result to 1 (instead of 0). We use Julia built in function binomial to compute kn .

Qn n Qn
Listing A.5: sn D kD0 k D kD0
nŠ=.n k/ŠkŠ. See Pascal triangle and number e, ??.
1 function sn(n)
2 product=1.0
3 for k=0:n
4 product *= binomial(big(n),k)
5 end
6 return product
7 end

In Listing A.6 I present a code that implements the Newton-


Raphson method for solving f .x/ D 0. (See Section 4.5.4.) As
it uses iterations, the code is similar to Listing A.1. Instead of
calculating the first derivative of f .x/ directly, I used a central

Phu Nguyen, Monash University © Draft version


Appendix A. Codes 1012

difference for this. I also introduced an increment variable i to


count the number of iterations required to get the solution. The function was then applied to
solving the equation cos x x D 0.

Listing A.6: Newton-Raphson method to solve f .x/ D 0 using central difference for derivative.
1 function newton_raphson(f,x0,epsilon)
2 x = x0
3 i = 0
4 while ( true )
5 i += 1
6 derx = (f(x0+1e-5)-f(x0-1e-5)) / (2e-5)
7 x = x0 - f(x0)/derx
8 @printf "%i %s %0.8f\n" i " iteration," x
9 if ( abs(x-x0) < epsilon ) break end
10 x0 = x
11 end
12 end
13 f(x) = cos(x) - x # short functions
14 newton_raphson(f,0.1,1e-6)

Listing A.7 implements three functions used to generate Newton fractals shown in Fig. 1.3.
The first function is the standard Newton-Raphson method, but the input is a function of a single
complex variable. The second function get_root_index is to return the position of a root r in
the list of all roots of the equation f .z/ D 0. This function uses the built in function isapprox
to check the equality of two numbers . The final function plot_newton_fractal loops over a
grid of n  n points within the domain Œxmin ; xmax 2 , for each point .x; y/, a complex variable
z0 D x C iy is made and inserted to the function newton to find a root r. Then, it finds the
position of r in the list roots. And finally it updates the matrix m accordingly. We used the code
with the function f .z/ D z 4 1, but you’re encouraged to play with f .z/ D z 12 1.

A.2 Recursion
In Section 2.10 we have met the Fibonacci numbers:

Fn D Fn 1 C Fn 2 ; n  2; F0 D F1 D 1 (A.2.1)

To compute F .n/, we need to use the recursive relation in Eq. (A.2.1). Listing A.8 is the Julia
implementation of Eq. (A.2.1). What is special about this “fibonacci” function? Inside the
definition of that function we call it (with smaller values of n). The process in which a function
calls itself directly or indirectly is called recursion and the corresponding function is called a
recursive function.

We should never check the equality of real/complex numbers by checking a DD b; instead we should check
ja bj < , where  is a small positive number. In other words, 0:99998 D 1:00001 D 1 according to a computer.
The built in function is an optimal implementation of this check.

Phu Nguyen, Monash University © Draft version


Appendix A. Codes 1013

Listing A.7: Newton fractals.


1 function newton(z0,f,fprime;max_iter=1000)
2 z = z0
3 for i=1:max_iter
4 dz = f(z)/fprime(z)
5 if abs(dz) < TOL return z end
6 z -= dz
7 end
8 return 0
9 end
10 function get_root_index(roots,r)
11 i = 0
12 for i=1:length(roots)
13 if isapprox(roots[i],r) # if r equals roots[i]
14 return i
15 end
16 end
17 if i == 0 # if root r is not yet found, add to roots
18 append!(roots,r) # equivalent to: roots = [roots;r]
19 return length(roots)
20 end
21 end
22 function plot_newton_fractal(f,fprime;n=200,domain=(-1,1,-1,1))
23 roots = [] # initialize roots to be empty
24 m = zeros(n,n)
25 xmin,xmax,ymin,ymax = domain
26 xarrays = range(xmin,xmax,length=n) # range(0,1,5) => {0,0.25,0.5,0.75,1}
27 yarrays = range(ymin,ymax,length=n)
28 for (ix,x) in enumerate(xarrays) # ix=1,2,... and x = xarrays[1], xarrays[2]...
29 for (iy,y) in enumerate(yarrays)
30 z0 = x + y * im # ‘im’ is Julia for i
31 r = newton(z0,f,fprime)
32 if r != 0
33 ir = get_root_index(roots,r) # roots be updated
34 m[iy,ix] = ir
35 end
36 end
37 end
38 return m
39 end
40 # a concrete function f(z)
41 f(z) = z^4 - 1; fprime(z) = 4*z^3
42 domain = (0.414,0.445,0.414,0.445)
43 m = plot_newton_fractal(f,fprime,n=500,domain=domain)
44 myplot = spy(m,Scale.ContinuousColorScale(p -> get(ColorSchemes.rainbow, p)))

The case n D 0 or n D 1 is called the base case of a recursive function. This is the case that
we know the answer to, thus it can be solved without any more recursive calls. The base case is
what stops the recursion from continuing on forever (i.e., infinite loop). Every recursive function

Phu Nguyen, Monash University © Draft version


Appendix A. Codes 1014

must have at least one base case (many functions have more than one).

Listing A.8: Fibonacci numbers implemented as a recursive function.


1 function fibonacci(n)
2 if ( n==0 n==1 )
3 return 1
4 else
5 return fibonacci(n-2) + fibonacci(n-1)
6 end
7 end

Sometimes the problem does not appear to be recursive. Thus, to master recursion we must
first find out how to think recursively. For example, consider the problem of computing the sum
of the first n integers. Using recursion, we do this:

S.n/ D 1 C 2 C    C n D 1 C 2 C    C .n 1/ Cn
„ ƒ‚ …
S.n 1/

We also need the base case, which is obviously S.1/ D 1. Now we can implement this in Julia
as in Listing A.9.

Listing A.9: Sum of the first n integers implemented as a recursive function.


1 function sum_first_integers(n)
2 if ( n==1 )
3 return 1
4 else
5 return sum_first_integers(n-1) + n
6 end
7 end

A.3 Numerical integration


Rb
Listing A.10 presents an implementation of the Simpson quadrature rule for a f .x/dx. The
integration interval can be divided into n equal parts. Using this as a template you can program
other quadrature rules such as trapezoidal or Gauss rule.

A.4 Harmonic oscillations


Listing A.11 is used to generate Fig. 9.9 about a weakly damped harmonic simple oscillator.
This code is presented to demonstrate how to generate graphs from formula. Assuming that the
formula is x.t/ D f .t/, and 0  t  50, then we generate a large number of points in this

Phu Nguyen, Monash University © Draft version


Appendix A. Codes 1015

Rb
Listing A.10: Simpson’s quadrature for a f .x/dx.
1 using PrettyTables
2 function simpson_quad(f,a,b,n)
3 A = 0.
4 deltan = (b-a)/n
5 deltax6 = deltan/6
6 for i=1:n
7 fa = f(a+(i-1)*deltan)
8 fb = f(a+i*deltan)
9 fm = f(a+i*deltan-deltan/2)
10 A += fa + 4*fm + fb
11 end
12 return A*deltax6
13 end
14 fx4(x) = x^4
15 I = simpson_quad(fx4,0,1,10)

interval, and for each ti , we compute x.ti /. Then we plot the points .ti ; x.ti //, these points are
joined by a line and thus we have a smooth curve. This is achieved using the Plots package.

A.5 Polynomial interpolation


Listing A.12 implements the Lagrange interpolation (Section 12.3.1). First a function is coded
to compute the Lagrange interpolation function li .x/
Y n
x xj
li .x/ D
j D0
xi xj
j ¤i

Then the Lagrange interpolation formula is programmed:


X
n
y.x/ D li .x/yi
i D0

To generate Fig. 12.2, many points in Œ0; 6 are generated, and for each point xi compute y.xi /,
then we plot the points .xi ; y.xi //.

A.6 Propability
Monte Carlo for pi. I show in Listing A.13 the code that implements the Monte-Carlo method
for calculating . This is the code used to generate Table 5.3 and Fig. 5.2 (this part of the code
is not shown for brevity). It also presents how to work with arrays of unknown size (line 4 for
the array points2 as we do not in advance how many points will be inside the circle). In line

Phu Nguyen, Monash University © Draft version


Appendix A. Codes 1016

Listing A.11: Weakly damped oscillation, Fig. 9.9.


1 using Plots
2 using LaTeXStrings
3
4 omega0, beta, x0, v0 = 1.0, 0.05, 1.0, 3.0
5
6 omega0d = sqrt(omega0^2-beta^2)
7 theta = atan(-(v0+beta*x0)/(omega0d*x0))
8 C = x0 / cos(theta)
9
10 ta = 0:0.1:50 # time domain divided in many points: 0,0.1,0.2,...
11 xt = zeros(length(ta)) # length(ta) returns the size of the vector ‘ta’
12 b1 = zeros(length(ta)) # zeros(10) => a vector of 10 elems, all are 0.
13 b2 = zeros(length(ta))
14
15 for i=1:length(ta) # loop over t_i, compute x(t_i), ...
16 t = ta[i]
17 xt[i] = C*exp(-beta*t)*cos(omega0d*t+theta)
18 b1[i] = C*exp(-beta*t)
19 b2[i] = -C*exp(-beta*t)
20 end
21 # generating the graphs, plot!() is to add another plot on the existing plot
22 pyplot()
23 p=plot(ta,xt,legend=false,size=(250,250))
24 plot!(ta,b1,legend=false,color="red", linestyle = :dash)
25 plot!(ta,b2,legend=false,color="red", linestyle = :dash)
26 xlabel!(L"t"), ylabel!(L"x(t)")

13, we add one row to this array. Final note, this function returns multiple values put in a tuple
(line 16).
In Listing A.14, I present another implementation, which is much shorter using list
comprehensionŽŽ . In one line (line 3) all n points in Œ0; 12 is generated. . In line 4, we get all
the points inside the unit circle using the filter function and an anonymous p predicate (x ->
norm(x) <= 1). The norm function, from the LinearAlgebra package, is for x 2 C y 2 .

Computer experiment of tossing a coint. When we toss a coin we either get a head or a tail.
In our virtual coin tossing experiment, we generate a random integer number within Œ1; 2, and
we assign one to head and two to tail. We repeat this for n times and count the number of heads
and tails. Listing A.15 is the resulting code. The code introduces the rand function to generate
random numbers.
Using list comprehension we can have a shorter implementation shown in Listing A.16.
A list comprehension is a syntactic construct for creating a list based on existing lists. It follows the form of
ŽŽ

the mathematical set-builder notation (set comprehension). For example, S D f2  x W x 2 N; x 2 > 3g.

Filter is a higher-order function that processes a data structure (usually a list) in some order to produce a new
data structure containing exactly those elements of the original data structure for which a given predicate returns
the boolean value true.

Phu Nguyen, Monash University © Draft version


Appendix A. Codes 1017

Listing A.12: Lagrange interpolation.


1 # ith Lagrange basis, i =1,2,3,...
2 function lagrange_basis_i(i,data_x,x) # l_i(x)
3 li = 1.0
4 xi = data_x[i]
5 for j=1:length(data_x)
6 if j != i
7 xj = data_x[j]
8 li *= (x-xj)/(xi-xj)
9 end
10 end
11 return li
12 end
13 function lagrange_interpolation(data_x,data_y,x)
14 fx = 0.
15 for i=1:length(data_x)
16 li = lagrange_basis_i(i,data_x,x)
17 fx += li * data_y[i]
18 end
19 return fx
20 end
21 data_x=[0 1 2 3 4 5 6] # data points
22 data_y=[0 0.8415 0.9093 0.1411 -0.7568 -0.9589 -0.2794]
23 ta = 0:0.1:6 # points where Lagrange interpolating function is drawn
24 func = zeros(length(ta))
25 for i=1:length(ta)
26 func[i] = lagrange_interpolation(data_x,data_y,ta[i])
27 end

Listing A.13: Monte-Carlo method for calculating .


1 function monte_carlo_pi(n)
2 inside = 0 # counter for points inside the circle
3 points1 = zeros(n,2) # all generated points inside/outside
4 points2 = Array{Float64}(undef, 0, 2) # points inside the circle. points1/2 for plot
5 # 0 row, 2 cols, initial value: anything
6 for i=1:n
7 x = rand() # random number in [0,1]
8 y = rand()
9 points1[i,:] = [x,y] # points1[i,1] = x, points1[i,2] = y
10 if ( x^2 + y^2 <= 1. )
11 inside += 1
12 points2 = [points2;[x y]] # add [x y] to points2, one row a time
13 end
14 end
15 return (4*(inside/n),points1,points2) # return a tuple (pi,points1,points2)
16 end

Phu Nguyen, Monash University © Draft version


Appendix A. Codes 1018

Listing A.14: Monte-Carlo method for calculating : shorter version.


1 using LinearAlgebra
2 function monte_carlo_pi(n)
3 points1 = [(rand(),rand()) for _ in 1:n]
4 points2 = filter(x -> norm(x) <= 1., points1) # filter out elements : x^2+y^2 <=1
5 return (4*(length(points2)/n),points1,points2)
6 end
7 # note that points1/2 are not matrices of nx2, but vectors of tuples
8 # to get the x-coords, we have to do: first.(points1)
9 # for example, to plot the points, do:
10 plot(first.(points1),last.(points1),"ro")

Listing A.15: Virtual experiment of tossing a coin in Julia.


1 function tossing_a_coin(n)
2 head_count = 0
3 tail_count = 0
4 for i=1:n
5 result = rand(1:2) # 1: Head and 2: Tail
6 if ( result == 1 ) head_count += 1 end
7 if ( result == 2 ) tail_count += 1 end
8 end
9 return (head_count, tail_count)
10 end
11 data = zeros(5,3)
12 data[:,1] = [10,100,1000,2000,10000]
13 for i=1:size(data,1)
14 h,t = tossing_a_coin(data[i,1])
15 data[i,2] = h/data[i,1]
16 data[i,3] = t/data[i,1]
17 end

Listing A.16: Virtual experiment of tossing a coin in Julia: list comprehension based implementation.
1 function tossing_a_coin(n)
2 coin=[ rand(1:2) for _ in 1:n]
3 return (sum(coin .== 1), sum(coin .== 2))
4 end

Birthday problem. Now we present an implementation of the birthday problem. The procedure
is: we repeat the following steps N times where N is a large counting number:

 collect birthdays of n persons; this can be done with [rand(1:365) for _ in 1:n]

Phu Nguyen, Monash University © Draft version


Appendix A. Codes 1019

 count the number of occurences of the above birthdays array; for example with 3 persons,
we can have f1; 2; 2g, and after the counting we get f1; 2g (there is shared birthday), or
f4; 5; 6g with no duplicated elements, we get f1; 1; 1g thus no shared birthday.

 if there is shared birthday we return a true;

Refer to Listing A.17 for the code.

Listing A.17: List comprehension based implementations.


1 # roll 2 six-sided dice N times, compute P(even number)
2 faces = 1:6
3 dice = sum ([ iseven(rand(faces) + rand(faces)) for _ in 1:N ]) / N
4 # birthday problem
5 using StatsBase # for counts function
6 function birthday_event(n)
7 birthday_a_year = 1:365
8 birthdays_n_pers = [rand(birthday_a_year) for _ in 1:n]
9 birthdays occurences = counts(birthdays_n_pers)
_
10 return maximum(birthdays_occurences) > 1
11 end
12 # counts(x): Count the number of times each value in x occurs
13 N = 10^5
14
15 function birthday_experiment(n)
16 return sum([birthday_event(n) for _ in 1:N])/N
17 end
18
19 println("Probability ofr 23: $(birthday_experiment(23))")

Distributions.jl is a Julia package for probability distributions and associated functions. List-
ing A.18 presents a brief summary of some common functions.
The code in Listing A.19 is used to illustrate graphically the central limit theorem. The code
generates n uniformly distributed variables (i.e., X1 ; X2 ; : : : ; Xn ). Then it computes the mean of
Xi s, that is Y D .X1 C    C Xn /=n. And this is done for a large number of times (N D 2  104
for example). Then, a histogram of the vector of these N means is plotted (lines 7–8). What we
get is Fig. 5.19a.

A.7 N body problems


This section presents the program to solve the three body problems discussed in Section 12.6.5.
The program works for an aribtrary number of bodies. It consists of three parts: the input part
is given in Listing A.20, the solution part in Listing A.21 and post-processing in Listing A.22.
The code for the solution phase is pretty identical to the maths (Eq. (12.6.15)). Note that I do
not care for efficiency: for example the force between body i and body j are computed twice.

Phu Nguyen, Monash University © Draft version


Appendix A. Codes 1020

Listing A.18: Illustration of the Distributions package.


1 using Distributions
2 xGrid = -5:.01:5 # where the PDF evaluated
3 normal_dist = Normal(0,1) # make a normal RV X: mu=0,sigma=1
4 normalPDF(z) = pdf(normal_dist,z) # return the PDF of X at z
5 plot(xGrid,normalPDF.(xGrid),color="red") # plot the PDF of X
6 xbar = mean(normal_dist) # get the mean of X
7 sig = std(normal_dist) # get the SD of X
8 P = cdf(normal_dist,0.6) # get the CDF of X at 0.6: 0.7257

Listing A.19: Illustration of the central limit theorem (Fig. 5.19)


1 using Distributions, Random
2 n, N = 5, 20000
3 dist = Uniform(1.,2.) # X: uniform RV a=1,b=2
4 data = [mean(rand(dist,n)) for _ in 1:N] # means of X_i, i=1:n
5 lb = minimum(data), ub = maximum(data)
6 nb = Int( floor( (ub-lb)/width ) ) # width=bin width
7 fig , ax = plt.subplots(1, 1, figsize=set_size())
8 ax.hist(data, bins=nb,align="left", rwidth=0.9,density=1)

Listing A.20: N body problem solved with Euler-Cromer’s method: part I.


1 using LinearAlgebra, Printf, Plots
2 time = 6. # total simulation time
3 dt = 0.01 # time step
4 stepCount = Int32(floor(time/dt)) # number of time steps
5 N = 3 # number of bodies
6 mass = zeros(N) # mass vector
7 pos = zeros(2,N,stepCount) # position matrix (all bodies, all steps)
8 vel = zeros(2,N,stepCount) # velocity matrix
9 mass[1] = 1.; mass[2] = 1.; mass[3] = 1.;
10 # initial conditions
11 pos[:,1,1] = ... pos[:,3,1] = ... pos[:,2,1] = ...
12 vel[:,1,1] = ... vel[:,3,1] = ... vel[:,2,1] = ...
13 function force(ri,rj,mj)
14 rij = rj - ri
15 d = norm(rij)
16 return (G*mj/d^3)* rij
17 end

A.8 Working with images


Images.jl is a package that you need to manipulate images programmatically. When testing
ideas or just following along with the documentation of Images.jl , it can be useful to have
some images to work with. The TestImages.jl package bundles several standard images for

Phu Nguyen, Monash University © Draft version


Appendix A. Codes 1021

Listing A.21: N body problem solved with Euler-Cromer’s method: part II.
1 for n=1:stepCount-1
2 for i = 1: N # loop over the bodies
3 ri = pos[:,i,n] # position vector of body ’i’ at time n
4 fi = zeros(2) # compute force acting on ’i’
5 for j = 1:N
6 if ( j != i )
7 rj = pos[:,j,n] # position vector of body ’j’ at time n
8 mj = mass[j] # mass of body ’j’
9 fij = force(ri,rj,mj) # call the force function
10 fi += fij # add force of ’j’ on ’i’
11 end
12 end
13 vel[:,i,n+1] = vel[:,i,n]+dt*fi # update velocity of body ’i’
14 pos[:,i,n+1] = pos[:,i,n]+dt*vel[:,i,n+1] # update position of body ’i’
15 end
16 end

Listing A.22: N body problem solved with Euler-Cromer’s method: part III.
1 colors = [:blue,:orange,:red,:yellow]
2 anim = @animate for n in 1:stepCount
3 plot(;size=(400,400), axisratio=:equal, legend=false)
4 xlims!(-1.1,1.1)
5 ylims!(-1.1,1.1)
6 scatter!(pos[1,:,n],pos[2,:,n],axisratio=:equal) # plot three masses
7 # plot the trajectory of three masses upto time n
8 plot!(pos[1,1,1:n],pos[2,1,1:n],axisratio=:equal,color=colors[1])
9 plot!(pos[1,2,1:n],pos[2,2,1:n],axisratio=:equal,color=colors[2])
10 plot!(pos[1,3,1:n],pos[2,3,1:n],axisratio=:equal,color=colors[3])
11 end
12 gif(anim, "three-body.gif", fps=30) # fps = frames per second

you.
Listing A.23 is the code used to do a SVD image compression. The result of the code was
given in Fig. 11.29. In the code I used the map function. In many programming languages, map is
the name of a higher-order function that applies a given function to each element of a collection,
e.g. a list or set, returning the results in a collection of the same type. Listing A.24 demonstrates
the use of map.

A.9 Reinventing the wheel


Reinventing the wheel means ’to work on an something that already exists’. Thus, I have been
reinvented the wheel as I have reimplemented Simpson’s quadrature, the Newton-Raphson
method, Lagrange interpolation and so on. There are excellent implementations of them, which

Phu Nguyen, Monash University © Draft version


Appendix A. Codes 1022

Listing A.23: Image compression using SVD in Julia.


1 using Images, TestImages, LinearAlgebra
2 img0 = float.(testimage("mandrill")) # load the image (512,512) matrix
3 img = Gray.(img0) # convert to grayscale
4 function rank_approx(F::SVD, k)
5 U, S, V = F
6 M = U[:, 1:k] * Diagonal(S[1:k]) * V[:, 1:k]’
7 clamp01!(M)
8 end
9 # svd(img) returns SVD{Float32, Float32, Matrix{Float32}}
10 imgs = [rank_approx(svd(img), k) for k in (10,50,100)]
11 imgs = mosaicview(img, imgs...; nrow=1, npad=10)
12 save("compression-svd.png",imgs)

Listing A.24: Use of the Julia map function.


1 map((x) -> x * 2, [1, 2, 3]) # => [2,4,6]
2 map(+, [1, 2, 3], [10, 20, 30]) # => [11,22,33]
3 imgs = map((10, 50, 100)) do k
4 rank_approx(svd(img), k)
5 end

are much better than my implementation, provided as ‘packages’ or ‘libraries’. When we learn
something we should reinvent the wheel as it is usually the best way to understand something.
But for real work, use libraries. Go to https://julialang.org for a list of packages available
in Julia.

A.10 Computer algebra system


Herein a summary of the library SymPy is given in Listing A.25 for reference. For a more
complete documentation, I refer to the website.

A.11 Computer graphics with processing


We have seen some stunning fractals in Figs. 1.2, 1.5 and 1.6. Herein I present the code used to
make them. I use processing for this task. We will create Fig. 1.5.
The first thing is to define the canvas (think of it as the paper on which you draw, just that
this paper is part of your computer screen). This can be done with the setup function shown in
Listing A.26. On this canvas, a Cartesian coordinate system is setup with the origin at the top
left corner, x-axis goes to the right, and y-axis goes down (Fig. A.2). (A bit weird). Then, we
select the size of our equilateral triangle (the biggest triangle) l (l a bit smaller than the width
of the canvas is fine). The center of this triangle is .xc ; yc /. Next, we determine .x1 ; y1 /–the

Phu Nguyen, Monash University © Draft version


Appendix A. Codes 1023

Listing A.25: SymPy.


1 using SymPy
2 @vars x
3 # derivatives and plot
4 f = 1 / ( 1 + 25 * x * x)
5 f2 = diff(f,x,2) # 2nd derivative of f
6 f6 = diff(f,x,6) # 6th derivative of f
7 xgrid=-1:0.01:1.0
8 yh6=[f6.subs(x,xi) for xi in xgrid] # evaluate f6 at xgrid
9 plt.plot(xgrid,yh6,color="black","-",linewidth=1.,label="6th derivative")
10 # integrals
11 J = integrate(f, (x, a, b)) # integral of f, from a to b
12 # limits
13 limit(sin(x)/x, x, 0)
14 limit( (x+1/x)^x, x,oo ) # oo for infinity
15 # series expansion
16 f.series(x, x0, n) # Taylor series around x0 n terms
17 # partial fraction decomposition
18 apart(f)

coords of the lower right vertex of the triangle. We need a function to draw a triangle given its
lower left corner and its length, thus we wrote the function "tri" in Listing A.27.

Figure A.2: Canvas and its coordinate system in processing.

Listing A.26: setup in processing.


1 void setup() {
2 size(900, 900); // size of the canvas
3 noStroke(); // no stroke (outline) around triangles
4 fill(50); // fill color for the triangles, black
5 }

Phu Nguyen, Monash University © Draft version


Appendix A. Codes 1024

Listing A.27: Draw a triangle with the lower left corner and side.
1 void tri(float x, float y, float l) {
2 triangle(x, y, x + l/2, y - sin(PI/3) * l, x + l, y);
3 }

Now, we study the problem carefully. The process is: start with an equilateral triangle. Sub-
divide it into four smaller congruent equilateral triangles and remove the central triangle. Repeat
step 2 with each of the remaining smaller triangles infinitely. Of course we do not divide the
triangles infinitely, but for a finite number of times denoted by n. Note also that subdividing the
biggest triangle by four smaller triangles and remove the central one is equivalent to draw three
smaller triangles.
Now, if n D 1 we just draw the biggest triangle, which is straightforward. For n D 2 we
need to draw three triangles. This is illustrated in Fig. A.3. We’re now ready to write the main
function called "divide", the code is in Listing A.28. The base case is n D 1 and if n D 2 we
call this function again with l replaced by l=2 (smaller triangles) and n replaced by n 1, which
is one, and thus three l=2 sub-triangles are created. Finally, put the divide function inside the
processing built in function draw as shown Listing A.29.

Figure A.3

Listing A.28: The function "divide" as a recursive function.


1 void divide(float x, float y, float l, int n) {
2 if(n == 1) {
3 tri(x, y, l);
4 } else {
5 divide(x, y, l/2, n-1);
6 divide(x + l/2, y, l/2, n-1);
7 divide(x + l/4, y - sin(PI/3) * l/2, l/2, n-1);
8 }
9 }

For more on processing, you can check out this youtube channel.

Phu Nguyen, Monash University © Draft version


Appendix A. Codes 1025

Listing A.29: Put the drawing functions inside the draw function.
1 void draw() {
2 background(255); // background color
3 divide(x1, y1, l, 3);
4 }

Phu Nguyen, Monash University © Draft version


Appendix B
Data science with Julia

B.1 Introduction to DataFrames.jl


Data comes mostly in a tabular format. By tabular, we mean that the data consists of a table
containing rows and columns. The rows denote observations while columns denote variables.
Comma-separated values (CSV) files are are very effective way to store tables. A CSV file
does exactly what the name indicates it does, namely storing values by separating them using
commas.
We are going to use two packages namely CSV.jl and DataFrames.jl. The former to read
CSV files (that contain the data we want to analyse) and the latter is to store this data as a table
format. See line 10 of Listing B.1.

1026
Appendix B. Data science with Julia 1027

Listing B.1: Reading a CSV file and creating a DataFrame.


1 using DataFrames
2 using CSV
3 using PyCall
4
5 plt = pyimport("matplotlib.pyplot")
6 sns = pyimport("seaborn")
7

8 sns.set_style("ticks")
9
10 train = DataFrame(CSV.File("Pearson.csv"))
11 size(train) % => (1078,2)
12 names(train) % => 2-element Vector{String}: "Father", "Son"
13 first(train,5) % -> print the first 5 rows
14 train[!,:Father] % => do not copy
15 col = train[,:Father] % => copy column Father to col
16 train[train.Father .> 70,:] % => get sub-table where father’s height > 70
17
18 fig , ax = plt.subplots(1, 1, figsize=(5,5))
19 ax.hist(train[!,:Father],bins=18,density=true)
20 plt.xlabel("Height")
21 plt.ylabel("Proportion of observations per unit bin")

Phu Nguyen, Monash University © Draft version


Bibliography

[1] David Acheson. The Wonder Book of Geometry: A Mathematical Story. OUP Oxford,
2020. ISBN 9780192585387,9780198846383. [Cited on page 235]

[2] Lara Alcock. Mathematics Rebooted: A Fresh Approach to Understanding. Oxford Uni-
versity Press, 1 edition, 2017. ISBN 0198803796,9780198803799. [Cited on page 260]

[3] Gerhard Wanner (auth.) Alexander Ostermann. Geometry by Its History. Undergrad-
uate Texts in Mathematics. Springer-Verlag Berlin Heidelberg, 1 edition, 2012. ISBN
3642291627,9783642291623. [Cited on page 235]

[4] John Anderson. Computational Fluid Dynamics: the basic and applications. McGraw-Hill
Science/Engineering/Math, 1 edition, 1995. ISBN 9780070016859,0-07-001685-2,0-07-
001685-2. [Cited on page 955]

[5] Herman H. Goldstine (auth.). A History of the Calculus of Variations from the 17th through
the 19th Century. Studies in the History of Mathematics and Physical Sciences 5. Springer-
Verlag New York, 1 edition, 1980. ISBN 9781461381082; 1461381088; 9781461381068;
1461381061. [Cited on page 797]

[6] W. W. Rouse Ball. A short account of the history of mathematics. Michigan histori-
cal reprint. Scholarly Publishing Office, University of Michigan Library, 2005. ISBN
1418185272,9781418185275. [Cited on page 21]

[7] Eric Temple Bell. Men of Mathematics: The Lives and Achievements of the Great Mathe-
maticians from Zeno to Poincaré. Touchstone, 1986. ISBN 0-671-62818-6, 978-1-4767-
8425-0. [Cited on page 21]

[8] Alex Bellos. Alex’s Adventure In Numberland. Bloomsbury Publishing PLC. [Cited on
page 44]

[9] Jonathan Borwein and David Bailey. Mathematics by Experiment: Plausible Rea-
soning in the 21st Century. A K Peters / CRC Press, 2nd edition, 2008. ISBN
1568814429,9781568814421. [Cited on page 17]

1028
Bibliography 1029

[10] Merzbach U.C. Boyer C.B. A history of mathematics. Wiley, 2011. ISBN
0470525487,9780470525487. [Cited on page 263]

[11] Glen Van Brummelen. Heavenly Mathematics: The Forgotten Art of Spherical Trigonome-
try. Princeton, 2012. ISBN 0691148929 978-0691148922. [Cited on page 324]

[12] D. N. Burghes and M.S. Borrie. Modelling with Differential Equations. Mathematics and
its Applications. Ellis Horwood Ltd , Publisher, 1981. ISBN 0853122865; 9780853122869.
[Cited on pages 736 and 739]

[13] Florian Cajori. A history of mathematical notations. Dover Publications, 1993. ISBN
9780486677668,0486677664. [Cited on page 98]

[14] Jennifer Coopersmith. The lazy universe. An introduction to the principle of least action.
Oxford University Press, 1 edition, 2017. ISBN 978-0-19-874304-0,0198743041. [Cited
on pages 371 and 797]

[15] Richard Courant, Herbert Robbins, and Ian Stewart. What is mathematics?: an elementary
approach to ideas and methods. Oxford University Press, 2nd ed edition, 1996. ISBN
0195105192,9780195105193. [Cited on page 342]

[16] Jay Cummings. Proofs: A Long-Form Mathematics Textbook (The Long-Form Math Text-
book Series). Independently published, 2021. ISBN 9798595265973. [Cited on page 12]

[17] Keith Devlin. The Unfinished game: Pascal, Fermat and the letters. Basic Books, 1 edition,
2008. ISBN 0465009107,9780465009107,9780786726325. [Cited on page 516]

[18] William Dunham. The mathematical universe: An alphabetical journey


through the great proofs, problems, and persons. Wiley, 1994. ISBN
9780471536567,0471536563. URL http://gen.lib.rus.ec/book/index.php?
md5=4efc025c74dbb7547e479f24b024020e. [Cited on page 81]

[19] William Dunham. Euler: The master of us all, volume 22. American Mathematical Society,
2022. [Cited on page 490]

[20] C. H Edwards. The historical development of the calculus. Springer, 1979. ISBN
3540904360,9783540904366. [Cited on page 342]

[21] Stanley J. Farlow. Partial differential equations for scientists and engineers. Courier Dover
Publications, 1993. ISBN 048667620X,9780486676203. URL http://gen.lib.rus.ec/
book/index.php?md5=74c5f9a0384371ab46a1def8f73ec978. [Cited on page 736]

[22] Richard Phillips Feynman. The Feynman Lectures on Physics 3 Volume Set) Set v, volume
Volumes 1 - 3. Addison Wesley Longman, 1970. ISBN 0201021153,9780201021158.
[Cited on pages 230 and 635]

Phu Nguyen, Monash University © Draft version


Bibliography 1030

[23] Daniel A Fleisch. A student’s guide to vectors and tensors. Cambridge University Press, 1
edition, 2012. ISBN 9780521193696,0521193699,9780521171908,0521171903. [Cited
on page 731]

[24] Strang G. Introduction to Linear Algebra. Wellesley-Cambridge Press,U.S, 5th edition


edition, 2021. [Cited on pages 838 and 838]

[25] Martin J Gander and Gerhard Wanner. From euler, ritz, and galerkin to modern computing.
Siam Review, 54(4):627–666, 2012. [Cited on page 824]

[26] Nicholas J. Giordano and Hisao Nakanishi. Computational Physics. Addison-Wesley, 2nd
edition edition, 2005. ISBN 0131469908; 9780131469907. [Cited on page 955]

[27] Anders Hald. A History of Probability and Statistics and Their Applications before 1750
(Wiley Series in Probability and Statistics). Wiley-Interscience, 1 edition, 2003. ISBN
0471471291,9780471471295. [Cited on page 517]

[28] Richard Hamming. Numerical methods for scientists and engineers. Dover, 2nd ed edition,
1987. ISBN 9780486652412,0486652416. [Cited on page 955]

[29] David J. Hand. Statistics: a very short introduction. Very Short Introductions. Oxford
University Press, USA, 2008. ISBN 9780199233564,019923356X. [Cited on page 623]

[30] Julian Havil. Gamma: exploring Euler’s constant. Princeton Science


Library. Princeton University Press, illustrated edition edition, 2009. ISBN
9780691141336,0691141339,0691099839,9780691099835. [Cited on pages 149 and 490]

[31] Brian Hopkins and Robin J Wilson. The truth about königsberg. The College Mathematics
Journal, 35(3):198–207, 2004. [Cited on page 224]

[32] Eugene Isaacson and Herbert Bishop Keller. Analysis of numerical methods. Dover
Publications, 1994. ISBN 9780486680293,0486680290. [Cited on page 955]

[33] Victor J. Katz. A History of Mathematics. Pearson, 3rd edition edition, 2008. ISBN
0321387007,9780321387004. [Cited on page 21]

[34] Daniel Kleppner and Robert Kolenkow. An Introduction To Mechanics. McGraw-Hill, 1


edition, 1973. ISBN 0070350485,9780070350489. [Cited on page 911]

[35] M Kline. Mathematical Thought From Ancient to Modern Times I. Oxford University
Press, 1972. [Cited on page 358]

[36] Morris Kline. Calculus: An Intuitive and Physical Approach. John Wiley & Sons, 1967.
ISBN 9780471023968,0471023965. [Cited on page 342]

[37] Morris Kline. Mathematics for the Nonmathematician (Dover books explaining science).
Dover books explaining science. Dover Publications, illustrated. edition, 1985. ISBN
0486248232,9780486248233,048646329X,9780486463292. [Cited on page 315]

Phu Nguyen, Monash University © Draft version


Bibliography 1031

[38] Cornelius Lanczos. The Variational Principles of Mechanics. 1957. [Cited on pages 797
and 802]
[39] Serge Lang. Math: Encounters with high school students. Springer, 1985. ISBN
9780387961293,0387961291. [Cited on page 284]
[40] Hans Petter Langtangen and Svein Linge. Finite Difference Computing with PDEs: A
Modern Software Approach. Texts in Computational Science and Engineering 16. Springer
International Publishing, 1 edition, 2017. ISBN 978-3-319-55455-6, 978-3-319-55456-3.
[Cited on page 955]
[41] David Derbes Robert Jantzen Lillian R. Lieber, Hugh Gray Lieber. The Einstein
theory of relativity. Paul Dry Books, 1st paul dry books ed edition, 2008. ISBN
9781589880443,1589880447. [Cited on page 731]
[42] Paul Lockhart. A Mathematician’s Lament: How School Cheats Us Out of Our
Most Fascinating and Imaginative Art Form. Bellevue Literary Press, 2009. ISBN
1934137170,9781934137178. [Cited on page 4]
[43] Paul Lockhart. Measurement. Belknap Press, 2012. ISBN 0674057554,9780674057555.
[Cited on page 235]
[44] Eli Maor. To Infinity and Beyond: A Cultural History of the Infinite. Princeton University
Press, illustrated edition edition, 1991. ISBN 9780691025117,0691025118. [Cited on
page 218]
[45] Eli Maor. Trigonometric delights. Princeton University Press, 1998. ISBN
9780691057545,9780691095417,0691057540,0691095418. [Cited on page 320]
[46] Jerrold E. Marsden and Anthony Tromba. Vector calculus. W.H. Freeman, 5th ed edition,
2003. ISBN 9780716749929; 0716749920. [Cited on page 635]
[47] Paul J. Nahin. An imaginary tale: The story of square root of -1. Princeton University Press,
pup edition, 1998. ISBN 0691027951,9780691027951,9780691127989,0691127980.
[Cited on pages 93, 172, 175, and 441]
[48] Paul J. Nahin. Dr. Euler’s Fabulous Formula: Cures Many Mathematical Ills. Princeton
University Press, 2006. ISBN 0691118221,9780691118222. [Cited on page 490]
[49] Paul J. Nahin. When Least Is Best: How Mathematicians Discovered Many Clever Ways to
Make Things as Small (or as Large) as Possible. Princeton University Press, 2007. ISBN
0691130523,9780691130521. [Cited on page 797]
[50] Paul J. Nahin. Inside Interesting Integrals: A Collection of Sneaky Tricks, Sly
Substitutions, and Numerous Other Stupendously Clever, Awesomely Wicked, and ...
Undergraduate Lecture Notes in Physics. Springer, 2015 edition, 2014. ISBN
1493912763,9781493912766. URL http://gen.lib.rus.ec/book/index.php?md5=
dd3891c740af26fb79ab93e5eb7ec95f. [Cited on pages 428 and 429]

Phu Nguyen, Monash University © Draft version


Bibliography 1032

[51] Paul J. Nahin. Hot Molecules, Cold Electrons: From the Mathematics of Heat to the De-
velopment of the Trans-Atlantic. Princeton University Press, 2020. ISBN 9780691191720;
0691191727. [Cited on page 784]

[52] Yoni Nazarathy and Hayden Klok. Statistics with Julia: Fundamentals for Data Sci-
ence, Machine Learning and Artificial Intelligence. Springer Nature, 2021. ISBN
9783030709013,3030709019. [Cited on page 623]

[53] Roger B Nelsen. Proofs without words: Exercises in visual thinking. Number 1. MAA,
1993. [Cited on page 11]

[54] Ivan Morton Niven. Numbers: rational and irrational. New Mathematical Library. Mathe-
matical Assn of America, random house edition, 1961. ISBN 9780883856017,0883856018.
[Cited on page 111]

[55] G. Polya. How to solve it; a new aspect of mathematical method. Prince-
ton paperbacks, 246. Princeton University Press, 2d ed edition, 1971. ISBN
9780691023564,9780691080970,0691023565,0691080976. [Cited on page 14]

[56] David Poole. Linear Algebra.. A Modern Introduction. Brooks Cole, 2005. ISBN
0534998453,9780534998455. [Cited on pages 628, 838, 930, and 944]

[57] David S. Richeson. Euler’s gem: The polyhedron formula and the birth of topology.
Princeton University Press, 1 edition, 2008. ISBN 0691126771,9780691126777. [Cited
on page 264]

[58] David S. Richeson. Tales of Impossibility: The 2000-Year Quest to Solve the Mathematical
Problems of Antiquity. Princeton University Press, 2019. ISBN 9780691192963. [Cited
on pages 90 and 235]

[59] Sheldon M. Ross. A first course in probability. Prentice Hall, 5th ed edition, 1998. ISBN
0137463146,9780137463145. [Cited on page 516]

[60] H. M. Schey. Div, Grad, Curl, and All That: An Informal Text on Vector Cal-
culus, Fourth Edition. W. W. Norton & Company, 4th edition, 2005. ISBN
0393925161,9780393925166. URL http://gen.lib.rus.ec/book/index.php?md5=
261ab626a8014c7f36f081ef725cf968. [Cited on page 698]

[61] Ian Stewart. Letters to a young mathematician. Art of mentoring.


Basic Books, a member of the Perseus Books Group, 2006. ISBN
0465082319,9780465082315,0465082327,9780465082322. URL http://gen.lib.rus.
ec/book/index.php?md5=70ddf8def914f740e7edf17f9ce91d77. [Cited on page 21]

[62] Ian Stewart. Why Beauty Is Truth: The History of Symmetry. Basic Books, 2007. ISBN
046508236X, 9780465082360. URL http://gen.lib.rus.ec/book/index.php?md5=
671dd77f4a4699b6bcac078e9571df8a. [Cited on page 111]

Phu Nguyen, Monash University © Draft version


[63] James Stewart. Calculus: Early Transcendentals. Stewart’s Calculus Series. Brooks Cole,
6° edition, 2007. ISBN 0495011665,9780495011668. URL http://gen.lib.rus.ec/
book/index.php?md5=ae7190f2e7ed196d93fd43485f2f7759. [Cited on page 635]

[64] Stephen M. Stigler. The history of statistics: the measurement of uncertainty before 1900.
Belknap Press, illustrated edition edition, 1986. ISBN 0674403401,9780674403406. [Cited
on page 516]

[65] John Stillwell. The Four Pillars of Geometry. Undergraduate texts in mathematics.
Springer, 1 edition, 2005. ISBN 9780387255309,0387255303. [Cited on page 235]

[66] John Stillwell. Mathematics and Its History. Undergraduate Texts in Mathematics.
Springer-Verlag New York, 3 edition, 2010. ISBN 144196052X,9781441960528. [Cited
on page 21]

[67] Gilbert Strang. Calculus. Wellesley College, 2 edition, 1991. ISBN


0961408820,9780961408824. URL http://gen.lib.rus.ec/book/index.php?md5=
2b7a48e9670c9eb0ce157b0527cc7481. [Cited on pages 342 and 635]

[68] Gilbert Strang. Linear Algebra And Learning from Data, volume 1. Wesley-Cambridge
Press, 1 edition, 2019. ISBN 0692196382,9780692196380. [Cited on pages 838 and 838]

[69] Steven Strogatz. Infinite Powers: How Calculus Reveals the Secrets of the Universe.
Houghton Mifflin Harcourt, 2019. [Cited on pages 26, 341, and 342]

[70] John R. Taylor. Classical Mechanics. University Science Books, 2005. ISBN
189138922X,9781891389221. [Cited on pages 736 and 911]

[71] Lloyd N. Trefethen. Approximation Theory and Approximation Practice. 2013. [Cited on
page 954]

[72] Paul Zeitz. The Art and Craft of Problem Solving. John Wiley, 2nd ed edition, 2007. ISBN
9780471789017,0471789011. [Cited on pages 15 and 45]
Index

2-body problem, 986 Bayes’ law, 539


Bayes’ rule, 539
Bézier curve, 476 Bayes’ theorem, 539
Euler–Mascheroni constant, 489 Bernoulli numbers, 190, 500
Numerical integration, 971 Bernstein basis polynomial, 477
open bracket , 128 big O notation, 495
binary numbers, 223
acceleration of gravity, 688
acute angles, 278 Binet’s formula, 76
adjoint matrix, 911 binomial theorem, 188
algebraic equations, 111 boundary conditions, 752
algebraic numbers, 111 Brachistochrone problem, 806
alternate angle, 242 Briggs, 155
AM-GM inequality, 123 Bubnov-Galerkin method, 834
analytic geometry, 327
cardinality, 219
angular frequency, 768
Cartesian coordinate system, 327
angular momentum, 851
anti-derivative, 408 Cartesian product, 526
areal coordinates, 676 CAS, 339
argument of complex number, 160 catenary, 800
arithmetic mean, 119 Cauchy product, 724
arithmetic rules, 32 Cauchy-Riemann equation, 728
asymptotes, 307, 348 Cauchy–Schwarz inequality, 126
average speed, 373 ceiling function, 147
axiom, 12 center of mass, 667, 668
centered difference, 958
Bézier surfaces, 683 central limit theorem, 604
backward difference, 958 chain rule of differentiation, 385
barycentric, 971 change of basis, 932
barycentric coordinates, 676 change of variables, 663
Basel problem, 488 change-of-basis matrix, 932
basis, 882 characteristic equation, 916

1034
Chebyshev polynomials, 969 damped oscillation, 773
Chebyshev’s inequality, 600 de Moivre, 563
Chernoff’s inequality, 601 de Moivre’s formula, 164
circle, 270 de Morgan’s laws, 529
Clenshaw’s algorithm, 957 definition, 12
closed bracket, 128 dependent variable, 737
co-domain of a function, 356 depressed cubic equation, 94
coding, 1008 derivative, 368, 375
cofactor, 909 determinant, 904
cofactor expansion, 909 determinant of a matrix, 893
column space, 882 difference equation, 549
complex analysis, 170 difference equations, 550
complex conjugate, 163 differential equations, 737
complex number, 158 Differential operator, 377
complex plane, 158 diffusion equation, 751
compound interests, 153 dimension matrix, 763
computer algebra system, 339 dimension of a PDE, 747
computing, 17 dimensional analysis, 759
condition number of a matrix, 946 dimensionless group, 760
conditional probability, 540 directional derivative, 642
congruence, 248 Dirichlet boundary conditions, 832
congruent triangles, 248 Dirichlet integral, 431
conic sections, 343 discrete random variable, 557
conjugate radical, 68 divergence, 706
conservation of energy, 751 divergence of a vector, 709
continued fraction, 77 divergence theorem, 710
convex functions, 401 domain of a function, 356
convexity, 401 dot product, 843
coordinate map, 934 double factorial, 182
coordinate vector, 930 double integral, 658
coordinate vector , 887 double integral in polar coordinates, 660
coordinates , 887 driven damped oscillation, 773
coupled oscillation, 780 driven oscillation, 773
coupled oscillator, 780 dummy index, 859
covariance, 597 dynamical equations, 687
covariance matrix, 597
Cramer’s rule, 909 eigenvalue, 916
cross derivatives, 640 eigenvalue equation, 917
cross product, 850 eigenvector, 916
cubic equation, 91 Einstein summation convention, 732
cumulative distribution function, 570 Einstein summation notation, 859
curl of a vector field, 712 elementary matrices, 878
cycloid, 806 ellipse, 346
elliptic integral, 435 Gauss’s theorem, 710
elliptic integral of the first kind, 435, 779 generalized binomial theorem, 479
elliptic integral of the second kind, 435 generalized eigenvector, 746
empty set, 526 generalized Pythagoras theorem, 306
Euclid, 171 generating functions, 605
Euclidean five postulates, 238 geometric mean, 119
Euler, 491 geometric series, 484
Euler’s identity, 171 geometry, 233
Euler’s method, 984 golden ratio, 70
Euler-Aspel-Cromer’ method, 986 gradient descent method, 1005
Euler-Maclaurin summation formula, 502 gradient vector, 644
expansion coefficients, 887 Gram-Schmidt algorithm, 902
Exponential of a matrix, 746 graph, 224
exterior angle, 250 graph of functions, 352
extrema, 395 graph theory, 224
extreme value theorem, 460 gravitation, 693
Green’s identities, 718
factorial, 177
factorization, 98 hanging chain, 800
Feymann’s trick, 430 harmonic oscillation, 766
Fibonacci, 76 Harriot-Girard theorem, 325
Fibonacci sequence, 73 heat conduction, 752
finite difference equation, 994 Heron’s formula, 244, 245
fixed point iterations, 78 Hessian matrix, 649
floor function, 147 hexadecimal numbers, 223
fluxes, 706 histogram, 584
forced oscillation, 773 horizontal translation, 354
formula, 88 Horner’s method, 205
forward difference, 958 hyperbola , 348
forward-backward-induction, 123 hyperbolic functions, 311
four color theorem, 227
Fourier coefficients, 505 ill conditioned matrix, 946
Fourier series, 505 implicit differentiation, 393
Fourier’s law, 752 improper integrals, 433
frequency, 768 independent variable, 737
function, 352 index notation, 732
function composition, 354, 355 inequality, 117
function transformation, 354 infimum, 526
function,graph, 352 infinite series, 479
functional equations, 360 initial-boundary value problem, 752
functions of a complex variable, 170 inner product, 938
fundamental theorem of arithmetic, 54 inner product space, 939
integral, 364, 365
Gauss rule, 976 Integration by parts, 413
integration by parts, 833 logarithmic differentiation, 395
Integration by substitution, 411 LU decomposition, 881
interior angle, 250
intermediate value theorem, 460 Machin’s formula, 294
interpolation, 960 marginal distribution, 588
inverse function, 357 Markov chain, 630
irrational number, 64 Markov’s inequality, 600
isomorphism, 934 mass matrix, 781
math phobia, 18
Jacobian matrix, 666 mathematical modeling, 737
Jensen inequality, 401 matrix-matrix multiplication, 894
joint probability mass function, 588 maxima, 395
Julia, 17, 1008 mean value theorem, 460
Mercator’s series, 483
Kepler’s laws, 685 Mersenne number, 134
kernel of a linear transformation, 933 method of separation of variables, 784
Kronecker delta, 897 mid-point rule, 972
Kronecker delta property, 831 minima, 395
modular arithmetic, 211
L’Hopital’s rule, 457
modulus of complex number, 160
Lagrange basis polynomials, 961
moment of inertia, 669
Lagrange interpolation, 961
moment of inertia matrix, 912
Lagrange multiplier, 654
Monte Carlo method, 522
Lagrange multiplier method, 655
multi-index, 652
Lagrangian mechanics, 815
multiplication rule of probability, 536
Laplacian operator, 754
multivariate normal distribution, 621
law of cosines, 306
law of heat conduction, 752 N-body problem, 986
law of sines, 306 natural frequency, 768
law of total probability, 539 Neptune, 697
Legendre polynomials, 940 Newton-Raphson method, 642
length of plane curves, 433 nilpotent matrix, 746
limit, 145, 445 norm, 943
line integrals, 701 normal frequencies, 781
linear approximation, 642 normal modes, 781
linear combination, 859 normalizing a vector, 844
linear equation, 86, 87 normed vector space, 943
linear function, 887 nullity, 885
linear independence, 868 nullspace, 882
linear recurrence equation, 549 number theory, 44
linear space, 926 numerical differentiation, 957
linear transformation, 933
linear transformations, 888 obtuse angles, 278
logarithm, 149 one-to-one, 933
onto, 933 Pythagoras, 85
order of a PDE, 747 Pythagoras theorem, 80
ordinary differential equations, 687, 737 Pythagorean triple, 82
orthogonal matrix, 898
Orthonormal basis, 897 quadratic equation, 88
quadratic form, 653
parabolas, 347 quadratic forms, 922
Parallel axis theorem, 673 quartenion, 857
parametric curves, 358 quotient rule of differentiation, 384
parametric surfaces, 676
radical, 66
partial derivative, 640
radican, 66
partial differential equations, 737
random variable, 557
partial fraction decomposition, 425
range of a function, 356
partial fractions, 428
range of a linear transformation, 933
Pascal triangle, 196
rank of a matrix, 866
pattern, 3
rank theorem, 886
PDE, 747
rational numbers, 58
PDF, 584
rational root theorem, 202
periodic functions, 505
rectangular or right hyperbola , 348
permutation, 177
recurrence equation, 549
piecewise continuous functions, 506
reduced row echelon form, 865
pigeonhole principle, 185, 186
reflex angles, 278
Platonic solids, 262
regular polyhedra, 262
polar coordinates, 469
regular tessellation, 262
polar form of complex numbers, 160 resolvent equation, 335
polygons, 236 resonance, 775
polyhedra, 262 Rolle’s theorem, 460
polynomial evaluation, 205 root mean square (RMS), 126
polynomial remainder theorem, 201 row echelon form, 864
polynomials, 198 row space, 882
power, 112 Runge’s phenomenon, 964
prime number, 58
principal axes theorem, 924 saddle point, 648
probability density function, 584 sample, 581
probability mass function, 557 sample space, 529
probability vector, 630 sample variance, 581
processing, 17 scalar, 839
programming, 17, 1008 scalar quantities, 839
projection, 848 scientific notation, 117
proof, 12 second derivative, 392
proof by contradiction, 58 second derivative test, 649
proof by induction, 48 second moment of area, 669
pseudoinverse matrix, 627 semi-discrete equation, 835
semi-regular tessellation, 262 transition matrix , 630
sequence, 145 transverse wave, 788
shear transformation, 888 trapezoidal rule, 972
Simpson rule, 975 trigonometric substitution, 424
Snell’s law of refraction, 398 trigonometry, 233
solution by radicals, 89 trigonometry equations, 303
spherical polygons, 325 Trigonometry identities, 168
spherical triangles, 325 trigonometry identities, 284
square root, 64 trigonometry inequality, 297
square wave, 506 triple integral, 660
standard deviation, 581 truncation error, 958
state vector , 630
stiffness matrix, 781 universal constant, 760
Stokes theorem, 713 upper triangular matrix, 878
strong form, 833
subset, 526 Vandermonde matrix, 964
subspace, 881 variance, 581
summation index, 859 vector, 839, 857
superset, 526 vector addition, 840
supremum, 526 vector calculus, 698
symmetry, 15 vector field, 699
system of linear equations, 860 vector space, 926
vectorial quantities, 839
tangent plane, 642 Venn diagram, 526
Taylor’s series, 491, 650 Verlet method, 989
telescoping sum, 69 vertical asymptotes, 307, 448
tensor, 734 vertical translation, 354
tessellation, 262 Vieta’s formula, 207
the basis theorem, 883 Viète, 95
The Cauchy-Schwarz inequality, 941 Viète’s formula, 140
the fundamental theorem of calculus, 409 von Neumann stability analysis, 996
the method of exhaustion, 267
the rank theorem, 866 Wallis’ infinite product, 489
The triangle inequality, 845 wave equation, 748, 832
theorem, 12 wavenumber, 788
time integration methods, 835 weak form integrals, 835
time rate of change of position, 375 Weierstrass approximation theorem, 964
torus, 679 weight function, 833
total differential, 642 Wessel, 171
transcendental equations, 111 word problem, 87
transcendental numbers, 111
transformation, 354 “cryptography”, 44

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy