All The Math
All The Math
Contents
1 Inequalities 1
1.1 Technique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Choosing Auxiliary parameter α . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.2 Other random methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Famous . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Involving trig functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Involving e . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.5 Random . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Recurrence Relations 3
2.1 Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
4 Triangles 4
4.1 Ravi substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
5 Integral Tricks 4
6 Calculus of Variations 4
7 Linear Algebra 4
7.1 General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
7.1.1 Skew symmetric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
7.1.2 Skew orthogonal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
7.2 Quadratic form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
7.3 Invertibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
7.4 Square matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
7.5 Orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
7.6 Symmetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
7.6.1 Proving A is symmetric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
7.7 QR Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
7.8 SVD Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
7.9 Polar Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
7.10 Gram–Schmidt algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
7.11 Schur complement condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
8 Basis 6
8.1 Span . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
9 Eigenvalues/vectors 6
9.0.1 Diagonalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
10 Invertibility 6
10.1 Positive semi-definite matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1 Inequalities
1.1 Technique
1.1.1 Choosing Auxiliary parameter α
1.
1.1.2 Other random methods
1. Breaking odd terms into even terms only:
(i) 1/2 (ii) 1
E |λX|2k+1 ≤ E |λX|2k E |λX|2k+2 ≤ 2 λ2k E X 2k + λ2k+2 E X 2k+2
2. Using FTC to show h(t) ≥ g(t):
If the first derivatives satisfy h′ (t) ≥ g ′ (t) for all t ≥ 0, integrating both sides with respect to t from 0 to t gives:
h(t) ≥ g(t)
for all t ≥ 0.
1.2 Famous
1. Conjugate duals and Fenchel–Young inequality:
f ∗ (y) ≥ xT y − f (x)
Pn 2 Pn 2
Pn 2
2. Cauchy-Schwartz: ( i=1 ui vi ) ≤ i=1 ui i=1 vi
(a) When a sum is squared, think about decomposing it into the product of sum of squares.
(b) Reverse triangle: ∥u − v∥ ≥ |∥u∥ − ∥v∥|
(c) Expectation without assuming independence:
h i 1/2 2λX2 1/2
E eλ(X1 +X2 ) ≤ E e2λX1
E e
x1 +x2 +···+xn √
4. Arithmetic–geometric mean inequality: n ≥ n
x1 x2 · · · xn
5. Brunn-Minkowski:
[vol(λC + (1 − λ)D)]1/n ≥ λ[vol(C)]1/n + (1 − λ)[vol(D)]1/n for all λ ∈ [0, 1]
λC + (1 − λ)D := {λc + (1 − λ)d | c ∈ C, d ∈ D}
6.
7.
8.
1.4 Involving e
1. supz ze−z = e−1
2
2. eu ≤ u + eau /b
for appropriately chose a and b (such as a = 9, b = 16)
3. x(1 − x)n ≤ xe−nx ≤ 1
en (0 ≤ x ≤ 1)
√1 1
4. 1−s
≤ es for all s ∈ 0, 2
Z−a
5. Convexity of e, where Z is a convex combination Z = αb + (1 − α)a, so α = b−a
Z − a tb b − Z ta
etZ ≤ αetb + (1 − α)eta = e + e
b−a b−a
6.
1.5 Random
1. (a + b + c)2 ≤ 3a2 + 3b2 + 3c2
2 Recurrence Relations
2.1 Techniques
1. Constant term involved =⇒ See if it needs to be removed
2. Think about re-indexing the terms
More generally,
n
X
Sn = [a + (k − 1)d]rk−1
k=1
a − [a + (n − 1)d]rn dr 1 − rn−1
= +
1−r (1 − r)2
If n = ∞,
a dr
S∞ = +
1 − r (1 − r)2
2. Maximum Maximum function is convex! You can change the order of the max on a function if the function is monotonous
(exponential: emax Xi = max eXi
Xn
= E max eλXi ≤ E eλXi
E exp λ max Xi
i∈[n] i∈[n]
i=1
3.
4.
4 Triangles
4.1 Ravi substitution
The technique of applying a substitution to transform a problem about the side lengths of a triangle into one about real numbers
(a)
Since
a=y+z
b=x+z
c=x+y
Then
b+c−a
x= 2
a+c−b
y= 2
a+b−c
z=
2
Memorizing the order of a, b and c may be quite difficult, but usually there is no need since most of the inequalities requiring
Ravi are symmetric. It’s however possible to use a mnemonic method to keep it in mind :
In the first part, you can notice that ” a doesn’t have x, b doesn’t have y and c doesn’t have z ”. In the bottom part, notice as
well that ” x has −a, y has −b and z has −c ”.
5 Integral Tricks
R∞ q
−λ2 t 2π
1. (Gaussian) −∞
exp 2 dλ = t
2.
6 Calculus of Variations
7 Linear Algebra
7.1 General
7.1.1 Skew symmetric
At + A = I
7.3 Invertibility
1. Positive definite
(a) Show X is positive definite by showing X t X is positive definite
2. Full rank
3. Uniqueness(Well-defined):
(a) If X is invertible, its orthogonal decomposition(the parts corresponding to the orthonormal bases of the subspaces ) is
unique- The symmetric part S is given by:
1
A + A⊤
S=
2
- The skew-symmetric part T is given by:
1
A − A⊤
T =
2
4.
7.5 Orthogonality
1. Considering only square matrices, orthogonal complement of Skew(p) in is Rp×p Sym(p)
(a) Assume ⟨A, Ω⟩ = 0 where Ω is skew-symmetric, A is a symmetric matrix.
i. Decompose A = S + T , where S is symmetric, T is skew-symmetric
ii. ⟨A, Ω⟩ = ⟨S + T, Ω⟩ = S t Ω + T t Ω = S t Ω + 0 = At Ω, which means A is left with only S, in other words, A is
symmetric
2.
7.6 Symmetry
7.6.1 Proving A is symmetric
1. Try to show: (A − At )X = 0 for X ̸= 0
2. Try proof by contradiction letting A = S + T
7.7 QR Decomposition
In particular, the matrix A can be written as:
A = QR
⊤
where: - Q is orthogonal, meaning Q Q = I, the identity matrix. - R is an upper triangular matrix, meaning that all entries
below the diagonal are zero.
A = U ΣV ⊤
where: - U and V are orthogonal matrices, and - Σ is a diagonal matrix containing the singular values of A.
A = UH
where: - U is the orthogonal matrix (from the SVD), and - H = Σ is the positive semi-definite matrix (from the SVD).
8.1 Span
1. Say G generates a vector space, V:
(a) V = span(G)
(b) The span of any subset of a vector space is a subspace
(c) The span of a subset of a vector space contains the span of that subspace
i. If L ∪ V ⊆ G, span(L ∪ V ) ⊆ span(G) = V , span of G = V if G is the generating set of G
9 Eigenvalues/vectors
For eigenvector vi , there is a corresponding eigenvalue λi such that if vi is an eigenvector of V:
V v i = λi v i
9.0.1 Diagonalization
n
X
A= λi vi wiT
i=1
10 Invertibility
10.1 Positive semi-definite matrices
Symmetric matrices can be expressed as The matrix D⊤ x Wx Dx is symmetric and positive semi-definite. Its smallest eigenvalue,
λmin (x), controls the stability of the inversion. Specifically: - If λmin (x) is close to 0 , the matrix is ill-conditioned or even non-
invertible, and fˆ(x) becomes unstable or undefined. - If λmin (x) is bounded away from 0 uniformly over x, then the matrix is
uniformly well-conditioned, and fˆ(x) behaves nicely.