0% found this document useful (0 votes)
351 views4 pages

Study Guide ML Math PDF

This document provides a guide for developing a strong mathematical foundation for machine learning work. It outlines a progression of subject areas to study, including linear algebra, analysis, statistics, probability, and optimization. The guide recommends specific textbooks for each subject. The goal is to gain a deep theoretical and applied understanding of machine learning algorithms by mastering the underlying mathematics. A solid grasp of proofs, linear algebra, and analysis is presented as essential prior knowledge before delving into more advanced statistical concepts.

Uploaded by

jerry
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
351 views4 pages

Study Guide ML Math PDF

This document provides a guide for developing a strong mathematical foundation for machine learning work. It outlines a progression of subject areas to study, including linear algebra, analysis, statistics, probability, and optimization. The guide recommends specific textbooks for each subject. The goal is to gain a deep theoretical and applied understanding of machine learning algorithms by mastering the underlying mathematics. A solid grasp of proofs, linear algebra, and analysis is presented as essential prior knowledge before delving into more advanced statistical concepts.

Uploaded by

jerry
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

The Mathematical Foundations of Machine Learning

Version 1

5/9/2020

This is a map of subjects and the corresponding textbooks that one should study in order to
have a very solid mathematical foundation for doing machine learning work. This is a guide for
learning the math that you will use, not for learning the machine learning algorithms themselves.
Obviously not all of this is necessary, and you can find work in the machine learning field without
knowing all of this. However, if your goal is to have a deep understanding, on both an applied
and theoretical level, of algorithms that are typically used in this field, then following this guide
will enable you to do so. None of the topics here cover the software aspect of machine learning.
There are many applied machine learning courses available online that cover that aspect of the
field. Websites like Coursera or Udemy are a good place to start.
The only prerequisite knowledge this guide will assume is a year of calculus. There are
many resources for learning calculus so they will not be covered here. The book by Stewart or
Khan Academy are a perfectly fine way to learn the subject. My goal in writing this guide is to
provide someone who wants to read ESL and the Deep Learning book with enough mathematical
maturity to do so. Covering the required material will put you in good shape for that. These
subjects are difficult and require a serious level of dedication. I’ve done my best to provide
textbooks that have solutions available. I will not link to the solution manuals, they can be found
with a little bit of searching. In fact, all required texts have solution manuals available aside from
the linear models text by Christensen. Also, my goal isn’t to provide resources that are free. I’ve
picked what I believe to be are the best quality texts for the subject that will give you the deepest
understanding. With that said, it shouldn’t be hard to find electronic copies of them.
You will need to be very comfortable with proving things, this is non-negotiable. So step
0 would be to go through a book like How to Prove It by Velleman, or Discrete Mathematics
by Rosen. From there we start with linear algebra and introductory analysis. You need a
good grasp of these in order to understand statistics at the level that is required for machine
learning. It is impossible to understand things like linear models and the central limit theorem
without having a good grasp of linear algebra and analysis. For these topics I recommend Linear
Algebra by Friedberg, Insel, and Spence and Understanding Analysis by Abbott. These are
both great textbooks. The linear algebra one is great because it blends applied and theoretical
understanding, and Understanding Analysis helps build intuition for doing further work in
analysis. From here we move on to statistics, and then things branch out. As a machine learning
practitioner, having a working knowledge of probability can be very helpful, but a rigorous
understanding of probability cannot be had unless we first learn some measure theory. Thus
I’ve included that in our path as well. Below is the flowchart of subjects to study and their
corresponding texts. A blue node is a required subject/text and an orange node is an optional
subject/text.
Proofs
How to Prove It
Velleman

Linear Algebra Analysis


Introductory Analysis Topology
Linear Algebra Principles of
Understanding Analysis Topology
Friedberg, Insel, Spence Mathematical Analysis
Abbott Munkres
Rudin

Statistics Functional Analysis Measure Theory


Optimization
Introduction to Introductory Functional Measures, Integrals,
Convex Optimization
Mathematical Statistics Analysis With Applications and Martingales
Boyd, Vandenberghe
Hogg, McKean, Craig Kreyszig Schilling

Introductory Linear Models


Advanced Statistics
Applied Linear Introductory Probability
Statistical Inference
Statistical Models A First Look
Casella, Berger
Kutner, et. al At Rigorous
Probability Theory
Linear Models GLMs Rosenthal
Plane Answers to Generalized, Linear,
Complex Questions and Mixed Models
Christensen McCulloch, Searle, Neuhaus Probability
Probability and
Advanced Linear Models Measure Theory
Advanced Ash, Doléans-Dade
Linear Modeling
Christensen

Further Advanced Statistics


Asymptotics
Theoretical Statistics
Asymptotic Statistics
Topics for a Core Course
van der Vaart
Keener
As stated before, if you are more comfortable with a standard discrete math text then you can
replace Velleman with the textbook by Rosen. But, Velleman is great and it has many solutions
in the back of the book. If you are finding linear algebra to be difficult then maybe backtrack a
bit and try working through Introduction to Linear Algebra by Strang (with its corresponding
MIT OpenCourseware videos) or Linear Algebra and Its Applications by Lay. For introductory
analysis, Abbott is as good as it gets. If you want more references though, Introduction to Real
Analysis by Bartle and Sherbert and The Way of Analysis by Strichartz are also good. The latter is
very wordy but the author focuses heavily on building intuition so it’s great if you’re not getting
that from Abbott’s text.
Once you’ve worked your way through linear algebra and analysis you should have enough
maturity to work through Hogg’s intro to statistics textbook. It is a great text when paired with
Casella and Berger. I learned from both of these texts and I still reference them from time to
time. If you need some supplemental texts to go along with Hogg try Mathematical Statistics
with Applications by Wackerly, Mendenhall, and Scheaffer, All of Statistics by Wasserman, and
Mathematical Statistics and Data Analysis by Rice. If you can get through the problems in Casella
and Berger (you really only need through chapter 10) then you are more than prepared for doing
work as a data scientist or machine learning engineer (in terms of probability and statistics).
Applied Linear Statistical Models by Kutner, et. al is a great text when it comes to learning
linear models. It is incredibly long, clocking in at a little over 1400 pages. However, you can
avoid the 2nd half of the book if you are pressed for time because it covers basic design and
analysis of experiments (ANOVA and the like). Once you’ve finished that you can move onto the
more theoretical aspects of linear models, like distributions of quadratic forms. This is covered
in the book by Christensen. It’s a great textbook but unfortunately there is no solution manual
available. If you are self-studying and need to be able to check your solutions then I would
recommend replacing this text with Linear Models in Statistics by Rencher and Schaalje. The
solutions are in the back. There are also many supplemental texts here. Some notable ones are:
A Primer on Linear Models by Monahan, Linear Models by Searle (solutions are available on the
text’s website), and Linear Statistical Models by Stapleton (solutions are in the back of the book).
Do not forget Convex Optimization! Knowing your optimization algorithms is incredibly
important as a machine learning practitioner and the text by Boyd and Vandenberghe is con-
sidered the bible. It can be a difficult text though. You should have a very solid foundation of
linear algebra, calculus, introductory analysis, and even some topology when working your way
through it. Supplement with Munkres (the first few chapters on point set topology) if needed
because I’m not sure if Abbott does any topology outside of R.
Once you’ve covered all that you are in good shape! If you are interested in really under-
standing probability then you will need a much better understanding of analysis. To do this
you should start by working through the first 7 chapters of Rudin (ignore the rest, they’re not
great). From here you can skip to introductory functional analysis by Kreyszig if you want. This
is functional analysis without measure theory so you’re still taking some baby steps here. To do
proper functional analysis we need a working knowledge of measure theory so we have more
to work with than sequence spaces. After you’ve completed the book by Schilling you can look
into the functional analysis texts by Rudin, Conway, or even Stein and Shakarchi. To do proper
probability we need our measure theory. The text by Schilling is a great introduction, and the
author provides a full solution manual on his website. It’s a very thorough textbook with great
proofs. It covers a few bits and pieces of probability but not enough for our liking, though. If you
want a supplement here, or maybe a even a more gentle introduction try Measure, Integration,
and Real Analysis by Axler. From here we move on to the study of rigorous probability. Start
with the gentle introduction by Rosenthal. It’s a very short text but it has lots of great problems
to work through. The introductory chapter explaining why we need measure theory to properly
define how probability works gives good motivation. Finally, a full blown probability textbook.
Probability by Ash was chosen over Probability and Measure by Billinglsey because it covers
roughly the same material and it has many solutions in the back of the text. Both are equally
good textbooks though, so they can be interchanged. If you’d like further references then the
texts by Chung, Resnick, Durrett, Athreya, and Pollard are good. If you still can’t get enough
probability then your next steps would be: Convergence of Probability Measures by Billingsley,
Real Analysis and Probability by Dudley, and Uniform Central Limit Theorems by Dudley.
Once we have a solid foundation of rigorous probability theory then we can work on even
more advanced statistics. The recommended text by Keener is a great book with some solutions
provided in the back. I think it is the text used for Stanford’s PhD level theoretical statistics class.
A good supplement here is the book Mathematical Statistics by Jun Shao (and its accompanying
solutions manual), and if you’re looking for a more Bayesian viewpoint then I would recommend
Theory of Statistics by Schervish. Finally, the most widely used text for asymptotic statistics is
the one by van der Vaart. A knowledge of measure theory might not be needed for this text but
it can never hurt. A supplementary text here would be Elements of Large-Sample Theory by
Lehmann.
Lastly there is the topic of GLMs and advanced linear models. The mentioned text for
GLMs is good but it is heavily theoretical. If you’d prefer something more applied then look
into Foundations of Linear and Generalized Linear Models by Agresti and an Introduction to
Generalized Linear Models by Dobson and Barnett. The classic text Genralized Linear Models by
Mccullagh and Nelder is recommended as well. If you are just interested in categorical data then
Categorical Data Analysis (not the introduction) by Agresti cannot be beat. Advanced Linear
Modeling by Christensen is really just a survey text covering a lot of advanced techniques, like
penalized estimation and reproducing kernel hilbert spaces. If you are interested in any of the
specific topics covered in it then there are references provided at the end of each chapter.
A few random recommended textbooks that did not fit in fall under econometrics and time
series analysis. For econometrics, Econometrics by Hayashi, Econometric Analysis by Greene,
Econometric Analysis of Cross Section and Panel Data by Wooldridge, and Econometric Theory
and Methods by Davidson and Mackinnon are great texts. For time series analysis I would
recommend Time Series Analysis by Hamilton (very dense) and Time Series Analysis and Its
Applications by Shumway and Stoffer.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy