ML-2-Expectation Maximization
ML-2-Expectation Maximization
The EM algorithm was explained and given its name in a classic 1977 paper by
Arthur Dempster, Nan Laird, and Donald Rubin
2|Page
1. We first consider a set of starting parameters given a set of incomplete (observed) data and we
assume that observed data come from a specific model
2. We then use the model to “estimate” the missing data. In other words after formulating some
parameters from observed data to build a model, we use this model to guess the missing value/data.
This step is called the expectation step.
3. Now we use the “complete” data that we have estimated to update parameters using the missing
data and observed data, we find the most likely modified parameters to build the modified model.
This is called the maximization step.
4. We repeat steps 2 & 3 until convergence that is there is no change in the parameters of the model
and the estimated model fits the observed data.
3|Page
4|Page
The major strength of the EM algorithm is its numerical stability where in every iteration of the
EM algorithm, the likelihood of the observed data increases that is we are heading towards a
solution. In addition, the EM handles parameter constraints gracefully.
In the case of EM algorithms can converge very slowly on some problems and this convergence is
intimately related to the amount of missing information. It guarantees to improve the probability
of the training corpus, which is different from reducing the errors directly. The EM algorithm
cannot guarantee to reach global maximum and sometimes could get struck at the local maxima,
saddle points, etc. Essentially the guess we make of the initial parameter values is very important
and can decide on the time to converge.
5|Page
6|Page
When we toss coin A, there is 80% chance that we will get head.
When we toss coin B, there is 45% chance that we will get head.
7|Page
When we toss coin A, there is 80% chance that we will get head.
When we toss coin B, there is 52% chance that we will get head.
8|Page
Applications of EM Algorithm
EM Algorithm is often used in data clustering in Machine Learning and computer vision.
The EM algorithm is used for parameter estimation in mixed models and quantitative
genetics
9|Page
It is used in psychometrics for estimating item parameters and latent abilities of item
response theory models
Some other applications include medical image reconstruction, structural engineering, etc.
Reference:
https://www.youtube.com/watch?v=qy3WKmSXM64
https://www.youtube.com/watch?v=3oefV-AoP0E
https://www.youtube.com/watch?v=DIADjJXrgps
https://www.edureka.co/blog/em-algorithm-in-machine-learning/
https://towardsdatascience.com/gaussian-mixture-models-explained-6986aaf5a95
https://ebooks.inflibnet.ac.in/csp15/chapter/expectation-and-maximization/
10 | P a g e
11 | P a g e