UNIT 4 - EM Alg
UNIT 4 - EM Alg
So, for the variables which are sometimes observable and sometimes not,
then we can use the instances when that variable is visible is observed for
the purpose of learning and then predict its value in the instances when it
is not observable.
EM Algorithm is used to
Update
Variables
Update
Hypothesis
Algorithm:
1. Given a set of incomplete data, consider a set of starting
parameters.
2. Expectation step (E – step): Using the observed available
data of the dataset, estimate (guess) the values of the missing
data.
3. Maximization step (M – step): Complete data generated
after the expectation (E) step is used in order to update the
parameters. It is basically used to update the hypothesis.
4. Repeat step 2 and step 3 until convergence.
Usage of EM algorithm –
• It can be used to fill the missing data in a sample.
• It can be used as the basis of unsupervised learning of clusters.
• It is also used to compute the Gaussian density of a function.
• It can be used for the purpose of estimating the parameters of
Hidden Markov Model (HMM).
• It can be used for discovering the values of latent variables.
• EM algorithm finds plenty of use in natural language processing
(NLP), computer vision, and quantitative genetics.
• Other important applications of the EM algorithm include image
reconstruction in the field of medicine and structural engineering.
Advantages of EM algorithm –
• It is always guaranteed that likelihood will increase with each
iteration.
• The E-step and M-step are often pretty easy for many problems
in terms of implementation.
• Solutions to the M-steps often exist in the closed form.
Disadvantages of EM algorithm –
• It has slow convergence.
• It makes convergence to the local optima only.