AI29
AI29
Gaussian Mixture Models is a "soft" clustering algorithm, where each point probabilistically
"belongs" to all clusters. This is different than k-means where each point belongs to one cluster.
The Gaussian mixture model is a probabilistic model that assumes all the data points are
generated from a mix of Gaussian distributions with unknown parameters.
For example, in modeling human height data, height is typically modeled as a normal
distribution for each gender with a mean of approximately 5'10" for males: and 55" for
females. Given only the height data and not the gender assignments for each data point, the
distribution of all heights would follow the sum of two scaled (different variance) and
shifted (different mean) normal distributions
A model making this assumption is an example of a Gaussian mixture model.
Gaussian mixture models do not rigidly classify each and every instance into one class or
the other. The algorithm attempts to produce K-Gaussian distributions that would take into
account the entire training space.
Every point can be associated with one or more distributions. Consequently, the deterministic
factor would be the probability that each point belongs to a certain Gaussian distribution.
GMMs have a variety of real-world applications. Some of them are listed below.
4.5.1 Expectation-maximization
Expectation is used to find the Gaussian parameters which are used to represent each
component of gaussian mixture models. Maximization is termed M and it is involved in
determining whether new data points can be added or not.
The Expectation-Maximization (EM) algorithm is used in maximum likelihood estimation
where the problem involves two sets of random variables of which one, X, is observable
and the other, Z, is hidden.
The goal of the algorithm is to find the parameter vector Ф that maximizes the likelihood
of the observed values of X, L(Ф/ X).
But in cases where this is not feasible, we associate the extra hidden variables Z and express
the underlying model using both, to maximize the likelihood of the joint distribution of X
and Z, the complete likelihood LC(Ф / X,Z).
Expectation-maximization (EM) is an iterative method used to find maximum likelihood
estimates of parameters in probabilistic models, where the model depends on unobserved,
also called latent, variables.
EM alternates between performing an expectation (E) step, which computes an expectation
of the likelihood by including the latent variables as if they were observed, and
maximization (M) step, which computes the maximum likelihood estimates of the
parameters by maximizing the expected likelihood found in the E step.
The parameters found on the M step are then used to start another E step, and the process
is repeated until some criterion is satisfied. EM is frequently used for data clustering like
for example in Gaussian mixtures.
In the Expectation step, find the expected values of the latent variables (here you need to
use the current parameter values)
In the Maximization step, first plug in the expected values of the latent variables in the
log-likelihood of the augmented data. Then maximize this log-likelihood to reevaluate the
parameters
Expectation-Maximization (EM) is a technique used in point estimation. Given a set of
observable variables X and unknown (latent) variables Z we want to estimate parameters
ѳ in a model.