0% found this document useful (0 votes)
2 views24 pages

Unit 4 Notes

The document covers ensemble techniques and unsupervised learning in artificial intelligence and machine learning, focusing on methods like bagging, boosting, and K-means clustering. It explains how combining multiple classifiers can enhance accuracy, detailing various approaches including voting, stacking, and the Adaboost algorithm. Additionally, it outlines the K-means algorithm for clustering, emphasizing its iterative process for grouping data points based on similarity.

Uploaded by

mchikam56
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views24 pages

Unit 4 Notes

The document covers ensemble techniques and unsupervised learning in artificial intelligence and machine learning, focusing on methods like bagging, boosting, and K-means clustering. It explains how combining multiple classifiers can enhance accuracy, detailing various approaches including voting, stacking, and the Adaboost algorithm. Additionally, it outlines the K-means algorithm for clustering, emphasizing its iterative process for grouping data points based on similarity.

Uploaded by

mchikam56
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

UNIT IV
ENSEMBLE TECHNIQUES AND UNSUPERVISED LEARNING

UNIVERSITY QUESTIONS

1. Assume an image has pixel size 240*180. elaborate how K means clustering can be
used to achieve lossy data compression of that image
2. Explain in detail about combining multiple classifiers by voting.
3. [i] what is bagging and boosting?give example
[ii]outline the steps in the adaboost algorithm with an example.
4. Elaborate on the steps in expectation –maximization algorithm

COMBINING MULTIPLE LEARNERS


 Combining multiple learners is a model composed of multiple learners that
complement each other so that by combining them, we attain higher accuracy.
 The No Free Lunch Theorem states that there is no single learning algorithm that in
any domain always induces the most accurate learner. The usual approach is to try
many and choose the one that performs the best on a separate validation set.
 The most common method to combine models is by averaging multiple models,
where taking a weighted average improves the accuracy
GENERATING DIVERSE LEARNERS DIFFERENT ALGORITHMS
We can use different learning algorithms to train different base-learners. Different algorithms
make different assumptions about the data and lead to different classifiers.
Different Hyper parameters
We can use the same learning algorithm but use it with different hyper parameters.
Different Input Representations
Separate base-learners may be using different representations of the same input
object or event, making it possible to integrate different types of
Sensors/measurements/modalities. Different representations make different characteristics
explicit allowing better identification.
Different Training Sets
Another possibility is to train different base-learners by different subsets of the
training set. This can be done randomly by drawing random training sets from the given
sample this is called bagging.

Prepared by A.SUJITHA AP/ECE PTLEECNCET


CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

Diversity vs. Accuracy


This implies that the required accuracy and diversity of the learners also depend on
how their decisions are to be combined. In a voting scheme, a learner is consulted for all
inputs, it should be accurate everywhere and diversity should be enforced everywhere.
Voting
The simplest way to combine multiple classifiers is by voting, which corresponds to taking a
linear combination of the learners

Model Combination Schemes


There are also different ways the multiple base-learners are combined to generate
the final output:
Multiexpert combination
Multiexpert combination methods have base-learners that work in parallel. These
methods can in turn be divided into two:
A) The global approach, also called learner fusion, given an input, all base-learners
generate an output and all these outputs are used. Examples are voting and stacking.
B) The local approach, or learner selection, for example, in mixture of experts, there is
a gating model, which looks at the input and chooses one (or very few) of the learners as
responsible for generating the output. Multistage combination methods use a serial
approach where the next base-learner is trained with or tested on only the instances where the
previous base-learners are not accurate enough.
VOTING
• The simplest way to combine multiple classifiers is by voting, which corresponds to
take a linear combination of the learners. Voting is an ensemble machine learning algorithm.
• For regression, a voting ensemble involves make a prediction that the average of
multiple other regression models.
• • In classification, a hard voting ensembled involves summing the votes for crisp class
labels from other models and predicting the class with the most votes, A soft voting
ensemble involves summing the predicted probabilities for class labels and predicting
the class label with the largest sum probability.

Prepared by A.SUJITHA AP/ECE PTLEECNCET


CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

• first step is to create multiple classification/ regression models using some training
dataset. Each base model can be created using different splits of the same training
dataset and same algorithm, or using the same dataset with different algorithms, or
any other methods.
• When combining multiple independent and diverse decisions each of which is at least
more accurate than random guessing, random errors cancel each other out, and correct
decisions are reinforced. Human ensembles are demonstrably better.
• Use a single, arbitrary learning algorithm but manipulate training data to make it learn
multiple models.
CLASSIFIER COMBINATION RULES

EXAMPLE:

Prepared by A.SUJITHA AP/ECE PTLEECNCET


CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

ENSEMBLE LEARNING

• The idea of ensemble learning is to employ multiple learners and combine their
predictions.
• Ensemble methods combine several decision trees classifiers to produce better
predictive performance than a single decision tree classifier. The main principle behind the
ensemble model is that a group of weak learners come together to form a strong learner, thus
increasing the accuracy of the model.
SIMPLE ENSEMBLE TRAINING METHODS
Simple ensemble training methods typically just involve the application of statistical
summary techniques, such as determining the mode[max voting], mean[averaging], or
weighted average of a set of predictions.
ADVANCED ENSEMBLE TRAINING METHODS
There are three primary advanced ensemble training techniques, each of which is
designed to deal with a specific type of machine learning problem.
 ―Bagging‖ techniques are used to decrease the variance of a model‘s predictions, with
variance referring to how much the outcome of predictions differs when based on the
same observation.
 ―Boosting‖ techniques are used to combat the bias of models.
 Finally, ―stacking‖ is used to improve predictions in general.
BAGGING/BOOTSTRAP AGGREGATING
. Bagging is a voting method whereby base-learners are made different by training them over
slightly different training sets.

Prepared by A.SUJITHA AP/ECE PTLEECNCET


CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

 It is a machine learning ensemble meta-algorithm designed to improve the stability


and accuracy of machine learning algorithms used in statistical classification and
regression.
 It decreases the variance and helps to avoid overfitting. It is usually applied to
decision tree methods. Bagging is a special case of the model averaging approach.
 Bootstrapping is a sampling technique where samples are derived from the whole
population (set) using the replacement procedure. The sampling with replacement
method helps make the selection procedure randomized. The base learning algorithm
is run on the samples to complete the procedure.
 Aggregation in bagging is done to incorporate all possible outcomes of the prediction
and randomize the outcome., the aggregation is based on the probability bootstrapping
procedures or on the basis of all outcomes of the predictive models.

• For given a training set of size n, create m samples of size n by drawing n examples
from the original data, with replacement. Each bootstrap sample will on average contain
63.2% of the unique training examples, the rest are replicates. It combines the m resulting
models using simple majority vote.

PSEUDOCODE

Prepared by A.SUJITHA AP/ECE PTLEECNCET


CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

1.Given training data(x1,y1),….,(xm,ym)


2. For t=1,…..,T: a. From bootstrap replicate dataset St by selecting m random examples
from the training set with replacement. b. Let ht be the result of training base learning
algorithm on si 3. Output combined classifier: H(x)=majority(h1(x),….,hT(x)).
IMPLEMENTATION STEPS OF BAGGING
Step 1: Multiple subsets are created from the original data set with equal tuples, selecting
observations with replacement.
Step 2: A base model is created on each of these subsets.
Step 3: Each model is learned in parallel with each training set and independent of each
other.
Step 4: The final predictions are determined by combining the predictions from all the
models
ADVANTAGES:
1. Reduce overfitting of the model.
2. Handles higher dimensionality data very well.
3. Maintains accuracy for missing data
DISADVANTAGE:
Since final prediction is based on mean prediction from the subset trees, it won‘t give
precise values for the classification and regression model
BOOSTING
 Boosting is an ensemble learning method that combines a set of weak learners into a
strong learner to minimize training errors. In boosting, a random sample of data is
selected, fitted with a model and then trained sequentially—that is, each model tries to
compensate for the weaknesses of its predecessor.
 Boosting takes many forms, including gradient boosting, Adaptive Boosting
(AdaBoost), and XGBoost (Extreme Gradient Boosting).

 It is done by building a model by using weak models in series.

 Firstly, a model is built from the training data.

 Then the second model is built which tries to correct the errors present in the first
model.

Prepared by A.SUJITHA AP/ECE PTLEECNCET


CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

 This procedure is continued and models are added until either the complete training
data set is predicted correctly or the maximum number of models is added.

STEPS
• Draw a random subset of training samples d1 without replacement from the training
set D to train a weak learner C1
• Draw second random training subset d2 without replacement from the training set add
add 50 percent of the samples that were previously falsely classified / misclassified to
train a weak learner C2
• Find the training samples d3 in the training set D on which C1 and C2 disagree to
train a third weak learner C3
• Combine all the weak learners via majority voting.
ADVANTAGES OF BOOSTING
• Improved accuracy
• Robustness to overfitting
• Better interpretability
DISADVANTAGES OF BOOSTING
• Prone to over – fitting.
• Requires careful tuning of different hyper – parameters
ADABOOST ALGORITHM
• It is used to learn weak classifiers

Prepared by A.SUJITHA AP/ECE PTLEECNCET


CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

– final classification based on weighted vote of weak classifiers


• Initially, all weights are set equally,
– each round the weights of incorrectly classified examples are increased
– so that those observations that the previously classifier poorly predicts receive
greater weight on the next iteration.
STEPS IN ADABOOST ALGORITHM
1. Initialise the dataset and assign equal weight to each of the data point
2. Provide this as input to the model and identify the wrongly classified data points
3. Increase the weight of the wrongly classified data points
4. If(got required results)
Goto step5
Else
Goto step2
5. end
ADVANTAGES OF ADABOOST
• Very simple to implement
• fairly good generalization
• prior error need not be known ahead of time
DISADVANTAGES OF ADABOOST
• Suboptimal solution
• Can over fit in presence of noise.

STACKING
• It is a popular ensemble machine learning techniques and used to predict multiple
nodes to build a new model and improve model performance.
• Stacking enables us to train multiple models to solve similar problems, and based on
their combined output, it builds a new model with improved performance
• It ensembling classification or regression models
• it consists of two-layer estimators
• first layer consists of all the baseline models that are used to predict the
outputs on the test datasets.
• second layer consists of Meta-Classifier or Regressor which takes all the
predictions of baseline models as an input and generate new predictions

Prepared by A.SUJITHA AP/ECE PTLEECNCET


CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

STEPS TO IMPLEMENT STACKING MODELS:


• Split training data sets into n-folds using the Repeated Stratified KFold as
this is the most common approach to preparing training datasets for meta-
models.
• Now the base model is fitted with the first fold, which is n-1, and it will make
predictions for the nth folds.
• The prediction made in the above step is added to the x1_train list.
• Repeat steps 2 & 3 for remaining n-1folds, so it will give x1_train array of
size n,
• Now, the model is trained on all the n parts, which will make predictions for
the sample data.
• Add this prediction to the y1_test list.
• In the same way, we can find x2_train, y2_test, x3_train, and y3_test by using
Model 2 and 3 for training, respectively, to get Level 2 predictions.
• Now train the Meta model on level 1 prediction, where these predictions will
be used as features for the model .
• Finally, Meta learners can now be used to make a prediction on test data in
the stacking model.

COMPARISON OF VARIOUS ENSEMBLE LEARNING


BAGGING BOOSTING STACKING

PURPOSE Reduce variance(or) Reduce bias(or)underfitting Improve accuracy


overfitting

Prepared by A.SUJITHA AP/ECE PTLEECNCET


CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

BASE homogenous homogenous heterogenous


LEARNER
TYPES

BASE parallel sequential Meta model


LEARNER
TRAINING

AGGREGATIO Max voting Weighted averaging Weighted


N averaging averaging

UNSUPERVISED LEARNING : K MEANS

• K-Means Clustering is an unsupervised learning algorithm that is used to solve


the clustering problems in machine learning or data science
• Here K defines the number of pre-defined clusters that need to be created in
the process, as if K=2, there will be two clusters, and for K=3, there will be
three clusters, and so on.
• It is an iterative algorithm that divides the unlabeled dataset into k different
clusters in such a way that each dataset belongs only one group that has
similar properties.
• It allows us to cluster the data into different groups and a convenient way to
discover the categories of groups in the unlabeled dataset on its own without
the need for any training
• It is a centroid-based algorithm, where each cluster is associated with a
centroid. The main aim of this algorithm is to minimize the sum of distances
between the data point and their corresponding clusters.
• The algorithm takes the unlabeled dataset as input, divides the dataset into k-
number of clusters, and repeats the process until it does not find the best
clusters. The value of k should be predetermined in this algorithm.
• The k-means clustering algorithm mainly performs two tasks:
• Determines the best value for K center points or centroids by an iterative
process.
• Assigns each data point to its closest k-center.
• Those data points which are near to the particular k-center, create a cluster.

Prepared by A.SUJITHA AP/ECE PTLEECNCET


CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

• Hence each cluster has datapoints with some commonalities, and it is away
from other clusters.

HOW DOES THE K-MEANS ALGORITHM WORK?


• Step-1: Select the number K to decide the number of clusters.
• Step-2: Select random K points or centroids. (It can be other from the input dataset).
• Step-3: Assign each data point to their closest centroid, which will form the
predefined K clusters.
• Step-4: Calculate the variance and place a new centroid of each cluster.
• Step-5: Repeat the third steps, which means reassign each datapoint to the new
closest centroid of each cluster.
• Step-6: If any reassignment occurs, then go to step-4 else go to FINISH.
• Step-7: The model is ready.
EXAMPLE OF K MEANS :
• Suppose we have two variables M1 and M2. The x-y axis scatter plot of these
two variables is given below:

• Let's take number k of clusters, i.e., K=2, to identify the dataset and to put them into
different clusters. It means here we will try to group these datasets into two different clusters.
• We need to choose some random k points or centroid to form the cluster. These points
can be either the points from the dataset or any other point. So, here we are selecting the
below two points as k points, which are not the part of our dataset. Consider the below image:

Prepared by A.SUJITHA AP/ECE PTLEECNCET


CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

• Now we will assign each data point of the scatter plot to its closest K-point or
centroid.
• calculate the distance between two points. So, we will draw a median between
boththecentroids. Consider the below image:

• From the above image, it is clear that points left side of the line is near to the
K1 or blue centroid, and points to the right of the line are close to the yellow
centroid. Let's color them as blue and yellow for clear visualization.

• As we need to find the closest cluster, so we will repeat the process by choosing a
new centroid. To choose the new centroids, we will compute the center of gravity of these
centroids, and will find new centroids as below:

Prepared by A.SUJITHA AP/ECE PTLEECNCET


CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

• Next, we will reassign each datapoint to the new centroid. For this, we will repeat the
same process of finding a median line. The median will be like below image:

• From the above image, we can see, one yellow point is on the left side of the
line, and two blue points are right to the line. So, these three points will be
assigned to new centroids

• As reassignment has taken place, so we will again go to the step-4, which is


finding new centroids or K-points.
• We will repeat the process by finding the center of gravity of centroids, so the new
centroids will be as shown in the below image:

Prepared by A.SUJITHA AP/ECE PTLEECNCET


CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

• As we got the new centroids so again will draw the median line and reassign the data
points. So, the image will be:

• We can see in the above image; there are no dissimilar data points on either side of
the line, which means our model is formed. Consider the below image:

• As our model is ready, so we can now remove the assumed centroids, and the
two final clusters will be as shown in the below image:

INSTANCE BASED LEARNING:KNN


[K-NEAREST NEIGHBOUR]

Prepared by A.SUJITHA AP/ECE PTLEECNCET


CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

• K-NN algorithm assumes the similarity between the new case/data and
available cases and put the new case into the category that is most similar to
the available categories.
• K-NN algorithm can be used for Regression as well as for Classification but
mostly it is used for the Classification problems.
• K-NN is a non-parametric algorithm, which means it does not make any
assumption on underlying data.
• It is also called a lazy learner algorithm because it does not learn from the
training set immediately instead it stores the dataset and at the time of
classification, it performs an action on the dataset.
• KNN algorithm at the training phase just stores the dataset and when it gets
new data, then it classifies that data into a category that is much similar to the
new data.
STEPS:
Step-1: Select the number K of the neighbors
Step-2: Calculate the Euclidean distance of K number of neighbors
Step-3: Take the K nearest neighbors as per the calculated Euclidean distance.
Step-4: Among these k neighbors, count the number of the data points in each category.
Step-5: Assign the new data points to that category for which the number of the neighbor is
maximum.
Step-6: Our model is ready.

EXAMPLE:
Suppose there are two categories, i.e., Category A and Category B, and we have a
new data point x1, so this data point will lie in which of these categories. To solve this type
of problem, we need a K-NN algorithm. With the help of K-NN, we can easily identify the
category or class of a particular dataset. Consider the below diagram:

Prepared by A.SUJITHA AP/ECE PTLEECNCET


CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

Suppose we have a new data point and we need to put it in the required category.

• Firstly, we will choose the number of neighbors, so we will choose the k=5.
• Next, we will calculate the Euclidean distance between the data points. The
Euclidean distance is the distance between two points, which we have already
studied in geometry. It can be calculated as:

• By calculating the Euclidean distance we got the nearest neighbors, as three


nearest neighbors in category A and two nearest neighbors in category B.
Consider the below image:

Prepared by A.SUJITHA AP/ECE PTLEECNCET


CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

• As we can see the 3 nearest neighbors are from category A in Figure 4.9 ,
hence this new data point must belong to category A.
ADVANTAGES OF KNN ALGORITHM:
• It is simple to implement.
• It is robust to the noisy training data
• It can be more effective if the training data is large.
DISADVANTAGES OF KNN ALGORITHM:
• Always needs to determine the value of K which may be complex some time.
• The computation cost is high because of calculating the distance between the
data points for all the training samples.

COMPARISON BETWEEN K MEANS AND KNN

Prepared by A.SUJITHA AP/ECE PTLEECNCET


CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

GAUSSIAN MIXTURE MODEL[GM MODEL]

• This model is a soft probabilistic clustering model that allows us to describe


the membership of points to a set of clusters using a mixture of Gaussian
densities.
• It is a soft classification (in contrast to a hard one) because it assigns
probabilities of belonging to a specific class instead of a definitive choice. In
essence, each observation will belong to every class but with different
probabilities.
ADVANTAGE OF GM OVER K MEANS
• While Gaussian mixture models are more flexible, they can be more difficult
to train than K-means. K-means is typically faster to converge and so may be
preferred in cases where the runtime is an important consideration.
• In general, K-means will be faster and more accurate when the data set is
large and the clusters are well-separated. Gaussian mixture models will be
more accurate when the data set is small or the clusters are not well-separated.
• Gaussian mixture models take into account the variance of the data, whereas
K-means does not.
• Gaussian mixture models are more flexible in terms of the shape of the
clusters, whereas K-means is limited to spherical clusters.
• Gaussian mixture models can handle missing data, whereas K-means cannot.
EXAMPLE OF GM MODEL:
For example, in modeling human height data, height is typically modeled as a normal
distribution for each gender with a mean of approximately 5‘10‖ for males and 5‘5‖ for
females. Given only the height data and not the gender assignments for each data point, the
distribution of all heights would follow the sum of two scaled (different variance) and shifted
(different mean) normal distributions. A model making this assumption is an example of a
Gaussian mixture model.
APPLICATIONS OF GM:
a) Used for signal processing
b) Used for customer churn analysis
c) Used for language identification
d) Used in video game industry
e) Genre classification of songs

Prepared by A.SUJITHA AP/ECE PTLEECNCET


CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

EXPECTATION-MAXIMIZATION ALGORITHM

• In Gaussian mixture models, an expectation-maximization method is a powerful tool


for estimating he parameters of a Gaussian mixture model. The expectation is termed E and
maximization is termed M.
• The Expectation – Maximization (EM) algorithm is used in maximum likelihood
estimation where the problem involves two sets of random variables of which one, X, is
observable and the other, Z, is hidden.
• The goal of the algorithm is to find the parameter vector ∅ that maximizes the
likelihood of the observed values of X, L (∅ | X)
• EM alternates between performing an expectation € step, which computes an
expectation of the likelihood by including the latent variables as if they were obserrved, and
maximization (M) step, which computes the maximum likelihood estimates of the parameters
by maximizing the expected likelihood found in the E step.
• The Parameters found on the M step are then used to start another E step, and the
process is repeated until some criterion is satisfied. EM is frequently used for data clustering
like for example in Gaussian mixtures.
• In the Expectation step, find the expected values of the latent variables (here you
need to use the current parameter values)
• In the Maximization step, first plug in the expected values of the latent variables in
the log-likelihood of the augmented data. The maximize this log-likelihood to reevaluate the
parameters.
FLOW CHART OF EM

Prepared by A.SUJITHA AP/ECE PTLEECNCET


CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

STEPS
1. Given a set of incomplete data, consider a set of starting parameters
2. Expectation step(E-step) – Using the observed available data of the dataset, estimate
the values of the missing data
3. Maximization step(M step)-complete data generated after E step is used in order to
update the parameters
4. Repeat step 2 and 3 untill convergence
FORMULA FOR E STEP AND M STEP

Prepared by A.SUJITHA AP/ECE PTLEECNCET


CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

K- no. of heads
n-no. of flips

Prepared by A.SUJITHA AP/ECE PTLEECNCET


CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

For coin A
L(H)=P(A)*No. of heads
L(T)=P(A)*No. of tails
For coin B
L(H)=P(B)*No. of heads
L(T)=P(B)*No. of tails

After 10 iteration
Θ1=0.8
Θ2=0.52

Prepared by A.SUJITHA AP/ECE PTLEECNCET


CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

USE/APPLICATION
• Used to fill the missing data in sample
• Uses as the basis of unsupervised learning of clusters
• Used for estimating the parameters of Hidden Markov Mode(HMM)
• Used for discovering the values of latent variables
ADVANTAGES
• Always guaranteed that likelihood will increase with each iteration
• E step and M step are easy for many problems in terms of implementation
DISADVANTAGES
• Has slow convergence
• It makes convergence to the local optima only
• It requires both probability –forward and backward

PART A
1. What is Ensemble method?
Ensemble methods are techniques that aim at improving the accuracy of results in
models by combining multiple models instead of using a single model. The combined
models increase the accuracy of the results significantly. This has boosted the
popularity of ensemble methods in machine learning
2. Which are the performance factors that influence KNN algorithm?
1. The distance function or distance metric used to determine the nearest neighbors
2. The Decision rule used to derive a classification from the K-Nearest neighbors.
3. The number of neighbors used to classify the new example.
3. List the properties of K-Means algorithm.
1. There are always K clusters
2. There is always at least one item in each cluster.
3. The clusters are non-hierarchical and they do not overlap
4. How do GMMs differentiate from K-means clustering?
GMMs and K-means, both are clustering algorithms used for unsupervised learning
tasks. However, the basic difference between them is that K-means is a distance-based
clustering method while GMMs is a distribution based clustering method.
5. What is ‘Over fitting’ in Machine learning?
In machine learning, when a statistical model describes random error or noise instead
of underlying relationship ‗over fitting‘ occurs. When a model is excessively

Prepared by A.SUJITHA AP/ECE PTLEECNCET


CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

complex, over fitting is normally observed, because of having too many parameters
with respect to the number of training data types. The model exhibits poor
performance which has been over fit.
6. What are the two paradigms of ensemble methods?
The two paradigms of ensemble methods are
 Sequential ensemble methods
 Parallel ensemble methods
7. What is Error-Correcting Output Codes?
The main classification task is defined in terms of a number of subtasks that are
implemented by the base learners. The idea is that the original task of separating
oneclass from all other classes may be a difficult problem. We want to define a set of
simpler classification problems, each specializing in one aspect of the task, and
combining these simpler classifiers, we get the final classifier.

8. When does an algorithm become unstable?


In machine learning an algorithm is said to be unstable if a small change in training
data cause the large change in learned classifiers.
9. Why is the smoothing parameter h need to be optimal?
Smoothing parameter should be optimal in order to obtain estimator that is suitable
with data.
10. What is the significance of gaussian mixture model?
 It is used to determine the probability that a given data point belongs to a
cluster
 It has the ability to model a wide range of probability distributions

Prepared by A.SUJITHA AP/ECE PTLEECNCET

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy