0% found this document useful (0 votes)

2 views24 pages

Unit 4 Notes

The document covers ensemble techniques and unsupervised learning in artificial intelligence and machine learning, focusing on methods like bagging, boosting, and K-means clustering. It explains how combining multiple classifiers can enhance accuracy, detailing various approaches including voting, stacking, and the Adaboost algorithm. Additionally, it outlines the K-means algorithm for clustering, emphasizing its iterative process for grouping data points based on similarity.

Uploaded by

mchikam56

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views24 pages

Unit 4 Notes

Uploaded by

mchikam56

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

UNIT IV
ENSEMBLE TECHNIQUES AND UNSUPERVISED LEARNING

UNIVERSITY QUESTIONS

1. Assume an image has pixel size 240*180. elaborate how K means clustering can be
used to achieve lossy data compression of that image
2. Explain in detail about combining multiple classifiers by voting.
3. [i] what is bagging and boosting?give example
[ii]outline the steps in the adaboost algorithm with an example.
4. Elaborate on the steps in expectation –maximization algorithm

COMBINING MULTIPLE LEARNERS

 Combining multiple learners is a model composed of multiple learners that
complement each other so that by combining them, we attain higher accuracy.
 The No Free Lunch Theorem states that there is no single learning algorithm that in
any domain always induces the most accurate learner. The usual approach is to try
many and choose the one that performs the best on a separate validation set.
 The most common method to combine models is by averaging multiple models,
where taking a weighted average improves the accuracy
GENERATING DIVERSE LEARNERS DIFFERENT ALGORITHMS
We can use different learning algorithms to train different base-learners. Different algorithms
make different assumptions about the data and lead to different classifiers.
Different Hyper parameters
We can use the same learning algorithm but use it with different hyper parameters.
Different Input Representations
Separate base-learners may be using different representations of the same input
object or event, making it possible to integrate different types of
Sensors/measurements/modalities. Different representations make different characteristics
explicit allowing better identification.
Different Training Sets
Another possibility is to train different base-learners by different subsets of the
training set. This can be done randomly by drawing random training sets from the given
sample this is called bagging.

Prepared by A.SUJITHA AP/ECE PTLEECNCET

CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

Diversity vs. Accuracy

This implies that the required accuracy and diversity of the learners also depend on
how their decisions are to be combined. In a voting scheme, a learner is consulted for all
inputs, it should be accurate everywhere and diversity should be enforced everywhere.
Voting
The simplest way to combine multiple classifiers is by voting, which corresponds to taking a
linear combination of the learners

Model Combination Schemes

There are also different ways the multiple base-learners are combined to generate
the final output:
Multiexpert combination
Multiexpert combination methods have base-learners that work in parallel. These
methods can in turn be divided into two:
A) The global approach, also called learner fusion, given an input, all base-learners
generate an output and all these outputs are used. Examples are voting and stacking.
B) The local approach, or learner selection, for example, in mixture of experts, there is
a gating model, which looks at the input and chooses one (or very few) of the learners as
responsible for generating the output. Multistage combination methods use a serial
approach where the next base-learner is trained with or tested on only the instances where the
previous base-learners are not accurate enough.
VOTING
• The simplest way to combine multiple classifiers is by voting, which corresponds to
take a linear combination of the learners. Voting is an ensemble machine learning algorithm.
• For regression, a voting ensemble involves make a prediction that the average of
multiple other regression models.
• • In classification, a hard voting ensembled involves summing the votes for crisp class
labels from other models and predicting the class with the most votes, A soft voting
ensemble involves summing the predicted probabilities for class labels and predicting
the class label with the largest sum probability.

Prepared by A.SUJITHA AP/ECE PTLEECNCET

CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

• first step is to create multiple classification/ regression models using some training
dataset. Each base model can be created using different splits of the same training
dataset and same algorithm, or using the same dataset with different algorithms, or
any other methods.
• When combining multiple independent and diverse decisions each of which is at least
more accurate than random guessing, random errors cancel each other out, and correct
decisions are reinforced. Human ensembles are demonstrably better.
• Use a single, arbitrary learning algorithm but manipulate training data to make it learn
multiple models.
CLASSIFIER COMBINATION RULES

EXAMPLE:

Prepared by A.SUJITHA AP/ECE PTLEECNCET

CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

ENSEMBLE LEARNING

• The idea of ensemble learning is to employ multiple learners and combine their
predictions.
• Ensemble methods combine several decision trees classifiers to produce better
predictive performance than a single decision tree classifier. The main principle behind the
ensemble model is that a group of weak learners come together to form a strong learner, thus
increasing the accuracy of the model.
SIMPLE ENSEMBLE TRAINING METHODS
Simple ensemble training methods typically just involve the application of statistical
summary techniques, such as determining the mode[max voting], mean[averaging], or
weighted average of a set of predictions.
ADVANCED ENSEMBLE TRAINING METHODS
There are three primary advanced ensemble training techniques, each of which is
designed to deal with a specific type of machine learning problem.
 ―Bagging‖ techniques are used to decrease the variance of a model‘s predictions, with
variance referring to how much the outcome of predictions differs when based on the
same observation.
 ―Boosting‖ techniques are used to combat the bias of models.
 Finally, ―stacking‖ is used to improve predictions in general.
BAGGING/BOOTSTRAP AGGREGATING
. Bagging is a voting method whereby base-learners are made different by training them over
slightly different training sets.

Prepared by A.SUJITHA AP/ECE PTLEECNCET

CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

 It is a machine learning ensemble meta-algorithm designed to improve the stability

and accuracy of machine learning algorithms used in statistical classification and
regression.
 It decreases the variance and helps to avoid overfitting. It is usually applied to
decision tree methods. Bagging is a special case of the model averaging approach.
 Bootstrapping is a sampling technique where samples are derived from the whole
population (set) using the replacement procedure. The sampling with replacement
method helps make the selection procedure randomized. The base learning algorithm
is run on the samples to complete the procedure.
 Aggregation in bagging is done to incorporate all possible outcomes of the prediction
and randomize the outcome., the aggregation is based on the probability bootstrapping
procedures or on the basis of all outcomes of the predictive models.

• For given a training set of size n, create m samples of size n by drawing n examples
from the original data, with replacement. Each bootstrap sample will on average contain
63.2% of the unique training examples, the rest are replicates. It combines the m resulting
models using simple majority vote.

PSEUDOCODE

Prepared by A.SUJITHA AP/ECE PTLEECNCET

CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

1.Given training data(x1,y1),….,(xm,ym)

2. For t=1,…..,T: a. From bootstrap replicate dataset St by selecting m random examples
from the training set with replacement. b. Let ht be the result of training base learning
algorithm on si 3. Output combined classifier: H(x)=majority(h1(x),….,hT(x)).
IMPLEMENTATION STEPS OF BAGGING
Step 1: Multiple subsets are created from the original data set with equal tuples, selecting
observations with replacement.
Step 2: A base model is created on each of these subsets.
Step 3: Each model is learned in parallel with each training set and independent of each
other.
Step 4: The final predictions are determined by combining the predictions from all the
models
ADVANTAGES:
1. Reduce overfitting of the model.
2. Handles higher dimensionality data very well.
3. Maintains accuracy for missing data
DISADVANTAGE:
Since final prediction is based on mean prediction from the subset trees, it won‘t give
precise values for the classification and regression model
BOOSTING
 Boosting is an ensemble learning method that combines a set of weak learners into a
strong learner to minimize training errors. In boosting, a random sample of data is
selected, fitted with a model and then trained sequentially—that is, each model tries to
compensate for the weaknesses of its predecessor.
 Boosting takes many forms, including gradient boosting, Adaptive Boosting
(AdaBoost), and XGBoost (Extreme Gradient Boosting).

 It is done by building a model by using weak models in series.

 Firstly, a model is built from the training data.

 Then the second model is built which tries to correct the errors present in the first
model.

Prepared by A.SUJITHA AP/ECE PTLEECNCET

CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

 This procedure is continued and models are added until either the complete training
data set is predicted correctly or the maximum number of models is added.

STEPS
• Draw a random subset of training samples d1 without replacement from the training
set D to train a weak learner C1
• Draw second random training subset d2 without replacement from the training set add
add 50 percent of the samples that were previously falsely classified / misclassified to
train a weak learner C2
• Find the training samples d3 in the training set D on which C1 and C2 disagree to
train a third weak learner C3
• Combine all the weak learners via majority voting.
ADVANTAGES OF BOOSTING
• Improved accuracy
• Robustness to overfitting
• Better interpretability
DISADVANTAGES OF BOOSTING
• Prone to over – fitting.
• Requires careful tuning of different hyper – parameters
ADABOOST ALGORITHM
• It is used to learn weak classifiers

Prepared by A.SUJITHA AP/ECE PTLEECNCET

CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

– final classification based on weighted vote of weak classifiers

• Initially, all weights are set equally,
– each round the weights of incorrectly classified examples are increased
– so that those observations that the previously classifier poorly predicts receive
greater weight on the next iteration.
STEPS IN ADABOOST ALGORITHM
1. Initialise the dataset and assign equal weight to each of the data point
2. Provide this as input to the model and identify the wrongly classified data points
3. Increase the weight of the wrongly classified data points
4. If(got required results)
Goto step5
Else
Goto step2
5. end
ADVANTAGES OF ADABOOST
• Very simple to implement
• fairly good generalization
• prior error need not be known ahead of time
DISADVANTAGES OF ADABOOST
• Suboptimal solution
• Can over fit in presence of noise.

STACKING
• It is a popular ensemble machine learning techniques and used to predict multiple
nodes to build a new model and improve model performance.
• Stacking enables us to train multiple models to solve similar problems, and based on
their combined output, it builds a new model with improved performance
• It ensembling classification or regression models
• it consists of two-layer estimators
• first layer consists of all the baseline models that are used to predict the
outputs on the test datasets.
• second layer consists of Meta-Classifier or Regressor which takes all the
predictions of baseline models as an input and generate new predictions

Prepared by A.SUJITHA AP/ECE PTLEECNCET

CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

STEPS TO IMPLEMENT STACKING MODELS:

• Split training data sets into n-folds using the Repeated Stratified KFold as
this is the most common approach to preparing training datasets for meta-
models.
• Now the base model is fitted with the first fold, which is n-1, and it will make
predictions for the nth folds.
• The prediction made in the above step is added to the x1_train list.
• Repeat steps 2 & 3 for remaining n-1folds, so it will give x1_train array of
size n,
• Now, the model is trained on all the n parts, which will make predictions for
the sample data.
• Add this prediction to the y1_test list.
• In the same way, we can find x2_train, y2_test, x3_train, and y3_test by using
Model 2 and 3 for training, respectively, to get Level 2 predictions.
• Now train the Meta model on level 1 prediction, where these predictions will
be used as features for the model .
• Finally, Meta learners can now be used to make a prediction on test data in
the stacking model.

COMPARISON OF VARIOUS ENSEMBLE LEARNING

BAGGING BOOSTING STACKING

PURPOSE Reduce variance(or) Reduce bias(or)underfitting Improve accuracy

overfitting

Prepared by A.SUJITHA AP/ECE PTLEECNCET

CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

BASE homogenous homogenous heterogenous

LEARNER
TYPES

BASE parallel sequential Meta model

LEARNER
TRAINING

AGGREGATIO Max voting Weighted averaging Weighted

N averaging averaging

UNSUPERVISED LEARNING : K MEANS

• K-Means Clustering is an unsupervised learning algorithm that is used to solve

the clustering problems in machine learning or data science
• Here K defines the number of pre-defined clusters that need to be created in
the process, as if K=2, there will be two clusters, and for K=3, there will be
three clusters, and so on.
• It is an iterative algorithm that divides the unlabeled dataset into k different
clusters in such a way that each dataset belongs only one group that has
similar properties.
• It allows us to cluster the data into different groups and a convenient way to
discover the categories of groups in the unlabeled dataset on its own without
the need for any training
• It is a centroid-based algorithm, where each cluster is associated with a
centroid. The main aim of this algorithm is to minimize the sum of distances
between the data point and their corresponding clusters.
• The algorithm takes the unlabeled dataset as input, divides the dataset into k-
number of clusters, and repeats the process until it does not find the best
clusters. The value of k should be predetermined in this algorithm.
• The k-means clustering algorithm mainly performs two tasks:
• Determines the best value for K center points or centroids by an iterative
process.
• Assigns each data point to its closest k-center.
• Those data points which are near to the particular k-center, create a cluster.

Prepared by A.SUJITHA AP/ECE PTLEECNCET

CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

• Hence each cluster has datapoints with some commonalities, and it is away
from other clusters.

HOW DOES THE K-MEANS ALGORITHM WORK?

• Step-1: Select the number K to decide the number of clusters.
• Step-2: Select random K points or centroids. (It can be other from the input dataset).
• Step-3: Assign each data point to their closest centroid, which will form the
predefined K clusters.
• Step-4: Calculate the variance and place a new centroid of each cluster.
• Step-5: Repeat the third steps, which means reassign each datapoint to the new
closest centroid of each cluster.
• Step-6: If any reassignment occurs, then go to step-4 else go to FINISH.
• Step-7: The model is ready.
EXAMPLE OF K MEANS :
• Suppose we have two variables M1 and M2. The x-y axis scatter plot of these
two variables is given below:

• Let's take number k of clusters, i.e., K=2, to identify the dataset and to put them into
different clusters. It means here we will try to group these datasets into two different clusters.
• We need to choose some random k points or centroid to form the cluster. These points
can be either the points from the dataset or any other point. So, here we are selecting the
below two points as k points, which are not the part of our dataset. Consider the below image:

Prepared by A.SUJITHA AP/ECE PTLEECNCET

CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

• Now we will assign each data point of the scatter plot to its closest K-point or
centroid.
• calculate the distance between two points. So, we will draw a median between
boththecentroids. Consider the below image:

• From the above image, it is clear that points left side of the line is near to the
K1 or blue centroid, and points to the right of the line are close to the yellow
centroid. Let's color them as blue and yellow for clear visualization.

• As we need to find the closest cluster, so we will repeat the process by choosing a
new centroid. To choose the new centroids, we will compute the center of gravity of these
centroids, and will find new centroids as below:

Prepared by A.SUJITHA AP/ECE PTLEECNCET

CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

• Next, we will reassign each datapoint to the new centroid. For this, we will repeat the
same process of finding a median line. The median will be like below image:

• From the above image, we can see, one yellow point is on the left side of the
line, and two blue points are right to the line. So, these three points will be
assigned to new centroids

• As reassignment has taken place, so we will again go to the step-4, which is

finding new centroids or K-points.
• We will repeat the process by finding the center of gravity of centroids, so the new
centroids will be as shown in the below image:

Prepared by A.SUJITHA AP/ECE PTLEECNCET

CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

• As we got the new centroids so again will draw the median line and reassign the data
points. So, the image will be:

• We can see in the above image; there are no dissimilar data points on either side of
the line, which means our model is formed. Consider the below image:

• As our model is ready, so we can now remove the assumed centroids, and the
two final clusters will be as shown in the below image:

INSTANCE BASED LEARNING:KNN

[K-NEAREST NEIGHBOUR]

Prepared by A.SUJITHA AP/ECE PTLEECNCET

CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

• K-NN algorithm assumes the similarity between the new case/data and
available cases and put the new case into the category that is most similar to
the available categories.
• K-NN algorithm can be used for Regression as well as for Classification but
mostly it is used for the Classification problems.
• K-NN is a non-parametric algorithm, which means it does not make any
assumption on underlying data.
• It is also called a lazy learner algorithm because it does not learn from the
training set immediately instead it stores the dataset and at the time of
classification, it performs an action on the dataset.
• KNN algorithm at the training phase just stores the dataset and when it gets
new data, then it classifies that data into a category that is much similar to the
new data.
STEPS:
Step-1: Select the number K of the neighbors
Step-2: Calculate the Euclidean distance of K number of neighbors
Step-3: Take the K nearest neighbors as per the calculated Euclidean distance.
Step-4: Among these k neighbors, count the number of the data points in each category.
Step-5: Assign the new data points to that category for which the number of the neighbor is
maximum.
Step-6: Our model is ready.

EXAMPLE:
Suppose there are two categories, i.e., Category A and Category B, and we have a
new data point x1, so this data point will lie in which of these categories. To solve this type
of problem, we need a K-NN algorithm. With the help of K-NN, we can easily identify the
category or class of a particular dataset. Consider the below diagram:

Prepared by A.SUJITHA AP/ECE PTLEECNCET

CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

Suppose we have a new data point and we need to put it in the required category.

• Firstly, we will choose the number of neighbors, so we will choose the k=5.
• Next, we will calculate the Euclidean distance between the data points. The
Euclidean distance is the distance between two points, which we have already
studied in geometry. It can be calculated as:

• By calculating the Euclidean distance we got the nearest neighbors, as three

nearest neighbors in category A and two nearest neighbors in category B.
Consider the below image:

Prepared by A.SUJITHA AP/ECE PTLEECNCET

CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

• As we can see the 3 nearest neighbors are from category A in Figure 4.9 ,
hence this new data point must belong to category A.
ADVANTAGES OF KNN ALGORITHM:
• It is simple to implement.
• It is robust to the noisy training data
• It can be more effective if the training data is large.
DISADVANTAGES OF KNN ALGORITHM:
• Always needs to determine the value of K which may be complex some time.
• The computation cost is high because of calculating the distance between the
data points for all the training samples.

COMPARISON BETWEEN K MEANS AND KNN

Prepared by A.SUJITHA AP/ECE PTLEECNCET

CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

GAUSSIAN MIXTURE MODEL[GM MODEL]

• This model is a soft probabilistic clustering model that allows us to describe

the membership of points to a set of clusters using a mixture of Gaussian
densities.
• It is a soft classification (in contrast to a hard one) because it assigns
probabilities of belonging to a specific class instead of a definitive choice. In
essence, each observation will belong to every class but with different
probabilities.
ADVANTAGE OF GM OVER K MEANS
• While Gaussian mixture models are more flexible, they can be more difficult
to train than K-means. K-means is typically faster to converge and so may be
preferred in cases where the runtime is an important consideration.
• In general, K-means will be faster and more accurate when the data set is
large and the clusters are well-separated. Gaussian mixture models will be
more accurate when the data set is small or the clusters are not well-separated.
• Gaussian mixture models take into account the variance of the data, whereas
K-means does not.
• Gaussian mixture models are more flexible in terms of the shape of the
clusters, whereas K-means is limited to spherical clusters.
• Gaussian mixture models can handle missing data, whereas K-means cannot.
EXAMPLE OF GM MODEL:
For example, in modeling human height data, height is typically modeled as a normal
distribution for each gender with a mean of approximately 5‘10‖ for males and 5‘5‖ for
females. Given only the height data and not the gender assignments for each data point, the
distribution of all heights would follow the sum of two scaled (different variance) and shifted
(different mean) normal distributions. A model making this assumption is an example of a
Gaussian mixture model.
APPLICATIONS OF GM:
a) Used for signal processing
b) Used for customer churn analysis
c) Used for language identification
d) Used in video game industry
e) Genre classification of songs

Prepared by A.SUJITHA AP/ECE PTLEECNCET

CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

EXPECTATION-MAXIMIZATION ALGORITHM

• In Gaussian mixture models, an expectation-maximization method is a powerful tool

for estimating he parameters of a Gaussian mixture model. The expectation is termed E and
maximization is termed M.
• The Expectation – Maximization (EM) algorithm is used in maximum likelihood
estimation where the problem involves two sets of random variables of which one, X, is
observable and the other, Z, is hidden.
• The goal of the algorithm is to find the parameter vector ∅ that maximizes the
likelihood of the observed values of X, L (∅ | X)
• EM alternates between performing an expectation € step, which computes an
expectation of the likelihood by including the latent variables as if they were obserrved, and
maximization (M) step, which computes the maximum likelihood estimates of the parameters
by maximizing the expected likelihood found in the E step.
• The Parameters found on the M step are then used to start another E step, and the
process is repeated until some criterion is satisfied. EM is frequently used for data clustering
like for example in Gaussian mixtures.
• In the Expectation step, find the expected values of the latent variables (here you
need to use the current parameter values)
• In the Maximization step, first plug in the expected values of the latent variables in
the log-likelihood of the augmented data. The maximize this log-likelihood to reevaluate the
parameters.
FLOW CHART OF EM

Prepared by A.SUJITHA AP/ECE PTLEECNCET

CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

STEPS
1. Given a set of incomplete data, consider a set of starting parameters
2. Expectation step(E-step) – Using the observed available data of the dataset, estimate
the values of the missing data
3. Maximization step(M step)-complete data generated after E step is used in order to
update the parameters
4. Repeat step 2 and 3 untill convergence
FORMULA FOR E STEP AND M STEP

Prepared by A.SUJITHA AP/ECE PTLEECNCET

CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

K- no. of heads
n-no. of flips

Prepared by A.SUJITHA AP/ECE PTLEECNCET

CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

For coin A
L(H)=P(A)*No. of heads
L(T)=P(A)*No. of tails
For coin B
L(H)=P(B)*No. of heads
L(T)=P(B)*No. of tails

After 10 iteration
Θ1=0.8
Θ2=0.52

Prepared by A.SUJITHA AP/ECE PTLEECNCET

CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

USE/APPLICATION
• Used to fill the missing data in sample
• Uses as the basis of unsupervised learning of clusters
• Used for estimating the parameters of Hidden Markov Mode(HMM)
• Used for discovering the values of latent variables
ADVANTAGES
• Always guaranteed that likelihood will increase with each iteration
• E step and M step are easy for many problems in terms of implementation
DISADVANTAGES
• Has slow convergence
• It makes convergence to the local optima only
• It requires both probability –forward and backward

PART A
1. What is Ensemble method?
Ensemble methods are techniques that aim at improving the accuracy of results in
models by combining multiple models instead of using a single model. The combined
models increase the accuracy of the results significantly. This has boosted the
popularity of ensemble methods in machine learning
2. Which are the performance factors that influence KNN algorithm?
1. The distance function or distance metric used to determine the nearest neighbors
2. The Decision rule used to derive a classification from the K-Nearest neighbors.
3. The number of neighbors used to classify the new example.
3. List the properties of K-Means algorithm.
1. There are always K clusters
2. There is always at least one item in each cluster.
3. The clusters are non-hierarchical and they do not overlap
4. How do GMMs differentiate from K-means clustering?
GMMs and K-means, both are clustering algorithms used for unsupervised learning
tasks. However, the basic difference between them is that K-means is a distance-based
clustering method while GMMs is a distribution based clustering method.
5. What is ‘Over fitting’ in Machine learning?
In machine learning, when a statistical model describes random error or noise instead
of underlying relationship ‗over fitting‘ occurs. When a model is excessively

Prepared by A.SUJITHA AP/ECE PTLEECNCET

CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

complex, over fitting is normally observed, because of having too many parameters
with respect to the number of training data types. The model exhibits poor
performance which has been over fit.
6. What are the two paradigms of ensemble methods?
The two paradigms of ensemble methods are
 Sequential ensemble methods
 Parallel ensemble methods
7. What is Error-Correcting Output Codes?
The main classification task is defined in terms of a number of subtasks that are
implemented by the base learners. The idea is that the original task of separating
oneclass from all other classes may be a difficult problem. We want to define a set of
simpler classification problems, each specializing in one aspect of the task, and
combining these simpler classifiers, we get the final classifier.

8. When does an algorithm become unstable?

In machine learning an algorithm is said to be unstable if a small change in training
data cause the large change in learned classifiers.
9. Why is the smoothing parameter h need to be optimal?
Smoothing parameter should be optimal in order to obtain estimator that is suitable
with data.
10. What is the significance of gaussian mixture model?
 It is used to determine the probability that a given data point belongs to a
cluster
 It has the ability to model a wide range of probability distributions

Prepared by A.SUJITHA AP/ECE PTLEECNCET

Ensemble Learning
No ratings yet
Ensemble Learning
26 pages
Machine Learning Unit3
No ratings yet
Machine Learning Unit3
26 pages
Math Lit P1 June 2023 Memorandum
No ratings yet
Math Lit P1 June 2023 Memorandum
10 pages
Cs3351 Aiml Unit 4 Notes
No ratings yet
Cs3351 Aiml Unit 4 Notes
32 pages
Ensemble Learning
100% (1)
Ensemble Learning
7 pages
Unit 4 Study Material
No ratings yet
Unit 4 Study Material
24 pages
UNIT-4NEW
No ratings yet
UNIT-4NEW
39 pages
unit 4
No ratings yet
unit 4
45 pages
Flight Simulator Sense
No ratings yet
Flight Simulator Sense
45 pages
UMl - unit 3
No ratings yet
UMl - unit 3
50 pages
Unit 4 Part 1
No ratings yet
Unit 4 Part 1
47 pages
Unit V -Multiple Learners
No ratings yet
Unit V -Multiple Learners
54 pages
UNIT IV
No ratings yet
UNIT IV
28 pages
EWMS 1 - Rev 2 - Temp Jetty and Bridge
No ratings yet
EWMS 1 - Rev 2 - Temp Jetty and Bridge
17 pages
Bagging and Boosting
No ratings yet
Bagging and Boosting
8 pages
Unit-4-AIML
No ratings yet
Unit-4-AIML
29 pages
Ku Thesis Dissertation
100% (2)
Ku Thesis Dissertation
4 pages
UNIT-IV
No ratings yet
UNIT-IV
22 pages
learning algorithms
No ratings yet
learning algorithms
24 pages
UNIT3_class
No ratings yet
UNIT3_class
30 pages
02-General Oceanography
No ratings yet
02-General Oceanography
97 pages
Ensemble Learning
No ratings yet
Ensemble Learning
32 pages
Ensemble Methods
No ratings yet
Ensemble Methods
31 pages
Datasheet CX02-81
No ratings yet
Datasheet CX02-81
2 pages
Ensemble_Techniques_Presentation
No ratings yet
Ensemble_Techniques_Presentation
17 pages
Animschool Introduction To 3d Animation
No ratings yet
Animschool Introduction To 3d Animation
7 pages
Ai ML Unit 4 Notes
No ratings yet
Ai ML Unit 4 Notes
42 pages
If North American Political Subdivisions Were People - Google Sheets
No ratings yet
If North American Political Subdivisions Were People - Google Sheets
11 pages
DEC23012
No ratings yet
DEC23012
14 pages
Mridula-Gupta - Resume - Univ - 2020
No ratings yet
Mridula-Gupta - Resume - Univ - 2020
8 pages
Unit 4
No ratings yet
Unit 4
17 pages
12 Ensemble Model
No ratings yet
12 Ensemble Model
90 pages
AIML-UNIT-4
No ratings yet
AIML-UNIT-4
17 pages
Ensemble Learning in Machine Learning
No ratings yet
Ensemble Learning in Machine Learning
15 pages
ML Module 5 2022 PDF
100% (2)
ML Module 5 2022 PDF
31 pages
Music Genre Classification Project Repor
No ratings yet
Music Genre Classification Project Repor
19 pages
Machine Learning: Ensemble Methods
No ratings yet
Machine Learning: Ensemble Methods
54 pages
Zhukovsky Labs: WQ80424HRS74
No ratings yet
Zhukovsky Labs: WQ80424HRS74
4 pages
Unit IV Aiml
No ratings yet
Unit IV Aiml
32 pages
Module 7 - Ensemble Learning
No ratings yet
Module 7 - Ensemble Learning
41 pages
Ensemble Final
No ratings yet
Ensemble Final
41 pages
Combining Multiple Learners
No ratings yet
Combining Multiple Learners
4 pages
UNIT IV
No ratings yet
UNIT IV
18 pages
UNIT 3 AML
No ratings yet
UNIT 3 AML
9 pages
Bastille-Rousseau Et Al (2013)
No ratings yet
Bastille-Rousseau Et Al (2013)
9 pages
Ensemble Classifiers
No ratings yet
Ensemble Classifiers
37 pages
AIML UNIT 4
No ratings yet
AIML UNIT 4
26 pages
Ensemble Learning
No ratings yet
Ensemble Learning
52 pages
1983 6023 1 PB
No ratings yet
1983 6023 1 PB
5 pages
unit 5 ML
No ratings yet
unit 5 ML
14 pages
AI24
No ratings yet
AI24
4 pages
ASTM-D3505-18
No ratings yet
ASTM-D3505-18
6 pages
Comparison of Naive Bayes Classifier and C-LSTM
No ratings yet
Comparison of Naive Bayes Classifier and C-LSTM
6 pages
USP-NF 724 - Drug Release
No ratings yet
USP-NF 724 - Drug Release
11 pages
Stats+Medic+-++Simulation
No ratings yet
Stats+Medic+-++Simulation
2 pages
09-08-2016 - Turtal Lab
29% (14)
09-08-2016 - Turtal Lab
4 pages
Classification Through Ensembling Techniques
No ratings yet
Classification Through Ensembling Techniques
10 pages
B43 Exp4 ML
No ratings yet
B43 Exp4 ML
6 pages
Ensemble Learning: Proprietary Content. ©great Learning. All Rights Reserved. Unauthorized Use or Distribution Prohibited
No ratings yet
Ensemble Learning: Proprietary Content. ©great Learning. All Rights Reserved. Unauthorized Use or Distribution Prohibited
6 pages
Article Review 9 Eng
No ratings yet
Article Review 9 Eng
21 pages
Ensemble Methods (Final)
No ratings yet
Ensemble Methods (Final)
16 pages
Lesson Selfies 20221031
No ratings yet
Lesson Selfies 20221031
5 pages
PBAAO25
No ratings yet
PBAAO25
1 page
Week 11 EnsembleLearning
No ratings yet
Week 11 EnsembleLearning
34 pages
AI25
No ratings yet
AI25
7 pages
Unit-3 ML
No ratings yet
Unit-3 ML
18 pages
Yarn Structure and Fabric Geometry
No ratings yet
Yarn Structure and Fabric Geometry
2 pages
Lecture 5
No ratings yet
Lecture 5
11 pages
Comp and Ben Presentation
No ratings yet
Comp and Ben Presentation
11 pages
Altivar VW3A3401
No ratings yet
Altivar VW3A3401
1 page
Slides - Ensemble
No ratings yet
Slides - Ensemble
6 pages
Ensemble
No ratings yet
Ensemble
6 pages
An Introduction of Ensemble Learning
100% (1)
An Introduction of Ensemble Learning
40 pages
Parker2005 Blatt 2004
No ratings yet
Parker2005 Blatt 2004
1 page
Jntuk R20 ML Unit-Iii
100% (1)
Jntuk R20 ML Unit-Iii
21 pages
Time To Explore (5) ML
No ratings yet
Time To Explore (5) ML
9 pages
Bagging+Boosting+Gradient Boosting
100% (1)
Bagging+Boosting+Gradient Boosting
48 pages
Metre Bridge
No ratings yet
Metre Bridge
14 pages
1.5.3 Elastic Deformation
No ratings yet
1.5.3 Elastic Deformation
6 pages
Unit 4
No ratings yet
Unit 4
24 pages
Avik Barman Logistics and Supply Chain Management Report
No ratings yet
Avik Barman Logistics and Supply Chain Management Report
26 pages
Ensemble Methods
100% (1)
Ensemble Methods
15 pages
Malaysia Soil Series
No ratings yet
Malaysia Soil Series
2 pages
Machine Learning and Data Mining: Prof. Alexander Ihler Fall 2012
No ratings yet
Machine Learning and Data Mining: Prof. Alexander Ihler Fall 2012
36 pages
Manual de Reloj Casio 5183
No ratings yet
Manual de Reloj Casio 5183
1 page
Ensemble Classifiers
100% (1)
Ensemble Classifiers
37 pages
Unit 4 Ensemble Techniques and Unsupervised Learning
100% (1)
Unit 4 Ensemble Techniques and Unsupervised Learning
25 pages
MACHINE LEARNING FOR BEGINNERS: A Practical Guide to Understanding and Applying Machine Learning Concepts (2023 Beginner Crash Course)
From Everand
MACHINE LEARNING FOR BEGINNERS: A Practical Guide to Understanding and Applying Machine Learning Concepts (2023 Beginner Crash Course)
Elaine Tate
No ratings yet
Machine Learning: Fundamentals and Applications
From Everand
Machine Learning: Fundamentals and Applications
Fouad Sabry
No ratings yet
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
From Everand
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
Artem Kovera
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Unit 4 Notes

Uploaded by

Unit 4 Notes

Uploaded by

CS3491/ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

COMBINING MULTIPLE LEARNERS

Prepared by A.SUJITHA AP/ECE PTLEECNCET

Diversity vs. Accuracy

Model Combination Schemes

Prepared by A.SUJITHA AP/ECE PTLEECNCET

Prepared by A.SUJITHA AP/ECE PTLEECNCET

Prepared by A.SUJITHA AP/ECE PTLEECNCET

 It is a machine learning ensemble meta-algorithm designed to improve the stability

Prepared by A.SUJITHA AP/ECE PTLEECNCET

1.Given training data(x1,y1),….,(xm,ym)

 It is done by building a model by using weak models in series.

 Firstly, a model is built from the training data.

Prepared by A.SUJITHA AP/ECE PTLEECNCET

Prepared by A.SUJITHA AP/ECE PTLEECNCET

– final classification based on weighted vote of weak classifiers

Prepared by A.SUJITHA AP/ECE PTLEECNCET

STEPS TO IMPLEMENT STACKING MODELS:

COMPARISON OF VARIOUS ENSEMBLE LEARNING

PURPOSE Reduce variance(or) Reduce bias(or)underfitting Improve accuracy

Prepared by A.SUJITHA AP/ECE PTLEECNCET

BASE homogenous homogenous heterogenous

BASE parallel sequential Meta model

AGGREGATIO Max voting Weighted averaging Weighted

UNSUPERVISED LEARNING : K MEANS

• K-Means Clustering is an unsupervised learning algorithm that is used to solve

Prepared by A.SUJITHA AP/ECE PTLEECNCET

HOW DOES THE K-MEANS ALGORITHM WORK?

Prepared by A.SUJITHA AP/ECE PTLEECNCET

Prepared by A.SUJITHA AP/ECE PTLEECNCET

• As reassignment has taken place, so we will again go to the step-4, which is

Prepared by A.SUJITHA AP/ECE PTLEECNCET

INSTANCE BASED LEARNING:KNN

Prepared by A.SUJITHA AP/ECE PTLEECNCET

Prepared by A.SUJITHA AP/ECE PTLEECNCET

• By calculating the Euclidean distance we got the nearest neighbors, as three

Prepared by A.SUJITHA AP/ECE PTLEECNCET

COMPARISON BETWEEN K MEANS AND KNN

Prepared by A.SUJITHA AP/ECE PTLEECNCET

GAUSSIAN MIXTURE MODEL[GM MODEL]

• This model is a soft probabilistic clustering model that allows us to describe

Prepared by A.SUJITHA AP/ECE PTLEECNCET

• In Gaussian mixture models, an expectation-maximization method is a powerful tool

Prepared by A.SUJITHA AP/ECE PTLEECNCET

Prepared by A.SUJITHA AP/ECE PTLEECNCET

Prepared by A.SUJITHA AP/ECE PTLEECNCET

Prepared by A.SUJITHA AP/ECE PTLEECNCET

Prepared by A.SUJITHA AP/ECE PTLEECNCET

8. When does an algorithm become unstable?

Prepared by A.SUJITHA AP/ECE PTLEECNCET

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.