AI25
AI25
"An ensembled model is a machine learning model that combines the predictions from two or
more models.”
There are 3 most common ensemble learning methods in machine learning. These are as follows:
• Bagging
• Boosting
• Stacking
The idea of ensemble learning is to employ multiple learners and combine their predictions. If we
have a committee of M models with uncorrelated errors, simply by averaging them the average
error of a model can be reduced by a factor of M.
• Unfortunately, the key assumption that the errors due to the individual models are uncorrelated
is unrealistic in practice, the errors are typically highly correlated, so the reduction in overall error
is generally small.
Ensemble modeling is the process of running two or more related but different analytical models
and then synthesizing the results into a single score or spread in order to improve the accuracy
of predictive analytics and data mining applications.
•Ensembles of classifiers is a set of classifiers whose individual decisions combined in some way
to classify new examples.
• Ensemble methods combine several decision trees classifiers to produce better predictive
performance than a single decision tree classifier. The main principle behind the ensemble model
is that a group of weak learners come together to form a strong learner, thus increasing the
accuracy of the model.
1.Variance reduction: If the training sets are completely independent, it will always helps to
average an ensemble because this will reduce variance without affecting bias (e.g. bagging) and
reduce sensitivity to individual data points.
2. Bias reduction: For simple models, average of models has much greater capacity than single
model Averaging models can reduce bias substantially by increasing capacity and control variance
by Citting one component at a time.
4.2.1 Bagging
Bagging is also called Bootstrap Aggregating Bagging and boosting are meta algorithms
that pool decisions from multiple classifiers. It creates ensembles by repeatedly randomly
resampling the training data.
Bagging is an ensemble learning technique that helps to improve the performance and
accuracy of machine learning algorithms
The meta algorithm, which is a special case of the model averaging, was originally
designed for classification and is usually applied to decision tree models, but it can be used
with any type of model for classification or regression.
Bootstrapping is the method of randomly creating samples of data out of a population with
replacement to estimate a population parameter.
Bagging Steps:
1. Suppose there are N observations and M features in training data set. A sample from training
data set is taken randomly with replacement,
2. A subset of M features is selected randomly and whichever feature gives the best split is used
to split the node iteratively.
4. Above steps are repeated in times and prediction is given based on the aggregation of predictions
from n number of trees.
Advantages of Bagging
Disadvantages of Bagging:
1. Since final prediction is based on the mean predictions from subset trees, it won't give precise
values for the classification and regression model.
4.2.2 Boosting
Boosting is an ensemble modeling technique that attempts to build a strong classifier from
the number of weak classifiers. It is done by building a model by using weak models in series.
Firstly, a model is built from the training data. Then the second model is built which tries to correct
the errors present in the first model. This procedure is continued and models are added until either
the complete training data set is predicted correctly or the maximum number of models are added.
Boosting is a bias reduction technique. It typically improves the performance of a single tree model
To begin, we define an algorithm for finding the rules of thumb, which we call a weak
learner. The boosting algorithm repeatedly calls this weak learner, each time feeding it a different
distribution over the training data. Each call generates a weak classifier and we must combine all
of these into a single classifier that. hopefully, is much more accurate than any one of the rules.
Train a set of weak hypotheses: h1,….,hT. The combined hypothesis H is a weighted majority
vote of the T weak hypotheses. During the training, focus on the examples that are misclassified.
Boosting Steps:
1. Draw a random subset of training samples d1 without replacement from the training set D to
train a weak learner C1
2. Draw second random training subset d2 without replacement from the training set and add 50
percent of the samples that were previously falsely classified/misclassified to train a weak learner
C2
3. Find the training samples d3 in the training set D on which C1 and C2 disagree to train a third
weak learner C3
AdaBoost:
AdaBoost was the first really successful boosting algorithm developed for the purpose of
binary classification. AdaBoost is short for Adaptive Boosting and is a very popular boosting
technique that combines multiple “weak classifiers” into a single “strong classifier”. It was
formulated by Yoav Freund and Robert Schapire. They also won the 2003 Gödel Prize for their
work.
Advantages of AdaBoost:
Disadvantages of AdaBoost
1. Suboptimal solution
4.2.3 Stacking
There are many ways to ensemble models in machine learning, such as Bagging, Boosting, and
stacking. Stacking is one of the most popular ensemble machine learning techniques used to
predict multiple nodes to build a new model and improve model performance. Stacking enables us
to train multiple models to solve similar problems, and based on their combined output, it builds a
new model with improved performance.
2. Train a meta level classifier to combine the outputs of the base level classifiers,