0% found this document useful (0 votes)

32 views8 pages

Bagging and Boosting

The document discusses ensemble learning techniques, specifically bagging and boosting, which combine multiple models to improve prediction accuracy and reduce overfitting or underfitting. Bagging uses random subsets of data to create independent models that are aggregated, while boosting sequentially updates models based on misclassified entries to reduce bias. Both methods aim to enhance prediction stability, but they differ in their approach to data partitioning, model independence, and the goals of variance reduction versus bias reduction.

Uploaded by

jdzalm157

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views8 pages

Bagging and Boosting

Uploaded by

jdzalm157

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 8

Ensemble Learning: Bagging and Boosting

#1: Introduction and main idea: ensemble learning

So when should we use it? Cleary, when we see overfitting or underfitting

of our models. Let’s begin with the key concept of bagging and boosting,
which both belong to the family of ensemble learning techniques:

The main idea behind ensemble learning is the usage of multiple

algorithms and models that are used together for the same task. While
single models use only one algorithm to create prediction models, bagging
and boosting methods aim to combine several of those to achieve better
prediction with higher consistency compared to individual learnings.

Example: Image classification

The essential concept is encapsulated by means of a didactic illustration

involving image classification. Supposing a collection of images, each
accompanied by a categorical label corresponding to the kind of animal, is
available for the purpose of training a model. In a traditional modelling
approach, we would try several techniques and calculate the accuracy to
choose one over the other. Imagine we used logistic regression, decision
tree, and support vector machines here that perform differently on the
given data set.
In the above example, it was observed that a specific record was
predicted as a dog by the logistic regression and decision tree models,
while a support vector machine identified it as a cat. As various models
have their distinct advantages and disadvantages for particular records, it
is the key idea of ensemble learning to combine all three models instead
of selecting only one approach that showed the highest accuracy.

The procedure is called aggregation or voting and combines the

predictions of all underlying models, to come up with one prediction that
is assumed to be more precise than any sub-model that would stay alone.

Bias-Variance trade-off

The next chart might be familiar to some of you, but it represents quite
well the relationship and the trade-off between bias and variance on the
test error rate.

You might be familiar with the following concept, but I posit that it
effectively illustrates the correlation and compromise between bias and
variance with respect to the testing error rate.
The relationship between the variance and bias of a model is such that a
reduction in variance results in an increase in bias, and vice versa. To
achieve optimal performance, the model must be positioned at an
equilibrium point, where the test error rate is minimized, and the variance
and bias are appropriately balanced.

Ensemble learning can help to balance both extreme cases to a more

stable prediction. One method is called bagging and the other is called
boosting.

#2: Bagging (bootstrap aggregation)

Let us focus first on the Bagging technique called bootstrap aggregation.

Bootstrap aggregation aims to solve the right extreme of the previous
chart by reducing the variance of the model to avoid overfitting.

With this purpose, the idea is to have multiple models of the same
learning algorithm that are trained by random subsets of the original
training data. Those random subsets are called bags and can contain any
combination of the data. Each of those datasets is then used to fit an
individual model which produces individual predictions for the given data.
Those predictions are then aggregated into one final classifier. The idea of
this method is really close to our initial toy example with the cats and
dogs.

Using random subsets of data, the risk of overfitting is reduced and

flattened by averaging the results of the sub-models. All models are
calculated in parallel and then aggregated together afterward.

The calculation of the final ensemble aggregation uses either the simple
average for regression problems or a simple majority vote for
classification problems. For that, each model from each random sample
produces a prediction for that given subset. For the average, those
predictions are just summed up and divided by the number of created
bags.

A simple majority voting works similarly but uses the predicted classes
instead of numeric values. The algorithm identifies the class with the most
predictions and assumes that the majority is the final aggregation. This is
again very similar to our toy example, where two out of three algorithms
predicted a picture to be a dog and the final aggregation was therefore a
dog prediction.

Random Forest
A famous extension to the bagging method is the random forest
algorithm, which uses the idea of bagging but uses also subsets of the
features and not only subsets of the entries. Bagging, on the other hand,
takes all given features into account.
 base_estimator: You have to provide the underlying algorithm that
should be used by the random subsets in the bagging procedure in
the first parameter. This could be for example Logistic Regression,
Support Vector Classification, Decision trees, or many more.

 n_estimators: The number of estimators defines the number of bags

you would like to create here and the default value for that is 10.

 max_samples: The maximum number of samples defines how many

samples should be drawn from X to train each base estimator. The
default value here is one point zero which means that the total
number of existing entries should be used. You could also say that
you want only 80% of the entries by setting it to 0.8.

After setting the scenes, this model object works like many other models
and can be trained using the fit()procedure including X and y data from
the training set. The corresponding predictions on test data can be done
using predict().

#3: Boosting

Boosting is a little variation of the bagging algorithm and uses sequential

processing instead of parallel calculations. While bagging aims to reduce
the variance of the model, the boosting method tries aims to reduce the
bias to avoid underfitting the data. With that idea in mind, boosting also
uses a random subset of the data to create an average-performing model
on that.

For that, it uses the miss-classified entries of the weak model with some
other random data to create a new model. Therefore, the different models
are not randomly chosen but are mainly influenced by wrong classified
entries of the previous model. The steps for this technique are the
following:

1. Train initial (weak) model

You create a subset of the data and train a weak learning model
which is assumed to be the final ensemble model at this stage. You
then analyze the results on the given training data set and can
identify those entries that were misclassified.

2. Update weights and train a new model

You create a new random subset of the original training data but
weight those misclassified entries higher. This dataset is then used
to train a new model.

3. Aggregate the new model with the ensemble model

The next model should perform better on the more difficult entries
and will be combined (aggregated) with the previous one into the
new final ensemble model.

Essentially, we can repeat this process multiple times and continuously

update the ensemble model until our prediction power is good enough.
The key idea here is clearly to create models that are also able to predict
the more difficult data entries. This can then lead to a better fit of the
model and reduces the bias.

In comparison to Bagging, this technique uses weighted voting or

weighted averaging based on the coefficients of the models that are
considered together with their predictions. Therefore, this model can
reduce underfitting, but might also tend to overfit sometimes.

Code example for boosting

In the following, we will look at a similar code example but for boosting.
Obviously, there exist multiple boosting algorithms. Besides
the GradientDescent methodology, the AdaBoost is one of the most
popular.
 base_estimator: Similar to Bagging, you need to define which
underlying algorithm you would like to use.

 n_estimators: The amount of estimators defines the maximum

number of iterations at which the boosting is terminated. It is called
the “maximum” number, because the algorithm will stop on its own,
in case good performance is achieved earlier.

 learning_rate: Finally, the learning rate controls how much the new
model is going to contribute to the previous one. Normally there is a
trade-off between the number of iterations and the value of the
learning rate. In other words: when taking smaller values of the
learning rate, you should consider more estimators, so that your
base model (the weak classifier) continues to improve.

The fit()and predict()procedures work similarly to the previous bagging

example. As you can see, it is easy to use such functions from existing
libraries. But of course, you can also implement your own algorithms to
build both techniques.

#4: Conclusion: differences & similarities

Since we learned briefly how bagging and boosting work, I would like to
put the focus now on comparing both methods against each other.

Similarities

 Ensemble methods
In a general view, the similarities between both techniques start
with the fact that both are ensemble methods with the aim to use
multiple learners over a single model to achieve better results.

 Multiple samples & aggregation

To do that, both methods generate random samples and multiple
training data sets. It is also similar that Bagging and Boosting both
arrive at the end decision by aggregation of the underlying models:
either by calculating average results or by taking a voting rank.

 Purpose
Finally, it is reasonable that both aim to produce higher stability and
better prediction for the data.

Differences

 Data partition | whole data vs. bias

While bagging uses random bags out of the training data for all
models independently, boosting puts higher importance on
misclassified data of the upcoming models. Therefore, the data
partition is different here.
 Models | independent vs. sequences
Bagging creates independent models that are aggregated together.
However, boosting updates the existing model with the new ones in
a sequence. Therefore, the models are affected by previous builds.

 Goal | variance vs. bias

Another difference is the fact that bagging aims to reduce the
variance, but boosting tries to reduce the bias. Therefore, bagging
can help to decrease overfitting, and boosting can reduce
underfitting.

 Function | weighted vs. non-weighted

The final function to predict the outcome uses equally weighted
average or equally weighted voting aggregations within the bagging
technique. Boosting uses weighted majority vote or weighted
average functions with more weight to those with better
performance on training data.

Implications

It was shown that the main idea of both methods is to use multiple models
together to achieve better predictions compared so single learning
models. However, there is no one-over-the-other statement to choose
between bagging and boosting since both have advantages and
disadvantages.

While bagging decreases the variance and reduces overfitting, it will only
rarely produce better bias. Boosting on the other hand side decreases the
bias but might be more overfitted that bagged models.

Coming back to the variance-bias tradeoff figure, I tried to visualize the

extreme cases when each method seems appropriate. However, this does
not mean that they achieve the results without any drawbacks. The aim
should always be to keep bias and variance in a reasonable balance.

Bagging and boosting both uses all given features and select only the
entries randomly. Random forest on the other side is an extension to
bagging that creates also random subsets of the features. Therefore,
random forest is used more often in practice than bagging.

Machine Learning Interview Questions
From Everand
Machine Learning Interview Questions
Tech Interviews
4.5/5 (2)
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
From Everand
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
Artem Kovera
No ratings yet
An Introduction of Ensemble Learning
100% (1)
An Introduction of Ensemble Learning
40 pages
Ensemble Methods
No ratings yet
Ensemble Methods
31 pages
Ensemble Methods
100% (1)
Ensemble Methods
15 pages
Unit 4 Notes
No ratings yet
Unit 4 Notes
24 pages
7 - Ensemble Techniques-Converted Updated
No ratings yet
7 - Ensemble Techniques-Converted Updated
8 pages
12 Ensemble Model
No ratings yet
12 Ensemble Model
90 pages
Bagging+Boosting+Gradient Boosting
100% (1)
Bagging+Boosting+Gradient Boosting
48 pages
Article Review 9 Eng
No ratings yet
Article Review 9 Eng
21 pages
Module 7 - Ensemble Learning
No ratings yet
Module 7 - Ensemble Learning
41 pages
Classification Through Ensembling Techniques
No ratings yet
Classification Through Ensembling Techniques
10 pages
unit 5 ML
No ratings yet
unit 5 ML
14 pages
Unit-3 ML
No ratings yet
Unit-3 ML
18 pages
UNIT 3 AML
No ratings yet
UNIT 3 AML
9 pages
Unit V -Multiple Learners
No ratings yet
Unit V -Multiple Learners
54 pages
Group9 ABA Ensemble Model
No ratings yet
Group9 ABA Ensemble Model
5 pages
Ensemble Learning
No ratings yet
Ensemble Learning
52 pages
Unit-3(1)
No ratings yet
Unit-3(1)
63 pages
UMl - unit 3
No ratings yet
UMl - unit 3
50 pages
UNIT-5 ML notes
No ratings yet
UNIT-5 ML notes
24 pages
Ensemble Learning in Machine Learning
No ratings yet
Ensemble Learning in Machine Learning
15 pages
AIML Lect6 Ensembles
No ratings yet
AIML Lect6 Ensembles
41 pages
Unit 4 Part 1
No ratings yet
Unit 4 Part 1
47 pages
5 - EnsembleModeling
No ratings yet
5 - EnsembleModeling
80 pages
Unit-3(1)
No ratings yet
Unit-3(1)
59 pages
Lecture 10 Ensemble Methods
No ratings yet
Lecture 10 Ensemble Methods
69 pages
Week 11 EnsembleLearning
No ratings yet
Week 11 EnsembleLearning
34 pages
Machine Learning: Ensemble Methods
No ratings yet
Machine Learning: Ensemble Methods
54 pages
Ensemble Classifiers
100% (1)
Ensemble Classifiers
37 pages
ML Mod 5.1
No ratings yet
ML Mod 5.1
18 pages
Unit IV Aiml
No ratings yet
Unit IV Aiml
32 pages
Ensemble_Techniques_Presentation
No ratings yet
Ensemble_Techniques_Presentation
17 pages
2.4-Ensemble_methods_lecture_notes (1)
No ratings yet
2.4-Ensemble_methods_lecture_notes (1)
14 pages
Ensembles 1
No ratings yet
Ensembles 1
4 pages
Ensemble Learning
No ratings yet
Ensemble Learning
35 pages
Machine Learning and Data Mining: Prof. Alexander Ihler Fall 2012
No ratings yet
Machine Learning and Data Mining: Prof. Alexander Ihler Fall 2012
36 pages
Ensemble Learning-Bagging-Boosting-Stacking
No ratings yet
Ensemble Learning-Bagging-Boosting-Stacking
12 pages
Ch-4 Ensemble Learning
No ratings yet
Ch-4 Ensemble Learning
18 pages
Class Adv Classification V
No ratings yet
Class Adv Classification V
50 pages
Ensemble Classifiers
No ratings yet
Ensemble Classifiers
37 pages
Ensemble Final
No ratings yet
Ensemble Final
41 pages
UNIT III Word File
No ratings yet
UNIT III Word File
13 pages
Unit 3
No ratings yet
Unit 3
99 pages
Ensemble Methods
No ratings yet
Ensemble Methods
4 pages
Module 2
No ratings yet
Module 2
34 pages
Ensemble Learning in Machine Learning
No ratings yet
Ensemble Learning in Machine Learning
4 pages
Unit Iv
No ratings yet
Unit Iv
14 pages
Evolutionary Bagging For Ensemble Learning: Keywords
No ratings yet
Evolutionary Bagging For Ensemble Learning: Keywords
16 pages
UNIT IV
No ratings yet
UNIT IV
18 pages
Time To Explore (5) ML
No ratings yet
Time To Explore (5) ML
9 pages
ML UNIT 3-1
No ratings yet
ML UNIT 3-1
14 pages
Lecture 5
No ratings yet
Lecture 5
11 pages
Ensemble Techniques and Random Forest: - Linear Algebra. - Basics of Machine Learning
No ratings yet
Ensemble Techniques and Random Forest: - Linear Algebra. - Basics of Machine Learning
8 pages
ML-Lecture-15-Ensemble
No ratings yet
ML-Lecture-15-Ensemble
27 pages
Ensemble Learning
100% (1)
Ensemble Learning
7 pages
Ensemble Methods (Final)
No ratings yet
Ensemble Methods (Final)
16 pages
Module 5,1 Ensemble_Bagging, RF,Boosting
No ratings yet
Module 5,1 Ensemble_Bagging, RF,Boosting
66 pages
Lec06 - Ensembling Methods Bagging Boosting
No ratings yet
Lec06 - Ensembling Methods Bagging Boosting
48 pages
Lecture 17 - Ensemble Learning
No ratings yet
Lecture 17 - Ensemble Learning
31 pages
IEEE Conference Template
No ratings yet
IEEE Conference Template
5 pages
Advanced Econometrics: Professor: Sukjin Han
No ratings yet
Advanced Econometrics: Professor: Sukjin Han
12 pages
DMBI
No ratings yet
DMBI
15 pages
6.interpretable Hardness Prediction of High-Entropy Alloys Through Ensemble Learning
No ratings yet
6.interpretable Hardness Prediction of High-Entropy Alloys Through Ensemble Learning
13 pages
Machine 2021 Jan-Apr
No ratings yet
Machine 2021 Jan-Apr
45 pages
Prediction of Compressive Strength of Fly Ash-Based Geopolymer
No ratings yet
Prediction of Compressive Strength of Fly Ash-Based Geopolymer
16 pages
AI25
No ratings yet
AI25
7 pages
Lung Cancer Detection Report
No ratings yet
Lung Cancer Detection Report
22 pages
Flood Prediction Using Supervised Machine Learning Algorithms
No ratings yet
Flood Prediction Using Supervised Machine Learning Algorithms
4 pages
Databricks Machine Learning Associate Exam Questions
No ratings yet
Databricks Machine Learning Associate Exam Questions
8 pages
Machine Learning MCQ S
No ratings yet
Machine Learning MCQ S
318 pages
Data Science Interview Questions With Answers ?
No ratings yet
Data Science Interview Questions With Answers ?
16 pages
ML Question Bank CA-II
No ratings yet
ML Question Bank CA-II
10 pages
100-Machine-Learning-Interview-Questions-and-Answers (Downloaded From Internet)
No ratings yet
100-Machine-Learning-Interview-Questions-and-Answers (Downloaded From Internet)
24 pages
Goutham Resume
No ratings yet
Goutham Resume
2 pages
course report
No ratings yet
course report
22 pages
Decision Tree Random Forest Theory
No ratings yet
Decision Tree Random Forest Theory
13 pages
Data Mining in Banking
No ratings yet
Data Mining in Banking
12 pages
Presentation On Research Paperarial Font
No ratings yet
Presentation On Research Paperarial Font
9 pages
Aiml QB With Ans - 075736
No ratings yet
Aiml QB With Ans - 075736
69 pages
Bagging and Boosting
No ratings yet
Bagging and Boosting
32 pages
6th sem project.pdf
No ratings yet
6th sem project.pdf
18 pages
Foml Paper Solution 2
No ratings yet
Foml Paper Solution 2
34 pages
Risk Analytics Data Driven Decisions Under Uncertainty Rodriguez
No ratings yet
Risk Analytics Data Driven Decisions Under Uncertainty Rodriguez
483 pages
Ensemble Methods
No ratings yet
Ensemble Methods
12 pages
Unit 2
No ratings yet
Unit 2
57 pages
Machine Learning (Se204A) Lab Manual
No ratings yet
Machine Learning (Se204A) Lab Manual
27 pages
Heart Disease Paper
No ratings yet
Heart Disease Paper
10 pages
applsci-15-05930
No ratings yet
applsci-15-05930
29 pages
AIML unit 4
No ratings yet
AIML unit 4
26 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Bagging and Boosting

Uploaded by

Bagging and Boosting

Uploaded by

Ensemble Learning: Bagging and Boosting

#1: Introduction and main idea: ensemble learning

So when should we use it? Cleary, when we see overfitting or underfitting

The main idea behind ensemble learning is the usage of multiple

Example: Image classification

The essential concept is encapsulated by means of a didactic illustration

The procedure is called aggregation or voting and combines the

Ensemble learning can help to balance both extreme cases to a more

#2: Bagging (bootstrap aggregation)

Let us focus first on the Bagging technique called bootstrap aggregation.

Using random subsets of data, the risk of overfitting is reduced and

 n_estimators: The number of estimators defines the number of bags

 max_samples: The maximum number of samples defines how many

Boosting is a little variation of the bagging algorithm and uses sequential

1. Train initial (weak) model

2. Update weights and train a new model

3. Aggregate the new model with the ensemble model

Essentially, we can repeat this process multiple times and continuously

In comparison to Bagging, this technique uses weighted voting or

Code example for boosting

 n_estimators: The amount of estimators defines the maximum

The fit()and predict()procedures work similarly to the previous bagging

#4: Conclusion: differences & similarities

 Multiple samples & aggregation

 Data partition | whole data vs. bias

 Goal | variance vs. bias

 Function | weighted vs. non-weighted

Coming back to the variance-bias tradeoff figure, I tried to visualize the

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.