0% found this document useful (0 votes)

5 views26 pages

Ensemble Learning

The document provides an overview of Ensemble Learning, highlighting its motivation, basic and advanced techniques, including Bagging and Boosting. It explains how combining multiple models can enhance predictive accuracy and outlines various methods such as Max Voting, Averaging, Stacking, and Blending. Additionally, it details specific algorithms like AdaBoost, Gradient Boosting, and XGBoost, emphasizing their applications and differences in handling data.

Uploaded by

nu726322

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views26 pages

Ensemble Learning

Uploaded by

nu726322

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 26

Ensemble Learning

Instructor: Dr. Umara Zahid

MSCS Fall 2022
Agenda
• Motivation • Algorithms based on Bagging
• Introduction to Ensemble and Boosting
Learning • Bagging meta-estimator
• Basic Ensemble Techniques • Random Forest
• Max Voting • AdaBoost
• Averaging • GBM
• Weighted Average • XGB
• Light GBM
• Advanced Ensemble Techniques
• CatBoost
• Stacking
• Blending
• Bagging
• Boosting
Motivation
• You have to buy something? How you do it?
• Example: You want to buy a new car (Two approaches)
1. You will walk up to the first car shop and purchase one
based on the advice of the dealer. Is it so?
2. You would likely browser a few web portals where
people have posted their reviews and compare different
car models, checking for their features and prices. You
will also probably ask your friends and colleagues for
their opinion. (In short, you wouldn’t directly reach a
conclusion, but will instead make a decision considering
the opinions of other people as well)
• Review or opinion-based decisions
Another Example (Its not just
buying/ selling related problem)
• You are a movie director, you created a short movie
• Now, you want to take preliminary feedback (ratings)
on the movie before
Inferences : making it public.
• What
1. Aare the possible
diverse group ofways by which
people youtocan
are likely do better
make that? decisions as
1. Ask compared
one of yourtofriends to rate the movie
individuals
• Biased Reviewa diverse set of machine learning models are likely to
2. Similarly,
make better decisions as compared to single models
2. Asking 5 colleagues to rate the movie
This diversification
• Unbiased/ in Machine Learning is achieved by a
less number of people/ Subject Matter Experts
technique
3. Asking 50 people to rate thecalled
movieEnsemble Learning
• More Generalized Review
What is Ensemble Learning?
• Ensemble methods are machine learning techniques that combines
several base models in order to produce one optimal predictive model
(What?)
• Ensemble learning techniques attempt to make the performance of the
predictive models better by improving their accuracy (Why?)
• Ensemble Learning is a process using which multiple machine learning
models (such as classifiers) are strategically constructed to solve a
particular problem (How?)
• In another way (Problem Symptom):
• To reduce the variance of certain ML models, such as, (neural networks),
multiple models are trained instead of a single model to combine the
predictions from these models.
Basic Ensemble Techniques
1. Max Voting
2. Averaging
3. Weighted Average
Max Voting
• In this technique, multiple models are used to make predictions for each
data point. The predictions by each model are considered as a ‘vote’. The
predictions which we get from the majority of the models are used as the
final prediction
• Considering previous example, you asked 5 of your colleagues to rate your
movie (out of 5); suppose three of them rated it as 4 while two of them
gave it a 5. Since the majority gave a rating of 4, the final rating will be
taken as 4

• The max voting method is generally used for classification problems

Averaging
• Similar to the max voting technique, multiple predictions are made
for each data point in averaging
• In this method, we take an average of predictions from all the models
and use it to make the final prediction
• For example, the averaging method would take the average of all the
values
• (5+4+5+4+4)/5 = 4.4
• Averaging can be used for making predictions in regression problems
or while calculating probabilities for classification problems
Weighted Average
• This is an extension of the averaging method
• All models are assigned different weights defining the importance of
each model for prediction.
• For instance, if two of your colleagues are critics, while others have no
prior experience in this field, then the answers by these two friends
are given more importance as compared to the other people

• The result is calculated as [(50.23) + (40.23) + (50.18) + (40.18) +

(4*0.18)/5] = 4.41.
Advanced Ensemble Techniques
• Stacking
• Blending
• Bagging
• Boosting
Stacking
• Train multiple base models (Level-0 models) on your training data.
• Then, you use their predictions to create a new dataset (called meta-features).
• A new model — called the meta-model (Level-1 model) — is trained on these
meta-features to make the final prediction
• Important Detail:
• To avoid data leakage, stacking typically uses k-fold cross-validation when
generating predictions for the meta-model.
• Example:
• Base models: Decision Tree, SVM, Logistic Regression
• Meta-model: Random Forest trained on predictions from the base models
Stacking
• Stacking uses predictions from multiple
models (for example decision tree, knn or
svm) to build a new model. This model is
used for making predictions on the test set.
• Step 1: The train set is split into 10 parts
• Step 2: A base model (suppose a decision
tree) is fitted on 9 parts and predictions are
made for the 10th part. This is done for
each part of the train set
• Step 3: Using this model, predictions are
made on the test set
Continued…
• Steps 2 to 3 are repeated for another base
model (say knn) resulting in another set of
predictions for the train set and test set
• The predictions from the train set are used as
features to build a new model
• This model is used to make final predictions on
the test prediction set
Blending
• Split the training set into two parts:
• Train set (e.g., 70%)
• Holdout set (e.g., 30%)
• Train base models on the train set.
• Use these trained models to predict on the holdout set.
• Train the meta-model on these holdout predictions and use it for final
predictions.
• Important Detail:
• Blending is simpler and faster, but it wastes some data (the holdout set is
not used to train base models).
Blending
• Blending follows the same approach as stacking but uses only a
holdout (validation) set from the train set to make predictions
• In other words, unlike stacking, the predictions are made on the
holdout set only
• The holdout set and the predictions are used to build a model which
is run on the test set.
Blending
• Step 1: The train set is split into training
and validation sets
• Step 2: Model(s) are fitted on the
training set. The predictions are made
on the validation set and the test set
• Step 3: The validation set and its
predictions are used as features to build
a new model. This model is used to
make final predictions on the test and
meta-features.
• Tranining set 70%, Validation 10%
(Tuning) test set 20%
Key Differences Between Stacking
and Blending
•Use blending if you're
quickly experimenting or
working with large datasets.

•Use stacking for more

robust, high-performance
models especially in
competitions or production.
Bagging
• The idea behind bagging is combining the results of multiple models (for
instance, all decision trees) to get a generalized result
• Here’s a question: If you create all the models on the same set of data and
combine it, will it be useful? There is a high chance that these models will
give the same result since they are getting the same input. So how can we
solve this problem? One of the techniques is bootstrapping.
• Bootstrapping is a sampling technique in which we create subsets of
observations from the original dataset, with replacement. The size of the
subsets is the same as the size of the original set.
• Bagging (or Bootstrap Aggregating) technique uses these subsets (bags) to
get a fair idea of the distribution (complete set). The size of subsets created
for bagging may be less than the original set.
Steps of Bagging
• Multiple subsets are created from the
original dataset, selecting observations
with replacement.
• A base model (weak model) is created on
each of these subsets.
• The models run in parallel and are
independent of each other.
• The final predictions are determined by
combining the predictions from all the
models
Boosting
• Boosting Machine Learning is one such
technique that can be used to solve
complex, data-driven, real-world
problems
• Boosting is an ensemble learning
technique that uses a set of Machine
Learning algorithms to convert weak
learner to strong learners in order to
increase the accuracy of the model
Difference between Bagging and
Boosting
• Bagging Vs. Boosting
• Parallel ensemble, popularly known as bagging
• The weak learners are produced parallelly during the
training phase
• The performance of the model can be increased by
parallelly training a number of weak learners on
bootstrapped data sets
• Examples: Random Forest algorithm, Bagging meta-
estimator
• Sequential ensemble, popularly known as boosting
• The weak learners are sequentially produced during the
training phase
• The performance of the model is improved by assigning
a higher weightage to the previous, incorrectly classified
samples.
• AdaBoost, Gradient Boosting, XGBoost
How Boosting Works?
• Step 1: The base algorithm reads the data
and assigns equal weight to each sample
observation.
• Step 2: False predictions made by the base
learner are identified. In the next iteration,
these false predictions are assigned to the
next base learner with a higher weightage
on these incorrect predictions.
• Step 3: Repeat step 2 until the algorithm
can correctly classify the output.
• Therefore, the main aim of Boosting is to
focus more on miss-classified predictions
but, it can be used for regression problems
as well
Adaptive Boosting (Adaboost)
• AdaBoost is implemented by combining several weak learners into a single strong
learner.
• The weak learners in AdaBoost take into account a single input feature and draw
out a single split decision tree called the decision stump.
• Each observation is weighed equally while drawing out the first decision stump.
• The results from the first decision stump are analyzed and if any observations are
wrongfully classified, they are assigned higher weights.
• Post this, a new decision stump is drawn by considering the observations with
higher weights as more significant.
• Again if any observations are misclassified, they’re given higher weight and this
process continues until all the observations fall into the right class.
• Adaboost can be used for both classification and regression-based problems,
however, it is more commonly used for classification purpose.
Gradient Boosting
• The difference in this type of boosting is that the weights for
misclassified outcomes are not incremented, instead, Gradient
Boosting method tries to optimize the loss function of the previous
learner by adding a new model that adds weak learners in order to
reduce the loss function.
• The main idea here is to overcome the errors in the previous learner’s
predictions. This type of boosting has three main components:
• Loss function that needs to be ameliorated.
• Weak learner for computing predictions and forming strong learners.
• An Additive Model that will regularize the loss function.
• Like AdaBoost, Gradient Boosting can also be used for both
classification and regression problems.
XGBoost
• Motivation
• GBoost is an advanced version of Gradient boosting method, it literally
means eXtreme Gradient Boosting. XGBoost developed by Tianqi Chen,
falls under the category of Distributed Machine Learning Community
(DMLC).
• The main aim of this algorithm is to increase the speed and efficiency of
computation. The Gradient Descent Boosting algorithm computes the
output at a slower rate since they sequentially analyze the data set,
therefore XGBoost is used to boost or extremely boost the performance of
the model.
XGBoost
• XGBoost is designed to focus on
computational speed and model
efficiency. The main features provided
by XGBoost are:
• Parallelly creates decision trees.
• Implementing distributed computing
methods for evaluating large and complex
models.
• Using Out-of-Core Computing to analyze
huge datasets.
• Implementing cache optimization to make
the best use of resources.

Starup Settings 620 1503-0046 - EN
No ratings yet
Starup Settings 620 1503-0046 - EN
13 pages
11 Quantifying The Impact of CEO Social Media Celebrity Status On Firm Value Novel Measures From Digital Gatekeeping Theory
No ratings yet
11 Quantifying The Impact of CEO Social Media Celebrity Status On Firm Value Novel Measures From Digital Gatekeeping Theory
10 pages
UNIT 3 - Information Technology System Applicable in Nursing Practice
100% (3)
UNIT 3 - Information Technology System Applicable in Nursing Practice
85 pages
Classification Through Ensembling Techniques
No ratings yet
Classification Through Ensembling Techniques
10 pages
An Introduction of Ensemble Learning
100% (1)
An Introduction of Ensemble Learning
40 pages
Ensemble Learning in Machine Learning
No ratings yet
Ensemble Learning in Machine Learning
4 pages
unit 5 ML
No ratings yet
unit 5 ML
14 pages
Assignment 1
No ratings yet
Assignment 1
4 pages
Module 7 - Ensemble Learning
No ratings yet
Module 7 - Ensemble Learning
41 pages
Ensemble Learning
100% (1)
Ensemble Learning
7 pages
Unit 4 Ensemble Techniques and Unsupervised Learning
100% (1)
Unit 4 Ensemble Techniques and Unsupervised Learning
25 pages
Ensemble Methods
No ratings yet
Ensemble Methods
31 pages
Unit-3 ML
No ratings yet
Unit-3 ML
18 pages
Unit 4
No ratings yet
Unit 4
24 pages
Article Review 9 Eng
No ratings yet
Article Review 9 Eng
21 pages
ML UNIT 3-1
No ratings yet
ML UNIT 3-1
14 pages
Ensemble Learning
No ratings yet
Ensemble Learning
52 pages
ML-Lecture-15-Ensemble
No ratings yet
ML-Lecture-15-Ensemble
27 pages
Ensemble Final
No ratings yet
Ensemble Final
41 pages
UNIT 3 AML
No ratings yet
UNIT 3 AML
9 pages
Lecture 5
No ratings yet
Lecture 5
11 pages
AI25
No ratings yet
AI25
7 pages
Unit I ML (I) 24-25
No ratings yet
Unit I ML (I) 24-25
79 pages
ML Unit-3
No ratings yet
ML Unit-3
28 pages
Module 2
No ratings yet
Module 2
34 pages
UMl - unit 3
No ratings yet
UMl - unit 3
50 pages
UNIT III Word File
No ratings yet
UNIT III Word File
13 pages
12 Ensemble Model
No ratings yet
12 Ensemble Model
90 pages
Ensemble Learning
No ratings yet
Ensemble Learning
32 pages
33_Assignment 7_ Implementation of Ensemble techniques
No ratings yet
33_Assignment 7_ Implementation of Ensemble techniques
7 pages
unit 4 ml
No ratings yet
unit 4 ml
9 pages
Unit 4 Updated Notes
No ratings yet
Unit 4 Updated Notes
13 pages
Ch-4 Ensemble Learning
No ratings yet
Ch-4 Ensemble Learning
18 pages
Machine learning lecture 2,3,4
No ratings yet
Machine learning lecture 2,3,4
26 pages
Bagging+Boosting+Gradient Boosting
100% (1)
Bagging+Boosting+Gradient Boosting
48 pages
Ensemble_Techniques_Presentation
No ratings yet
Ensemble_Techniques_Presentation
17 pages
AIML UNIT 4
No ratings yet
AIML UNIT 4
26 pages
Ensemble Methods (Final)
No ratings yet
Ensemble Methods (Final)
16 pages
Ensemble Learning in Machine Learning
No ratings yet
Ensemble Learning in Machine Learning
15 pages
ML Uint 4-2
No ratings yet
ML Uint 4-2
20 pages
Unit 4 Part 1
No ratings yet
Unit 4 Part 1
47 pages
22AIP3101A Session 11
No ratings yet
22AIP3101A Session 11
30 pages
What Is Ensemble Learning
No ratings yet
What Is Ensemble Learning
4 pages
Ensemble Methods
No ratings yet
Ensemble Methods
3 pages
Time To Explore (5) ML
No ratings yet
Time To Explore (5) ML
9 pages
2.4-Ensemble_methods_lecture_notes (1)
No ratings yet
2.4-Ensemble_methods_lecture_notes (1)
14 pages
7 - Ensemble Techniques-Converted Updated
No ratings yet
7 - Ensemble Techniques-Converted Updated
8 pages
Unit I ML (I) 24-25-1
No ratings yet
Unit I ML (I) 24-25-1
152 pages
Week 11 EnsembleLearning
No ratings yet
Week 11 EnsembleLearning
34 pages
unit 4 pdf
No ratings yet
unit 4 pdf
9 pages
ML Ass
No ratings yet
ML Ass
21 pages
Machine Learning and Data Mining: Prof. Alexander Ihler Fall 2012
No ratings yet
Machine Learning and Data Mining: Prof. Alexander Ihler Fall 2012
36 pages
ML Unit 3 V2
No ratings yet
ML Unit 3 V2
47 pages
Ensemble Methods
100% (1)
Ensemble Methods
15 pages
Technical Report
No ratings yet
Technical Report
10 pages
ML Unit-3
No ratings yet
ML Unit-3
15 pages
Unit-3(1)
No ratings yet
Unit-3(1)
63 pages
Ensemble Learning
No ratings yet
Ensemble Learning
16 pages
Unit 4 Notes
No ratings yet
Unit 4 Notes
24 pages
Unit Iv
No ratings yet
Unit Iv
14 pages
Unit 3
No ratings yet
Unit 3
99 pages
IBM SPSS Modeler Cookbook
From Everand
IBM SPSS Modeler Cookbook
Meta S. Brown
No ratings yet
Pedestrian Detection: Please, suggest a subtitle for a book with title 'Pedestrian Detection' within the realm of 'Computer Vision'. The suggested subtitle should not have ':'.
From Everand
Pedestrian Detection: Please, suggest a subtitle for a book with title 'Pedestrian Detection' within the realm of 'Computer Vision'. The suggested subtitle should not have ':'.
Fouad Sabry
No ratings yet
w1 - Intro Asean
No ratings yet
w1 - Intro Asean
54 pages
funcgeo2
No ratings yet
funcgeo2
18 pages
Isago Program Manual Gopm Ed 3
100% (2)
Isago Program Manual Gopm Ed 3
145 pages
Electrical Engineering (Jan - 2018)
No ratings yet
Electrical Engineering (Jan - 2018)
8 pages
emtech-Q1-M3
No ratings yet
emtech-Q1-M3
21 pages
Assignment 2
No ratings yet
Assignment 2
2 pages
DL Queries
No ratings yet
DL Queries
3 pages
Form 1095-A Instructions
No ratings yet
Form 1095-A Instructions
4 pages
Hammer Mill
No ratings yet
Hammer Mill
4 pages
Migrating A Survey From LimeSurvey To Qualtrics
No ratings yet
Migrating A Survey From LimeSurvey To Qualtrics
11 pages
June Newsletter
No ratings yet
June Newsletter
9 pages
SSH Key or Password Authentication For An SFTP User Using SEAS XAPI Custom Exit
No ratings yet
SSH Key or Password Authentication For An SFTP User Using SEAS XAPI Custom Exit
8 pages
DC 102 Module 1 Lesson 1
No ratings yet
DC 102 Module 1 Lesson 1
12 pages
Mathematics Standard 2 Year 12 Topic Guide Algebra
No ratings yet
Mathematics Standard 2 Year 12 Topic Guide Algebra
8 pages
Click Start, International Edition, Learner's Book 6, ISBN 978-1-108-95190-6
No ratings yet
Click Start, International Edition, Learner's Book 6, ISBN 978-1-108-95190-6
2 pages
DOM MCQs
No ratings yet
DOM MCQs
9 pages
Proposal For Ecommerce Website
No ratings yet
Proposal For Ecommerce Website
3 pages
Celwel - 60 (Adore Electrodes)
No ratings yet
Celwel - 60 (Adore Electrodes)
1 page
Quiz Artificial Intelligence
No ratings yet
Quiz Artificial Intelligence
4 pages
Welcome!: Thistleton and Sadigov Download and Install R For Windows Week 1
No ratings yet
Welcome!: Thistleton and Sadigov Download and Install R For Windows Week 1
3 pages
37 DR Leisur Streets Beau Bassin, Mauritius Phone Number: 58400417 / Parents Phone Number: 58287492 or 52530692 Academic Resume
No ratings yet
37 DR Leisur Streets Beau Bassin, Mauritius Phone Number: 58400417 / Parents Phone Number: 58287492 or 52530692 Academic Resume
2 pages
EyePhone Presentation
No ratings yet
EyePhone Presentation
8 pages
Therapy Net
No ratings yet
Therapy Net
1 page
NAT-Reviewer-_-Media-and-Information-Literacy
No ratings yet
NAT-Reviewer-_-Media-and-Information-Literacy
16 pages
Unifi - Changing The Firmware of A Unifi Device: Search
No ratings yet
Unifi - Changing The Firmware of A Unifi Device: Search
7 pages
TR-Carpentry NC III
No ratings yet
TR-Carpentry NC III
151 pages
Ach 480
No ratings yet
Ach 480
664 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Ensemble Learning

Uploaded by

Ensemble Learning

Uploaded by

Ensemble Learning

Instructor: Dr. Umara Zahid

• The max voting method is generally used for classification problems

• The result is calculated as [(50.23) + (40.23) + (50.18) + (40.18) +

•Use stacking for more

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Ensemble Learning

Uploaded by

Ensemble Learning

Uploaded by

Ensemble Learning

Instructor: Dr. Umara Zahid

• The max voting method is generally used for classification problems

• The result is calculated as [(5*0.23) + (4*0.23) + (5*0.18) + (4*0.18) +

•Use stacking for more

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

• The result is calculated as [(50.23) + (40.23) + (50.18) + (40.18) +