0% found this document useful (0 votes)

17 views18 pages

ML Lecture 7 - Ensemble Learning

The document provides an overview of Ensemble Learning in Machine Learning, detailing methods such as Random Forest, Bagging, and Boosting. It explains how these techniques combine multiple classifiers to improve prediction accuracy, particularly in class-imbalanced data. Additionally, it includes algorithms for Bagging and AdaBoost, illustrating the processes involved in creating ensemble models.

Uploaded by

Saraf

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views18 pages

ML Lecture 7 - Ensemble Learning

Uploaded by

Saraf

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

Outline Ensemble Learning Random Forest Bagging Boosting Ensemble of Trees

Ensemble Learning in Machine Learning

Prof. Dr. Dewan Md. Farid

Dept. of CSE, UIU

July 04, 2023

Prof. Dr. Dewan Md. Farid: Ensemble Learning in Machine Learning Dept. of CSE, UIU
Outline Ensemble Learning Random Forest Bagging Boosting Ensemble of Trees

Ensemble Learning

Random Forest

Bagging

Boosting

Ensemble of Trees

Prof. Dr. Dewan Md. Farid: Ensemble Learning in Machine Learning Dept. of CSE, UIU
Outline Ensemble Learning Random Forest Bagging Boosting Ensemble of Trees

Ensemble Classifier

I It is the process of combining different classification techniques to

build a powerful composite model from the data.

I It returns a class label prediction for new instances based on the

individual classifiers vote.

I It improves the classification accuracy of class-imbalanced data.

Prof. Dr. Dewan Md. Farid: Ensemble Learning in Machine Learning Dept. of CSE, UIU
Outline Ensemble Learning Random Forest Bagging Boosting Ensemble of Trees

Ensemble Classifier (con.)

Figure: An example of an ensemble classifier.

Prof. Dr. Dewan Md. Farid: Ensemble Learning in Machine Learning Dept. of CSE, UIU
Outline Ensemble Learning Random Forest Bagging Boosting Ensemble of Trees

Ensemble Classifier (con.)

It is often advantageous to take the training data and derive several
sub-data sets from it, learn a classifier from each, and combing them to
produce an ensemble model.

New Data
Instances
Sub-Data1 Model, M1

Data, D Combine
Sub-Data 2 Model, M2 Prediction
Votes

Sub-Data k Model, Mk

Figure: Ensemble model to improve classification accuracy.

Prof. Dr. Dewan Md. Farid: Ensemble Learning in Machine Learning Dept. of CSE, UIU
Outline Ensemble Learning Random Forest Bagging Boosting Ensemble of Trees

Random Forest

The Random Forest also known as Random Decision Forest is an

ensemble learning method for classification and regression that able to
classify large amounts of data with accuracy.
I It constructs a number of decision trees based on randomly selected
subset of features at training time and outputting the class that is
the ensemble of trees vote for the most popular class.
I The selection of a random subset of features is an example of the
random subspace method.
Random subspace method (or attribute bagging) is also an ensemble
classifier that consists of several classifiers each operating in a subspace
of the original feature space, and outputs the class based on the outputs
of these individual classifiers. It is an attractive choice for classifying high
dimensional data.

Prof. Dr. Dewan Md. Farid: Ensemble Learning in Machine Learning Dept. of CSE, UIU
Outline Ensemble Learning Random Forest Bagging Boosting Ensemble of Trees

Bagging
I Combining the decision of different mining models means the various
outputs into a single prediction. The simplest way to do this in the
case of classification is to take a vote (perhaps a weighted vote); in
the case of numeric prediction it is to calculate the average (perhaps
a weighted average).
I Bagging and boosting both adopt this approach, but they derive
the individual models in different ways. In bagging the models
receive the equal weight, whereas in boosting weighting is used to
give more influence to the more successful ones.
I To introduce bagging, suppose that several training datasets of the
same size are chosen at random for the problem domain. Imaging
using a particular machine learning technique to build a decision tree
for each dataset. We can combine the trees by having them vote on
each test instance. If one class receives more votes than any other,
it is taken as the correct one. Predictions made by voting become
more reliable as more votes are taken into account.
Prof. Dr. Dewan Md. Farid: Ensemble Learning in Machine Learning Dept. of CSE, UIU
Outline Ensemble Learning Random Forest Bagging Boosting Ensemble of Trees

Bagging (con.)
Bagging also known as Bootstrap Aggregation, combines different
classifiers into a single prediction model. It uses voting technique
(perhaps a weighted vote) for classifying a new instance.

Algorithm 1 Bagging Algorithm

Input: Training data, D, number of iterations, k, and a learning scheme.
Output: Ensemble model, M ∗
Method:
1: for i = 1 to k do
2: create bootstrap sample Di , by sampling D with replacement;
3: use Di , and learning scheme to derive a model, Mi ;
4: end for
To use M ∗ to classify a new instance, xNew :
Each Mi ∈ M ∗ classify xNew and return the majority vote;

Prof. Dr. Dewan Md. Farid: Ensemble Learning in Machine Learning Dept. of CSE, UIU
Outline Ensemble Learning Random Forest Bagging Boosting Ensemble of Trees

Boosting

AdaBoost (short for Adaptive Boosting) is a popular boosting algorithm,

which considers a series of classifiers and combines the votes of each
individual classifier for classifying an unknown or known instances.
I In boosting, weights are assigned to each training instance.
I A series of k classifiers is iteratively learned.
I After a classifier, Mi , is learned, the weights are updated to allow
the subsequent classifier, Mi+1 , to pay more attention to the
instances that were misclassified by Mi .
I The final boosted classifier, M ∗ , combines the votes of each
individual classifier.

Prof. Dr. Dewan Md. Farid: Ensemble Learning in Machine Learning Dept. of CSE, UIU
Outline Ensemble Learning Random Forest Bagging Boosting Ensemble of Trees

Boosting (con.)

In boosting, weights are assigned to each training instance. A series of k

classifiers is iteratively learned. After a classifier, Mi , is learned, the
weights are updated to allow the subsequent classifier, Mi+1 , to “pay
more attention” to the training instances that were misclassified by Mi .
The final boosted classifier, M ∗ , combines the votes of each individual
classifier, where the weight of each classifier’s vote is a function of its
accuracy.

Prof. Dr. Dewan Md. Farid: Ensemble Learning in Machine Learning Dept. of CSE, UIU
Outline Ensemble Learning Random Forest Bagging Boosting Ensemble of Trees

Boosting (con.)

AdaBoost (short for Adaptive Boosting) is a popular boosting algorithm.

Suppose we want to boost the accuracy of a learning method. We are
given D, a data set of d class-labeled instances,
(x1 , y1 ), (x2 , y2 ), · · · , (xd , yd ), where yi is the class label of instance xi .
Initially, AdaBoost assigns each training instance an equal weight of d1 .
Generating k classifiers for the ensemble requires k rounds through the
rest of the algorithm. We can sample to form any sized training set, not
necessarily of size d. Sampling with replacement is used - the same
instance may be selected more than once. Each instance?s chance of
being selected is based on its weight. A classifier model, M, is derived
form the training instances of Di . Its error is then calculated using Di as
a test set. The weights of the training instances are the adjusted
according to how they were classified.

Prof. Dr. Dewan Md. Farid: Ensemble Learning in Machine Learning Dept. of CSE, UIU
Outline Ensemble Learning Random Forest Bagging Boosting Ensemble of Trees

Boosting (con.)

If an instance was incorrectly classified, it weight is increased. If an

instance was correctly classified, its weight is decreased. An instance’s
weight reflects how difficult it is to classify - the higher the weight, the
more often it has been misclassified. These weights will be used to
generate the training samples for the classifier of the next round. The
basic idea is that when we build a classifier, we want it to focus more on
the misclassified instances of the previous round. Some classifiers may be
better at classifying some “difficult” instances than others. In this way,
we build a series of classifiers that complement each other.

Prof. Dr. Dewan Md. Farid: Ensemble Learning in Machine Learning Dept. of CSE, UIU
Outline Ensemble Learning Random Forest Bagging Boosting Ensemble of Trees

Error Rate

To compute the error rate of model Mi , we the sum the weights of each
of the instances in Di , that Mi misclassified. That is,
d
X
error (Mi ) = wj ∗ err (xj ) (1)
j=1

Where err (xj ) is the misclassification error of instance xi . If the instance

xi was misclassified, then err (xi ) is 1. Otherwise, it is 0. If the
performance of classifier Mi is so poor that is its error exceeds 0.5, then
we abandon it. Instead, we try again by generating a new Di training set,
from which we derive a new Mi .

Prof. Dr. Dewan Md. Farid: Ensemble Learning in Machine Learning Dept. of CSE, UIU
Outline Ensemble Learning Random Forest Bagging Boosting Ensemble of Trees

Normalising Weight

If a instance in round i was correctly classified, its weight is multiplied by

error (Mi )
error ( 1−error (Mi ) ). Once the weights of all of the correctly classified
instances are updated, the weights for all instances (including the
misclassified instances) are normalised so that their sum remains the
same as it was before. To normalise a weight, we multiply it by the sum
of the old weights, divided by the sum of the new weights. As a result,
the weights of misclassified instances are increased and the weights of
correctly classified instances are decreased.

Prof. Dr. Dewan Md. Farid: Ensemble Learning in Machine Learning Dept. of CSE, UIU
Outline Ensemble Learning Random Forest Bagging Boosting Ensemble of Trees

AdaBoost Algorithm
Algorithm 2 AdaBoost Algorithm
Input: Training data, D, number of iterations, k, and a learning scheme.
Output: Ensemble model, M ∗
Method:
1: initialise weight, xi ∈ D to d1 ;
2: for i = 1 to k do
3: sample D with replacement according to instance weight to obtain
Di ;
4: use Di , and learning scheme to derive a model, Mi ;
5: compute error (Mi );
6: if error (Mi ) ≥ 0.5 then
7: go back to step 3 and try again;
8: end if
9: for each correctly classified xi ∈ D do
error (Mi )
10: multiply weight of xi by ( 1−error (Mi ) );
11: end for
12: normalise weight of instances;
13: end for
To use M ∗ to classify a new instance, xNew :
1: initialise weight of each class to zero;
2: for i = 1 to n do
3: wi = log 1−error (Mi )
error (Mi ) ; // weight of the classifier’s vote
4: c = Mi (xNew ); // class prediction by Mi
5: add wi to weight for class c;
6: end for
7: return class with largest weight;

Prof. Dr. Dewan Md. Farid: Ensemble Learning in Machine Learning Dept. of CSE, UIU
Outline Ensemble Learning Random Forest Bagging Boosting Ensemble of Trees

Ensemble of Trees

I It combines a number of decision trees in order to reduce the risk of

overfitting.
I It creates several sub-datasets, D1 , · · · , Di , · · · , Dk , from the original
training data, D.
I It groups the features of each sub-dataset, Di , and build a tree, DTj
on each group.
I It calculates the error rate of DTj on sub-datasets, Di . If error rate
of DTj is less than or equal to threshold value then the tree is
consider for ensemble.
I To make a prediction on a new instance, each tree’s prediction is
counted as a vote for one class. The label is predicted to be the
class that receives the most votes (majority voting).

Prof. Dr. Dewan Md. Farid: Ensemble Learning in Machine Learning Dept. of CSE, UIU
Outline Ensemble Learning Random Forest Bagging Boosting Ensemble of Trees

Algorithm 3 Ensemble of Trees

Input: Training data, D, and C4.5 learning algorithm.
Output: A set of trees, DT ∗
Method:
1: DT ∗ = ∅;
2: create sun-datasets, D1 , · · · , Di , · · · , Dk , from the training data, D;
3: for i = 1 to k do
4: group features in Di into m groups;
5: for j = 1 to m do
6: build a DTj with jth feature group;
7: compute error (DTj ) on Di ;
8: if error (DTj ) ≤ threshold value then
9: DT ∗ = DT ∗ ∪ DTj ;
10: end if
11: end for
12: end for
To use DT ∗ to classify a new instance, xNew :
Each DTi ∈ DT ∗ classify xNew and return majority voting;

Prof. Dr. Dewan Md. Farid: Ensemble Learning in Machine Learning Dept. of CSE, UIU
Outline Ensemble Learning Random Forest Bagging Boosting Ensemble of Trees

* THANK YOU *

Prof. Dr. Dewan Md. Farid: Ensemble Learning in Machine Learning Dept. of CSE, UIU

8 Bagging Boosting Annotated (1)
No ratings yet
8 Bagging Boosting Annotated (1)
31 pages
Ensemble Method
No ratings yet
Ensemble Method
8 pages
ML-Unit I - Ensemble Methods
No ratings yet
ML-Unit I - Ensemble Methods
54 pages
Ensemble Learning Ons
No ratings yet
Ensemble Learning Ons
26 pages
Ensemble Machine Learning Approach
No ratings yet
Ensemble Machine Learning Approach
13 pages
Chapter07 Ensemble Learning
No ratings yet
Chapter07 Ensemble Learning
21 pages
Data Mining - Ensemble Methods
No ratings yet
Data Mining - Ensemble Methods
12 pages
Ensemble_Techniques_Presentation
No ratings yet
Ensemble_Techniques_Presentation
17 pages
Ensemble Learning
No ratings yet
Ensemble Learning
16 pages
U1-Ensemble Methods
No ratings yet
U1-Ensemble Methods
17 pages
Voting or Averaging of Predictions of Multiple Pre-Trained Models
No ratings yet
Voting or Averaging of Predictions of Multiple Pre-Trained Models
23 pages
Unit 4 Notes
No ratings yet
Unit 4 Notes
24 pages
unit 4 ml
No ratings yet
unit 4 ml
9 pages
Bagging and Boosting
No ratings yet
Bagging and Boosting
40 pages
Ensemble Learning
No ratings yet
Ensemble Learning
35 pages
E4fbc2f-C755-Ed1a-C18-F18ec25eb0d Ensemble Learning Bagging Boosting and Stacking
No ratings yet
E4fbc2f-C755-Ed1a-C18-F18ec25eb0d Ensemble Learning Bagging Boosting and Stacking
6 pages
Unit I ML (I) 24-25
No ratings yet
Unit I ML (I) 24-25
79 pages
Bagging and Boosting
No ratings yet
Bagging and Boosting
8 pages
Evolutionary Bagging For Ensemble Learning: Keywords
No ratings yet
Evolutionary Bagging For Ensemble Learning: Keywords
16 pages
ML-Lecture-15-Ensemble
No ratings yet
ML-Lecture-15-Ensemble
27 pages
Class Adv Classification V
No ratings yet
Class Adv Classification V
50 pages
14 Model Ensembles
No ratings yet
14 Model Ensembles
63 pages
Bagging and Boosting
No ratings yet
Bagging and Boosting
4 pages
ML UNIT 3-1
No ratings yet
ML UNIT 3-1
14 pages
Unit I ML (I) 24-25-1
No ratings yet
Unit I ML (I) 24-25-1
152 pages
UNIT 3 AML
No ratings yet
UNIT 3 AML
9 pages
AI25
No ratings yet
AI25
7 pages
unit 5 ML
No ratings yet
unit 5 ML
14 pages
UNIT III Word File
No ratings yet
UNIT III Word File
13 pages
ML Unit-3
No ratings yet
ML Unit-3
15 pages
Lecture 17 - Ensemble Learning
No ratings yet
Lecture 17 - Ensemble Learning
31 pages
ML - 5
No ratings yet
ML - 5
53 pages
12 Ensemble Model
No ratings yet
12 Ensemble Model
90 pages
Module 2
No ratings yet
Module 2
34 pages
AIML Lect6 Ensembles
No ratings yet
AIML Lect6 Ensembles
41 pages
Unit-3 ML
No ratings yet
Unit-3 ML
18 pages
ML Mod 5.1
No ratings yet
ML Mod 5.1
18 pages
Ensemble Classifiers
No ratings yet
Ensemble Classifiers
37 pages
Vlsi Design Styles
No ratings yet
Vlsi Design Styles
61 pages
Unit-3(1)
No ratings yet
Unit-3(1)
59 pages
Ensemble Methods
No ratings yet
Ensemble Methods
31 pages
Unit-3 ML P (1) PPTs by DR KSR
No ratings yet
Unit-3 ML P (1) PPTs by DR KSR
21 pages
Unit-3(1)
No ratings yet
Unit-3(1)
63 pages
Ensemble Learning
No ratings yet
Ensemble Learning
52 pages
Module 7 - Ensemble Learning
No ratings yet
Module 7 - Ensemble Learning
41 pages
Ensembles of Classifiers: Evgueni Smirnov
No ratings yet
Ensembles of Classifiers: Evgueni Smirnov
43 pages
Lecture 10 Ensemble Methods
No ratings yet
Lecture 10 Ensemble Methods
69 pages
Ensemble Methods (Final)
No ratings yet
Ensemble Methods (Final)
16 pages
DSA1101 2019 Week4 Part1
No ratings yet
DSA1101 2019 Week4 Part1
39 pages
What Is Ensemble Learning
No ratings yet
What Is Ensemble Learning
4 pages
An Introduction of Ensemble Learning
100% (1)
An Introduction of Ensemble Learning
40 pages
LiveTracker All Network Details 4
No ratings yet
LiveTracker All Network Details 4
1 page
Ensembles 1
No ratings yet
Ensembles 1
4 pages
Ensemble Methods
100% (1)
Ensemble Methods
15 pages
Bagging+Boosting+Gradient Boosting
100% (1)
Bagging+Boosting+Gradient Boosting
48 pages
Machine Learning and Data Mining: Prof. Alexander Ihler Fall 2012
No ratings yet
Machine Learning and Data Mining: Prof. Alexander Ihler Fall 2012
36 pages
Ch-4 Ensemble Learning
No ratings yet
Ch-4 Ensemble Learning
18 pages
ISO 03964-2016
No ratings yet
ISO 03964-2016
16 pages
Instant Download The Concise Industrial Flow Measurement Handbook-A Definitive Practical Guide 1st Edition Michael A. Crabtree (Author) PDF All Chapters
100% (4)
Instant Download The Concise Industrial Flow Measurement Handbook-A Definitive Practical Guide 1st Edition Michael A. Crabtree (Author) PDF All Chapters
55 pages
Ensemble Learning Methods
100% (1)
Ensemble Learning Methods
24 pages
Differential Equations - MTH401 Handouts Lecture 17
No ratings yet
Differential Equations - MTH401 Handouts Lecture 17
15 pages
Screenshot 2022-09-13 at 9.48.40 PM
No ratings yet
Screenshot 2022-09-13 at 9.48.40 PM
71 pages
Topic 1 - Intro To STS-2
No ratings yet
Topic 1 - Intro To STS-2
49 pages
Unit 3
No ratings yet
Unit 3
99 pages
Complete Plan to Start Learning CGI Ads
No ratings yet
Complete Plan to Start Learning CGI Ads
8 pages
GUIDELINE WinGD-2S iCER-Installation
No ratings yet
GUIDELINE WinGD-2S iCER-Installation
25 pages
Specifecation For Civil Works
No ratings yet
Specifecation For Civil Works
128 pages
Codan 9323-9360 Reference Manual
No ratings yet
Codan 9323-9360 Reference Manual
228 pages
Techno Midterm
No ratings yet
Techno Midterm
7 pages
Water Softener Manual
No ratings yet
Water Softener Manual
24 pages
Pariksha Pe Charcha
No ratings yet
Pariksha Pe Charcha
2 pages
Paper AI IN Performance Management
No ratings yet
Paper AI IN Performance Management
11 pages
ATS5630 Elastomeric Strip Seal Expansion Joints
No ratings yet
ATS5630 Elastomeric Strip Seal Expansion Joints
11 pages
Ensemble Classifiers
100% (1)
Ensemble Classifiers
37 pages
Candidate's CV Tracker - Assessment (QA - QC-Inspector)
No ratings yet
Candidate's CV Tracker - Assessment (QA - QC-Inspector)
8 pages
energy-efficiency-in-hot-rolling-mill-plants-descale-pumps
No ratings yet
energy-efficiency-in-hot-rolling-mill-plants-descale-pumps
6 pages
Topic Map
No ratings yet
Topic Map
4 pages
LF 45 - 55
100% (3)
LF 45 - 55
410 pages
Mayank's Profile - India-4
No ratings yet
Mayank's Profile - India-4
2 pages
Confirmation 1345851
No ratings yet
Confirmation 1345851
2 pages
Omta Merit Certificates: Guidelines For Ordering (See Page 2 For Fillable Order Form)
No ratings yet
Omta Merit Certificates: Guidelines For Ordering (See Page 2 For Fillable Order Form)
2 pages
Polycab January 2018 PDF
No ratings yet
Polycab January 2018 PDF
2 pages
MC Suction Boxes - 1000
100% (1)
MC Suction Boxes - 1000
2 pages
Bat 5
No ratings yet
Bat 5
15 pages
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet
Competitive Learning: Fundamentals and Applications for Reinforcement Learning through Competition
From Everand
Competitive Learning: Fundamentals and Applications for Reinforcement Learning through Competition
Fouad Sabry
No ratings yet
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet
Column Schedule
No ratings yet
Column Schedule
1 page
Computer Systems Servicing NC II CG
100% (2)
Computer Systems Servicing NC II CG
40 pages
Iatf Process Audit Check Sheet Format
No ratings yet
Iatf Process Audit Check Sheet Format
14 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

ML Lecture 7 - Ensemble Learning

Uploaded by

ML Lecture 7 - Ensemble Learning

Uploaded by

Outline Ensemble Learning Random Forest Bagging Boosting Ensemble of Trees

Ensemble Learning in Machine Learning

Prof. Dr. Dewan Md. Farid

Dept. of CSE, UIU

July 04, 2023

I It is the process of combining different classification techniques to

I It returns a class label prediction for new instances based on the

I It improves the classification accuracy of class-imbalanced data.

Ensemble Classifier (con.)

Figure: An example of an ensemble classifier.

Ensemble Classifier (con.)

Figure: Ensemble model to improve classification accuracy.

The Random Forest also known as Random Decision Forest is an

Algorithm 1 Bagging Algorithm

AdaBoost (short for Adaptive Boosting) is a popular boosting algorithm,

In boosting, weights are assigned to each training instance. A series of k

AdaBoost (short for Adaptive Boosting) is a popular boosting algorithm.

If an instance was incorrectly classified, it weight is increased. If an

Where err (xj ) is the misclassification error of instance xi . If the instance

If a instance in round i was correctly classified, its weight is multiplied by

I It combines a number of decision trees in order to reduce the risk of

Algorithm 3 Ensemble of Trees

* THANK YOU *

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

ML Lecture 7 - Ensemble Learning

Uploaded by

ML Lecture 7 - Ensemble Learning

Uploaded by

Outline Ensemble Learning Random Forest Bagging Boosting Ensemble of Trees

Ensemble Learning in Machine Learning

Prof. Dr. Dewan Md. Farid

Dept. of CSE, UIU

July 04, 2023

I It is the process of combining different classification techniques to

I It returns a class label prediction for new instances based on the

I It improves the classification accuracy of class-imbalanced data.

Ensemble Classifier (con.)

Figure: An example of an ensemble classifier.

Ensemble Classifier (con.)

Figure: Ensemble model to improve classification accuracy.

The Random Forest also known as Random Decision Forest is an

Algorithm 1 Bagging Algorithm

AdaBoost (short for Adaptive Boosting) is a popular boosting algorithm,

In boosting, weights are assigned to each training instance. A series of k

AdaBoost (short for Adaptive Boosting) is a popular boosting algorithm.

If an instance was incorrectly classified, it weight is increased. If an

Where err (xj ) is the misclassification error of instance xi . If the instance

If a instance in round i was correctly classified, its weight is multiplied by

I It combines a number of decision trees in order to reduce the risk of

Algorithm 3 Ensemble of Trees

*** THANK YOU ***

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

* THANK YOU *