0% found this document useful (0 votes)
17 views18 pages

ML Lecture 7 - Ensemble Learning

The document provides an overview of Ensemble Learning in Machine Learning, detailing methods such as Random Forest, Bagging, and Boosting. It explains how these techniques combine multiple classifiers to improve prediction accuracy, particularly in class-imbalanced data. Additionally, it includes algorithms for Bagging and AdaBoost, illustrating the processes involved in creating ensemble models.

Uploaded by

Saraf
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views18 pages

ML Lecture 7 - Ensemble Learning

The document provides an overview of Ensemble Learning in Machine Learning, detailing methods such as Random Forest, Bagging, and Boosting. It explains how these techniques combine multiple classifiers to improve prediction accuracy, particularly in class-imbalanced data. Additionally, it includes algorithms for Bagging and AdaBoost, illustrating the processes involved in creating ensemble models.

Uploaded by

Saraf
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Outline Ensemble Learning Random Forest Bagging Boosting Ensemble of Trees

Ensemble Learning in Machine Learning

Prof. Dr. Dewan Md. Farid

Dept. of CSE, UIU

July 04, 2023

Prof. Dr. Dewan Md. Farid: Ensemble Learning in Machine Learning Dept. of CSE, UIU
Outline Ensemble Learning Random Forest Bagging Boosting Ensemble of Trees

Ensemble Learning

Random Forest

Bagging

Boosting

Ensemble of Trees

Prof. Dr. Dewan Md. Farid: Ensemble Learning in Machine Learning Dept. of CSE, UIU
Outline Ensemble Learning Random Forest Bagging Boosting Ensemble of Trees

Ensemble Classifier

I It is the process of combining different classification techniques to


build a powerful composite model from the data.

I It returns a class label prediction for new instances based on the


individual classifiers vote.

I It improves the classification accuracy of class-imbalanced data.

Prof. Dr. Dewan Md. Farid: Ensemble Learning in Machine Learning Dept. of CSE, UIU
Outline Ensemble Learning Random Forest Bagging Boosting Ensemble of Trees

Ensemble Classifier (con.)

Figure: An example of an ensemble classifier.

Prof. Dr. Dewan Md. Farid: Ensemble Learning in Machine Learning Dept. of CSE, UIU
Outline Ensemble Learning Random Forest Bagging Boosting Ensemble of Trees

Ensemble Classifier (con.)


It is often advantageous to take the training data and derive several
sub-data sets from it, learn a classifier from each, and combing them to
produce an ensemble model.

New Data
Instances
Sub-Data1 Model, M1

Data, D Combine
Sub-Data 2 Model, M2 Prediction
Votes

Sub-Data k Model, Mk

Figure: Ensemble model to improve classification accuracy.

Prof. Dr. Dewan Md. Farid: Ensemble Learning in Machine Learning Dept. of CSE, UIU
Outline Ensemble Learning Random Forest Bagging Boosting Ensemble of Trees

Random Forest

The Random Forest also known as Random Decision Forest is an


ensemble learning method for classification and regression that able to
classify large amounts of data with accuracy.
I It constructs a number of decision trees based on randomly selected
subset of features at training time and outputting the class that is
the ensemble of trees vote for the most popular class.
I The selection of a random subset of features is an example of the
random subspace method.
Random subspace method (or attribute bagging) is also an ensemble
classifier that consists of several classifiers each operating in a subspace
of the original feature space, and outputs the class based on the outputs
of these individual classifiers. It is an attractive choice for classifying high
dimensional data.

Prof. Dr. Dewan Md. Farid: Ensemble Learning in Machine Learning Dept. of CSE, UIU
Outline Ensemble Learning Random Forest Bagging Boosting Ensemble of Trees

Bagging
I Combining the decision of different mining models means the various
outputs into a single prediction. The simplest way to do this in the
case of classification is to take a vote (perhaps a weighted vote); in
the case of numeric prediction it is to calculate the average (perhaps
a weighted average).
I Bagging and boosting both adopt this approach, but they derive
the individual models in different ways. In bagging the models
receive the equal weight, whereas in boosting weighting is used to
give more influence to the more successful ones.
I To introduce bagging, suppose that several training datasets of the
same size are chosen at random for the problem domain. Imaging
using a particular machine learning technique to build a decision tree
for each dataset. We can combine the trees by having them vote on
each test instance. If one class receives more votes than any other,
it is taken as the correct one. Predictions made by voting become
more reliable as more votes are taken into account.
Prof. Dr. Dewan Md. Farid: Ensemble Learning in Machine Learning Dept. of CSE, UIU
Outline Ensemble Learning Random Forest Bagging Boosting Ensemble of Trees

Bagging (con.)
Bagging also known as Bootstrap Aggregation, combines different
classifiers into a single prediction model. It uses voting technique
(perhaps a weighted vote) for classifying a new instance.

Algorithm 1 Bagging Algorithm


Input: Training data, D, number of iterations, k, and a learning scheme.
Output: Ensemble model, M ∗
Method:
1: for i = 1 to k do
2: create bootstrap sample Di , by sampling D with replacement;
3: use Di , and learning scheme to derive a model, Mi ;
4: end for
To use M ∗ to classify a new instance, xNew :
Each Mi ∈ M ∗ classify xNew and return the majority vote;

Prof. Dr. Dewan Md. Farid: Ensemble Learning in Machine Learning Dept. of CSE, UIU
Outline Ensemble Learning Random Forest Bagging Boosting Ensemble of Trees

Boosting

AdaBoost (short for Adaptive Boosting) is a popular boosting algorithm,


which considers a series of classifiers and combines the votes of each
individual classifier for classifying an unknown or known instances.
I In boosting, weights are assigned to each training instance.
I A series of k classifiers is iteratively learned.
I After a classifier, Mi , is learned, the weights are updated to allow
the subsequent classifier, Mi+1 , to pay more attention to the
instances that were misclassified by Mi .
I The final boosted classifier, M ∗ , combines the votes of each
individual classifier.

Prof. Dr. Dewan Md. Farid: Ensemble Learning in Machine Learning Dept. of CSE, UIU
Outline Ensemble Learning Random Forest Bagging Boosting Ensemble of Trees

Boosting (con.)

In boosting, weights are assigned to each training instance. A series of k


classifiers is iteratively learned. After a classifier, Mi , is learned, the
weights are updated to allow the subsequent classifier, Mi+1 , to “pay
more attention” to the training instances that were misclassified by Mi .
The final boosted classifier, M ∗ , combines the votes of each individual
classifier, where the weight of each classifier’s vote is a function of its
accuracy.

Prof. Dr. Dewan Md. Farid: Ensemble Learning in Machine Learning Dept. of CSE, UIU
Outline Ensemble Learning Random Forest Bagging Boosting Ensemble of Trees

Boosting (con.)

AdaBoost (short for Adaptive Boosting) is a popular boosting algorithm.


Suppose we want to boost the accuracy of a learning method. We are
given D, a data set of d class-labeled instances,
(x1 , y1 ), (x2 , y2 ), · · · , (xd , yd ), where yi is the class label of instance xi .
Initially, AdaBoost assigns each training instance an equal weight of d1 .
Generating k classifiers for the ensemble requires k rounds through the
rest of the algorithm. We can sample to form any sized training set, not
necessarily of size d. Sampling with replacement is used - the same
instance may be selected more than once. Each instance?s chance of
being selected is based on its weight. A classifier model, M, is derived
form the training instances of Di . Its error is then calculated using Di as
a test set. The weights of the training instances are the adjusted
according to how they were classified.

Prof. Dr. Dewan Md. Farid: Ensemble Learning in Machine Learning Dept. of CSE, UIU
Outline Ensemble Learning Random Forest Bagging Boosting Ensemble of Trees

Boosting (con.)

If an instance was incorrectly classified, it weight is increased. If an


instance was correctly classified, its weight is decreased. An instance’s
weight reflects how difficult it is to classify - the higher the weight, the
more often it has been misclassified. These weights will be used to
generate the training samples for the classifier of the next round. The
basic idea is that when we build a classifier, we want it to focus more on
the misclassified instances of the previous round. Some classifiers may be
better at classifying some “difficult” instances than others. In this way,
we build a series of classifiers that complement each other.

Prof. Dr. Dewan Md. Farid: Ensemble Learning in Machine Learning Dept. of CSE, UIU
Outline Ensemble Learning Random Forest Bagging Boosting Ensemble of Trees

Error Rate

To compute the error rate of model Mi , we the sum the weights of each
of the instances in Di , that Mi misclassified. That is,
d
X
error (Mi ) = wj ∗ err (xj ) (1)
j=1

Where err (xj ) is the misclassification error of instance xi . If the instance


xi was misclassified, then err (xi ) is 1. Otherwise, it is 0. If the
performance of classifier Mi is so poor that is its error exceeds 0.5, then
we abandon it. Instead, we try again by generating a new Di training set,
from which we derive a new Mi .

Prof. Dr. Dewan Md. Farid: Ensemble Learning in Machine Learning Dept. of CSE, UIU
Outline Ensemble Learning Random Forest Bagging Boosting Ensemble of Trees

Normalising Weight

If a instance in round i was correctly classified, its weight is multiplied by


error (Mi )
error ( 1−error (Mi ) ). Once the weights of all of the correctly classified
instances are updated, the weights for all instances (including the
misclassified instances) are normalised so that their sum remains the
same as it was before. To normalise a weight, we multiply it by the sum
of the old weights, divided by the sum of the new weights. As a result,
the weights of misclassified instances are increased and the weights of
correctly classified instances are decreased.

Prof. Dr. Dewan Md. Farid: Ensemble Learning in Machine Learning Dept. of CSE, UIU
Outline Ensemble Learning Random Forest Bagging Boosting Ensemble of Trees

AdaBoost Algorithm
Algorithm 2 AdaBoost Algorithm
Input: Training data, D, number of iterations, k, and a learning scheme.
Output: Ensemble model, M ∗
Method:
1: initialise weight, xi ∈ D to d1 ;
2: for i = 1 to k do
3: sample D with replacement according to instance weight to obtain
Di ;
4: use Di , and learning scheme to derive a model, Mi ;
5: compute error (Mi );
6: if error (Mi ) ≥ 0.5 then
7: go back to step 3 and try again;
8: end if
9: for each correctly classified xi ∈ D do
error (Mi )
10: multiply weight of xi by ( 1−error (Mi ) );
11: end for
12: normalise weight of instances;
13: end for
To use M ∗ to classify a new instance, xNew :
1: initialise weight of each class to zero;
2: for i = 1 to n do
3: wi = log 1−error (Mi )
error (Mi ) ; // weight of the classifier’s vote
4: c = Mi (xNew ); // class prediction by Mi
5: add wi to weight for class c;
6: end for
7: return class with largest weight;

Prof. Dr. Dewan Md. Farid: Ensemble Learning in Machine Learning Dept. of CSE, UIU
Outline Ensemble Learning Random Forest Bagging Boosting Ensemble of Trees

Ensemble of Trees

I It combines a number of decision trees in order to reduce the risk of


overfitting.
I It creates several sub-datasets, D1 , · · · , Di , · · · , Dk , from the original
training data, D.
I It groups the features of each sub-dataset, Di , and build a tree, DTj
on each group.
I It calculates the error rate of DTj on sub-datasets, Di . If error rate
of DTj is less than or equal to threshold value then the tree is
consider for ensemble.
I To make a prediction on a new instance, each tree’s prediction is
counted as a vote for one class. The label is predicted to be the
class that receives the most votes (majority voting).

Prof. Dr. Dewan Md. Farid: Ensemble Learning in Machine Learning Dept. of CSE, UIU
Outline Ensemble Learning Random Forest Bagging Boosting Ensemble of Trees

Algorithm 3 Ensemble of Trees


Input: Training data, D, and C4.5 learning algorithm.
Output: A set of trees, DT ∗
Method:
1: DT ∗ = ∅;
2: create sun-datasets, D1 , · · · , Di , · · · , Dk , from the training data, D;
3: for i = 1 to k do
4: group features in Di into m groups;
5: for j = 1 to m do
6: build a DTj with jth feature group;
7: compute error (DTj ) on Di ;
8: if error (DTj ) ≤ threshold value then
9: DT ∗ = DT ∗ ∪ DTj ;
10: end if
11: end for
12: end for
To use DT ∗ to classify a new instance, xNew :
Each DTi ∈ DT ∗ classify xNew and return majority voting;

Prof. Dr. Dewan Md. Farid: Ensemble Learning in Machine Learning Dept. of CSE, UIU
Outline Ensemble Learning Random Forest Bagging Boosting Ensemble of Trees

*** THANK YOU ***

Prof. Dr. Dewan Md. Farid: Ensemble Learning in Machine Learning Dept. of CSE, UIU

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy