0% found this document useful (0 votes)

20 views54 pages

ML-Unit I - Ensemble Methods

ensemble emathods

Uploaded by

t40088356

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views54 pages

ML-Unit I - Ensemble Methods

ensemble emathods

Uploaded by

t40088356

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 54

Machine Learning

Dr. Sunil Saumya

IIIT Dharwad
Ensemble Methods: Bagging and Boosting
ML phenomena
Ensemble Learning
● A problem in machine learning is that individual models tend to perform
poorly.
○ In other words, they tend to have low prediction accuracy.
● To mitigate this problem, we combine multiple models to get one with a
better performance. This process is known as Ensemble learning.
○ The individual models that we combine are known as weak learners.
○ We call them weak learners because they either have a high bias or high
variance.
Ensemble Learning
● Ensemble learning improves a model’s performance in mainly three ways:
○ By reducing the variance of weak learners
○ By reducing the bias of weak learners,
○ By improving the overall accuracy of strong learners.
● Approach in Ensemble learning:
○ Bagging: used to reduce the variance of weak learners
○ Boosting: used to reduce the bias of weak learners
○ Stacking: used to improve the overall accuracy of strong learners
Bagging
Bagging
● Voting or Averaging of predictions of multiple pre-trained models
Boosting
Stacking
Stacking
● Use predictions of multiple models as “features” to train a new
model and use the new model to make predictions on test data
Random Forest
● Random Forest is one of the most popular and commonly used algorithms by
Data Scientists.
● Random forest is a Supervised Machine Learning Algorithm that is used
widely in Classification and Regression problems.
● It builds decision trees on different samples and takes their majority vote for
classification and average in case of regression.
● Random Forest uses a bagging approach of ensemble learning.
○ Bagging, also known as Bootstrap Aggregation
Random Forest: algorithm
● Step 1: In the Random forest model, a subset of data points and a subset of
features is selected for constructing each decision tree called Bootstrapping.
○ Simply put, n random records and m features are taken from the data set
having k number of records.
● Step 2: Individual decision trees are constructed for each sample.
● Step 3: Each decision tree will generate an output.
● Step 4: Final output is considered based on Majority Voting or Averaging for
Classification and regression, respectively.
Random Forest: Bootstrapping
Bootstrapping:
● It is a process of sampling dataset.
● The sampling can be done:
○ Row wise: with replacement
and without replacement
○ Column wise
○ Combination of row and
…………………..
column sampling …………………..

Dataset shape: (100*6)

Random Forest: Bootstrapping x1 x2 x3 x4 x5 Vote

27 8 2 7 0.16 0

Bootstrapping: (Row wise sampling) 40 42 2 9 0.26 1

… …. … … … ….
Without
replacement 14 13 32 6 0.87 0

x1 x2 x3 x4 x5 Vote

With 2 25 8 5 0.34 0
replacement
40 42 2 9 0.26 1

… …. … … … ….

40 42 2 9 0.26 1
Original Dataset Bootstrapped Dataset- Shape: (50*6)
Random Forest: Bootstrapping x1 x3 x4 Vote

27 2 7 0

Bootstrapping: (Column wise sampling) 40 2 9 1

… … … ….
Dataset 1
14 32 6 0

x2 x3 x5 Vote
Dataset 2
25 8 0.34 0

42 2 0.26 1

…. … … ….

42 2 0.26 1
Original Dataset Bootstrapped Dataset- Shape: (1000*3)
Random Forest: Bootstrapping x1 x3 x4 Vote

27 2 7 0

Bootstrapping: (Row+column wise sampling) 40 2 9 1

… … … ….
Without
replacement 14 32 6 0

x1 x2 x5 Vote

With 2 25 0.34 0
replacement
40 42 0.26 1

… …. … ….

40 42 0.26 1
Original Dataset Bootstrapped Dataset- Shape: (50*6)
Random Forest: Bootstrapping
Aggregation:
● We aggregate all individual decision trees for prediction.
○ In Classification: we do voting
○ In Regression: we do averaging
Random Forest: Bootstrapping
Random Forest: How it performs so well?
● Each individual weak
learner is exposed to only
1k instances.
● Other 2k instances are
unseen for every DT.

● In general, we always keep

30% data unseen for every
weak learner.
● RF performs good because
it gives LB-LV model.
Random Forest Vs Bagging
Are Random Forest and Bagging same?
● Random Forest employs the bagging approach, but differs from bagging in
two ways:
○ Bagging is an ensemble of any ML model in general, but Random forest
is strictly an ensemble of decision trees.
○ Bagging employs a tree-level sampling strategy, whereas Random Forest
employs a node-level sampling strategy.
Random Forest Vs Bagging
Are Random Forest and Bagging same?
● Random Forest employs the bagging approach, but differs from bagging in
two ways:
○ Bagging is an ensemble of any ML model in general, but Random forest
is strictly an ensemble of decision trees.
○ Bagging employs a tree-level sampling strategy, whereas Random Forest
employs a node-level sampling strategy.
Random Forest: Important features
Label p0 p1 p2 p3 …. … p782 p783 P784

4 0 0 0 0 20 11 0 0 0

9 0 0 0 0 186 0 0 0 0

0 0 0 0 0 90 0 0 0 0

6 0 0 0 0 54 90 0 0 0

0 0 0 0 0 255 0 0 0 0

4 0 0 0 0 0 87 0 0 0 Image size: 28*28

Random Forest: Important features
The array below shows the importance of
every column in the dataset.
Random Forest: Important features
The heat map of important features:
Random Forest: Hyperparameters
● n_estimators: Number of trees the algorithm builds before averaging the
predictions.
● max_features: Maximum number of features random forest considers
splitting a node.
● mini_sample_leaf: Determines the minimum number of leaves required to
split an internal node.
● criterion: How to split the node in each tree? (Entropy/Gini impurity/Log
Loss)
● max_leaf_nodes: Maximum leaf nodes in each tree
Boosting: Adaboost
● Adaboost is a stagewise additive method.
● Three important points to understand before starting adaboost algorithm:
○ Weak Learners
○ Decision stumps
○ +1 and -1
● Weak Learners:
○ A weak learner produces a classifier which is only slightly more accurate
than random classification.
■ k-Nearest Neighbors, with k=1
■ Multi-Layer Perceptron, with a single node
■ Naive Bayes, operating on a single input variable.
Boosting: Adaboost
● Three important points to understand before starting adaboost algorithm:
○ Weak Learners
○ Decision stumps
○ +1 and -1
● Decision stumps:
○ A decision tree with a single node operating on one input variable, the output of
which makes a prediction directly.
Boosting: Adaboost
● Three important points to understand before starting adaboost algorithm:
○ Weak Learners
○ Decision stumps
○ +1 and -1
● +1 and -1:
○ In adaboost:
■ For positive class we use +1
■ For negative class we use -1.
● We don't use 0 for negative class.
Adaboost: geometric intuition

h(x)= ∝1h1(x)+∝2h2(x)+∝3h3(x)

Weight =∝1 Weight =∝2 Weight =∝3

Adaboost: geometric intuition

h(x)= ∝1h1(x)+∝2h2(x)+∝3h3(x)
= 2*(-1)+3*(+1)+2(-1)
Weight =∝1=2 Weight =∝2=3 Weight =∝3=2
= -2+3-2 = -1
Adaboost: working example
Consider the initial dataset as:
X1 X2 Y

3 7 1

2 9 0

1 4 1

9 8 0

3 7 0

Original Dataset
Adaboost: working example
Consider the initial dataset as:
X1 X2 Y X1 X2 Y Initial weight (=1/n)

3 7 1 3 7 1 0.2

2 9 0 2 9 0 0.2

1 4 1 1 4 1 0.2

9 8 0 9 8 0 0.2

3 7 0 3 7 0 0.2

Original Dataset Give the initial weight to each row.

Here, number of row (n) = 5
Adaboost: working example
Now we will start with stage 1(Model 1):

X1 X2 Y weight ● Create a decision stump for the given dataset.

● Consider the decision stump we have created
3 7 1 0.2
is as: X1>5
2 9 0 0.2

1 4 1 0.2

9 8 0 0.2 1 data 4 data

point points
3 7 0 0.2 will will
come come
Original Dataset ● Now, do the prediction based on this decision
stump for all data points.
Adaboost: working example
Stage 1 (Model 1):

● If we observe the Y-pred column, we find that all

X1 X2 Y Y-pred weight predictions are true except those highlighted.
3 7 1 1 0.2
Y=0 ❌ Y-pred=1 and Y=1 ❌ Y-pred=0
2 9 0 1 0.2
● Find ∝ for Model 1.
1 4 1 0 0.2
○ Where ∝ is the weight of Model 1 in final
9 8 0 0 0.2 prediction.
3 7 0 0 0.2 ○ ∝ depends on error rate of Model 1.
■ Error rate is how many mistakes Model 1
Original Dataset has done on the give dataset.
● If ∝ will be high error rate will be low and vice
Adaboost: working example
Stage 1 (Model 1):

● Consider three values of error rate

X1 X2 Y Y-pred weight ○ error rate (model a)=0%
3 7 1 1 0.2 ○ error rate (model b)=100%
○ error rate (model c)=50%
2 9 0 1 0.2
● Which of the above model is reliable?
1 4 1 0 0.2
○ Clearly, model a is reliable, because here error
9 8 0 0 0.2 rate is 0% so ∝ will be near to 1.
3 7 0 0 0.2 ○ Then, model b is also reliable, because here error
rate is 100% so ∝ will be near to -1.
Original Dataset
○ model c is not reliable, because here error rate is
50% so ∝ will be 0.
Adaboost: working example
Stage 1 (Model 1):
X1 X2 Y Y-pred weight That means ∝ we can
compute by computing
3 7 1 1 0.2 following function:
2 9 0 1 0.2
error rate (model a)=0%
1 4 1 0 0.2 error rate (model b)=100%
9 8 0 0 0.2 error rate (model c)=50%
3 7 0 0 0.2
Original Dataset
Adaboost: working example
Stage 1 (Model 1): error rate
X1 X2 Y Y-pred weight Weight sum of all such data points which are misclassified.

3 7 1 1 0.2
Therefore, error rate (model 1)= 0.2+0.2 = 0.4

2 9 0 1 0.2 ∝1⇒½ ln ((1-0.4)/0.4) ⇒ ½ ln(0.6/0.4) ⇒ ½ * ln (1.5)

1 4 1 0 0.2 ⇒ ½* 0.40 ⇒ 0.20
9 8 0 0 0.2
Therefore, Stage 1 ∝1= 0.20
3 7 0 0 0.2
Original Dataset
Stage 1 completes here.
Adaboost: working example
● Next, we increase the weight of the misclassified data points
before sending it to Stage 2.
X1 X2 Y Y-pred weight ○ We use the technique called upsampling to increase
the weight of a specific data points.
3 7 1 1 0.2 ○ By upsampling we mean boosting weights of a few
2 9 0 1 0.2 data points.

1 4 1 0 0.2 Weight update formula:

9 8 0 0 0.2 For misclassified points: New_weight = current_weight * e∝1
3 7 0 0 0.2
correctly classified points: New_weight = current_weight * e-∝1
Original Dataset
Adaboost: working example Weight update formula:

For misclassified points:

New_weight = current_weight * e∝1

X1 X2 Y Y-pred weight updated
correctly classified points:

3 7 1 1 0.2 0.16 New_weight = current_weight * e-∝1

2 9 0 1 0.2 0.24

1 4 1 0 0.2 0.24

9 8 0 0 0.2 0.16

3 7 0 0 0.2 0.16
Original Dataset
Adaboost: working example

X1 X2 Y Y-pred weight Updated Normalized

weight weight

3 7 1 1 0.2 0.16 0.166

2 9 0 1 0.2 0.24 0.25

1 4 1 0 0.2 0.24 0.25

9 8 0 0 0.2 0.16 0.166

3 7 0 0 0.2 0.16 0.166

Original Dataset
Adaboost: working example Upsampling:

● Generate n (=5) random numbers in

X1 X2 Y New Range the range 0 and 1. Say numbers are:
weight ○ 0.13
3 7 1 0.166 0 - 0.166 ○ 0.43
2 9 0 0.25 0.166 - 0.416 ○ 0.62
○ 0.50
1 4 1 0.25 0.416 - 0.666
○ 0.8
9 8 0 0.166 0.666 - 0.832 ● For every random number generated
3 7 0 0.166 0.832 - 1.0 choose the corresponding row based
on the Range column for new stage
Original Dataset dataset.
Adaboost: working example Upsampling:

● Generate n (=5) random numbers in

X1 X2 Y New Range the range 0 and 1. Say numbers are:
weight ○ 0.13 → row 1
3 7 1 0.166 0 - 0.166 ○ 0.43 → row 3
2 9 0 0.25 0.166 - 0.416 ○ 0.62 → row 3
○ 0.50 → row 3
1 4 1 0.25 0.416 - 0.666
○ 0.8 → row 4
9 8 0 0.166 0.666 - 0.832 ● Therefore, in the new dataset for
3 7 0 0.166 0.832 - 1.0 Stage 2, the rows available will be

Row1, Row3, Row3, Row3, Row4

Original Dataset
Adaboost: working example
● Upsampling helps us to choose those row
X1 X2 Y New Range more times which has larger range. Hence,
weight
they are boosted.
3 7 1 0.166 0 - 0.166 ● Repeat all these for n_estimators:
1 4 1 0.25 0.416 - 0.666 ○ Create a decision stump based on new
1 4 1 0.25 0.416 - 0.666
stage data.
○ Calculate ∝2 for new decision stump.
1 4 1 0.25 0.416 - 0.666
○ Find new wight of each data points
9 8 0 0.166 0.666 - 0.832 ○ Find new range based on the new weight.
○ Using upsampling create new dataset.
New Dataset for Stage 2 with dominant
row 3 At last we have ∝1, ∝2, ∝3, …..,
∝n (for n
estimators)
Adaboost: working example
At last we have ∝1, ∝2, ∝3, …..,
∝n (for n
X1 X2 Y New Range estimators)
weight

3 7 1 0.166 0 - 0.166
● For new data points, the final prediction will
be made on following formula:
1 4 1 0.25 0.416 - 0.666

1 4 1 0.25 0.416 - 0.666 h(x)= ∝1h1(x)+∝2h2(x)+∝3h3(x) +...+

1 4 1 0.25 0.416 - 0.666
∝nhn(x)

9 8 0 0.166 0.666 - 0.832 Where, h1(x), h2(x) ..hn(x) are predictions of

estimators h1, h2, …, hn respectively on test
New Dataset for Stage 2 with dominant data.
row 3
Gradient Boosting
● It is a boosting algorithm.
● It works in a sequential stage wise addition.
● Consider the following dataset:

iq cgpa salary
We will create three estimators for this simple dataset.

90 8 3

100 7 4 Model 1: is an average of output variable also known as leaf

110 6 8
Therefore, Model 1 prediction = (3+4+8+6+3)/5 = 4.8
120 9 6
Hence, whatever will be the iq and cgpa, Model 1 prediction will
80 5 3
always be 4.8.
Gradient Boosting
● Model 1: is an average of output variable also known as leaf
● Therefore, Model 1 prediction = (3+4+8+6+3)/5 = 4.8

iq cgpa salary Pred1

Next, we will calculate the loss of Model 1 using

90 8 3 4.8 Pseudo_residual = actual - prediction

100 7 4 4.8

110 6 8 4.8

120 9 6 4.8

80 5 3 4.8
Gradient Boosting
● Model 1: is an average of output variable also known as leaf
● Therefore, Model 1 prediction = (3+4+8+6+3)/5 = 4.8
Calculate: Pseudo_residual = actual - prediction
iq cgpa salary Pred1 res1
● Next, we will transfer these errors to Model 2.
90 8 3 4.8 -1.8
● We will build Model 2 using decision tree on
100 7 4 4.8 -0.8 following dataset:
110 6 8 4.8 3.2 Input: iq and cgpa
120 9 6 4.8 1.2
Output: res1
80 5 3 4.8 -1.8
Gradient Boosting
● Model 1 prediction = 4.8 iq<=105
● Model 2: construction using DT

iq cgpa res1 iq<=95 cgpa<=7.5

90 8 -1.8
-1.8 -0.8 3.2 1.2
100 7 -0.8

110 6 3.2

120 9 1.2

80 5 -1.8
Gradient Boosting
● Model 1 prediction = 4.8 iq<=105
● Model 2: construction using DT

iq cgpa res1 Pred2 iq<=95 cgpa<=7.5

90 8 -1.8 -1.8

100 7 -0.8 -0.8 -1.8 -0.8 3.2 1.2

110 6 3.2 3.2

Here, we are ready with two models.
120 9 1.2 1.2 Let’s do the predictions on gradient boosting with these two
models.
80 5 -1.8 -1.8

Prediction = Model 1 prediction + Model 2 prediction

Gradient Boosting
● Model 1 prediction = 4.8 Row# Pred=
iq<=105
(iq,cgpa) M1+M2
● Model 2: construction using DT
1 4.8-1.8=3

iq cgpa res1 Pred2 iq<=95 cgpa<=7.5 2 4.8-0.8=4

90 8 -1.8 -1.8 3 4.8+3.2=8

100 7 -0.8 -0.8 -1.8 -0.8 3.2 1.2 4 4.8+1.2=6

110 6 3.2 3.2 5 4.8-1.8=3

Here, we are ready with two models.
120 9 1.2 1.2 Let’s do the predictions on gradient boosting with these two
models.
80 5 -1.8 -1.8 Model overfitting
Prediction = Model 1 prediction + Model 2 prediction
Gradient Boosting
● Model 1 prediction = 4.8
● Model 2: construction using DT
iq cgpa salary res1 Pred2 PredBoost res2
(M1+0.1*M2)
Pseudo_residual = actual - prediction
90 8 3 -1.8 -1.8 4.62 -1.62

100 7 4 -0.8 -0.8 4.72 -0.72 Here, res2 < res1

110 6 8 3.2 3.2 5.12 2.88 This process need to be repeated, where
120 9 6 1.2 1.2 4.92 1.08 by adding a new model our residual is
approaching towards zero.
80 5 3 -1.8 -1.8 4.62 -1.62
Therefore, now let’s build Model 3.
Gradient Boosting
● Model 1 prediction = 4.8
● Model 2: Our DT is ready.
● Model 3: construct DT Model 2 DT

iq cgpa res2
iq<=105
90 8 -1.62

100 7 -0.72
iq<=95 cgpa<=7.5
110 6 2.88

120 9 1.08
-1.62 -0.72 2.88 1.08
80 5 -1.62
Model 3 DT
Gradient Boosting
● Model 1 prediction = 4.8
● Model 2: Our DT is ready.
● Model 3: construct DT Model 2 DT

iq cgpa res2 pred3

iq<=105
90 8 -1.62 -1.62

100 7 -0.72 -0.72

iq<=95 cgpa<=7.5
110 6 2.88 2.88

120 9 1.08 1.08

-1.62 -0.72 2.88 1.08
80 5 -1.62 -1.62
Model 3 DT
Gradient Boosting
Model 1 prediction = 4.8
Model 2 DT
Final PredBoost = M1 prediction + 0.1*Model 2 prediction + 0.1*Model 3 prediction

iq cgpa salary res1 Pred2 PredBoost res2 pred3 Final PredBoost res3
(M1+0.1*M2) (M1+0.1*M2+0.1*M3)

90 8 3 -1.8 -1.8 4.62 -1.62 -1.62 4.62 - 0.1*1.62=4.45 -1.45

100 7 4 -0.8 -0.8 4.72 -0.72 -0.72 4.72 - 0.1*0.72=4.64 -0.64

110 6 8 3.2 3.2 5.12 2.88 2.88 5.12 + 0.1*2.88=5.40 2.6

120 9 6 1.2 1.2 4.92 1.08 1.08 4.92 + 0.1*1.08=5.02 0.98

80 5 3 -1.8 -1.8 4.62 -1.62 -1.62 4.62 - 0.1*1.62=4.45 -1.45

Applying Wisdom to Contemporary World Problems Robert J. Sternberg download
100% (2)
Applying Wisdom to Contemporary World Problems Robert J. Sternberg download
51 pages
PA.UNIT - IV
No ratings yet
PA.UNIT - IV
45 pages
Nursing Courses
No ratings yet
Nursing Courses
7 pages
Research-Based Character Education: Marvin W. Berkowitz and Melinda C. Bier
No ratings yet
Research-Based Character Education: Marvin W. Berkowitz and Melinda C. Bier
14 pages
Planets English Reading Comprehension Worksheet in Colorful Simple Style
No ratings yet
Planets English Reading Comprehension Worksheet in Colorful Simple Style
29 pages
9th Kan Maths Part-1 (2024-25)
No ratings yet
9th Kan Maths Part-1 (2024-25)
128 pages
The Role of Letters in - Pride and Prejudice - by Jane Austen
No ratings yet
The Role of Letters in - Pride and Prejudice - by Jane Austen
2 pages
ENsemble, Random Forest
No ratings yet
ENsemble, Random Forest
28 pages
Ensemble Method
No ratings yet
Ensemble Method
8 pages
Amaya School of Home Industries
No ratings yet
Amaya School of Home Industries
2 pages
Shaheerullah Fayzi High School Deploma
No ratings yet
Shaheerullah Fayzi High School Deploma
2 pages
Unit V -Multiple Learners
No ratings yet
Unit V -Multiple Learners
54 pages
ML mod1
No ratings yet
ML mod1
48 pages
Ensemble_Techniques_Presentation
No ratings yet
Ensemble_Techniques_Presentation
17 pages
Module 5,1 Ensemble_Bagging, RF,Boosting
No ratings yet
Module 5,1 Ensemble_Bagging, RF,Boosting
66 pages
Bagging vs Boosting - Javatpoint
No ratings yet
Bagging vs Boosting - Javatpoint
8 pages
ML-Lecture-15-Ensemble
No ratings yet
ML-Lecture-15-Ensemble
27 pages
L5 Text and Context Students
No ratings yet
L5 Text and Context Students
33 pages
Data Mining - Ensemble Methods
No ratings yet
Data Mining - Ensemble Methods
12 pages
Lec06 - Ensembling Methods Bagging Boosting
No ratings yet
Lec06 - Ensembling Methods Bagging Boosting
48 pages
Bagging
No ratings yet
Bagging
6 pages
Random Forest-Supervised ML
No ratings yet
Random Forest-Supervised ML
45 pages
Bagging vs Boosting in Machine Learning - GeeksforGeeks
No ratings yet
Bagging vs Boosting in Machine Learning - GeeksforGeeks
9 pages
Ensemble Learning
No ratings yet
Ensemble Learning
16 pages
Bagging and Boosting
No ratings yet
Bagging and Boosting
40 pages
Ensemble - Part 1
No ratings yet
Ensemble - Part 1
33 pages
sources of innovation-bk summary
No ratings yet
sources of innovation-bk summary
6 pages
U1-Ensemble Methods
No ratings yet
U1-Ensemble Methods
17 pages
Ensemble Learning
No ratings yet
Ensemble Learning
35 pages
FLSC ApplicationSummary110323
No ratings yet
FLSC ApplicationSummary110323
6 pages
16-Ensemble Learning - Cont... - 12-04-2024
No ratings yet
16-Ensemble Learning - Cont... - 12-04-2024
13 pages
ML U3 Notes
No ratings yet
ML U3 Notes
10 pages
Student Fee Details
No ratings yet
Student Fee Details
1 page
BIT ICT - July 2024
No ratings yet
BIT ICT - July 2024
2 pages
Class Adv Classification V
No ratings yet
Class Adv Classification V
50 pages
Bagging vs Boosting in Machine Learning
No ratings yet
Bagging vs Boosting in Machine Learning
5 pages
Medt 04: Health Information System For Medical Laboratory Science
No ratings yet
Medt 04: Health Information System For Medical Laboratory Science
24 pages
605-Article Text-1093-1-10-20170402
No ratings yet
605-Article Text-1093-1-10-20170402
13 pages
Brief Outline Kairali Research Awards
No ratings yet
Brief Outline Kairali Research Awards
4 pages
12 Ensemble Model
No ratings yet
12 Ensemble Model
90 pages
Chapter 5 - Managing The Business
No ratings yet
Chapter 5 - Managing The Business
33 pages
UNIT 3 AML
No ratings yet
UNIT 3 AML
9 pages
5 - EnsembleModeling
No ratings yet
5 - EnsembleModeling
80 pages
Module 2
No ratings yet
Module 2
34 pages
ML UNIT 3-1
No ratings yet
ML UNIT 3-1
14 pages
Bagging Boosting
No ratings yet
Bagging Boosting
3 pages
Bagging Vs Boosting in Machine Learning
No ratings yet
Bagging Vs Boosting in Machine Learning
4 pages
Random Forest
No ratings yet
Random Forest
20 pages
05 - Ensemble Learning
No ratings yet
05 - Ensemble Learning
39 pages
Ensemble Methods
No ratings yet
Ensemble Methods
31 pages
Classification Algorithms
No ratings yet
Classification Algorithms
68 pages
Ensemble Classifiers
No ratings yet
Ensemble Classifiers
37 pages
UNIT-V (Bagging, Boosting, Random Forest) : by Dr. K. Aditya Shastry Associate Professor Dept. of ISE NMIT, Bengaluru
No ratings yet
UNIT-V (Bagging, Boosting, Random Forest) : by Dr. K. Aditya Shastry Associate Professor Dept. of ISE NMIT, Bengaluru
27 pages
Case Study 9 - Simon (Chronic Schizophrenia)
No ratings yet
Case Study 9 - Simon (Chronic Schizophrenia)
6 pages
MML Ib
No ratings yet
MML Ib
3 pages
Unit-3(1)
No ratings yet
Unit-3(1)
59 pages
ML Mod 5.1
No ratings yet
ML Mod 5.1
18 pages
PTE General Level 1 For Teacher
100% (1)
PTE General Level 1 For Teacher
6 pages
Unit-3(1)
No ratings yet
Unit-3(1)
63 pages
Bagging and Boosting
No ratings yet
Bagging and Boosting
4 pages
Module 7 - Ensemble Learning
No ratings yet
Module 7 - Ensemble Learning
41 pages
Ensemble Final
No ratings yet
Ensemble Final
41 pages
5th Sem Marksheet
No ratings yet
5th Sem Marksheet
1 page
Unit-3 ML
No ratings yet
Unit-3 ML
18 pages
Ensemble Learning
No ratings yet
Ensemble Learning
52 pages
Lesson 8 - Ensemble Learning
No ratings yet
Lesson 8 - Ensemble Learning
61 pages
Lesson Plan SCIENCE 5 (WEEK 2, DAY 1)
0% (1)
Lesson Plan SCIENCE 5 (WEEK 2, DAY 1)
3 pages
Mark Scheme (Results) November 2020
No ratings yet
Mark Scheme (Results) November 2020
22 pages
Group9 ABA Ensemble Model
No ratings yet
Group9 ABA Ensemble Model
5 pages
Lecture 10 Ensemble Methods
No ratings yet
Lecture 10 Ensemble Methods
69 pages
TutorialOnNeuralModelingSystems 2
0% (1)
TutorialOnNeuralModelingSystems 2
10 pages
Arnav Verma Updated Resume
No ratings yet
Arnav Verma Updated Resume
2 pages
Unit 1 Test Psychology
No ratings yet
Unit 1 Test Psychology
7 pages
Unit Iv
No ratings yet
Unit Iv
14 pages
Bagging+Boosting+Gradient Boosting
100% (1)
Bagging+Boosting+Gradient Boosting
48 pages
Ensembles 1
No ratings yet
Ensembles 1
4 pages
1.1 - Xgboost, GBboost, Adaboost - Boosting - Medium
No ratings yet
1.1 - Xgboost, GBboost, Adaboost - Boosting - Medium
6 pages
Enseble LEarning
100% (1)
Enseble LEarning
57 pages
Lecture 05 Random Forest 07112022 124639pm
No ratings yet
Lecture 05 Random Forest 07112022 124639pm
25 pages
Lesson 3 He Neuromotor Basis For Motor Control
No ratings yet
Lesson 3 He Neuromotor Basis For Motor Control
46 pages
Ch-4 Ensemble Learning
No ratings yet
Ch-4 Ensemble Learning
18 pages
Automobile Engineering
No ratings yet
Automobile Engineering
7 pages
2021 Saudi Arabia Venture Capital Report 2021
No ratings yet
2021 Saudi Arabia Venture Capital Report 2021
25 pages
Ensemble Learning Methods
100% (1)
Ensemble Learning Methods
24 pages
Shader: Exploring Visual Realms with Shader: A Journey into Computer Vision
From Everand
Shader: Exploring Visual Realms with Shader: A Journey into Computer Vision
Fouad Sabry
No ratings yet
Hidden Surface Determination: Unveiling the Secrets of Computer Vision
From Everand
Hidden Surface Determination: Unveiling the Secrets of Computer Vision
Fouad Sabry
No ratings yet
Unit 3
No ratings yet
Unit 3
99 pages
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet
Getting The Gospel Right
No ratings yet
Getting The Gospel Right
15 pages
Ensemble Classifiers
100% (1)
Ensemble Classifiers
37 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

ML-Unit I - Ensemble Methods

Uploaded by

ML-Unit I - Ensemble Methods

Uploaded by

Machine Learning

Dr. Sunil Saumya

Dataset shape: (100*6)

Bootstrapping: (Row wise sampling) 40 42 2 9 0.26 1

Bootstrapping: (Column wise sampling) 40 2 9 1

Bootstrapping: (Row+column wise sampling) 40 2 9 1

● In general, we always keep

4 0 0 0 0 0 87 0 0 0 Image size: 28*28

Weight =∝1 Weight =∝2 Weight =∝3

Original Dataset Give the initial weight to each row.

X1 X2 Y weight ● Create a decision stump for the given dataset.

9 8 0 0.2 1 data 4 data

● If we observe the Y-pred column, we find that all

● Consider three values of error rate

2 9 0 1 0.2 ∝1⇒½ *ln ((1-0.4)/0.4) ⇒ ½ *ln(0.6/0.4) ⇒ ½ * ln (1.5)

1 4 1 0 0.2 Weight update formula:

For misclassified points:

New_weight = current_weight * e∝1

3 7 1 1 0.2 0.16 New_weight = current_weight * e-∝1

X1 X2 Y Y-pred weight Updated Normalized

3 7 1 1 0.2 0.16 0.166

2 9 0 1 0.2 0.24 0.25

1 4 1 0 0.2 0.24 0.25

9 8 0 0 0.2 0.16 0.166

3 7 0 0 0.2 0.16 0.166

● Generate n (=5) random numbers in

● Generate n (=5) random numbers in

Row1, Row3, Row3, Row3, Row4

1 4 1 0.25 0.416 - 0.666 h(x)= ∝1h1(x)+∝2h2(x)+∝3h3(x) +...+

9 8 0 0.166 0.666 - 0.832 Where, h1(x), h2(x) ..hn(x) are predictions of

100 7 4 Model 1: is an average of output variable also known as leaf

iq cgpa salary Pred1

90 8 3 4.8 Pseudo_residual = actual - prediction

iq cgpa res1 iq<=95 cgpa<=7.5

iq cgpa res1 Pred2 iq<=95 cgpa<=7.5

100 7 -0.8 -0.8 -1.8 -0.8 3.2 1.2

110 6 3.2 3.2

Prediction = Model 1 prediction + Model 2 prediction

iq cgpa res1 Pred2 iq<=95 cgpa<=7.5 2 4.8-0.8=4

90 8 -1.8 -1.8 3 4.8+3.2=8

100 7 -0.8 -0.8 -1.8 -0.8 3.2 1.2 4 4.8+1.2=6

110 6 3.2 3.2 5 4.8-1.8=3

100 7 4 -0.8 -0.8 4.72 -0.72 Here, res2 < res1

iq cgpa res2 pred3

100 7 -0.72 -0.72

120 9 1.08 1.08

90 8 3 -1.8 -1.8 4.62 -1.62 -1.62 4.62 - 0.1*1.62=4.45 -1.45

100 7 4 -0.8 -0.8 4.72 -0.72 -0.72 4.72 - 0.1*0.72=4.64 -0.64

110 6 8 3.2 3.2 5.12 2.88 2.88 5.12 + 0.1*2.88=5.40 2.6

120 9 6 1.2 1.2 4.92 1.08 1.08 4.92 + 0.1*1.08=5.02 0.98

80 5 3 -1.8 -1.8 4.62 -1.62 -1.62 4.62 - 0.1*1.62=4.45 -1.45

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

2 9 0 1 0.2 ∝1⇒½ ln ((1-0.4)/0.4) ⇒ ½ ln(0.6/0.4) ⇒ ½ * ln (1.5)