0% found this document useful (0 votes)
16 views52 pages

Ensemble Models

Uploaded by

mrchanproplayer
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views52 pages

Ensemble Models

Uploaded by

mrchanproplayer
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 52

Ensemble Models

Bagging and Random Forests

P R E S E N T E D B Y:

CHANDRU
D AW O O D I F Z A L
B H A R AT H I
ensemble /ɑːnˈsɑːm.bəl/
- a small group of musicians, dancers or actors who perform
together.

- a number of things considered as a group.

- a group producing a single effect


Ensemble Models
Ensemble learning is a machine learning technique that
enhances accuracy and resilience in forecasting by
merging predictions from multiple models. It aims to
mitigate errors or biases that may exist in individual
models by leveraging the collective intelligence of the
ensemble.

It is much like seeking advice from multiple sources


before making a significant decision, such as purchasing
a car. Just as you wouldn’t rely solely on one opinion,
ensemble models combine predictions from multiple
base models to enhance overall performance.
Ensemble Models
Bias-Variance Trade-off
Single models often have to compromise between
bias and variance. For example, decision trees
might have high variance but low bias, while linear
models may have low variance but high bias.

Complex models, such as deep neural networks or


individual decision trees, can overfit the training
data, meaning they perform well on the training
set but poorly on unseen data.
Ensemble Learning
Let us ask one of our friends to
rate the movie

Ye
s
N
o
Ensemble Learning
Let us ask one of our friends to
rate the movie

Ye
s
N
o
High Bias: Our friend (model) oversimplifies
the problem, failing to capture important
patterns, leading to poor predictions. It
underfits the data.
Ensemble Learning
Let us ask one of our friends to
rate the movie

Ye
s
N
o
High Variance: On the other hand, our expert
is too complex, paying attention to tiny details
that might not be relevant in the future,
leading to overfitting.
Ensemble Learning
How about asking 50 people to
rate the movie?

Ye
s
N The responses, in this case, would be
more generalized and diversified since
o now we have people with different sets
of skills. And as it turns out – this is a
better approach to get honest ratings
than the previous cases we saw.
Ensemble Learning
An ensemble model combines several
individual models to produce more accurate
predictions than a single model alone.
Simple Ensemble Techniques

MAX VOTING AVERAGING WEIGHTED


AVERAGING
Advanced Ensemble techniques
 Stacking
 Blending
 Bagging
 Boosting
Bagging
Bagging
Bagging, also known as
Bootstrap Aggregating, is a
potent ensemble learning
strategy that has been
shown to improve model
performance. Bagging can
be used to solve a variety
of machine-learning issues,
such as classification,
regression, and clustering.
Bagging
Bagging is combining the results of
multiple models (for instance, all
decision trees) to get a generalized
result.
Here’s a question: If we create all the
models on the same set of data and
combine it, will it be useful?
There is a high chance that these
models will give the same result since
they are getting the same input.
We solve this problem by one of the
techniques called “bootstrapping”.
This new dataset, known as Sampling with
Bootstrapping
Replacement, has the same number of values as
original dataset is called a Bootstrapped
Dataset
This new dataset, known as Sampling with
Bootstrapping Replacement, has the same number of values as
original dataset is called a Bootstrapped
Dataset

We calculate the mean of the new Bootstrapped Dataset. We


get a new mean, as this dataset is different from the original
dataset.
Implementation Steps of Bagging
•Dataset Preparation: Clean and preprocess your
dataset. Split it into training and test sets.

•Bootstrap Sampling: Randomly sample from the


training data with replacement to create multiple
bootstrap samples. Each sample typically has the
same size as the original dataset.

•Model Training: Train a base model (e.g., decision


tree, neural network) on each bootstrap sample.
Each model is trained independently.

•Prediction Generation: Use each trained model to


predict the test data.
Implementation Steps of Bagging
•Combining Predictions: Aggregate the predictions
from all models using methods like majority voting
for classification or averaging for regression.

•Evaluation: Assess the ensemble’s performance on


the test data using metrics like accuracy, F1 score, or
mean squared error.

•Hyperparameter Tuning: Adjust the


hyperparameters of the base models or the
ensemble as needed, using techniques like cross-
validation.

•Deployment: Once satisfied with the ensemble’s


performance, deploy it to make predictions on new
data.
Random Forests
Random Forest
Random Forest is another ensemble machine
learning algorithm that follows the bagging
technique. It is an extension of the bagging
estimator algorithm. The base estimators in
random forest are decision trees.

Random forest randomly selects a set of


features which are used to decide the best split
at each node of the decision tree.
1. Create a “bootstrapped”
dataset
2. Create a decision tree using the
bootstrapped dataset, only using a
random subset of variables (columns) at
each step

Let us select two random variables


Good Blood Circ. and Blocked
Arteries as candidates for the root
node
Since we used Good Blood
Circ., let us grey it out and
focus on the remaining
variables.
We just build the tree as usual, but
only considering a random subset of
variables at each step
How do we use the
Random Forest?
We got a new patient
data, and we want to know
if he has heart disease or
not.

We take the data and run


it down the first tree
In this case, "Yes" received most
votes. So, we conclude that this
patient has heart disease.
?

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy