ML Unit 3-1
ML Unit 3-1
Ensemble learning
2. Boosting
3.
4. Generalization)
5. Voting
6. Random forest
Voting classifier
Hard Voting
Soft Voting
Hard Voting
Hard voting (also known as majority voting). The models predict the
output class independent of each other. The output class is a class with
the highest majority of votes.
EXAMPLE:
Suppose three classifiers predicted the output class (A, A, B), so the
majority predicted A as an output. Therefore A will be the final
prediction.
Soft Voting
In soft voting, the output class is the prediction based on the average
of probability given to that class. Soft voting entails combining the
probabilities of each prediction in each model and picking the
prediction with the highest total probability.
EXAMPLE:
To make predictions and then we do voting of all the tress to make prediction.
Why use Random Forest?
o It takes less training time as compared to other algorithms.
o It predicts output with high accuracy, even for the large dataset it runs efficiently.
Step-2: Build the decision trees associated with the selected data points (Subsets).
Step-3: Choose the number N for decision trees that you want to build.
Step-5: For new data points, find the predictions of each decision tree, and assign the new
data points to the category that wins the majority votes.
EXAMPLE:
Applications of Random Forest
1. Banking: Banking sector mostly uses this algorithm for the identification of loan risk.
2. Medicine: With the help of this algorithm, disease trends and risks of the disease can
be identified.
3. Land Use: We can identify the areas of similar land use by this algorithm.
Bagging
Bootstrap Aggregating, also known as bagging, is a machine learning ensemble meta-
algorithm designed to improve the stability and accuracy of machine learning algorithms used
in statistical classification and regression.
It creates multiple instances of the same model by training each instance on a randomly
drawn subset of the training data with replacement (bootstrap sampling). The final
prediction is made by aggregating the outputs of all models.
It decreases the variance and helps to avoid overfitting. It is usually applied to decision tree
methods.
1. Bootstrap Sampling: Create multiple random subsets of the training data with
replacement.
3. Aggregation:
Example of Bagging
The Random Forest model uses Bagging, where decision tree models with higher variance
are present. It makes random feature selection to grow trees. Several random trees make a
Random Forest.
Pasting
Pasting is similar to bagging but differs in one key aspect—it uses sampling without
replacement, meaning each subset of data is unique and does not have repeated samples.
How It Works
1. Random Subset Selection: Create multiple random subsets of the training data without
replacement.
3. Aggregation:
Advantages of Pasting
✅ Reduces variance, but less effective than bagging when dealing with high variance models.
✅ Uses the full dataset more efficiently as there are no repeated samples.
Boosting
Boosting is an ensemble modelling technique designed to create a strong classifier by
combining multiple weak classifiers. The process involves building models sequentially,
where each new model aims to correct the errors made by the previous ones.
Subsequent models are then trained to address the mistakes of their predecessors.
Higher weights: Instances that were misclassified by the previous model receive
higher weights.
Lower weights: Instances that were correctly classified receive lower weights.
Training on weighted data: The subsequent model learns from the weighted dataset,
focusing its attention on harder-to-learn examples (those with higher weights).
Boosting Algorithms
There are several boosting algorithms. The original ones, proposed by Robert
Schapire and Yoav Freund were not adaptive and could not take full advantage of the weak
learners.
Schapire and Freund then developed AdaBoost, an adaptive boosting algorithm that won the
prestigious Gödel Prize. AdaBoost was the first really successful boosting algorithm
developed for the purpose of binary classification.
AdaBoost is short for Adaptive Boosting and is a very popular boosting technique that
combines multiple “weak classifiers” into a single “strong classifier”.
Algorithm:
1. Initialise the dataset and assign equal weight to each of the data point.
2. Provide this as input to the model and identify the wrongly classified data points.
3. Increase the weight of the wrongly classified data points and decrease the weights of
correctly classified data points. And then normalize the weights of all data points.
5. End
But stacking ensemble faces an overfitting problem because we are using the same training
dataset to train the base models and also using the prediction of the same training dataset to
train the meta-model. To solve this problem stacking ensemble comes up with two methods.
1. Blending
2. k-fold
Blending
In this method, we divide the training dataset into two parts. The first part of the training
dataset will be used to train the base models then the second part of the training dataset is
used by the base models to predict the outcome which is further used by the meta-model.
K-Folds
In this method, we divide the training dataset into k parts/folds, then k-1 parts of the training
dataset are used to train the base models and one part which is left is used by the base models
to predict the outcome which is further used by the meta-model.