0% found this document useful (0 votes)

7 views14 pages

ML Unit 3-1

The document covers various ensemble learning techniques, including Voting Classifiers, Bagging, Boosting, and Stacking, emphasizing their roles in improving prediction accuracy. It explains Random Forests as a specific application of Bagging, detailing its advantages and steps for implementation. Additionally, it discusses the concepts of Pasting and the boosting algorithm AdaBoost, along with the challenges of overfitting in Stacking methods.

Uploaded by

nunepavan2001

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views14 pages

ML Unit 3-1

Uploaded by

nunepavan2001

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 14

UNIT - III

Ensemble Learning and Random Forests: Introduction, Voting

Classifiers, Bagging and Pasting, Random Forests, Boosting,
Stacking.

Support Vector Machine: Linear SVM Classification, Nonlinear

SVM Classification SVM Regression, Naïve Bayes Classifiers.

Ensemble learning

Ensemble learning is a powerful technique in machine learning where

multiple models are combined to improve the overall performance
and accuracy of predictions. It is based on the idea that a group of
models working together can often outperform a single model.
Types of Ensemble Learning Techniques

1. Bagging (Bootstrap Aggregating)

2. Boosting

4. Generalization)

5. Voting

6. Random forest
Voting classifier

A Voting Classifier is an ensemble learning technique that combines

multiple machine learning models to improve classification accuracy.
It works by aggregating predictions from different models and making
the final decision based on majority voting (hard voting) or
probability averaging (soft voting).

Types of Voting Classifiers

 Hard Voting
 Soft Voting

Hard Voting

Hard voting (also known as majority voting). The models predict the
output class independent of each other. The output class is a class with
the highest majority of votes.

EXAMPLE:

Suppose three classifiers predicted the output class (A, A, B), so the
majority predicted A as an output. Therefore A will be the final
prediction.
Soft Voting

In soft voting, the output class is the prediction based on the average
of probability given to that class. Soft voting entails combining the
probabilities of each prediction in each model and picking the
prediction with the highest total probability.

Each base model classifier independently assigns the probability of

occurrence of each type. In the end, the average of the possibilities of
each class is calculated, and the final output is the class having the
highest probability.

EXAMPLE:

Suppose given some input to three models, the prediction probability

for class A = (0.30, 0.47, 0.53) and B = (0.20, 0.32, 0.40). So the
average for class A is 0.4333 and B is 0.3067, the winner is clearly
class A because it had the highest probability averaged by each
classifier.

RANDOM FOREST ALGORITHM

 To make predictions and then we do voting of all the tress to make prediction.
Why use Random Forest?
o It takes less training time as compared to other algorithms.
o It predicts output with high accuracy, even for the large dataset it runs efficiently.

o It can also maintain accuracy when a large proportion of data is missing.

STEPS FOR RANDOM FOREST ALGORITHM

Step-1: Select random K data points from the training set.

Step-2: Build the decision trees associated with the selected data points (Subsets).

Step-3: Choose the number N for decision trees that you want to build.

Step-4: Repeat Step 1 & 2.

Step-5: For new data points, find the predictions of each decision tree, and assign the new
data points to the category that wins the majority votes.

EXAMPLE:
Applications of Random Forest
1. Banking: Banking sector mostly uses this algorithm for the identification of loan risk.

2. Medicine: With the help of this algorithm, disease trends and risks of the disease can
be identified.

3. Land Use: We can identify the areas of similar land use by this algorithm.

4. Marketing: Marketing trends can be identified using this algorithm.

Bagging
Bootstrap Aggregating, also known as bagging, is a machine learning ensemble meta-
algorithm designed to improve the stability and accuracy of machine learning algorithms used
in statistical classification and regression.

It creates multiple instances of the same model by training each instance on a randomly
drawn subset of the training data with replacement (bootstrap sampling). The final
prediction is made by aggregating the outputs of all models.

It decreases the variance and helps to avoid overfitting. It is usually applied to decision tree
methods.

Bagging is a special case of the model averaging approach.

Implementation Steps of Bagging

1. Bootstrap Sampling: Create multiple random subsets of the training data with
replacement.

2. Model Training: Train an individual model on each subset.

3. Aggregation:

o Regression: Take the average of all model predictions.

o Classification: Use majority voting (most common class label).

An illustration for the concept of bootstrap aggregating (Bagging)

Example of Bagging

The Random Forest model uses Bagging, where decision tree models with higher variance
are present. It makes random feature selection to grow trees. Several random trees make a
Random Forest.
Pasting
Pasting is similar to bagging but differs in one key aspect—it uses sampling without
replacement, meaning each subset of data is unique and does not have repeated samples.

How It Works

1. Random Subset Selection: Create multiple random subsets of the training data without
replacement.

2. Model Training: Train an individual model on each subset.

3. Aggregation:

o Regression: Take the average of all model predictions.

o Classification: Use majority voting.

Advantages of Pasting

✅ Reduces variance, but less effective than bagging when dealing with high variance models.
✅ Uses the full dataset more efficiently as there are no repeated samples.
Boosting
Boosting is an ensemble modelling technique designed to create a strong classifier by
combining multiple weak classifiers. The process involves building models sequentially,
where each new model aims to correct the errors made by the previous ones.

 Initially, a model is built using the training data.

 Subsequent models are then trained to address the mistakes of their predecessors.

 boosting assigns weights to the data points in the original dataset.

 Higher weights: Instances that were misclassified by the previous model receive
higher weights.

 Lower weights: Instances that were correctly classified receive lower weights.

 Training on weighted data: The subsequent model learns from the weighted dataset,
focusing its attention on harder-to-learn examples (those with higher weights).

 This iterative process continues until:

o The entire training dataset is accurately predicted, or

o A predefined maximum number of models is reached.

Boosting Algorithms
There are several boosting algorithms. The original ones, proposed by Robert
Schapire and Yoav Freund were not adaptive and could not take full advantage of the weak
learners.

Schapire and Freund then developed AdaBoost, an adaptive boosting algorithm that won the
prestigious Gödel Prize. AdaBoost was the first really successful boosting algorithm
developed for the purpose of binary classification.

AdaBoost is short for Adaptive Boosting and is a very popular boosting technique that
combines multiple “weak classifiers” into a single “strong classifier”.

Algorithm:

1. Initialise the dataset and assign equal weight to each of the data point.
2. Provide this as input to the model and identify the wrongly classified data points.

3. Increase the weight of the wrongly classified data points and decrease the weights of
correctly classified data points. And then normalize the weights of all data points.

4. if (got required results)

Goto step 5
else
Goto step 2

5. End

What is Stacking Ensemble Technique?

Stacking is another ensemble technique. It is the extension of the voting ensemble technique
as we use different learning models which we called Base Models in stacking. On top of that,
another model is trained which uses the prediction of the base models for the final outcome.
The extra added learning model is known as Meta Model.
Stacking

But stacking ensemble faces an overfitting problem because we are using the same training
dataset to train the base models and also using the prediction of the same training dataset to
train the meta-model. To solve this problem stacking ensemble comes up with two methods.

1. Blending

2. k-fold

Blending

In this method, we divide the training dataset into two parts. The first part of the training
dataset will be used to train the base models then the second part of the training dataset is
used by the base models to predict the outcome which is further used by the meta-model.

K-Folds

In this method, we divide the training dataset into k parts/folds, then k-1 parts of the training
dataset are used to train the base models and one part which is left is used by the base models
to predict the outcome which is further used by the meta-model.

Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
From Everand
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
Artem Kovera
No ratings yet
Unit-3 ML
No ratings yet
Unit-3 ML
18 pages
ML Unit-3
No ratings yet
ML Unit-3
15 pages
Bagging
No ratings yet
Bagging
7 pages
Unit 3 Aml
No ratings yet
Unit 3 Aml
9 pages
Ch-4 Ensemble Learning
No ratings yet
Ch-4 Ensemble Learning
18 pages
ML Mod 5.1
No ratings yet
ML Mod 5.1
18 pages
ML Unit 3 V2
No ratings yet
ML Unit 3 V2
47 pages
Ensemble Learning
No ratings yet
Ensemble Learning
16 pages
ML Lecture 15 Ensemble
No ratings yet
ML Lecture 15 Ensemble
27 pages
Unit 5 ML
No ratings yet
Unit 5 ML
14 pages
Unit 4
No ratings yet
Unit 4
24 pages
Assignment 1
No ratings yet
Assignment 1
4 pages
Ensemble Learning
No ratings yet
Ensemble Learning
26 pages
Module 2
No ratings yet
Module 2
34 pages
Unit 4 Ensemble Techniques and Unsupervised Learning
100% (1)
Unit 4 Ensemble Techniques and Unsupervised Learning
25 pages
Unit 4 ML
No ratings yet
Unit 4 ML
25 pages
Unit 3
No ratings yet
Unit 3
59 pages
22AIP3101A Session 11
No ratings yet
22AIP3101A Session 11
30 pages
Unit 3
No ratings yet
Unit 3
63 pages
Ensemble Learning in Machine Learning
No ratings yet
Ensemble Learning in Machine Learning
4 pages
Ensemble Methods
No ratings yet
Ensemble Methods
31 pages
AI25
No ratings yet
AI25
7 pages
Bagging and Boosting
No ratings yet
Bagging and Boosting
40 pages
Module 7 - Ensemble Learning
No ratings yet
Module 7 - Ensemble Learning
41 pages
Lecture 6
No ratings yet
Lecture 6
24 pages
Bagging Vs Boosting in Machine Learning
No ratings yet
Bagging Vs Boosting in Machine Learning
5 pages
Group9 ABA Ensemble Model
No ratings yet
Group9 ABA Ensemble Model
5 pages
UNIT III Word File
No ratings yet
UNIT III Word File
13 pages
Ensemble Final
No ratings yet
Ensemble Final
41 pages
Random Forest-Supervised ML
No ratings yet
Random Forest-Supervised ML
45 pages
Technical Report
No ratings yet
Technical Report
10 pages
Ensemble Learning
100% (1)
Ensemble Learning
7 pages
2.4-Ensemble Methods Lecture Notes
No ratings yet
2.4-Ensemble Methods Lecture Notes
14 pages
Bagging Vs Boosting - Javatpoint
No ratings yet
Bagging Vs Boosting - Javatpoint
8 pages
Aiml Unit 4
No ratings yet
Aiml Unit 4
26 pages
Unit 4 ML
No ratings yet
Unit 4 ML
9 pages
Ensemble Learning (Autosaved)
No ratings yet
Ensemble Learning (Autosaved)
31 pages
What Is Ensemble Learning
No ratings yet
What Is Ensemble Learning
4 pages
Enseble LEarning
100% (1)
Enseble LEarning
57 pages
An Introduction of Ensemble Learning
100% (1)
An Introduction of Ensemble Learning
40 pages
Classification Algorithms
No ratings yet
Classification Algorithms
68 pages
U1-Ensemble Methods
No ratings yet
U1-Ensemble Methods
17 pages
Bagging
No ratings yet
Bagging
6 pages
ML Unit 3
No ratings yet
ML Unit 3
22 pages
Ensemble Techniques Presentation
No ratings yet
Ensemble Techniques Presentation
17 pages
5 - EnsembleModeling
No ratings yet
5 - EnsembleModeling
80 pages
Unit 4 Updated Notes
No ratings yet
Unit 4 Updated Notes
13 pages
Bagging+Boosting+Gradient Boosting
100% (1)
Bagging+Boosting+Gradient Boosting
48 pages
Ensemble Learning Methods
100% (1)
Ensemble Learning Methods
24 pages
Bagging Vs Boosting in Machine Learning - GeeksforGeeks
No ratings yet
Bagging Vs Boosting in Machine Learning - GeeksforGeeks
9 pages
Ensemble Methods Send
No ratings yet
Ensemble Methods Send
20 pages
Ensemble Methods
100% (1)
Ensemble Methods
15 pages
ML Exp 9
No ratings yet
ML Exp 9
3 pages
Bagging and Boosting
No ratings yet
Bagging and Boosting
4 pages
Unit 4
No ratings yet
Unit 4
17 pages
Article Review 9 Eng
No ratings yet
Article Review 9 Eng
21 pages
Ensemble Learning
No ratings yet
Ensemble Learning
52 pages
Pedestrian Detection: Please, suggest a subtitle for a book with title 'Pedestrian Detection' within the realm of 'Computer Vision'. The suggested subtitle should not have ':'.
From Everand
Pedestrian Detection: Please, suggest a subtitle for a book with title 'Pedestrian Detection' within the realm of 'Computer Vision'. The suggested subtitle should not have ':'.
Fouad Sabry
No ratings yet
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
Computer Science Assignment Xi
No ratings yet
Computer Science Assignment Xi
34 pages
Java
No ratings yet
Java
36 pages
Computer Architecture MCQ
No ratings yet
Computer Architecture MCQ
102 pages
Biometrics and Fingerprint Payment Technology
No ratings yet
Biometrics and Fingerprint Payment Technology
4 pages
Pure Basic
100% (1)
Pure Basic
1,243 pages
Sending Calendar Invitations From TOPdesk
No ratings yet
Sending Calendar Invitations From TOPdesk
3 pages
802dot1q Qinq Tunneling
No ratings yet
802dot1q Qinq Tunneling
13 pages
Kinematics, Workspace, and Design of A Dual-Arm Spatial Robot in Zero Gravity
No ratings yet
Kinematics, Workspace, and Design of A Dual-Arm Spatial Robot in Zero Gravity
7 pages
Letter To PM
No ratings yet
Letter To PM
2 pages
Problem Set Ee8205 PDF
No ratings yet
Problem Set Ee8205 PDF
4 pages
Hermes Opcodes Table
No ratings yet
Hermes Opcodes Table
28 pages
IEC 61000-4-30 - Changes - Ed 1 To Ed 2 PDF
No ratings yet
IEC 61000-4-30 - Changes - Ed 1 To Ed 2 PDF
3 pages
Networking in Manufacturing Systems
No ratings yet
Networking in Manufacturing Systems
22 pages
Tutorial Genesis
No ratings yet
Tutorial Genesis
102 pages
PMP Chapter 4 Test Project Integration Management
No ratings yet
PMP Chapter 4 Test Project Integration Management
3 pages
Computer Programing II - Course Outline
No ratings yet
Computer Programing II - Course Outline
2 pages
C Programming Language
No ratings yet
C Programming Language
34 pages
Genio Power Point PDF
No ratings yet
Genio Power Point PDF
40 pages
NPTEL Solutions Merged
No ratings yet
NPTEL Solutions Merged
134 pages
External Qs
No ratings yet
External Qs
5 pages
Credit Card Usage Pattern
No ratings yet
Credit Card Usage Pattern
3 pages
Structures in C
100% (1)
Structures in C
25 pages
Distribution Transformer Failure Population Analysis
No ratings yet
Distribution Transformer Failure Population Analysis
2 pages
2G TRIAL-Um Interface Speech Frame Repairing - All EJ BSC - 20160829
0% (1)
2G TRIAL-Um Interface Speech Frame Repairing - All EJ BSC - 20160829
14 pages
RTS Test Plan - Genentech
No ratings yet
RTS Test Plan - Genentech
10 pages
Process Capability Statistics - CPK vs. PPK
No ratings yet
Process Capability Statistics - CPK vs. PPK
4 pages
Logitech Conference Cam BCC950 Video Conference Webcam
No ratings yet
Logitech Conference Cam BCC950 Video Conference Webcam
6 pages
Liebert Exs 10 20 Kva Brochure English
No ratings yet
Liebert Exs 10 20 Kva Brochure English
8 pages
Digit 0209
0% (1)
Digit 0209
70 pages
Pps Question Bank
No ratings yet
Pps Question Bank
2 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

ML Unit 3-1

Uploaded by

ML Unit 3-1

Uploaded by

UNIT - III

Ensemble Learning and Random Forests: Introduction, Voting

Support Vector Machine: Linear SVM Classification, Nonlinear

Ensemble learning is a powerful technique in machine learning where

1. Bagging (Bootstrap Aggregating)

A Voting Classifier is an ensemble learning technique that combines

Types of Voting Classifiers

Each base model classifier independently assigns the probability of

Suppose given some input to three models, the prediction probability

RANDOM FOREST ALGORITHM

o It can also maintain accuracy when a large proportion of data is missing.

STEPS FOR RANDOM FOREST ALGORITHM

Step-4: Repeat Step 1 & 2.

4. Marketing: Marketing trends can be identified using this algorithm.

Bagging is a special case of the model averaging approach.

Implementation Steps of Bagging

2. Model Training: Train an individual model on each subset.

o Regression: Take the average of all model predictions.

o Classification: Use majority voting (most common class label).

An illustration for the concept of bootstrap aggregating (Bagging)

2. Model Training: Train an individual model on each subset.

o Regression: Take the average of all model predictions.

o Classification: Use majority voting.

 Initially, a model is built using the training data.

 boosting assigns weights to the data points in the original dataset.

 This iterative process continues until:

o The entire training dataset is accurately predicted, or

o A predefined maximum number of models is reached.

4. if (got required results)

What is Stacking Ensemble Technique?

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.