0% found this document useful (0 votes)

7 views26 pages

Dsbdunitiii T1729232981820-1

Uploaded by

apdeshmukh371122

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views26 pages

Dsbdunitiii T1729232981820-1

Uploaded by

apdeshmukh371122

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 26

Unit III Classification

Supervised learning, as the name indicates, has the presence of a supervisor as

a teacher. Supervised learning is when we teach or train the machine using
data that is well-labelled. Which means some data is already tagged with the
correct answer. After that, the machine is provided with a new set of
examples(data) so that the supervised learning algorithm analyses the training
data(set of training examples) and produces a correct outcome from labeled
data.

For example, a labelled dataset of images of Elephant, Camel and Cow would
have each image tagged with either “Elephant”, “Camel “or “Cow.”

The k-nearest neighbors (k-NN) algorithm is a simple yet powerful supervised

learning method used for both classification and regression tasks. Here’s a
brief overview:

How k-NN Works

Training Phase: The algorithm stores all the training data points and their
corresponding labels.

Classification Phase:
For a given test data point, the algorithm calculates the distance between this
point and all the training data points.

It then selects the (k) nearest neighbors (where (k) is a user-defined constant).

The test data point is assigned the class that is most common among its (k)
nearest neighbors.

Key Features

Non-parametric: k-NN does not make any assumptions about the underlying
data distribution.

Distance Metrics: Commonly used metrics include Euclidean, Manhattan, and

Minkowski distances.

Versatility: It can handle both numerical and categorical data.

Example 1
Example 2

Imagine we have a dataset with two features: Brightness and Saturation, and two
classes: Red and Blue. Here’s a simplified version of the dataset:

Table

Brightness Saturation Class

40 20 Red

50 50 Blue

60 90 Blue

10 25 Red

70 70 Blue

60 10 Red

25 80 Blue

Now, we want to classify a new data point with Brightness = 55 and

Saturation = 45. We’ll use the KNN algorithm with ( k = 3 ).

Steps:

Calculate the Distance: Compute the distance between the new data point
and all other points in the dataset. We’ll use the Euclidean distance formula:

Find the Nearest Neighbors: Identify the 3 nearest neighbors to the new data
point based on the calculated distances.
Majority Voting: Assign the class of the new data point based on the majority
class of its 3 nearest neighbors.

Calculation:

Let’s calculate the distances:

Distance to (40, 20):

≈29.15

Distance to (50, 50):

≈7.07

Distance to (60, 90):

≈45.28

Distance to (10, 25):

≈49.25
Distance to (70, 70):

≈29.15

Distance to (60, 10):

≈35.36

Distance to (25, 80):

≈46.10

Nearest Neighbors:

The 3 nearest neighbors are:

(50, 50) - Blue

(40, 20) - Red

(70, 70) - Blue

Majority Voting:

Out of the 3 nearest neighbors, 2 are Blue and 1 is Red. Therefore, the new
data point will be classified as Blue.

Applications

Pattern Recognition

Data Mining

Intrusion Detection

Decision tree
A decision tree classifier is a type of supervised learning algorithm used for classification
tasks. It works by splitting the data into subsets based on the values of the input features,
creating a tree-like model of decisions.
Structure of a Decision Tree

 Root Node: Represents the entire dataset and the initial decision to be made.
 Internal Nodes: Represent decisions or tests on attributes.
 Branches: Represent the outcome of a decision or test, leading to another node.
 Leaf Nodes: Represent the final decision or prediction.
How It Works

 Selecting the Best Attribute: Using metrics like Gini impurity, entropy, or
information gain, the best attribute to split the data is selected.

Gini impurity and entropy are both metrics used to measure the quality of a split in
decision trees, which are popular models in machine learning for classification and
regression tasks.

Gini impurity and entropy are both metrics used in decision trees to measure the
impurity or disorder of a dataset, helping to determine the best splits at each node

Formula: The Gini impurity for a node ( t ) is calculated as:

where ( p_i ) is the probability of an element being classified as class ( i ) and ( C ) is

the total number of classes.

Entropy

Definition: Entropy measures the amount of uncertainty or disorder in the

dataset. It quantifies the impurity in a more information-theoretic sense.

Formula: The entropy for a node ( t ) is calculated as

where ( p_i ) is the probability of an element being classified as class ( i ) and

( C ) is the total number of classes.

 Splitting the Dataset: The dataset is split into subsets based on the selected
attribute.
 Repeating the Process: This process is repeated recursively for each subset,
creating new internal nodes or leaf nodes until a stopping criterion is met
(e.g., all instances in a node belong to the same class or a predefined depth is
reached).

Pruning

To overcome overfitting, pruning techniques are used. Pruning reduces

the size of the tree by removing nodes that provide little power in
classifying instances. There are two main types of pruning:
Pre-pruning (Early Stopping): Stops the tree from growing once it meets
certain criteria (e.g., maximum depth, minimum number of samples per
leaf).

Post-pruning: Removes branches from a fully grown tree that do not

provide significant power.

Support vector machine

A Support Vector Machine (SVM) is a supervised machine learning algorithm
used for classification and regression tasks. Here’s a brief overview:

The goal of the SVM algorithm is to create the best line or decision boundary
that can segregate n-dimensional space into classes so that we can easily put
the new data point in the correct category in the future. This best decision
boundary is called a hyperplane.
Since SVM is a supervised algorithm, it uses a labeled dataset to
train itself. A labeled dataset is one in which the output data is
already present for the input data.

And when the new data comes in, the algorithm makes the
prediction based on the learnings from the labeled dataset. This can
be seen in the above diagram.

Let’s understand the basic terminologies of SVM with the help of a

classification problem.

Suppose we have a dataset in which the data points belong to two

classes i.e. circle class and square class.
Using SVM, our goal is to identify whether the new data
point which comes in belongs to the circle class or the
square class.

Hyperplane (Decision Boundary)

As the first step in SVM, we want to separate the 2 classes (circles

and squares). We don’t want the data points of the 2 classes to mix
with each other.

So how do we separate the two classes?

We draw a separating line between the two classes such that the data
points belonging to the circle class are on one side of the line and the
data points belonging to the square class are on the other side of the
line.

This line that separates the two classes is known as the “Decision
Boundary”. And in generalized form, it is known as the
“Hyperplane”.

Why is it called a decision boundary?

It is called a decision boundary because it acts as a boundary

between the two classes and it decides if the newly arrived data
points belong to the circle class or the square class.

In the next step, we have to identify the data points in

each class that are closest to the hyperplane.
After identifying the data points that are closest to the hyperplane,
we will draw a line in such a way that this line just touches the
closest data point and is parallel to the hyperplane. This
step is done on both sides of the hyperplane i.e. for both classes.

Let the distance between the hyperplane and the parallel line of class
A (circle class) be D1. Similarly, let the distance between the
hyperplane and the parallel line of class B (square class) be D2.

When we sum up these two distances, we get the new distance

known as the “Margin” or the “Marginal Distance”.
Hence, the Margin is nothing but:

This margin has a huge significance when deciding which is the best
hyperplane that should be used while making predictions.

Support Vectors
The data points touching the marginal lines are known as
the Support Vectors.
These are the data points closest to the hyperplane that was
considered while drawing the marginal lines parallel to the
hyperplane.
Please note that there can be multiple data points
touching the marginal line. And hence there can be
multiple support vectors present at the same time.
Non-Linear SVM:If data is linearly arranged, then we can separate it by using a straight
line, but for non-linear data, we cannot draw a single straight line. Consider the below

image:
So to separate these data points, we need to add one more dimension. For linear data,
we have used two dimensions x and y, so for non-linear data, we will add a third
dimension z. It can be calculated as:

z=x2 +y2
By adding the third dimension, the sample space will become as below image:

So now, SVM will divide the datasets into classes in the following way. Consider the
below image:
Since we are in 3-d Space, hence it is looking like a plane parallel to the x-axis. If we
convert it in 2d space with z=1, then it will become as:

Key Concepts:

Classification and Regression: SVMs can be used to classify data into different
categories or predict continuous values.

Hyperplane: The main goal of an SVM is to find the optimal hyperplane that
best separates the data into different classes. In a 2D space, this hyperplane is
a line, while in higher dimensions, it becomes a plane or a hyperplane.

Margin: SVM aims to maximize the margin, which is the distance between the
hyperplane and the nearest data points from each class. This helps in achieving
better generalization on unseen data.

Support Vectors: These are the data points that are closest to the hyperplane
and influence its position and orientation. They are critical in defining the
optimal hyperplane.

Applications:
Text Classification: Categorizing emails as spam or not spam.

Image Classification: Identifying objects in images.

Handwriting Recognition: Recognizing handwritten characters.

Bioinformatics: Classifying genes or proteins.

A Support Vector Machine (SVM) is a supervised machine learning algorithm

used for classification and regression tasks. Here’s a brief overview:

Bagging
Bagging (Bootstrap Aggregating) is an ensemble learning technique designed to
improve the stability and accuracy of machine learning models. Here’s a
breakdown of how it works and its benefits:

How Bagging Works

M1, M2 and M3 are weak learners.

Bootstrap Sampling: Multiple subsets of the training data are created by

randomly sampling with replacement. This means some data points may
appear multiple times in a subset, while others may not appear at all.
Training Multiple Models: Each subset is used to train a separate model
independently and in parallel.

Aggregating Predictions: The predictions from all the models are combined. For
classification tasks, this is typically done by majority voting, and for regression
tasks, by averaging the predictions.

Benefits of Bagging

Reduces Variance: By training multiple models on different subsets of the data,

bagging reduces the variance of the overall model, making it less sensitive to
fluctuations in the training data.

Prevents Overfitting: Since each model is trained on a different subset of the

data, the ensemble model is less likely to overfit compared to a single model
trained on the entire dataset.

Improves Accuracy: The combined predictions of multiple models often result

in better performance than any single model.

Boosting

Boosting is a powerful ensemble technique in machine learning designed to

improve the accuracy of predictive models by combining multiple weak
learners into a single strong learner. Here’s a breakdown of how it works and
some common algorithms:

How Boosting Works

Initial Weights: Assign initial weights to all data points.

Sequential Training: Train the first weak learner on the data. Evaluate its
performance and increase the weights of misclassified instances.

Iterative Process: Repeat the process of adjusting weights and training

subsequent learners. Each new model focuses on the weaknesses of the
ensemble so far.

Combining Results: Aggregate the predictions of all weak learners to form the
final output, typically using weighted voting.
Types of Boosting Algorithms

AdaBoost (Adaptive Boosting):

One of the first boosting algorithms.

Focuses on reweighting the training examples each time a learner is added,

putting more emphasis on incorrectly classified instances.

Particularly effective for binary classification problems.

Gradient Boosting:

Builds models sequentially and corrects errors along the way.

Each new model is trained to correct the errors of the previous models.

Variants include Gradient Boosting Machines (GBM) and XGBoost, which are
known for their high performance and efficiency.

XGBoost (Extreme Gradient Boosting):

An optimized version of gradient boosting.

Known for its speed and performance.

Incorporates regularization to prevent overfitting.

Applications

Classification: Boosting is widely used in classification tasks, such as spam

detection, sentiment analysis, and image recognition.

Regression: It is also used in regression tasks to predict continuous outcomes,

like house prices or stock prices.

Boosting has significantly impacted machine learning by enhancing model

accuracy and robustness

Bagging and boosting are both ensemble techniques in machine learning that
aim to improve the performance of models by combining multiple weak
learners. However, they do so in different ways

Bagging:Reduce variance and prevent overfitting.

Boosting:Reduce both bias and variance by focusing on difficult-to-predict
instances

Definition: Involves classifying data into one of two possible classes

In a binary classification task, the goal is to classify the input data into two
mutually exclusive categories. The training data in such a situation is labeled
in a binary format: true and false; positive and negative; O and 1; spam and
not spam, etc. depending on the problem being tackled. For instance, we
might want to detect whether a given image is a truck or a boat.
Examples:

Spam detection (spam vs. not spam)

Disease diagnosis (disease vs. no disease)

Algorithms: Logistic Regression, Support Vector Machines (SVM), Decision

Trees, etc.
Multiclass Classification

Definition: Involves classifying data into one of three or more possible classes.

One-versus-one: this strategy trains as many classifiers as there are pairs of

labels. If we have a 3-class classification, we will have three pairs of labels, thus
three classifiers, as shown below.
In general, for N labels, we will have Nx(N-1)/2 classifiers. Each classifier is
trained on a single binary dataset, and the final class is predicted by a majority
vote between all the classifiers. One-vs-one approach works best for SVM and
other kernel-based algorithms.

One-versus-rest: at this stage, we start by considering each label as an

independent label and consider the rest combined as only one label. With 3-
classes, we will have three classifiers.
In general, for N labels, we will have N binary classifiers.

Multi-Label Classification

In multi-label classification tasks, we try to predict 0 or more classes for each

input example. In this case, there is no mutual exclusion because the input
example can have more than one label.

Such a scenario can be observed in different domains, such as auto-tagging in

Natural Language Processing, where a given text can contain multiple topics.
Similarly to computer vision, an image can contain multiple objects, as
illustrated below: the model predicted that the image contains: a plane, a
boat, a truck, and a dog.

Highly Memory-Intensive Design in 40 NM
No ratings yet
Highly Memory-Intensive Design in 40 NM
3 pages
Supervised Learning - SVM - DT
No ratings yet
Supervised Learning - SVM - DT
43 pages
ML and Ai Unit 04 and Unit 05
No ratings yet
ML and Ai Unit 04 and Unit 05
58 pages
CH 04 Classification Techniques
No ratings yet
CH 04 Classification Techniques
89 pages
MLunit 2 Mynotes
No ratings yet
MLunit 2 Mynotes
15 pages
Data Science Unit 3
No ratings yet
Data Science Unit 3
33 pages
Unit - 3
No ratings yet
Unit - 3
73 pages
Ai Unit 4
No ratings yet
Ai Unit 4
17 pages
Refer For KNNDecison Tree SVM
No ratings yet
Refer For KNNDecison Tree SVM
90 pages
Machine Learning
No ratings yet
Machine Learning
15 pages
CH 7
No ratings yet
CH 7
33 pages
Mod09-ppt2-ML in Image Classification
No ratings yet
Mod09-ppt2-ML in Image Classification
30 pages
Module 3
No ratings yet
Module 3
79 pages
Unit 1
No ratings yet
Unit 1
15 pages
"Classifiers": R & D Project by Under The Guidance of
No ratings yet
"Classifiers": R & D Project by Under The Guidance of
59 pages
Types of Kernels in Support Vector Machines
No ratings yet
Types of Kernels in Support Vector Machines
14 pages
Unit 4-1
No ratings yet
Unit 4-1
27 pages
Unit 2
No ratings yet
Unit 2
16 pages
ML Notes
No ratings yet
ML Notes
12 pages
Supervised Classification Notes
No ratings yet
Supervised Classification Notes
31 pages
Classification
No ratings yet
Classification
7 pages
Machine Learning
No ratings yet
Machine Learning
32 pages
CSCI946 W5-Classification
No ratings yet
CSCI946 W5-Classification
72 pages
Machine Learning QNA
No ratings yet
Machine Learning QNA
1 page
IT 802 ML Unit-2 Notes
No ratings yet
IT 802 ML Unit-2 Notes
19 pages
U21amg05 Aif and ML Unit 04 Notes
No ratings yet
U21amg05 Aif and ML Unit 04 Notes
42 pages
Data Mining: Kabith Sivaprasad (BE/1234/2009) Rimjhim (BE/1134/2009) Utkarsh Ahuja (BE/1226/2009)
No ratings yet
Data Mining: Kabith Sivaprasad (BE/1234/2009) Rimjhim (BE/1134/2009) Utkarsh Ahuja (BE/1226/2009)
32 pages
Session 5
No ratings yet
Session 5
36 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
A Review of Supervised Learning Based Classification For Text To Speech System
No ratings yet
A Review of Supervised Learning Based Classification For Text To Speech System
8 pages
ML Unit 3
No ratings yet
ML Unit 3
17 pages
Pattern Revision
No ratings yet
Pattern Revision
63 pages
DW&M Unit 3 Part I
No ratings yet
DW&M Unit 3 Part I
101 pages
Unit - Iii
No ratings yet
Unit - Iii
52 pages
Pattern Recognition
No ratings yet
Pattern Recognition
33 pages
Pattern Recognition 14
No ratings yet
Pattern Recognition 14
46 pages
Introduction To Big Data and Data Mining
No ratings yet
Introduction To Big Data and Data Mining
130 pages
Basic of SVM Algorithm
No ratings yet
Basic of SVM Algorithm
10 pages
Machine Learning Algorithms Laiki
No ratings yet
Machine Learning Algorithms Laiki
123 pages
Unit Ii
No ratings yet
Unit Ii
102 pages
AIML-Unit 4 Notes-Assignment 4
No ratings yet
AIML-Unit 4 Notes-Assignment 4
21 pages
Presentation UNIT-2
No ratings yet
Presentation UNIT-2
96 pages
Classification (NaiveBayes KNN SVM DecisionTrees)
No ratings yet
Classification (NaiveBayes KNN SVM DecisionTrees)
105 pages
Introduction To Classification - PPT Slides 1
No ratings yet
Introduction To Classification - PPT Slides 1
62 pages
ML Unit-2 (CEC)
No ratings yet
ML Unit-2 (CEC)
96 pages
ML04 KNN-SVM 2024-2025
No ratings yet
ML04 KNN-SVM 2024-2025
57 pages
Data Mining Unit-IV
No ratings yet
Data Mining Unit-IV
7 pages
Ai and ML
No ratings yet
Ai and ML
16 pages
Module Iii
No ratings yet
Module Iii
15 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
L6 Lecture Image - Classification.fundemental v4
No ratings yet
L6 Lecture Image - Classification.fundemental v4
66 pages
DataScience - Project (Banknote Authentication) - SHILANJOY BHATTACHARJEE EE
No ratings yet
DataScience - Project (Banknote Authentication) - SHILANJOY BHATTACHARJEE EE
14 pages
ML Unit2
No ratings yet
ML Unit2
38 pages
INT354 - Unit 3
No ratings yet
INT354 - Unit 3
60 pages
Machine Learning in A Nutshell
No ratings yet
Machine Learning in A Nutshell
36 pages
Chapter 4
No ratings yet
Chapter 4
103 pages
Chapter 6 ML Classifications
100% (1)
Chapter 6 ML Classifications
51 pages
Unit 3 Ds
No ratings yet
Unit 3 Ds
10 pages
KNN & Support Vector Machines: Dr.S.Vasantharathna
No ratings yet
KNN & Support Vector Machines: Dr.S.Vasantharathna
22 pages
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet
DAAL - Assignment No 9
No ratings yet
DAAL - Assignment No 9
1 page
DAAL - Assignment No 7
No ratings yet
DAAL - Assignment No 7
2 pages
DAAL - Assignment No 10
No ratings yet
DAAL - Assignment No 10
2 pages
Department of Artificial Intelligence & Data Science K. K. Wagh Institute of Engineering Education and Research
No ratings yet
Department of Artificial Intelligence & Data Science K. K. Wagh Institute of Engineering Education and Research
5 pages
Ty Ai&Ds Semester: Assignment No.: Assignment Title
No ratings yet
Ty Ai&Ds Semester: Assignment No.: Assignment Title
4 pages
DAAL - Assignment No 8
No ratings yet
DAAL - Assignment No 8
1 page
S.Y Syllabus
No ratings yet
S.Y Syllabus
57 pages
Zahra
No ratings yet
Zahra
2 pages
CSV Flie Question Bank Solutions
No ratings yet
CSV Flie Question Bank Solutions
20 pages
RR 01 Artificial Intelligence
No ratings yet
RR 01 Artificial Intelligence
14 pages
4th Week Report
No ratings yet
4th Week Report
3 pages
Question Bank For Practical
No ratings yet
Question Bank For Practical
3 pages
CS511 Syl
No ratings yet
CS511 Syl
2 pages
Online Rental System
No ratings yet
Online Rental System
60 pages
Analysis Techniques For Information Security
No ratings yet
Analysis Techniques For Information Security
164 pages
Selection Statements in Java
No ratings yet
Selection Statements in Java
13 pages
Art of Computing and Problem Solving MG1DSCCSC100Art of Computing and Problem Solving MG1DSCECC101
No ratings yet
Art of Computing and Problem Solving MG1DSCCSC100Art of Computing and Problem Solving MG1DSCECC101
2 pages
Course Title: Data Structure Using C Course Code: CSIT 124 Credit Units: - 04 Course Level: - UG Course Type: - PC-Core Course Objectives
No ratings yet
Course Title: Data Structure Using C Course Code: CSIT 124 Credit Units: - 04 Course Level: - UG Course Type: - PC-Core Course Objectives
7 pages
Interview Camp: Level: Hard Given Inorder and Preorder Traversals of A Binary Tree, Reconstruct The Binary Tree
No ratings yet
Interview Camp: Level: Hard Given Inorder and Preorder Traversals of A Binary Tree, Reconstruct The Binary Tree
3 pages
Codevita
No ratings yet
Codevita
32 pages
x86 64 Assembly Language Programming With Ubuntu Ed Jorgensen - Download The Entire Ebook Instantly and Explore Every Detail
100% (1)
x86 64 Assembly Language Programming With Ubuntu Ed Jorgensen - Download The Entire Ebook Instantly and Explore Every Detail
73 pages
9.4.1 Deep Learning and AI Development Framework Lab Guide-1
No ratings yet
9.4.1 Deep Learning and AI Development Framework Lab Guide-1
20 pages
Into To Soft Computing
No ratings yet
Into To Soft Computing
20 pages
Exercises On Threads
No ratings yet
Exercises On Threads
3 pages
Web Tech Notes Unit 3,4,5
No ratings yet
Web Tech Notes Unit 3,4,5
20 pages
Cat He Sera
No ratings yet
Cat He Sera
5 pages
Automation Frameworks and Design Patterns
100% (1)
Automation Frameworks and Design Patterns
14 pages
C Programming-Wps Office
No ratings yet
C Programming-Wps Office
9 pages
Fajar Susilowati S17072
No ratings yet
Fajar Susilowati S17072
12 pages
A B C D E F: Producers and Consumers Problem
No ratings yet
A B C D E F: Producers and Consumers Problem
4 pages
B Tech 18062021
No ratings yet
B Tech 18062021
59 pages
Q. Explain Scope of Variable With An Example - Local, Global, Instance
No ratings yet
Q. Explain Scope of Variable With An Example - Local, Global, Instance
5 pages
Ahsan MASc S2024
No ratings yet
Ahsan MASc S2024
78 pages
CS3401 - Algorithm
No ratings yet
CS3401 - Algorithm
37 pages
W s-2 Foundation of Sets
No ratings yet
W s-2 Foundation of Sets
2 pages
Jntuk 2-1 Oops C++ - Unit-5
No ratings yet
Jntuk 2-1 Oops C++ - Unit-5
43 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Dsbdunitiii T1729232981820-1

Uploaded by

Dsbdunitiii T1729232981820-1

Uploaded by

Unit III Classification

Supervised learning, as the name indicates, has the presence of a supervisor as

The k-nearest neighbors (k-NN) algorithm is a simple yet powerful supervised

How k-NN Works

Distance Metrics: Commonly used metrics include Euclidean, Manhattan, and

Versatility: It can handle both numerical and categorical data.

Brightness Saturation Class

Now, we want to classify a new data point with Brightness = 55 and

Let’s calculate the distances:

Distance to (40, 20):

Distance to (50, 50):

Distance to (60, 90):

Distance to (10, 25):

Distance to (60, 10):

Distance to (25, 80):

The 3 nearest neighbors are:

(50, 50) - Blue

(40, 20) - Red

(70, 70) - Blue

Formula: The Gini impurity for a node ( t ) is calculated as:

where ( p_i ) is the probability of an element being classified as class ( i ) and ( C ) is

Definition: Entropy measures the amount of uncertainty or disorder in the

Formula: The entropy for a node ( t ) is calculated as

where ( p_i ) is the probability of an element being classified as class ( i ) and

To overcome overfitting, pruning techniques are used. Pruning reduces

Post-pruning: Removes branches from a fully grown tree that do not

Support vector machine

Let’s understand the basic terminologies of SVM with the help of a

Suppose we have a dataset in which the data points belong to two

Hyperplane (Decision Boundary)

As the first step in SVM, we want to separate the 2 classes (circles

So how do we separate the two classes?

Why is it called a decision boundary?

It is called a decision boundary because it acts as a boundary

In the next step, we have to identify the data points in

When we sum up these two distances, we get the new distance

Image Classification: Identifying objects in images.

Handwriting Recognition: Recognizing handwritten characters.

Bioinformatics: Classifying genes or proteins.

A Support Vector Machine (SVM) is a supervised machine learning algorithm

How Bagging Works

M1, M2 and M3 are weak learners.

Bootstrap Sampling: Multiple subsets of the training data are created by

Reduces Variance: By training multiple models on different subsets of the data,

Prevents Overfitting: Since each model is trained on a different subset of the

Improves Accuracy: The combined predictions of multiple models often result

Boosting is a powerful ensemble technique in machine learning designed to

How Boosting Works

Initial Weights: Assign initial weights to all data points.

Iterative Process: Repeat the process of adjusting weights and training

AdaBoost (Adaptive Boosting):

One of the first boosting algorithms.

Focuses on reweighting the training examples each time a learner is added,

Particularly effective for binary classification problems.

Builds models sequentially and corrects errors along the way.

XGBoost (Extreme Gradient Boosting):

An optimized version of gradient boosting.

Known for its speed and performance.

Incorporates regularization to prevent overfitting.

Classification: Boosting is widely used in classification tasks, such as spam

Regression: It is also used in regression tasks to predict continuous outcomes,

Boosting has significantly impacted machine learning by enhancing model

Bagging:Reduce variance and prevent overfitting.

Definition: Involves classifying data into one of two possible classes

Spam detection (spam vs. not spam)

Disease diagnosis (disease vs. no disease)

Algorithms: Logistic Regression, Support Vector Machines (SVM), Decision

One-versus-one: this strategy trains as many classifiers as there are pairs of

One-versus-rest: at this stage, we start by considering each label as an

In multi-label classification tasks, we try to predict 0 or more classes for each

Such a scenario can be observed in different domains, such as auto-tagging in

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.