0% found this document useful (0 votes)

18 views50 pages

Unit II 2.2 ML Kernel Machines SVM

Uploaded by

Hii

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views50 pages

Unit II 2.2 ML Kernel Machines SVM

Uploaded by

Hii

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 50

MACHINE LEARNING

(20BT60501)

COURSE DESCRIPTION:
Concept learning, General to specific ordering, Decision tree
learning, Support vector machine, Artificial neural networks,
Multilayer neural networks, Bayesian learning, Instance based
learning, reinforcement learning.
Subject :MACHINE LEARNING -(20BT60501)

Topic: Unit II – DECISION TREE LEARNING AND KERNEL MACHINES

Prepared By:
Dr.J.Avanija
Professor
Dept. of CSE
Sree Vidyanikethan Engineering College
Tirupati.
Unit II – DECISION TREE LEARNING AND KERNEL MACHINES

Decision Tree Learning:

 Decision tree representation
 Problems for decision tree learning
 Decision tree learning algorithm
 Hypothesis space search
 Inductive bias in decision tree learning
 Issues in decision tree learning
Kernel Machines:
 Support vector machines
 SVMs for regression
 SVMs for classification
 Choosing C
 A probabilistic interpretation of SVMs
Kernel Methods in Machine Learning
 Kernels or kernel methods (also called Kernel functions) are sets of
different types of algorithms that are being used for pattern analysis.
 Used to solve a non-linear problem by using a linear classifier.
 Kernel in Machine Learning is a field of study that enables computers to
learn without being explicitly programmed
 The input dataset can be placed into a higher dimensional space with
the help of a kernel method or trick and then use any of the available
classification algorithms in this higher-dimensional space.
 Derives a hyperplane that linearly separates the two categories.
 Kernel in Machine Learning is a measure of similarity between two
points

4
Kernel Methods in Machine Learning

5
Kernel Methods in Machine Learning

In the real world, almost all the data are randomly

distributed, which makes it hard to separate different
classes linearly.

Mapping the data from 2-dimensional space to 3-

dimensional space
6
Kernels in SVM

Interesting feature of SVM is that it can even work with a

non-linear dataset and for this, we use “Kernel Trick”
which makes it easier to classify the point

Different Kernel Functions

 Polynomial Kernel
 Sigmoid Kernel
 RBF Kernel
 Bessel function Kernel
 Annova Kernel 7
Kernels in SVM

Polynomial Kernel
Following is the formula for the polynomial kernel:

Here d is the degree of the polynomial, which we need to

specify manually.

SVM using Polynomial

kernel (degree 2) can
separate data

8
Kernels in SVM

Sigmoid Kernel
Can be used as Neural networks:

It is just taking your input, mapping them to a value of 0

and 1 so that they can be separated by a simple straight
line.

9
Kernels in SVM

RBF Kernel (Radial Basis Function)

It creates non-linear combinations of our features to lift your
samples onto a higher-dimensional feature space where we
can use a linear decision boundary to separate your classes It
is the most used kernel in SVM classifications, the following
formula explains it mathematically:

10
Kernels in SVM

RBF Kernel

11
Support Vector Machines
 A support vector machine (SVM) is machine learning algorithm
that analyzes data for classification and regression analysis.
SVM is a supervised learning method that looks at data and
sorts it into one of two categories.
 It is trained with a series of data already classified into two
categories, building the model as it is initially trained. The task of
an SVM algorithm is to determine which category a new data point
belongs in.
 SVM a kind of non-binary linear classifier.

12
Support Vector Machines
Important Terminologies
 Hyperplane
 Support Vectors
 Marginal Distance
 Linear Separable
 Non-linear Separable

13
Support Vector Machines

Hyperplane

Marginal
Distance

Support
Vectors

14
Support Vector Machines

Applications of SVM
 Text and hypertext classification
 Image classification
 Recognizing handwritten characters
 Biological sciences, including protein classification

 The goal of the SVM algorithm is to create the best line or

decision boundary that can segregate n-dimensional space
into classes so that we can easily put the new data point in the
correct category in the future. This best decision boundary is
called a hyperplane.

15
Support Vector Machines
 SVM chooses the extreme points/vectors that help in creating the hyperplane.
These extreme cases are called as support vectors, and hence algorithm is
termed as Support Vector Machine
 The data points or vectors that are the closest to the hyperplane and which
affect the position of the hyperplane are termed as Support Vector.
 Two different categories that are classified using a decision boundary or
hyperplane:

16
Support Vector Machines

The red coloured dashed line is the optimal hyper plane. The green
coloured dashed lines define the boundary for each class. And the
data points with green coloured thick outline that are on the
boundary of the class are called support vectors. Hence, the
name Support Vector Machine.
17
Support vectors used to determine optimal hyperplane.
Support Vector Machines
SVM can be of two types:
 Linear SVM: Linear SVM is used for linearly separable data, which means if
a dataset can be classified into two classes by using a single straight
line, then such data is termed as linearly separable data, and classifier is
used called as Linear SVM classifier.

 Non-linear SVM: Non-Linear SVM is used for non-linearly separated data,

which means if a dataset cannot be classified by using a straight line,
then such data is termed as non-linear data and classifier used is called as
Non-linear SVM classifier.

18
Support Vector Machines

Linear SVM Non-Linear SVM

19
Support Vector Machines
Mathematical Modeling:
 Given, training set {(Xᵢ,Yᵢ) where i=1,2,3,…,n}, Xᵢ ∈ ℜᵐ, Yᵢ ∈ {+1,-1}.
Here, Xᵢ is the feature vector for the iᵗʰ data point and Yᵢ is the label for
the iᵗʰ data point. The label can be either ‘+1’ for positive class or ‘-1’ for
negative class. The value ‘1’ is taken for the mathematical convenience.
 Wᵢ be a vector perpendicular to the decision boundary (the optimal hyper
plane) and Xᵢ be an unknown vector. Then the projection of Xᵢ vector on
the unit vector of Wᵢ will determine if that unknown point belongs to
positive class or negative class

20
Support Vector Machines
Mathematical Modeling:
Y= WᵗXᵢ + b – Equation of hyperplane
The dot product between two vector W and X is same as matrix
multiplication between Wᵗ and X.

21
Support Vector Machines
Mathematical Modeling:
Let, X⁺ be a support vector under positive class and X⁻ be a support vector
under negative class.Then, WX⁺ + b =1 ⇒ WX⁺ = 1 - b. Similarly, WX⁻ + b =-
1 ⇒ WX⁻ = -1 - b. Then, the projection of the vector (X⁺ — X⁻) on the unit
vector of W vector gives the width of the separation gap or the margin between
the support vectors of the two classes. The width of the margin is given by:

22
Support Vector Machines
Mathematical Modeling:
The objective of SVM is to maximise the width of the separation gap. That
means to maximise 2/||W|| which is same as minimising||W|| which is same
as minimising ||W||² and which is same as minimising (1/2)||W||² and the
same thing can be written as (1/2)WᵗW.

23
Support Vector Machines
Hard Margin and Soft Margin
 Hard margin SVM, used for linear separable data
 Soft margin SVM used for non-linear separable data.

24
Support Vector Machines
•
Soft Margin Constraints

25
Support Vector Machines
Mathematical Modeling:
 In real scenarios, the data are not strictly linearly separable. Thus, the problem is
modified by introducing the slack variables ‘ξ’ and a penalty term ‘C’.
Here, ‘C’ is a kind of regularisation parameter.
 slack variables ‘ξ’ which is the distance between the data point and the
margin of the class from the other side
 If ξᵢ for a data point is less then the mistake is less bad and C*ξᵢ will be less.
 f ξᵢ for a data point is high then the mistake is more bad

26
Support Vector Machines
Mathematical Modeling:

27
SVM for Classification
 Hinge Loss
 The hinge loss is a specific type of cost function that incorporates a margin or
distance from the classification boundary into the cost calculation.
 The hinge loss increases linearly.
 Associated with soft-margin support vector machines.
 The distance from the hyperplane can be regarded as a measure of confidence.

28
SVM for Classification
 Hinge Loss
 Given input features “X” and target “y”, the goal of the SVM algorithm is to
predict a value ( ‘predicted y’) close to the target (‘actual y’) for each
observation.
 Equation that could calculate ‘predicted y’ depends on some weighted values of
input X. It can be written as :

predicted y’ = f (weighted values of X). (weights denoted as w)

‘
Job of the loss function is to quantify the error between the ‘predicted y’ and the
‘actual y’.
 This defines the amount by which you want to penalize the mis-classified
observations.
Total cost = ||w²||/2 + C*(Sum of all losses for each observation)

Where ‘C’ is the hyper-parameter that controls the amount of regularization.

29
SVM for Classification
 Hinge Loss
 The hinge loss is a loss function used for training classifiers in SVM. The hinge
loss is used for "maximum-margin" classification, most notably for support
vector machines (SVMs).

 Decision boundary
classifying positive
and negative
points.
 Points marked in
red are
misclassified

30
SVM for Classification
 Hinge Loss
 Plot the yf(x) against the loss function, For points in yf(x) > 0, assign
‘0’ loss.
For points where yf(x) < 0, assign
a loss of ‘1’

31
SVM for Classification
X is a positive sample. Penalty = 1 – t.y where t is actual output,
y is predicted output by
SVM
Positiv
e
Plane
Negativ
e
Plane
Penalty = Penalty = 0 to 1
0

Penalty = 1 Penalty > 1

32
SVMs for Classification

• The hinge loss function penalty increases

we linearly.
ion
ion • Hinge loss is defined as
Where t is the actual outcome (either +1 or
-1),
D
B

y is𝑛the predicted output of the SVM.

1
min ‖𝑊 ‖ + 𝐶 ∑ 𝑚𝑎𝑥 ( 0 ,1 −𝑡 𝑖 𝑦 𝑖 )
2

2 𝑖 =1 Negative
Plane

where n is the number of

samples.

33
SVM for Classification
 Hinge Loss
 Hinge loss = [0, 1- yf(x)].
 For yf(x) ≥ 1, hinge loss is ‘0’.
 For yf(x) < 1, then hinge loss increases massively.
 If yf(x) increases with every misclassified point the upper bound of
hinge loss {1- yf(x)} also increases exponentially.

xi for which αi > 0 are called support

vectors- the points which are either
34
incorrectly classified or are classified.
SVM for Classification
 Large Margin Principle

35
SVM for Classification
 Large Margin Principle
 The margin is the distance between the two boundaries. The support vectors are
the instances at the boundaries (when WᵗX = 1 or -1) • Or within the boundaries,
if not linearly separable
 The goal of SVMs is to learn the boundaries to make the margin as large as
possible (Large Margin Classification)
 The size of the margin is: 2 / ||w||
 ||w|| is the L2 norm of the weight vector
 Learning goal:
 Maximize 2 / ||w||, subject to the constraints that all instances are correctly
classified
 Turn it into minimization problem by taking the inverse: ½ ||w||
 Can also square the L2 norm (makes the calculus easier), just like with L2
regularization: ½ ||w||2
36
SVM for Classification
 Choosing C
 C is chosen from cross validation
 Support Vector Machine always looks for
-Setting a larger margin
-lowering misclassification rate
 Increase in margin, leads to high misclassfication rate
 Decrease in margin, leads to low misclassfication rate
 Priority should be getting a lower misclassfication rate
 Can be achieved by parameter C

37
SVM for Classification
 Cross Validation- k-fold CV

Cross-validation is a resampling method that uses

different portions of the data to test and train a model on
different iterations. Cross-validation is a statistical
method used to estimate the skill of machine learning
models.
38
SVM for Classification
 Choosing C
 Large Value of parameter C => small margin
 Small Value of paramerter C => Large margin
 Choosing C depends on test data.
 Try with different C values and choose the value which gives you lowest
misclassification rate on testing data.

39
SVM for Classification
 SVMs for Multiclass Classification

40
SVM for Classification
 SVMs for Multiclass Classification

• Upgrading an SVM to the multi-class case is not so easy, since the

outputs are not on a calibrated scale and hence are hard to compare
to each other

• one-versus-the-rest (OVR) approach (also called one-vs-all))

• one-versus-one approach (OVO)

• in which we train 𝐶 binary classifiers, 𝑓𝑐(𝐱), where the data from class 𝑐
is treated as positive, and the data from all the other classes is treated as
negative

• However, this can result in regions of input space which are

ambiguously labeled.
41
•The green region is predicted to be both class 1 and class 2.
SVM for Classification
 SVMs for Multiclass Classification

• The obvious approach is to use a one-versus-the-rest approach (also

called one-vs-all), in which we train 𝐶 binary classifiers, 𝑓𝑐(𝐱), where the
data from class 𝑐 is treated as positive, and the data from all the other
classes is treated as negative

• However, this can result in regions of input space which are

ambiguously labeled.
•The green region is predicted to be
both class 1 and class 2.

42
SVM for Classification
 SVMs for Multiclass Classification

Another approach is to use the one-versus-one or OVO approach,also

called all pairs, in which we train C(C−1)/2 classifiers to discriminate all
pairs 𝑓𝐶,𝐶′
•We then classify a point into the class which has the highest number of
votes. However, this can also result in ambiguities

43
SVM for Regression
 Regression analysis consists of a set of machine learning methods
that allow us to predict a continuous outcome variable (y) based
on the value of one or multiple predictor variables (x).
 Goal of regression model is to build a mathematical equation that
defines y as a function of the x variables.
 This equation can be used to predict the outcome (y) on the basis of
new values of the predictor variables (x).
 t can be utilized to assess the strength of the relationship between
variables and for modeling the future relationship between them.

44
SVM for Regression

y has sub i (ŷᵢ) represents the estimated output given the

input. (Xᵢ)
In this equation, β₀ is a bias and β₁ is the weight of the
model. If the model is charted so that the output is the y
axis and the input is the x axis, β₀ refers to the y -
45
intercept and β₁ represents the slope.
SVM for Regression
vector 𝒘 depends on all the training inputs
• The problem with kernelized ridge regression is that the solution

• We now seek a method to produce a sparse estimate

• Consider the epsilon insensitive loss function called as Huber
loss function

• This means that any point lying inside an 𝜖-tube around the
prediction is not penalized

46
SVM for Regression
• Mean Square Error

where n is the number of samples, t is actual output, y is

predicted output.

• Mean Absolute Error

• Huber Loss

47
SVMs for Regression

(a) Illustration of 𝑃2, Huber and 𝜖-insensitive loss

𝜖 = 1.5
(b) Illustration of the 𝜖-tube used in SVM regression.
functions, where

Huber Loss:
48
SVMs for Regression
• The corresponding objective function

49
SVMs Pros and Cons
– It works really well with a clear margin of separation.
– It is effective in high dimensional spaces.
– It is effective in cases where number of dimensions > number of samples
– It uses a subset of training points in the decision function (called support
vectors), so it is also memory efficient.
– This classifier is heavily reliant on the support vectors and changes as
support vectors change. As a result, they tend to overfit. Hence kernels
functions and regularization is important.
– It does not provide probability estimates.
– It doesn’t perform well with large datasets because the required training
time is higher.
– It also doesn’t perform very well, when the data set has more noise i.e.
target classes are overlapping.

Unit 2 PPT - Part 2
100% (1)
Unit 2 PPT - Part 2
81 pages
Sesion 4
No ratings yet
Sesion 4
37 pages
10 Classification SVM
No ratings yet
10 Classification SVM
22 pages
Support Vector Machine (SVM)
No ratings yet
Support Vector Machine (SVM)
103 pages
Unit-III - SVM
No ratings yet
Unit-III - SVM
105 pages
ML Lec9 SVM
No ratings yet
ML Lec9 SVM
32 pages
Support Vector Machines
No ratings yet
Support Vector Machines
43 pages
S V M (SVM) : Upport Ector Achine
No ratings yet
S V M (SVM) : Upport Ector Achine
67 pages
Unit 2 - SVM - 241016 - 104220
No ratings yet
Unit 2 - SVM - 241016 - 104220
47 pages
Detailed SVM Presentation
No ratings yet
Detailed SVM Presentation
15 pages
Unit2 Notes What Is A Support Vector Machine
No ratings yet
Unit2 Notes What Is A Support Vector Machine
11 pages
2.6 Supervised-Support Vector Machine
No ratings yet
2.6 Supervised-Support Vector Machine
18 pages
L5-Support Vector Machine
No ratings yet
L5-Support Vector Machine
61 pages
Support Vector Machine
No ratings yet
Support Vector Machine
18 pages
Session Svmclassification
No ratings yet
Session Svmclassification
28 pages
SVM Presentation
No ratings yet
SVM Presentation
13 pages
Hands On Machine Learning 3 Edition
No ratings yet
Hands On Machine Learning 3 Edition
43 pages
Module 3 ML 24
No ratings yet
Module 3 ML 24
65 pages
Lecture - 7 Classification (SVM)
No ratings yet
Lecture - 7 Classification (SVM)
48 pages
Ankita
No ratings yet
Ankita
10 pages
Support Vector Machine: Abinas Panda
No ratings yet
Support Vector Machine: Abinas Panda
52 pages
Support Vector Machine
No ratings yet
Support Vector Machine
9 pages
Course Title: Fundamentals of Machine Learning Course Code: Group Assignment On
No ratings yet
Course Title: Fundamentals of Machine Learning Course Code: Group Assignment On
9 pages
Supervised Alg
No ratings yet
Supervised Alg
27 pages
Unit 2
No ratings yet
Unit 2
47 pages
Ann Unit III
No ratings yet
Ann Unit III
20 pages
Support Vector Machine (SVM) Algorithm
No ratings yet
Support Vector Machine (SVM) Algorithm
10 pages
Unit - 2-1
No ratings yet
Unit - 2-1
7 pages
Chapter 07
No ratings yet
Chapter 07
18 pages
Support Vector Machine (SVM) Algorithm
No ratings yet
Support Vector Machine (SVM) Algorithm
9 pages
Support Vector Machine
No ratings yet
Support Vector Machine
17 pages
Support Vector Machine: Prof. Subodh Kumar Mohanty
No ratings yet
Support Vector Machine: Prof. Subodh Kumar Mohanty
52 pages
SVM
No ratings yet
SVM
11 pages
Support Vector Machine
No ratings yet
Support Vector Machine
40 pages
SVM Notes Unit 4
No ratings yet
SVM Notes Unit 4
8 pages
SVM MJJ
No ratings yet
SVM MJJ
19 pages
SVM
No ratings yet
SVM
12 pages
Lecture 18 - SVM
No ratings yet
Lecture 18 - SVM
54 pages
SVM Theory
No ratings yet
SVM Theory
7 pages
Unit5 ML
No ratings yet
Unit5 ML
12 pages
SVM Manual
No ratings yet
SVM Manual
7 pages
ML Unit 3
No ratings yet
ML Unit 3
14 pages
DMML Unit4 - SVM
No ratings yet
DMML Unit4 - SVM
50 pages
Unit - 2
No ratings yet
Unit - 2
15 pages
Support Vactor Machine Final
No ratings yet
Support Vactor Machine Final
11 pages
Support Vector Machines
No ratings yet
Support Vector Machines
16 pages
Support Vector Machine (SVM) Terminology Hyperplane WX + B 0 Support Vectors Margin Kernel Hard Margin Soft Margin
No ratings yet
Support Vector Machine (SVM) Terminology Hyperplane WX + B 0 Support Vectors Margin Kernel Hard Margin Soft Margin
6 pages
Support Vector Machines
No ratings yet
Support Vector Machines
12 pages
Unit-4 AI - SVM
No ratings yet
Unit-4 AI - SVM
21 pages
Support Vector Machine
No ratings yet
Support Vector Machine
52 pages
Module10 - Support Vector Machine
No ratings yet
Module10 - Support Vector Machine
23 pages
Linear Regression & SVM
No ratings yet
Linear Regression & SVM
33 pages
SVM Presentation
No ratings yet
SVM Presentation
27 pages
Support Vector Machine
No ratings yet
Support Vector Machine
31 pages
Targ - Theoretical Mechanics A Short Course - Mir 1988 PDF
100% (2)
Targ - Theoretical Mechanics A Short Course - Mir 1988 PDF
528 pages
Support Vector Machine (SVM)
No ratings yet
Support Vector Machine (SVM)
4 pages
Support Vector Machine
No ratings yet
Support Vector Machine
45 pages
13 Fault Tree Analysis
100% (1)
13 Fault Tree Analysis
13 pages
SVM Notes
No ratings yet
SVM Notes
8 pages
C15....... Concrete Mix Design
No ratings yet
C15....... Concrete Mix Design
18 pages
SUpport Vector Machine
No ratings yet
SUpport Vector Machine
28 pages
The Nature and Goals of Anthropology, Sociology and Political Science
No ratings yet
The Nature and Goals of Anthropology, Sociology and Political Science
12 pages
Trane Presentation
No ratings yet
Trane Presentation
52 pages
Draw 122 Geometrical Construction 2 Part 1
No ratings yet
Draw 122 Geometrical Construction 2 Part 1
23 pages
Nigerian Agricultural Journal: Adoption of Improved Soybean Production Technologies in Benue State, Nigeria
No ratings yet
Nigerian Agricultural Journal: Adoption of Improved Soybean Production Technologies in Benue State, Nigeria
6 pages
Material Safety Data Sheet: Es Compleat Coolant Eg Premix (Ethylene Glycol Based Coolant)
No ratings yet
Material Safety Data Sheet: Es Compleat Coolant Eg Premix (Ethylene Glycol Based Coolant)
8 pages
Learn About Ecosystems - Lesson Plan
No ratings yet
Learn About Ecosystems - Lesson Plan
2 pages
Specification For Structural Steel Buildings 04
No ratings yet
Specification For Structural Steel Buildings 04
4 pages
gooFSM Research Full Chapters
No ratings yet
gooFSM Research Full Chapters
79 pages
WK 6 Strategic Planning Policy Analysis
No ratings yet
WK 6 Strategic Planning Policy Analysis
47 pages
Sort - SEIRI: Checklist Item Criteria Exist? Rating Comments
No ratings yet
Sort - SEIRI: Checklist Item Criteria Exist? Rating Comments
2 pages
Black Holes and Beyond
No ratings yet
Black Holes and Beyond
140 pages
BSC12 PDF
No ratings yet
BSC12 PDF
2 pages
Practice Test Planner - 2024-25 (TYM) Phase-03 Version 2.0
No ratings yet
Practice Test Planner - 2024-25 (TYM) Phase-03 Version 2.0
4 pages
NMP5 Q4 Week 2
No ratings yet
NMP5 Q4 Week 2
16 pages
Lecture Slides GGR The Role of The Board in Innovation Ver1.0 110224
No ratings yet
Lecture Slides GGR The Role of The Board in Innovation Ver1.0 110224
46 pages
Behavioral Pragmatism Barnes Holmes
No ratings yet
Behavioral Pragmatism Barnes Holmes
12 pages
Fluvial Processes
No ratings yet
Fluvial Processes
35 pages
Strength and Durability of Mortar and Concrete Containing Rice Husk Ash: A Review
No ratings yet
Strength and Durability of Mortar and Concrete Containing Rice Husk Ash: A Review
15 pages
Risk Management Q-A 1-5 Module-1
No ratings yet
Risk Management Q-A 1-5 Module-1
4 pages
Free Online AI Face Swap 2
No ratings yet
Free Online AI Face Swap 2
1 page
The Prodigy of Welding: By: Brandy Ratliff Graduation Project 2010
No ratings yet
The Prodigy of Welding: By: Brandy Ratliff Graduation Project 2010
16 pages
Elapan Company Profile 2023
No ratings yet
Elapan Company Profile 2023
7 pages
AN240P
No ratings yet
AN240P
5 pages
SS Specimen Papers (2267, 227X) - With Marking Points
No ratings yet
SS Specimen Papers (2267, 227X) - With Marking Points
8 pages
Trampa Termodinamica Modelo NTD600
No ratings yet
Trampa Termodinamica Modelo NTD600
2 pages
Topcon Agriculture SB - 18005 TopNET Global D Frequency Migration Phase 2
No ratings yet
Topcon Agriculture SB - 18005 TopNET Global D Frequency Migration Phase 2
2 pages
Wilson Newsletter September 2020
No ratings yet
Wilson Newsletter September 2020
2 pages
Support Vector Machine: Fundamentals and Applications
From Everand
Support Vector Machine: Fundamentals and Applications
Fouad Sabry
No ratings yet
Kernel Methods: Fundamentals and Applications
From Everand
Kernel Methods: Fundamentals and Applications
Fouad Sabry
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Unit II 2.2 ML Kernel Machines SVM

Uploaded by

Unit II 2.2 ML Kernel Machines SVM

Uploaded by

MACHINE LEARNING

Topic: Unit II – DECISION TREE LEARNING AND KERNEL MACHINES

Decision Tree Learning:

In the real world, almost all the data are randomly

Mapping the data from 2-dimensional space to 3-

Interesting feature of SVM is that it can even work with a

Different Kernel Functions

Here d is the degree of the polynomial, which we need to

SVM using Polynomial

It is just taking your input, mapping them to a value of 0

RBF Kernel (Radial Basis Function)

 The goal of the SVM algorithm is to create the best line or

 Non-linear SVM: Non-Linear SVM is used for non-linearly separated data,

Linear SVM Non-Linear SVM

predicted y’ = f (weighted values of X). (weights denoted as w)

Where ‘C’ is the hyper-parameter that controls the amount of regularization.

Penalty = 1 Penalty > 1

• The hinge loss function penalty increases

y is𝑛the predicted output of the SVM.

where n is the number of

xi for which αi > 0 are called support

Cross-validation is a resampling method that uses

• Upgrading an SVM to the multi-class case is not so easy, since the

• one-versus-the-rest (OVR) approach (also called one-vs-all))

• one-versus-one approach (OVO)

• However, this can result in regions of input space which are

• The obvious approach is to use a one-versus-the-rest approach (also

• However, this can result in regions of input space which are

Another approach is to use the one-versus-one or OVO approach,also

y has sub i (ŷᵢ) represents the estimated output given the

• We now seek a method to produce a sparse estimate

where n is the number of samples, t is actual output, y is

• Mean Absolute Error

(a) Illustration of 𝑃2, Huber and 𝜖-insensitive loss

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.