0% found this document useful (0 votes)
12 views20 pages

Design of SVM

The document discusses support vector machines (SVMs). It explains that SVMs find the optimal separating hyperplane that maximizes the margin between two classes of samples. It describes how the Lagrangian is used to solve the constrained optimization problem of finding the hyperplane. The hyperplane is defined based on support vectors, which are training samples that lie closest to the hyperplane.

Uploaded by

Ankur Saroj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views20 pages

Design of SVM

The document discusses support vector machines (SVMs). It explains that SVMs find the optimal separating hyperplane that maximizes the margin between two classes of samples. It describes how the Lagrangian is used to solve the constrained optimization problem of finding the hyperplane. The hyperplane is defined based on support vectors, which are training samples that lie closest to the hyperplane.

Uploaded by

Ankur Saroj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Support Vector Machine

Compiled by Koushika B, COE17B044


Guided by
Dr Umarani Jayaraman

Department of Computer Science and Engineering


Indian Institute of Information Technology Design and Manufacturing
Kancheepuram

May 5, 2022

1 / 20
SVM

w t Xi + w0 > 0
w .Xi + w0 > 0 if Xiw1
w .Xi + w0 < 0 if Xiw2
g (X ) = w .Xi + w0 ≥ b for good generalization
The distance of point ‘X’ from the hyperplane ‘H’ which is
represented as g(X) can be calculated as r = g (x)/||w ||
which is nothing but (W .Xi + w0 )/||w || .

2 / 20
So, we must ensure that

(W .Xi + w0 )/||W || ≥ b ⇒ 1

(W .Xi + w0 ) ≥ b.||W ||

W .Xi + w0 ≥ 1 if Xi ω1

W .Xi + w0 ≤ −1 if Xi ω2 ⇒ 2

3 / 20
From 2, we can establish an uniform criteria as follows Yi is the class label
whose values is ±1
+1 for class ω1
-1 for class ω2

yi (W .Xi + w0 ) ≥ 1 and this equality holds

yi (W .Xi + w0 ) = 1 if Xi are support vectors

yi (W .Xi + w0 ) > 1 if Xi are not support vectors

4 / 20
5 / 20
From 1 ,
In order to maximize the margin ‘b’, ||w || has to be minimized and at
the same time wo has to be maximized .
For minimization of ||w || , let’s consider the other constraints.

yi (w .Xi + w0 ) = 1
Because it is constraint optimisation problem, it can be converted into
un-constraint problem by using the lagrangian multiplier.

6 / 20
Minimization of ||W || is same as minimization of φ(W ) -
function of W

φ(W ) = W t .W
φ(w ) = 12 .W .W (dot product)

1/2 is introduced for mathematical convenience.

with the constraints,


yi (W .Xi + w0 ) = 1

subject to the constraint, if Xi are support vectors

7 / 20
It can be written using unconstraint optimization problem as follows

Pn
L(||W ||, wo ) = 12 ||W ||2 – i=1 λ[yi (W .Xi + wo )–1]

minimize ||w || .
1 2
2 ||W || is objective function
[yi (W .Xi + wo )–1] is a constraint

8 / 20
We can define the lagrangian of the form,

L(w , wo ) = 21 (w .w )– αi [yi(w .Xi + wo)–1]


P

minimize ||wo || .
maximize ||w || .
αi is Lagrangian multiplier

L(w , wo ) = 12 (w .w )–
P
αi [yi (w .Xi + wo )–1]

by taking derivative w.r.t ’w’ and 0 wo0

9 / 20
L(W , wo ) = 12 (W .W )– P αi [yi (W Xi + w
P
Po )–1]
L(W , wo ) = 21 (W .W )– αi yi (W Xi ) − αi yi wo + αi
P

∂L P
∂wo =− αi yi = 0

∂L Pn
∂wo = i=1 αi yi = 0 ⇐ one of the constraints

n= number of samples during training

10 / 20
L(W , wo ) = 12 (W .W )– αi yi (W .Xi ) − αi yi wo + αi
P P P
∂L P
∂w = W − αi yi Xi = 0
Pn
W = i=1 αi yi Xi

Substitute those values P


in 1
L(W , wo ) = 12 (W .W )– αi yi (W .Xi ) − αi yi wo + αi
P P

where P
W .Xi = αj yj Xj and αi yi wo ) = 0

11 / 20
1 Pn Pn P
= 2 i=1 αi αj yi yj (Xi .Xj )– i=1 αi αj yi yj (Xi .Xj ) + αi
Pn 1 Pn
L(W , wo ) = i=1 αi − 2 i=1 αi αj yi yj (Xi .Xj )

We have to maximize this with different values of αi

12 / 20
αi - lagrangian multipliers are always positive

αi ≥ 0
Pn
i=1 αi yi =0

When we try to maximize this lagrangian function, it is quite likely


some of the lagrangian multipliers are zero and few of the lagrangian
multipliers are very high.
if αi = 0; that indicates its corresponding training vectors (Xi ) are
not support vectors.
if αi 6= 0 , its corresponding training vectors (Xi ) are having high
influence over the position of hyper plane (support vectors).
Here αi 6= 0 will go into decision making.

13 / 20
g (z) = W .Z P
+ wo
g (z) = sign( ni=1 αi yi Xi .Z + wo )

unknown Feature Vector Z


if sign is +ve Z ω1
if sign is -ve Z ω2

14 / 20
The steps of SVM design to estimate W and wo

Pn
W = i=1 αi yi Xi

and wo

wo = 12 [min
P P
∀yi =+1 αi yi (Xi .Xj ) + max ∀yi =−1 αi yi (Xi .Xj )]

wo = 21 [min
P P
∀yi =+1 W .Xi + max ∀yi =−1 W .Xi ]

substitute these W and wo into classification rule to classify the unknown


feature vector Z .

g (Z ) = W .ZP+ wo
g (Z ) = sign ni=1 αi yi Xi .Z + wo

15 / 20
16 / 20
but in implementation,
w t Xi + wo = 1
w .Xi + wo = 1

wo = 1 − (w .X )

17 / 20
Additional slides

Consider the optimization problem maximize f(x,y) subject to :

g(x,y) = 0 or g(x,y) = c
L(x, y , λ) = f (x, y )–λg (x, y )

f(x,y) is the objective function


g(x,y) is the constraint

assume both f and g have continuous first partial derivatives.

18 / 20
Additional slides

Here
f (x, y ) = f (||w ||, wo ) ⇒ objective function

g (x, y ) = yi (w .X + wo ) = 1 ⇒ constraint
we can define the lagrangian of the form ,
Pn
L(||w ||, wo ) = 12 ||w ||2 – i=1 λ[yi (w .Xi + wo )–1]

minimize ||w || .
maximize ||wo ||
||w ||2 is objective function
[yi (w .Xi + wo )–1] is the constraint

19 / 20
THANK YOU

20 / 20

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy