0% found this document useful (0 votes)
694 views12 pages

Lec-13-Perceptron Vs Bayes Classifier

The document compares perceptrons and Bayes classifiers. Perceptrons are linear classifiers, while Bayes classifiers can be linear classifiers under the assumption of Gaussian distributions. The perceptron is adaptive and simple to implement, while the Bayes classifier minimizes misclassification risk independently of the underlying distribution. Both can classify linearly separable patterns, but the Bayes classifier can also handle nonseparable patterns.

Uploaded by

sadaqatsaghri
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
694 views12 pages

Lec-13-Perceptron Vs Bayes Classifier

The document compares perceptrons and Bayes classifiers. Perceptrons are linear classifiers, while Bayes classifiers can be linear classifiers under the assumption of Gaussian distributions. The perceptron is adaptive and simple to implement, while the Bayes classifier minimizes misclassification risk independently of the underlying distribution. Both can classify linearly separable patterns, but the Bayes classifier can also handle nonseparable patterns.

Uploaded by

sadaqatsaghri
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 12

Perceptron vs.

Bayes
Classifier

(Sec 3.10~3.11)
Perceptron vs. Bayes
Classifier
 What relationship to the Bayes classifier does the perceptron
bear?
 Perceptron is a linear classifier
 Bayes classifier is a linear classifier on the assumption of
Gaussianity

Probability distributions
Classification
Say C1

fX(x|C2)

x
H1 H1
Source
x
H2
fX(x|C1)

Probabilistic H: observation space


transition mechanism
Say C2
Bayes Criterion
 Assumptions
 the source outputs are governed by probability assignments
 a priori probability: pi, i=1,2
 observer’s information before experiment
 the cost for the course of action: cij, i,j=1,2
 cij, i,j=1,2
 i: hypothesis chosen, j: hypothesis that was true
 Objective
 design our decision rule so that on the average the risk will be as
small as possible
 the risk R = c11p1Pr(say C1|C1 is true) + c21p1Pr(say C2|C1 is true)
+ c22p2Pr(say C2|C2 is true) | + c12p2Pr(say C1|C2 is true)
Bayes Classifier
H  H1  H 2 Observation space

R  c11 p1 H f x ( x | C1 )dx  c22 p2 H  H f x ( x | C 2 )dx


1 1
Average Risk
 c21 p1 H  H f x ( x | C1 )dx  c12 p2 H f x ( x | C 2 )dx
1 1

where c11<c21 and c22<c12


Observing that
H f x ( x | C1 )dx  H f x ( x | C 2 )dx  1

R  c21 p1  c22 p2
 H [ p2 ( c12  c22 ) f x ( x | C 2 )  p1(c21  c11 ) f x ( x | C1 )]dx
1

A
If A is positive, the observation is of Class 2
If A is negative, the observation is of Class 1
Implementation of Bayes
Classifier
Define f ( x | C1 )
 ( x)  x Likelihood ratio
f x ( x | C2 )
p (c  c )
 ( x )  2 12 22 Threshold
p1(c21  c11 )

Assign x to class C1
(x) if (x) > .
x Likelihood ratio Comparator
Otherwise, assign
it to class C2


• Likelihood ratio Computation is completely invariant to the a priori
probabilities and costs involved in the decision-making process.
Another Implementation
 Natural logarithm is a monotonic function
 Likelihood ratio (x) and Threshold  are positive

Assign x to class C1
Log-Likelihood log(x) if log(x) > log.
x Comparator
ratio Otherwise, assign
it to class C2

log 
Bayes Classifier for a
Gaussian Distribution (1)
 Two-class problem for a Gaussian distribution

E[ X ]  1
Class C1
E[( X  1 )( X  1 )T ]  C
Class C2 E[ X ]   2
E[( X   2 )( X   2 )T ]  C

Nondiagonal and nonsingular covariance matrix C

1 1
f x ( x | Ci )  exp( ( x   i )T C 1( x   i )), i  1,2
( 2 )m / 2 (det(C ))1 / 2 2
Bayes Classifier for a
Gaussian Distribution (2)
f ( x | C1 ) p2 (c12  c22 )
 ( x)  x  ( x) 
f x ( x | C2 ) p1(c21  c11 )
 Assumption
 two classes are equiprobable: p1=p2=0.5
 No cost on correct classification: c11=c22=0 and c21=c12
log   0
log  ( x )  log( f x ( x | C1 ))  log( f x ( x | C 2 ))
1 1
  ( x  1 )T C 1( x  1 )  ( x   2 )T C 1( x   2 )
2 2
1
 ( 1   2 )T C 1 x  (  2T C 1 2  1T C 11 )
2

wT x b Linear Classifier
1-D Gaussian Distribution
Decision boundary

fX(x|C1) fX(x|C2)

1 0 2 x

Minimum average risk


Comparison between
Perceptron and Bayes Classifier
 The perceptron operates on the linearly separable
patterns, while Bayes classifier can work on
nonseparable patterns
 Bayes classifier minimizes the probability of
misclassification which is independent of the underlying
distribution
 The perceptron convergence algorithm is non-parametric
and Bayes classifier is parametric
 The perceptron convergence algorithm is adaptive and
simple to implement
Summary
 The perceptron and LMS algorithm
 LMS algorithm uses a linear neuron and performs continuous
learning
 The perceptron uses the McCullock-Pitts formal model of a neuron
 Critique of Rosenblatt’s perceptron by Minsky(1961)
 generalization
 computational limitation
 Advanced forms of neural networks
 multilayer perceptron trained with the back-propagation algorithm
 radial basis-function networks
 support vector machine

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy