0% found this document useful (0 votes)
8 views13 pages

Naive Bayes

AI

Uploaded by

sushmitaa9193
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views13 pages

Naive Bayes

AI

Uploaded by

sushmitaa9193
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Data Mining

Classification: Alternative Techniques

Lecture Notes for Naïve Bayes Classifiers

Introduction to Data Mining


by
Tan, Steinbach, Kumar

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 1


Bayes Classifier
• A probabilistic framework for solving classification problems
• Conditional Probability: P( A, C )
P(C | A) =
P( A)
P( A, C )
P( A | C ) =
P(C )
• Bayes theorem:

P( A | C ) P(C )
P(C | A) =
P( A)
Example of Bayes Theorem
• Given:
• A doctor knows that meningitis causes stiff neck 50% of the time
• Prior probability of any patient having meningitis is 1/50,000
• Prior probability of any patient having stiff neck is 1/20

• If a patient has stiff neck, what’s the probability he/she


has meningitis?

P( S | M ) P( M ) 0.5 ×1 / 50000
P( M | S ) = = = 0.0002
P( S ) 1 / 20
Bayesian Classifiers
• Consider each attribute and class label as random
variables

• Given a record with attributes (A1, A2,…,An)


• Goal is to predict class C
• Specifically, we want to find the value of C that maximizes P(C|
A1, A2,…,An )

• Can we estimate P(C| A1, A2,…,An ) directly from data?


Bayesian Classifiers
• Approach:
• compute the posterior probability P(C | A1, A2, …, An) for all
values of C using the Bayes theorem
P( A A ! A | C ) P(C )
P(C | A A ! A ) = 1 2 n

P( A A ! A )
1 2 n

1 2 n

• Choose value of C that maximizes


P(C | A1, A2, …, An)

• Equivalent to choosing value of C that maximizes


P(A1, A2, …, An|C) P(C)

• How to estimate P(A1, A2, …, An | C )?


Naïve Bayes Classifier
• Assume independence among attributes Ai when class is given:
• P(A1, A2, …, An |C) = P(A1| Cj) P(A2| Cj)… P(An| Cj)

• Can estimate P(Ai| Cj) for all Ai and Cj.

• New point is classified to Cj if P(Cj) Π P(Ai| Cj) is maximal.


How to Estimate
l l
Probabilities
s
from Data?
c a c a u
r i r i o
o o u
g g in s
ca
te
ca
te
c on
t
cl
a s • Class: P(C) = Nc/N
Tid Refund Marital Taxable • e.g., P(No) = 7/10,
Status Income Evade P(Yes) = 3/10
1 Yes Single 125K No
2 No Married 100K No
• For discrete attributes:
3 No Single 70K No
4 Yes Married 120K No P(Ai | Ck) = |Aik|/ Nc k
5 No Divorced 95K Yes
• where |Aik| is number of
6 No Married 60K No
instances having attribute Ai
7 Yes Divorced 220K No
and belongs to class Ck
8 No Single 85K Yes
• Examples:
9 No Married 75K No
10 No Single 90K Yes P(Status=Married|No) = 4/7
10

P(Refund=Yes|Yes)=0
How to Estimate Probabilities from Data?

• For continuous attributes:


• Discretize the range into bins
k
• one ordinal attribute per bin
• violates independence assumption
• Two-way split: (A < v) or (A > v)
• choose only one of the two splits as new attribute
• Probability density estimation:
• Assume attribute follows a normal distribution
• Use data to estimate parameters of distribution
(e.g., mean and standard deviation)
• Once probability distribution is known, can use it to estimate the conditional probability
P(Ai|c)
How togo Estimate
g o in
u o uProbabilities
s
r
s
ic a l
from
r Data?
ic a l

e e t as
c at c at c o n
c l
Tid Refund Marital Taxable
Evade
• Normal distribution:
Status Income
1 −
( Ai − µ ij ) 2

1 Yes Single 125K No


P( A | c ) = e 2 σ ij2

2πσ
i j 2
2 No Married 100K No
ij
3 No Single 70K No
• One for each (Ai,ci) pair
4 Yes Married 120K No
5 No Divorced 95K Yes
• For (Income, Class=No):
6 No Married 60K No
• If Class=No
7 Yes Divorced 220K No
• sample mean = 110
8 No Single 85K Yes
• sample variance = 2975
9 No Married 75K No
10 No Single 90K Yes
10

1 −
( 120 −110 ) 2

P( Income = 120 | No) = e 2 ( 2975 )


= 0.0072
2π (54.54)
Example
Given a Testof Naïve Bayes Classifier
Record:
X = (Refund = No, Married, Income = 120K)
naive Bayes Classifier:

P(Refund=Yes|No) = 3/7 ● P(X|Class=No) = P(Refund=No|Class=No)


P(Refund=No|No) = 4/7 × P(Married| Class=No)
P(Refund=Yes|Yes) = 0 × P(Income=120K| Class=No)
P(Refund=No|Yes) = 1 = 4/7 × 4/7 × 0.0072 = 0.0024
P(Marital Status=Single|No) = 2/7
P(Marital Status=Divorced|No)=1/7
P(Marital Status=Married|No) = 4/7 ● P(X|Class=Yes) = P(Refund=No| Class=Yes)
P(Marital Status=Single|Yes) = 2/7 × P(Married| Class=Yes)
P(Marital Status=Divorced|Yes)=1/7 × P(Income=120K| Class=Yes)
P(Marital Status=Married|Yes) = 0 = 1 × 0 × 1.2 × 10-9 = 0
For taxable income:
If class=No: sample mean=110 Since P(X|No)P(No) > P(X|Yes)P(Yes)
sample variance=2975 Therefore P(No|X) > P(Yes|X)
If class=Yes: sample mean=90
sample variance=25 => Class = No
Naïve Bayes Classifier
• If one of the conditional probability is zero, then the entire expression
becomes zero
• Probability estimation:
N ic
Original : P ( Ai | C ) =
Nc
c: number of classes
N ic + 1
Laplace : P ( Ai | C ) = p: prior probability
Nc + c
m: parameter
N ic + mp
m - estimate : P ( Ai | C ) =
Nc + m
Example of Naïve Bayes Classifier
Name Give Birth Can Fly Live in Water Have Legs Class
human yes no no yes mammals A: attributes
python no no no no non-mammals
salmon no no yes no non-mammals M: mammals
whale yes no yes no mammals
frog
komodo
no
no
no
no
sometimes
no
yes
yes
non-mammals
non-mammals
N: non-mammals
6 6 2 2
bat yes yes no yes mammals P ( A | M ) = × × × = 0.06
pigeon
cat
no
yes
yes
no
no
no
yes
yes
non-mammals
mammals
7 7 7 7
leopard shark yes no yes no non-mammals 1 10 3 4
turtle no no sometimes yes non-mammals P ( A | N ) = × × × = 0.0042
penguin no no sometimes yes non-mammals 13 13 13 13
porcupine yes no no yes mammals
eel no no yes no non-mammals 7
salamander no no sometimes yes non-mammals P ( A | M ) P ( M ) = 0.06 × = 0.021
gila monster no no no yes non-mammals 20
platypus no no no yes mammals
owl no yes no yes non-mammals 13
dolphin yes no yes no mammals P ( A | N ) P ( N ) = 0.004 × = 0.0027
eagle no yes no yes non-mammals 20

Give Birth Can Fly Live in Water Have Legs Class P(A|M)P(M) > P(A|N)P(N)
yes no yes no ?
=> Mammals
Naïve Bayes (Summary)
• Robust to isolated noise points

• Robust to irrelevant attributes

• Independence assumption may not hold for some attributes


• Use other techniques such as Bayesian Belief Networks (BBN)

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy