0% found this document useful (0 votes)

8 views13 pages

Naive Bayes

Uploaded by

sushmitaa9193

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views13 pages

Naive Bayes

Uploaded by

sushmitaa9193

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

Data Mining

Classification: Alternative Techniques

Lecture Notes for Naïve Bayes Classifiers

Introduction to Data Mining

by
Tan, Steinbach, Kumar

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 1

Bayes Classifier
• A probabilistic framework for solving classification problems
• Conditional Probability: P( A, C )
P(C | A) =
P( A)
P( A, C )
P( A | C ) =
P(C )
• Bayes theorem:

P( A | C ) P(C )
P(C | A) =
P( A)
Example of Bayes Theorem
• Given:
• A doctor knows that meningitis causes stiff neck 50% of the time
• Prior probability of any patient having meningitis is 1/50,000
• Prior probability of any patient having stiff neck is 1/20

• If a patient has stiff neck, what’s the probability he/she

has meningitis?

P( S | M ) P( M ) 0.5 ×1 / 50000
P( M | S ) = = = 0.0002
P( S ) 1 / 20
Bayesian Classifiers
• Consider each attribute and class label as random
variables

• Given a record with attributes (A1, A2,…,An)

• Goal is to predict class C
• Specifically, we want to find the value of C that maximizes P(C|
A1, A2,…,An )

• Can we estimate P(C| A1, A2,…,An ) directly from data?

Bayesian Classifiers
• Approach:
• compute the posterior probability P(C | A1, A2, …, An) for all
values of C using the Bayes theorem
P( A A ! A | C ) P(C )
P(C | A A ! A ) = 1 2 n

P( A A ! A )
1 2 n

1 2 n

• Choose value of C that maximizes

P(C | A1, A2, …, An)

• Equivalent to choosing value of C that maximizes

P(A1, A2, …, An|C) P(C)

• How to estimate P(A1, A2, …, An | C )?

Naïve Bayes Classifier
• Assume independence among attributes Ai when class is given:
• P(A1, A2, …, An |C) = P(A1| Cj) P(A2| Cj)… P(An| Cj)

• Can estimate P(Ai| Cj) for all Ai and Cj.

• New point is classified to Cj if P(Cj) Π P(Ai| Cj) is maximal.

How to Estimate
l l
Probabilities
s
from Data?
c a c a u
r i r i o
o o u
g g in s
ca
te
ca
te
c on
t
cl
a s • Class: P(C) = Nc/N
Tid Refund Marital Taxable • e.g., P(No) = 7/10,
Status Income Evade P(Yes) = 3/10
1 Yes Single 125K No
2 No Married 100K No
• For discrete attributes:
3 No Single 70K No
4 Yes Married 120K No P(Ai | Ck) = |Aik|/ Nc k
5 No Divorced 95K Yes
• where |Aik| is number of
6 No Married 60K No
instances having attribute Ai
7 Yes Divorced 220K No
and belongs to class Ck
8 No Single 85K Yes
• Examples:
9 No Married 75K No
10 No Single 90K Yes P(Status=Married|No) = 4/7
10

P(Refund=Yes|Yes)=0
How to Estimate Probabilities from Data?

• For continuous attributes:

• Discretize the range into bins
k
• one ordinal attribute per bin
• violates independence assumption
• Two-way split: (A < v) or (A > v)
• choose only one of the two splits as new attribute
• Probability density estimation:
• Assume attribute follows a normal distribution
• Use data to estimate parameters of distribution
(e.g., mean and standard deviation)
• Once probability distribution is known, can use it to estimate the conditional probability
P(Ai|c)
How togo Estimate
g o in
u o uProbabilities
s
r
s
ic a l
from
r Data?
ic a l

e e t as
c at c at c o n
c l
Tid Refund Marital Taxable
Evade
• Normal distribution:
Status Income
1 −
( Ai − µ ij ) 2

1 Yes Single 125K No

P( A | c ) = e 2 σ ij2

2πσ
i j 2
2 No Married 100K No
ij
3 No Single 70K No
• One for each (Ai,ci) pair
4 Yes Married 120K No
5 No Divorced 95K Yes
• For (Income, Class=No):
6 No Married 60K No
• If Class=No
7 Yes Divorced 220K No
• sample mean = 110
8 No Single 85K Yes
• sample variance = 2975
9 No Married 75K No
10 No Single 90K Yes
10

1 −
( 120 −110 ) 2

P( Income = 120 | No) = e 2 ( 2975 )

= 0.0072
2π (54.54)
Example
Given a Testof Naïve Bayes Classifier
Record:
X = (Refund = No, Married, Income = 120K)
naive Bayes Classifier:

P(Refund=Yes|No) = 3/7 ● P(X|Class=No) = P(Refund=No|Class=No)

P(Refund=No|No) = 4/7 × P(Married| Class=No)
P(Refund=Yes|Yes) = 0 × P(Income=120K| Class=No)
P(Refund=No|Yes) = 1 = 4/7 × 4/7 × 0.0072 = 0.0024
P(Marital Status=Single|No) = 2/7
P(Marital Status=Divorced|No)=1/7
P(Marital Status=Married|No) = 4/7 ● P(X|Class=Yes) = P(Refund=No| Class=Yes)
P(Marital Status=Single|Yes) = 2/7 × P(Married| Class=Yes)
P(Marital Status=Divorced|Yes)=1/7 × P(Income=120K| Class=Yes)
P(Marital Status=Married|Yes) = 0 = 1 × 0 × 1.2 × 10-9 = 0
For taxable income:
If class=No: sample mean=110 Since P(X|No)P(No) > P(X|Yes)P(Yes)
sample variance=2975 Therefore P(No|X) > P(Yes|X)
If class=Yes: sample mean=90
sample variance=25 => Class = No
Naïve Bayes Classifier
• If one of the conditional probability is zero, then the entire expression
becomes zero
• Probability estimation:
N ic
Original : P ( Ai | C ) =
Nc
c: number of classes
N ic + 1
Laplace : P ( Ai | C ) = p: prior probability
Nc + c
m: parameter
N ic + mp
m - estimate : P ( Ai | C ) =
Nc + m
Example of Naïve Bayes Classifier
Name Give Birth Can Fly Live in Water Have Legs Class
human yes no no yes mammals A: attributes
python no no no no non-mammals
salmon no no yes no non-mammals M: mammals
whale yes no yes no mammals
frog
komodo
no
no
no
no
sometimes
no
yes
yes
non-mammals
non-mammals
N: non-mammals
6 6 2 2
bat yes yes no yes mammals P ( A | M ) = × × × = 0.06
pigeon
cat
no
yes
yes
no
no
no
yes
yes
non-mammals
mammals
7 7 7 7
leopard shark yes no yes no non-mammals 1 10 3 4
turtle no no sometimes yes non-mammals P ( A | N ) = × × × = 0.0042
penguin no no sometimes yes non-mammals 13 13 13 13
porcupine yes no no yes mammals
eel no no yes no non-mammals 7
salamander no no sometimes yes non-mammals P ( A | M ) P ( M ) = 0.06 × = 0.021
gila monster no no no yes non-mammals 20
platypus no no no yes mammals
owl no yes no yes non-mammals 13
dolphin yes no yes no mammals P ( A | N ) P ( N ) = 0.004 × = 0.0027
eagle no yes no yes non-mammals 20

Give Birth Can Fly Live in Water Have Legs Class P(A|M)P(M) > P(A|N)P(N)
yes no yes no ?
=> Mammals
Naïve Bayes (Summary)
• Robust to isolated noise points

• Robust to irrelevant attributes

• Independence assumption may not hold for some attributes

• Use other techniques such as Bayesian Belief Networks (BBN)

Space Battles - A Spacefarers Guide - The Exciting New Game From Rick Priestley - Warlord Community
No ratings yet
Space Battles - A Spacefarers Guide - The Exciting New Game From Rick Priestley - Warlord Community
15 pages
Data Mining Classification: Alternative Techniques
No ratings yet
Data Mining Classification: Alternative Techniques
15 pages
Electric Machine Design (Module-4)
No ratings yet
Electric Machine Design (Module-4)
24 pages
Naïve Bayes Classifier (Week 8)
No ratings yet
Naïve Bayes Classifier (Week 8)
18 pages
Data Mining Classification: Naïve Bayes Classifier Lecture Notes For Chapter 4 &5
No ratings yet
Data Mining Classification: Naïve Bayes Classifier Lecture Notes For Chapter 4 &5
26 pages
ML 09 Naive Bayes Classifier
No ratings yet
ML 09 Naive Bayes Classifier
24 pages
PR January20 05 PDF
No ratings yet
PR January20 05 PDF
24 pages
Bayesian Classifiers Introduction To Data Mining, 2 Edition by Tan, Steinbach, Karpatne, Kumar
No ratings yet
Bayesian Classifiers Introduction To Data Mining, 2 Edition by Tan, Steinbach, Karpatne, Kumar
17 pages
Bayesian Classifiers Introduction To Data Mining, 2 Edition by Tan, Steinbach, Karpatne, Kumar
No ratings yet
Bayesian Classifiers Introduction To Data Mining, 2 Edition by Tan, Steinbach, Karpatne, Kumar
26 pages
Classification (Naive Bayes)
No ratings yet
Classification (Naive Bayes)
40 pages
Chap4 Naive Bayes
No ratings yet
Chap4 Naive Bayes
26 pages
Datamining Lect11
No ratings yet
Datamining Lect11
53 pages
Class Adv Classification IV
No ratings yet
Class Adv Classification IV
49 pages
Navie Classifier
No ratings yet
Navie Classifier
8 pages
ML Lec 15 Naive Bayes
No ratings yet
ML Lec 15 Naive Bayes
16 pages
Naive Ba Yes
No ratings yet
Naive Ba Yes
28 pages
Classification With NaiveBayes
No ratings yet
Classification With NaiveBayes
19 pages
Chap4 Naive Bayes
No ratings yet
Chap4 Naive Bayes
13 pages
Foundations of Data Science - Unit 6 - Naive Bayes
No ratings yet
Foundations of Data Science - Unit 6 - Naive Bayes
12 pages
Naïve Bayesv1
No ratings yet
Naïve Bayesv1
31 pages
DM NaiveBayes
No ratings yet
DM NaiveBayes
15 pages
6 - Naive Bayes
No ratings yet
6 - Naive Bayes
26 pages
Lecture 5-Naïve Bayes
No ratings yet
Lecture 5-Naïve Bayes
26 pages
20210913115710D3708 - Session 09-12 Bayes Classifier
No ratings yet
20210913115710D3708 - Session 09-12 Bayes Classifier
30 pages
Nayes Bayes Classifier
No ratings yet
Nayes Bayes Classifier
46 pages
Bayesian Decision Theory
No ratings yet
Bayesian Decision Theory
27 pages
Bayes
No ratings yet
Bayes
10 pages
Naive Bayes
No ratings yet
Naive Bayes
19 pages
Lect 7 DM
No ratings yet
Lect 7 DM
65 pages
ML Lecture#5
No ratings yet
ML Lecture#5
65 pages
D3 It Naive Bayes
No ratings yet
D3 It Naive Bayes
24 pages
Naive Bayes
No ratings yet
Naive Bayes
19 pages
L3 (Week3) Bayesian Classifier
No ratings yet
L3 (Week3) Bayesian Classifier
21 pages
Week 4 - Classification Alternative Techniques
No ratings yet
Week 4 - Classification Alternative Techniques
87 pages
Machine Learning and Data Mining: Prof. Alexander Ihler
No ratings yet
Machine Learning and Data Mining: Prof. Alexander Ihler
51 pages
Machine Learning and Data Mining: Prof. Alexander Ihler
No ratings yet
Machine Learning and Data Mining: Prof. Alexander Ihler
51 pages
ML 05 Bayesian Classifier
No ratings yet
ML 05 Bayesian Classifier
19 pages
29-Naive Bayes-03-10-2024
No ratings yet
29-Naive Bayes-03-10-2024
48 pages
Data Mining - Bayesian Classification
No ratings yet
Data Mining - Bayesian Classification
6 pages
Bayesian Classification
No ratings yet
Bayesian Classification
25 pages
Data Classification and Prediction : Lecture-11
No ratings yet
Data Classification and Prediction : Lecture-11
36 pages
Bayesian Classifier and ML Estimation: 6.1 Conditional Probability
100% (3)
Bayesian Classifier and ML Estimation: 6.1 Conditional Probability
11 pages
Bayes
No ratings yet
Bayes
48 pages
Lecture 2.4-2.5
No ratings yet
Lecture 2.4-2.5
16 pages
L23 Bayesian Naive
No ratings yet
L23 Bayesian Naive
18 pages
Naive by
No ratings yet
Naive by
23 pages
Bayes Classification Method
No ratings yet
Bayes Classification Method
18 pages
Lecture Slide 03 - Bayesian Classifier - Summer 2023
No ratings yet
Lecture Slide 03 - Bayesian Classifier - Summer 2023
23 pages
Naive Bayes
No ratings yet
Naive Bayes
18 pages
Data Mining - Module 7
No ratings yet
Data Mining - Module 7
8 pages
A5 PDF
No ratings yet
A5 PDF
9 pages
Lecture 7
No ratings yet
Lecture 7
15 pages
Bayesian Classification: Dr. Navneet Goyal BITS, Pilani
No ratings yet
Bayesian Classification: Dr. Navneet Goyal BITS, Pilani
35 pages
Bayesian Classifier Notes
No ratings yet
Bayesian Classifier Notes
9 pages
8 - Classification NaiveBayes PDF
No ratings yet
8 - Classification NaiveBayes PDF
13 pages
Bayesian Classification: Dr. Navneet Goyal BITS, Pilani
No ratings yet
Bayesian Classification: Dr. Navneet Goyal BITS, Pilani
35 pages
Bayes Classification Methods
No ratings yet
Bayes Classification Methods
22 pages
What Is Bayes Theorem?: Something Else Has Already Occurred. Using The Conditional Probability, We Can Calculate
No ratings yet
What Is Bayes Theorem?: Something Else Has Already Occurred. Using The Conditional Probability, We Can Calculate
8 pages
Naïve Bayes Classifier
No ratings yet
Naïve Bayes Classifier
21 pages
Pre-Calculus Essentials
From Everand
Pre-Calculus Essentials
Ernest Woodward
No ratings yet
SSC CGL Preparatory Guide -Mathematics (Part 2)
From Everand
SSC CGL Preparatory Guide -Mathematics (Part 2)
Dr. DK Sukhani
4/5 (1)
Class 9 Economics Project On Toothpaste
No ratings yet
Class 9 Economics Project On Toothpaste
12 pages
Capture One 9.3 Release Notes (Rev 1.6)
No ratings yet
Capture One 9.3 Release Notes (Rev 1.6)
37 pages
Industrial Instrumentation: Process Measurement
No ratings yet
Industrial Instrumentation: Process Measurement
54 pages
Tantra - Agama - Part One - Tantra - Sreenivasarao's Blogs
No ratings yet
Tantra - Agama - Part One - Tantra - Sreenivasarao's Blogs
19 pages
Engineering Materials Exp. - 2
No ratings yet
Engineering Materials Exp. - 2
6 pages
Ivf Changing Steps and Rationale
No ratings yet
Ivf Changing Steps and Rationale
2 pages
IC Design of Power Management Circuits (I)
No ratings yet
IC Design of Power Management Circuits (I)
40 pages
Enhancing Alternate Fuel in Cement Manufacturing Process: A Sustainable Technological Approach
100% (1)
Enhancing Alternate Fuel in Cement Manufacturing Process: A Sustainable Technological Approach
33 pages
NX902 - Morphing Objects Using OmniCAD
No ratings yet
NX902 - Morphing Objects Using OmniCAD
7 pages
Gen.-Physics-1 Ch-46 Plus Week 6 Complete 10 Pages.v.1.0
No ratings yet
Gen.-Physics-1 Ch-46 Plus Week 6 Complete 10 Pages.v.1.0
10 pages
Poem I (From Tao-Teh-Ching) : Lao Tzu
No ratings yet
Poem I (From Tao-Teh-Ching) : Lao Tzu
4 pages
Genome Organization
100% (1)
Genome Organization
23 pages
Real Christmas: The Purpose of Christmas
No ratings yet
Real Christmas: The Purpose of Christmas
17 pages
A Technique For Human Error Assessment Early in Design
No ratings yet
A Technique For Human Error Assessment Early in Design
8 pages
Lithium Battery Service and Repair - A Handbook
No ratings yet
Lithium Battery Service and Repair - A Handbook
10 pages
Gantry Girder Data (Group 1)
No ratings yet
Gantry Girder Data (Group 1)
2 pages
PROPOSAL FROM M/s S.M.ASGHAR (PVT) LIMITED FOR CUSTOM CLEARANCE-SHIPPING & LOGISTICS SOLUTION
No ratings yet
PROPOSAL FROM M/s S.M.ASGHAR (PVT) LIMITED FOR CUSTOM CLEARANCE-SHIPPING & LOGISTICS SOLUTION
144 pages
09 MSDS Wax Dispersant
No ratings yet
09 MSDS Wax Dispersant
8 pages
Apollo 9504D: Stackable 1.6T DCI/DWDM Transmission Platform
No ratings yet
Apollo 9504D: Stackable 1.6T DCI/DWDM Transmission Platform
2 pages
Assignment 8614 PDF
No ratings yet
Assignment 8614 PDF
22 pages
TLE-CSS Grade9 Q1 LAS1
No ratings yet
TLE-CSS Grade9 Q1 LAS1
6 pages
Lock Out Tag Out (LOTO) Safety Awareness
No ratings yet
Lock Out Tag Out (LOTO) Safety Awareness
27 pages
Nano Sweep BT
No ratings yet
Nano Sweep BT
38 pages
Manual CVT
100% (2)
Manual CVT
221 pages
TSC Ievc Doc System Syad V2.0
No ratings yet
TSC Ievc Doc System Syad V2.0
750 pages
CV - Senior Electrical Engineer
No ratings yet
CV - Senior Electrical Engineer
3 pages
Joyful Living Services
No ratings yet
Joyful Living Services
51 pages
A-STR-STD-000-30053-0 - STD Details of Ladders SHT 1 PDF
No ratings yet
A-STR-STD-000-30053-0 - STD Details of Ladders SHT 1 PDF
1 page

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Naive Bayes

Uploaded by

Naive Bayes

Uploaded by

Data Mining

Classification: Alternative Techniques

Lecture Notes for Naïve Bayes Classifiers

Introduction to Data Mining

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 1

• If a patient has stiff neck, what’s the probability he/she

• Given a record with attributes (A1, A2,…,An)

• Can we estimate P(C| A1, A2,…,An ) directly from data?

• Choose value of C that maximizes

• Equivalent to choosing value of C that maximizes

• How to estimate P(A1, A2, …, An | C )?

• Can estimate P(Ai| Cj) for all Ai and Cj.

• New point is classified to Cj if P(Cj) Π P(Ai| Cj) is maximal.

• For continuous attributes:

1 Yes Single 125K No

P( Income = 120 | No) = e 2 ( 2975 )

P(Refund=Yes|No) = 3/7 ● P(X|Class=No) = P(Refund=No|Class=No)

• Robust to irrelevant attributes

• Independence assumption may not hold for some attributes

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.