0% found this document useful (0 votes)

35 views6 pages

Lecture 3

Uploaded by

Mohanad Kadhim

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views6 pages

Lecture 3

Uploaded by

Mohanad Kadhim

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Need for Probabilistic Reasoning

• Most everyday reasoning is based on uncertain evidence

and inferences.
• Classical logic, which only allows conclusions to be strictly
CS 343: Artificial Intelligence true or strictly false, does not account for this uncertainty or
Probabilistic Reasoning and the need to weigh and combine conflicting evidence.
• Straightforward application of probability theory is
Naïve Bayes impractical since the large number of probability parameters
required are rarely, if ever, available.
• Therefore, early expert systems employed fairly ad hoc
methods for reasoning under uncertainty and for combining
evidence.
• Recently, methods more rigorously founded in probability
Raymond J. Mooney theory that attempt to decrease the amount of conditional
probabilities required have flourished.
University of Texas at Austin
1 2

Axioms of Probability Theory Conditional Probability

• All probabilities between 0 and 1 • P(A | B) is the probability of A given B

0 ≤ P( A) ≤ 1 • Assumes that B is all and only information
• True proposition has probability 1, false has known.
probability 0. • Defined by:
P(true) = 1 P(false) = 0. P( A ∧ B)
P ( A | B) =
P(B)
• The probability of disjunction is:
P ( A ∨ B ) = P ( A) + P ( B ) − P ( A ∧ B )
A A∧ B B
A A∧ B B

3 4

Independence Classification (Categorization)

• A and B are independent iff: • Given:

P( A | B) = P ( A) – A description of an instance, x∈X, where X is the
These two constraints are logically equivalent instance language or instance space.
P( B | A) = P ( B ) – A fixed set of categories: C={c1, c2,…cn}
• Therefore, if A and B are independent: • Determine:
– The category of x: c(x)∈C, where c(x) is a
P( A ∧ B)
P ( A | B) = = P( A) categorization function whose domain is X and whose
P(B) range is C.
P ( A ∧ B ) = P ( A) P ( B ) – If c(x) is a binary function C={0,1} ({true,false},
{positive, negative}) then it is called a concept.

5 6

1
Learning for Categorization Sample Category Learning Problem

• A training example is an instance x∈X, • Instance language: <size, color, shape>

paired with its correct category c(x): – size ∈ {small, medium, large}
<x, c(x)> for an unknown categorization – color ∈ {red, blue, green}
– shape ∈ {square, circle, triangle}
function, c.
• C = {positive, negative}
• Given a set of training examples, D.
• D: Example Size Color Shape Category
• Find a hypothesized categorization function,
h(x), such that: 1 small red circle positive
2 large red circle positive
∀ < x, c ( x ) > ∈ D : h ( x ) = c ( x )
3 small red triangle negative
Consistency
4 large blue circle negative
7 8

Joint Distribution Probabilistic Classification

• The joint probability distribution for a set of random variables,
X1,…,Xn gives the probability of every combination of values (an n-
• Let Y be the random variable for the class which takes values
dimensional array with vn values if all variables are discrete with v {y1,y2,…ym}.
values, all vn values must sum to 1): P(X1,…,Xn) • Let X be the random variable describing an instance consisting
positive negative of a vector of values for n features <X1,X2…Xn>, let xk be a
circle square circle square possible value for X and xij a possible value for Xi.
red 0.20 0.02 red 0.05 0.30 • For classification, we need to compute P(Y=yi | X=xk) for i=1…m
blue 0.02 0.01 blue 0.20 0.20 • However, given no other assumptions, this requires a table
• The probability of all possible conjunctions (assignments of values to giving the probability of each category for each possible instance
some subset of variables) can be calculated by summing the in the instance space, which is impossible to accurately estimate
appropriate subset of values from the joint distribution. from a reasonably-sized training set.
P(red ∧ circle) = 0.20 + 0.05 = 0.25 – Assuming Y and all Xi are binary, we need 2n entries to specify
P( red ) = 0.20 + 0.02 + 0.05 + 0.3 = 0.57 P(Y=pos | X=xk) for each of the 2n possible xk’s since
P(Y=neg | X=xk) = 1 – P(Y=pos | X=xk)
• Therefore, all conditional probabilities can also be calculated.
– Compared to 2n+1 – 1 entries for the joint distribution P(Y,X1,X2…Xn)
P( positive ∧ red ∧ circle ) 0.20
P( positive | red ∧ circle ) = = = 0.80
P( red ∧ circle ) 0.25 9 10

Bayes Theorem Bayesian Categorization

P( E | H ) P( H ) • Determine category of xk by determining for each yi

P( H | E ) =
P( E )
P (Y = yi ) P( X = xk | Y = yi )
P(Y = yi | X = xk ) =
Simple proof from definition of conditional probability: P ( X = xk )

P( E | H ) P ( H ) P( X = xk ) = ∑ P(Y = yi ) P ( X = xk | Y = yi )
QED: P ( H | E ) = i =1
P( E )
11 12

2
Bayesian Categorization (cont.) Generative Probabilistic Models
• Need to know: • Assume a simple (usually unrealistic) probabilistic method
by which the data was generated.
– Priors: P(Y=yi) • For categorization, each category has a different
– Conditionals: P(X=xk | Y=yi) parameterized generative model that characterizes that
category.
• P(Y=yi) are easily estimated from data. • Training: Use the data for each category to estimate the
– If ni of the examples in D are in yi then P(Y=yi) = ni / |D| parameters of the generative model for that category.
– Maximum Likelihood Estimation (MLE): Set parameters to
• Too many possible instances (e.g. 2n for binary maximize the probability that the model produced the given
training data.
features) to estimate all P(X=xk | Y=yi). – If Mλ denotes a model with parameter values λ and Dk is the
training data for the kth class, find model parameters for class k
• Still need to make some sort of independence (λk) that maximize the likelihood of Dk:
assumptions about the features to make learning λk = argmax P( Dk | M λ )
tractable. λ
• Testing: Use Bayesian analysis to determine the category
model that most likely generated a specific test instance.
13 14

Naïve Bayes Generative Model Naïve Bayes Inference Problem

lg red circ
neg
pos pos ?? ??
pos neg
pos neg

Category neg
pos pos
pos neg
pos neg

red circ lg red circ red circ lg red circ

med blue med blue
sm blue tri tricirc sm sqr sm blue tri tricirc sm sqr
med lg red grn red circ circ med med grn grn tri circ med lg red grn red circ circ med med grn grn tri circ
lg lg sm sm lglg red blue circ tri sqr lg lg sm sm lglg red blue circ tri sqr
sm med red blue circ sqr sm blue grn sqr tri sm med red blue circ sqr sm blue grn sqr tri
red red
Size Color Shape Size Color Shape Size Color Shape Size Color Shape
Positive Negative 15 Positive Negative 16

Naïve Bayesian Categorization Naïve Bayes Categrization Example

• If we assume features of an instance are independent given
Probability positive negative
the category (conditionally independent). m
P( X | Y ) = P ( X 1 , X 2 ,L X n | Y ) = ∏ P( X i | Y )
P(Y) 0.5 0.5
P(small | Y) 0.4 0.4
i =1
P(medium | Y) 0.1 0.2
• Therefore, we then only need to know P(Xi | Y) for each P(large | Y) 0.5 0.4 Test Instance:
possible pair of a feature-value and a category.
P(red | Y) 0.9 0.3 <medium ,red, circle>
• If Y and all Xi and binary, this requires specifying only 2n P(blue | Y) 0.05 0.3
parameters:
P(green | Y) 0.05 0.4
– P(Xi=true | Y=true) and P(Xi=true | Y=false) for each Xi
P(square | Y) 0.05 0.4
– P(Xi=false | Y) = 1 – P(Xi=true | Y)
P(triangle | Y) 0.05 0.3
• Compared to specifying 2n parameters without any P(circle | Y) 0.9 0.3
independence assumptions.

17 18

3
Naïve Bayes Categorization Example Naïve Bayes Diagnosis Example

Probability positive negative • C = {allergy, cold, well}

P(Y) 0.5 0.5
• e1 = sneeze; e2 = cough; e3 = fever
P(medium | Y) 0.1 0.2
P(red | Y) 0.9 0.3 Test Instance:
<medium ,red, circle>
• E = {sneeze, cough, ¬fever}
P(circle | Y) 0.9 0.3

Naïve Bayes Diagnosis Example (cont.) Estimating Probabilities

Probability Well Cold Allergy • Normally, probabilities are estimated based on observed
P(ci ) 0.9 0.05 0.05 frequencies in the training data.
P(sneeze | ci ) 0.1 0.9 0.9 E={sneeze, cough, ¬fever} • If D contains nk examples in category yk, and nijk of these nk
examples have the jth value for feature Xi, xij, then:
P(cough | ci ) 0.1 0.8 0.7
nijk
P(fever | ci ) 0.01 0.7 0.4 P( X i = xij | Y = yk ) =
P(well | E) = (0.9)(0.1)(0.1)(0.99)/P(E)=0.0089/P(E)
nk
P(cold | E) = (0.05)(0.9)(0.8)(0.3)/P(E)=0.01/P(E)
• However, estimating such probabilities from small training
P(allergy | E) = (0.05)(0.9)(0.7)(0.6)/P(E)=0.019/P(E)
sets is error-prone.
• If due only to chance, a rare feature, Xi, is always false in
Most probable category: allergy the training data, ∀yk :P(Xi=true | Y=yk) = 0.
P(E) = 0.0089 + 0.01 + 0.019 = 0.0379 • If Xi=true then occurs in a test example, X, the result is that
P(well | E) = 0.23 ∀yk: P(X | Y=yk) = 0 and ∀yk: P(Y=yk | X) = 0
P(cold | E) = 0.26
P(allergy | E) = 0.50
21 22

Probability Estimation Example Smoothing

Ex Size Color Shape Category

Probability positive negative • To account for estimation from small samples,
P(Y) 0.5 0.5 probability estimates are adjusted or smoothed.
1 small red circle positive P(small | Y) 0.5 0.5
P(medium | Y) 0.0 0.0
• Laplace smoothing using an m-estimate assumes that
2 large red circle positive
P(large | Y) 0.5 0.5 each feature is given a prior probability, p, that is
3 small red triangle negitive P(red | Y) 1.0 0.5 assumed to have been previously observed in a
P(blue | Y) 0.0 0.5 “virtual” sample of size m.
4 large blue circle negitive
P(green | Y) 0.0 0.0 nijk + mp
P( X i = xij | Y = yk ) =
P(square | Y) 0.0 0.0
nk + m
P(triangle | Y) 0.0 0.5
Test Instance X:
<medium, red, circle> P(circle | Y) 1.0 0.5 • For binary features, p is simply assumed to be 0.5.
P(positive | X) = 0.5 * 0.0 * 1.0 * 1.0 / P(X) = 0
P(negative | X) = 0.5 * 0.0 * 0.5 * 0.5 / P(X) = 0 23 24

4
Laplace Smothing Example Text Categorization Applications
• Assume training set contains 10 positive examples: • Web pages
– Recommending
– 4: small – Yahoo-like classification
– 0: medium • Newsgroup/Blog Messages
– 6: large – Recommending
– spam filtering
• Estimate parameters as follows (if m=1, p=1/3) – Sentiment analysis for marketing
– P(small | positive) = (4 + 1/3) / (10 + 1) = 0.394 • News articles
– P(medium | positive) = (0 + 1/3) / (10 + 1) = 0.03 – Personalized newspaper
– P(large | positive) = (6 + 1/3) / (10 + 1) = 0.576 • Email messages
– P(small or medium or large | positive) = 1.0 – Routing
– Prioritizing
– Folderizing
– spam filtering
25 – Advertising on Gmail 26

Text Categorization Methods Naïve Bayes for Text

• Most common representation of a document • Modeled as generating a bag of words for a

is a “bag of words,” i.e. set of words with document in a given category by repeatedly
their frequencies, word order is ignored. sampling with replacement from a
vocabulary V = {w1, w2,…wm} based on the
• Gives a high-dimensional vector probabilities P(wj | ci).
representation (one feature for each word). • Smooth probability estimates with Laplace
• Vectors are sparse since most words are m-estimates assuming a uniform distribution
rare. over all words (p = 1/|V|) and m = |V|
– Zipf’s law and heavy-tailed distributions – Equivalent to a virtual sample of seeing each word in
each category exactly once.

27 28

Naïve Bayes Generative Model for Text Naïve Bayes Text Classification

Win lotttery $ !
spam ?? ??
legit
spam spam
legit legit spam
spam spam legit
legit spam spam
legit legit
Category
spam spam
science legit
Viagra science
Viagra
win PM
win
Category PM
hot ! !! computer Friday
Nigeria deal hot ! !! computer Friday
test homework Nigeria deal
lottery nude test homework
March score lottery nude
! Viagra March score
$ May exam ! Viagra
$ May exam
spam legit spam
29
legit 30

5
Text Naïve Bayes Algorithm Text Naïve Bayes Algorithm
(Train) (Test)
Let V be the vocabulary of all words in the documents in D Given a test document X
For each category ci ∈ C Let n be the number of word occurrences in X
Let Di be the subset of documents in D in category ci Return the category:
n
P(ci) = |Di| / |D|
argmax P (ci )∏ P( ai | ci )
Let Ti be the concatenation of all the documents in Di ci ∈C i =1
Let ni be the total number of word occurrences in Ti where ai is the word occurring the ith position in X
For each word wj ∈ V
Let nij be the number of occurrences of wj in Ti
Let P(wj | ci) = (nij + 1) / (ni + |V|)

31 32

Underflow Prevention Comments on Naïve Bayes

• Multiplying lots of probabilities, which are • Makes probabilistic inference tractable by

between 0 and 1 by definition, can result in making a strong assumption of conditional
floating-point underflow. independence.
• Since log(xy) = log(x) + log(y), it is better to • Tends to work fairly well despite this strong
assumption.
perform all computations by summing logs
of probabilities rather than multiplying • Experiments show it to be quite competitive
with other classification methods on
probabilities. standard datasets.
• Class with highest final un-normalized log • Particularly popular for text categorization,
probability score is still the most probable. e.g. spam filtering.
33 34

CME538 Lecture 1 Slide 1
No ratings yet
CME538 Lecture 1 Slide 1
122 pages
9-6 Error Messages Reference
No ratings yet
9-6 Error Messages Reference
2,536 pages
Message-6 2
No ratings yet
Message-6 2
226 pages
Unit 3
No ratings yet
Unit 3
157 pages
ml3 - Text Classification - Naive Bayes
No ratings yet
ml3 - Text Classification - Naive Bayes
50 pages
3 - Bayesian Classification
No ratings yet
3 - Bayesian Classification
15 pages
737 Book NG 22 303
100% (2)
737 Book NG 22 303
76 pages
Unit 3
No ratings yet
Unit 3
46 pages
Openstack CLI References
No ratings yet
Openstack CLI References
556 pages
UNIT4 - Part2 Aiml
No ratings yet
UNIT4 - Part2 Aiml
46 pages
Statistical Perspective
No ratings yet
Statistical Perspective
85 pages
03 01 24 - 19 02 59 - DebugLog
No ratings yet
03 01 24 - 19 02 59 - DebugLog
81 pages
Lec14 15 GenerativeModelsForDiscreteData
No ratings yet
Lec14 15 GenerativeModelsForDiscreteData
74 pages
PDS OperaManEPAS3W Us9901
No ratings yet
PDS OperaManEPAS3W Us9901
58 pages
Lecture13 Nbayes
No ratings yet
Lecture13 Nbayes
56 pages
Bayesian Learning
No ratings yet
Bayesian Learning
58 pages
L4 Naive Bayes
No ratings yet
L4 Naive Bayes
31 pages
Chapter 4
No ratings yet
Chapter 4
22 pages
Unit 3 Bayesian Learning
No ratings yet
Unit 3 Bayesian Learning
49 pages
Unit 6
No ratings yet
Unit 6
47 pages
Unit 3 Bayesian Concept Learning
No ratings yet
Unit 3 Bayesian Concept Learning
66 pages
Naive Bayes
No ratings yet
Naive Bayes
25 pages
Itm 5913
No ratings yet
Itm 5913
22 pages
ML 05 Bayesian Classifier
No ratings yet
ML 05 Bayesian Classifier
19 pages
2022 Naive Bayes and Probability
No ratings yet
2022 Naive Bayes and Probability
30 pages
Naive Bayes
No ratings yet
Naive Bayes
24 pages
Lect9 NB
No ratings yet
Lect9 NB
46 pages
3 - Classification - Naive Bayes
No ratings yet
3 - Classification - Naive Bayes
30 pages
ConfigTool V5.1.3 - User Manual
No ratings yet
ConfigTool V5.1.3 - User Manual
88 pages
Be142 Genset Controller Manual
No ratings yet
Be142 Genset Controller Manual
28 pages
Google - Professional Cloud Architect - Page 7 - Examprepper
No ratings yet
Google - Professional Cloud Architect - Page 7 - Examprepper
4 pages
6 Naive-Bayes
No ratings yet
6 Naive-Bayes
18 pages
Naive Bayes
No ratings yet
Naive Bayes
31 pages
Naive by
No ratings yet
Naive by
23 pages
Bayesian Learning
No ratings yet
Bayesian Learning
41 pages
Naive Bayes
No ratings yet
Naive Bayes
29 pages
Structured Query Language (SQL) : Textbook Reference Database Management Systems: Chapter 5
No ratings yet
Structured Query Language (SQL) : Textbook Reference Database Management Systems: Chapter 5
146 pages
Lecture 5-Naïve Bayes
No ratings yet
Lecture 5-Naïve Bayes
26 pages
23-Naive Bayes
No ratings yet
23-Naive Bayes
22 pages
Unit 2
No ratings yet
Unit 2
20 pages
I239-5 Naive Bayes
No ratings yet
I239-5 Naive Bayes
35 pages
ML Unit3
No ratings yet
ML Unit3
21 pages
Naive Bayes
No ratings yet
Naive Bayes
18 pages
Statistical Inference INF312 - Is - Lecture 03 - Part 3
No ratings yet
Statistical Inference INF312 - Is - Lecture 03 - Part 3
18 pages
L3 (Week3) Bayesian Classifier
No ratings yet
L3 (Week3) Bayesian Classifier
21 pages
Register Organization of 8086 PDF
100% (1)
Register Organization of 8086 PDF
10 pages
Bayes Algorithm
No ratings yet
Bayes Algorithm
26 pages
Bayesian Decision Theory and Learning: Jayanta Mukhopadhyay Dept. of Computer Science and Engg
No ratings yet
Bayesian Decision Theory and Learning: Jayanta Mukhopadhyay Dept. of Computer Science and Engg
56 pages
Twinkle
No ratings yet
Twinkle
2 pages
PSI - ATLAS Cloud System Requirements V8.0
No ratings yet
PSI - ATLAS Cloud System Requirements V8.0
50 pages
Naïve Bayes Classifier: Adopted From Slides by Ke Chen From University of Manchester and Yangqiu Song From Msra
No ratings yet
Naïve Bayes Classifier: Adopted From Slides by Ke Chen From University of Manchester and Yangqiu Song From Msra
25 pages
Bayesian
No ratings yet
Bayesian
91 pages
Naïve Bayes Classifier: Dr. Hussain Dawood
No ratings yet
Naïve Bayes Classifier: Dr. Hussain Dawood
20 pages
Baye's Rule and Its Use
No ratings yet
Baye's Rule and Its Use
9 pages
AI20
No ratings yet
AI20
4 pages
Data Mining - Module 7
No ratings yet
Data Mining - Module 7
8 pages
Machine Learning and Data Mining: Prof. Alexander Ihler
No ratings yet
Machine Learning and Data Mining: Prof. Alexander Ihler
51 pages
Irs Unit 4 CH 1
No ratings yet
Irs Unit 4 CH 1
58 pages
Naive Bayes
No ratings yet
Naive Bayes
9 pages
Naïve Bayes Classifier: Ke Chen
No ratings yet
Naïve Bayes Classifier: Ke Chen
20 pages
Bayes Theorem
No ratings yet
Bayes Theorem
20 pages
Teldat Dm712-I SNMP Agent
No ratings yet
Teldat Dm712-I SNMP Agent
36 pages
Naïve Bayes Classifier: Ke Chen
No ratings yet
Naïve Bayes Classifier: Ke Chen
18 pages
Self Repair App
No ratings yet
Self Repair App
38 pages
Lecture Slide 03 - Bayesian Classifier - Summer 2023
No ratings yet
Lecture Slide 03 - Bayesian Classifier - Summer 2023
23 pages
Hadamard Codes: N N T N N
No ratings yet
Hadamard Codes: N N T N N
7 pages
Bayes' Rule and Its Use
No ratings yet
Bayes' Rule and Its Use
13 pages
Lecture - 4 Classification (Naive Bayes)
No ratings yet
Lecture - 4 Classification (Naive Bayes)
33 pages
Delomatic 4 DM-4 Land/DM-4 Marine: Technical Specifications Part 2, Chapter 29
No ratings yet
Delomatic 4 DM-4 Land/DM-4 Marine: Technical Specifications Part 2, Chapter 29
22 pages
Continuous Internal Evaluation Internal Assessment Test - 2: Section S Questions Cos RBT Leve L Marks
No ratings yet
Continuous Internal Evaluation Internal Assessment Test - 2: Section S Questions Cos RBT Leve L Marks
1 page
1 Request For Proposal
No ratings yet
1 Request For Proposal
9 pages
AI Lec 04+05 - Naive Bayes
No ratings yet
AI Lec 04+05 - Naive Bayes
55 pages
Naïve Bayes Classifier: Ke Chen
No ratings yet
Naïve Bayes Classifier: Ke Chen
20 pages
Bayesian Classification
No ratings yet
Bayesian Classification
25 pages
Bayesian Learning: Berrin Yanikoglu
No ratings yet
Bayesian Learning: Berrin Yanikoglu
64 pages
TSB 55L16XMEA Service Manual
No ratings yet
TSB 55L16XMEA Service Manual
38 pages
Machine Learning and Data Mining: Prof. Alexander Ihler
No ratings yet
Machine Learning and Data Mining: Prof. Alexander Ihler
51 pages
Quick Start Guide: Installation On Windows Vista, 7, 8, 8.1 and 10
No ratings yet
Quick Start Guide: Installation On Windows Vista, 7, 8, 8.1 and 10
3 pages
7.5 Effects of Layer 2 Devices On Data Flow: 7.5.1 Ethernet LAN Segmentation
No ratings yet
7.5 Effects of Layer 2 Devices On Data Flow: 7.5.1 Ethernet LAN Segmentation
9 pages
Personal Edition Installation Guide 63SP1
No ratings yet
Personal Edition Installation Guide 63SP1
16 pages
Naive Bayes Classifier PDF
No ratings yet
Naive Bayes Classifier PDF
17 pages
Mermaid: Level 108 Parts Time To Create 22 Hour
No ratings yet
Mermaid: Level 108 Parts Time To Create 22 Hour
49 pages
Project 1name - Excel Activities in Email Automation - People - Email
No ratings yet
Project 1name - Excel Activities in Email Automation - People - Email
4 pages
OOP - S2021 - Mid Term Exam
No ratings yet
OOP - S2021 - Mid Term Exam
2 pages
Naïve Bayes Classifier: Ke Chen
No ratings yet
Naïve Bayes Classifier: Ke Chen
18 pages
Naïve Bayes Classifier
No ratings yet
Naïve Bayes Classifier
17 pages
PED08 Prelim 2019 2020 2nd Semester
No ratings yet
PED08 Prelim 2019 2020 2nd Semester
4 pages
How To Make Micro-SIM From Usual SIM Card
No ratings yet
How To Make Micro-SIM From Usual SIM Card
1 page
Foundations of Elementary Analysis
From Everand
Foundations of Elementary Analysis
Roshan Trivedi
No ratings yet
Group Theory I Essentials
From Everand
Group Theory I Essentials
Emil Milewski
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Lecture 3

Uploaded by

Lecture 3

Uploaded by

Need for Probabilistic Reasoning

• Most everyday reasoning is based on uncertain evidence

Axioms of Probability Theory Conditional Probability

• All probabilities between 0 and 1 • P(A | B) is the probability of A given B

Independence Classification (Categorization)

• A and B are independent iff: • Given:

• A training example is an instance x∈X, • Instance language: <size, color, shape>

Joint Distribution Probabilistic Classification

Bayes Theorem Bayesian Categorization

P( E | H ) P( H ) • Determine category of xk by determining for each yi

P( H ∧ E ) • P(X=xk) can be determined since categories are

Naïve Bayes Generative Model Naïve Bayes Inference Problem

red circ lg red circ red circ lg red circ

Naïve Bayesian Categorization Naïve Bayes Categrization Example

Probability positive negative • C = {allergy, cold, well}

Naïve Bayes Diagnosis Example (cont.) Estimating Probabilities

Probability Estimation Example Smoothing

Ex Size Color Shape Category

Text Categorization Methods Naïve Bayes for Text

• Most common representation of a document • Modeled as generating a bag of words for a

Underflow Prevention Comments on Naïve Bayes

• Multiplying lots of probabilities, which are • Makes probabilistic inference tractable by

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.