Lecture-7 Classification Using Naive Bays
Lecture-7 Classification Using Naive Bays
Machine Learning
Lecture 07
Discussion
2
Establishing the
prayer means
fixing it where it
needs repair &
maintaining it
once you do.
The prayer is
our reminder
that we’ll all
stand before
Allah on
Judgment Day.
Agenda:
•A Quick Recap (Important Concepts)
•Naïve Bayes Classifier
•Principle of Naïve Bayes
•Bayes theorem
•Why Bayes Classification
•Example
•Advantages and Disadvantages
•Conclusion
3
What is a Classifier?
4
Principle of Naive Bayes Classifier:
A Naive Bayes classifier is a probabilistic machine
learning model that’s used for classification task. The
crux of the classifier is based on the Bayes theorem.
5
Why Naïve?
7
Bayes’ Theorem
We will start with the fact that joint probability is commutative for
any two events. That is:
p(A and B) = p(B and A) ……… (3)
From equation 2, we know that:
p(A and B) = p(A).p(B|A)
p(B and A) = p(B).p(A|B)
We can rewrite equation 3 as:
p(A).p(B|A) = p(B).p(A|B)
Dividing two sides by p(B) gives us the Bayes’ Theorem:
8
Example:
• Let us take an example to get some better
intuition.
• Consider the problem of playing golf.
9
Example:
• We classify whether the day is suitable for playing golf,
given the features of the day.
• If we take the first row of the dataset, we can observe
that is not suitable for playing golf if the outlook:rainy,
temperature : hot, humidity : high and windy: False
• Assumption-I: we consider that these predictors are
independent
- if the temperature is hot, it does not necessarily mean that the
humidity is high
• Assumption-II: all the predictors have an equal effect
on the outcome
- the day being windy does not have more importance in
deciding to play golf or not
10
Example:
• According to this example, Bayes theorem can be rewritten as:
11
Example:
• Now, you can obtain the values for each by looking at the dataset
and substitute them into the equation.
• For all entries in the dataset, the denominator does not change, it
remain static. Therefore, the denominator can be removed and a
proportionality can be introduced
• In our case, the class variable(y) has only two outcomes, yes or
no. There could be cases where the classification could be
multivariate. Therefore, we need to find the class y with maximum
probability.
• Using the above function, we can obtain the class, given the
predictors 12
Bayesian Classification: Why?
• A statistical classifier: performs probabilistic prediction, i.e.,
predicts class membership probabilities
• Foundation: Based on Bayes’ Theorem.
• Performance: A simple Bayesian classifier, naïve Bayesian classifier,
has comparable performance with decision tree and selected neural
network classifiers
• Incremental: Each training example can incrementally
increase/decrease the probability that a hypothesis is correct — prior
knowledge can be combined with observed data
• Standard: Even when Bayesian methods are computationally
intractable, they can provide a standard of optimal decision making
against which other methods can be measured
13
Naïve Bayes Classifier: Training Dataset
14
Naïve Bayes Classifier An Example : age
youth
youth
income student credit_rating
high
high
no
no
fair
excellent
buys_computer
no
no
middle-a high no fair yes
senior medium no fair yes
senior low yes fair yes
senior low yes excellent no
middle-a low yes excellent yes
•
youth medium no fair no
P(Ci): P(buys_computer = “yes”) = 9/14 = 0.643 youth low
senior medium
yes fair
yes fair
yes
yes
49
Avoiding the Zero-Probability Problem
Advantages and Disadvantages:
Conclusion:
• Naive Bayes algorithms are mostly used in sentiment
analysis, spam filtering, recommendation systems etc.
• They are fast and easy to implement but their biggest
disadvantage is that the requirement of predictors to be
independent
• In most of the real life cases, the predictors are
dependent, this hinders the performance of the
classifier.
18
Thank you
Any Question?
19