0% found this document useful (0 votes)
17 views9 pages

AI NOTES Unit 2

The document discusses probabilistic reasoning, focusing on Bayesian inference and Naïve Bayes models, which are used to act under uncertainty. It explains the sources of uncertainty, the workings of the Naïve Bayes classifier, and its advantages and disadvantages, as well as applications in various fields. Additionally, it covers Bayesian belief networks, their components, and how they represent probabilistic relationships among variables.

Uploaded by

hasiny
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views9 pages

AI NOTES Unit 2

The document discusses probabilistic reasoning, focusing on Bayesian inference and Naïve Bayes models, which are used to act under uncertainty. It explains the sources of uncertainty, the workings of the Naïve Bayes classifier, and its advantages and disadvantages, as well as applications in various fields. Additionally, it covers Bayesian belief networks, their components, and how they represent probabilistic relationships among variables.

Uploaded by

hasiny
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

UNIT II

PROBABILISTIC REASONING

Acting under uncertainty – Bayesian inference – naïve bayes models. Probabilistic reasoning –
Bayesian networks – exact inference in BN – approximate inference in BN – causal networks.

Acting under uncertainty


The agent can never be completely certain about the state of the external world since there is
ambiguity and uncertainty

Sources of Uncertainty

 Uncertain data

 missing data, unreliable, ambiguous, imprecise representation,

inconsistent, subjective, derived from defaults, noisy…

 Uncertain knowledge

 Multiple causes lead to multiple effects

 Incomplete knowledge of causality in the domain

 Probabilistic/stochastic effects

 Uncertain knowledge representation

 restricted model of the real system

 limited expressiveness of the representation mechanism

Naïve Bayes Classifier Algorithm


 Naïve Bayes algorithm is a supervised learning algorithm, which is based on Bayes theorem and
used for solving classification problems.
 It is mainly used in text classification that includes a high-dimensional training dataset.
 Naïve Bayes Classifier is one of the simple and most effective Classification algorithms which
helps in building the fast machine learning models that can make quick predictions.
 It is a probabilistic classifier, which means it predicts on the basis of the probability of an object.
 Some popular examples of Naïve Bayes Algorithm are spam filtration, Sentimental analysis, and
classifying articles.
Why is it called Naïve Bayes?
The Naïve Bayes algorithm is comprised of two words Naïve and Bayes, Which can be described as:

Naïve: It is called Naïve because it assumes that the occurrence of a certain feature is independent of the
occurrence of other features. Such as if the fruit is identified on the bases of color, shape, and taste, then
red, spherical, and sweet fruit is recognized as an apple. Hence each feature individually contributes to
identify that it is an apple without depending on each other.
Bayes: It is called Bayes because it depends on the principle of Bayes' Theorem.
Bayes' Theorem:
Bayes' theorem is also known as Bayes' Rule or Bayes' law, which is used to determine the probability of
a hypothesis with prior knowledge. It depends on the conditional probability.
The formula for Bayes' theorem is given as:

Where,

P(A|B) is Posterior probability: Probability of hypothesis A on the observed event B.

P(B|A) is Likelihood probability: Probability of the evidence given that the probability of a hypothesis is
true.

P(A) is Prior Probability: Probability of hypothesis before observing the evidence.

P(B) is Marginal Probability: Probability of Evidence.

Working of Naïve Bayes' Classifier:


Working of Naïve Bayes' Classifier can be understood with the help of the below example:

Suppose we have a dataset of weather conditions and corresponding target variable "Play". So using this
dataset we need to decide that whether we should play or not on a particular day according to the weather
conditions. So to solve this problem, we need to follow the below steps:

1. Convert the given dataset into frequency tables.


2. Generate Likelihood table by finding the probabilities of given features.
3. Now, use Bayes theorem to calculate the posterior probability.

Problem: If the weather is sunny, then the Player should play or not?

Solution: To solve this, first consider the below dataset:

Outlook Play

0 Rainy Yes

1 Sunny Yes

2 Overcast Yes

3 Overcast Yes

4 Sunny No

5 Rainy Yes

6 Sunny Yes

7 Overcast Yes

8 Rainy No

9 Sunny No

10 Sunny Yes
11 Rainy No

12 Overcast Yes

13 Overcast Yes

Frequency table for the Weather Conditions:

Weather Yes No

Overcast 5 0

Rainy 2 2

Sunny 3 2

Total 10 5

Likelihood table weather condition:

Weather No Yes

Overcast 0 5 5/14= 0.35

Rainy 2 2 4/14=0.29

Sunny 2 3 5/14=0.35

All 4/14=0.29 10/14=0.71

Applying Bayes'theorem:

P(Yes|Sunny)= P(Sunny|Yes)*P(Yes)/P(Sunny)

P(Sunny|Yes)= 3/10= 0.3

P(Sunny)= 0.35

P(Yes)=0.71

So P(Yes|Sunny) = 0.3*0.71/0.35= 0.60

P(No|Sunny)= P(Sunny|No)*P(No)/P(Sunny)

P(Sunny|NO)= 2/4=0.5

P(No)= 0.29

P(Sunny)= 0.35

So P(No|Sunny)= 0.5*0.29/0.35 = 0.41

So as we can see from the above calculation that P(Yes|Sunny)>P(No|Sunny)

Hence on a Sunny day, Player can play the game.

Advantages of Naïve Bayes Classifier:


o Naïve Bayes is one of the fast and easy ML algorithms to predict a class of datasets.
o It can be used for Binary as well as Multi-class Classifications.
o It performs well in Multi-class predictions as compared to the other Algorithms.
o It is the most popular choice for text classification problems.

Disadvantages of Naïve Bayes Classifier:


o Naive Bayes assumes that all features are independent or unrelated, so it cannot learn the
relationship between features.

Applications of Naïve Bayes Classifier:


o It is used for Credit Scoring.
o It is used in medical data classification.
o It can be used in real-time predictions because Naïve Bayes Classifier is an eager learner.
o It is used in Text classification such as Spam filtering and Sentiment analysis.

Types of Naïve Bayes Model:

There are three types of Naive Bayes Model, which are given below:

o Gaussian: The Gaussian model assumes that features follow a normal distribution. This means if
predictors take continuous values instead of discrete, then the model assumes that these values are
sampled from the Gaussian distribution.
o Multinomial: The Multinomial Naïve Bayes classifier is used when the data is multinomial
distributed. It is primarily used for document classification problems, it means a particular
document belongs to which category such as Sports, Politics, education, etc.
The classifier uses the frequency of words for the predictors.
o Bernoulli: The Bernoulli classifier works similar to the Multinomial classifier, but the predictor
variables are the independent Booleans variables. Such as if a particular word is present or not in a
document. This model is also famous for document classification tasks

Bayesian Belief Network in artificial intelligence

Bayesian belief network is key computer technology for dealing with probabilistic events and to solve a
problem which has uncertainty. We can define a Bayesian network as:

"A Bayesian network is a probabilistic graphical model which represents a set of variables and their
conditional dependencies using a directed acyclic graph."

It is also called a Bayes network, belief network, decision network, or Bayesian model.

Bayesian networks are probabilistic, because these networks are built from a probability distribution,
and also use probability theory for prediction and anomaly detection.

Real world applications are probabilistic in nature, and to represent the relationship between multiple
events, we need a Bayesian network. It can also be used in various tasks including prediction, anomaly
detection, diagnostics, automated insight, reasoning, time series prediction, and decision making
under uncertainty.

Bayesian Network can be used for building models from data and experts opinions, and it consists of two
parts:

o Directed Acyclic Graph


o Table of conditional probabilities.

The generalized form of Bayesian network that represents and solve decision problems under uncertain
knowledge is known as an Influence diagram.

A Bayesian network graph is made up of nodes and Arcs (directed links), where:

o Each node corresponds to the random variables, and a variable can be continuous or discrete.
o Arc or directed arrows represent the causal relationship or conditional probabilities between
random variables. These directed links or arrows connect the pair of nodes in the graph.
These links represent that one node directly influence the other node, and if there is no directed
link that means that nodes are independent with each other
o In the above diagram, A, B, C, and D are random variables represented by the nodes
of the network graph.
o If we are considering node B, which is connected with node A by a directed arrow,
then node A is called the parent of Node B.
o Node C is independent of node A.

The Bayesian network has mainly two components:

o Causal Component
o Actual numbers

Each node in the Bayesian network has condition probability distribution P(Xi |Parent(Xi) ), which
determines the effect of the parent on that node.

Bayesian network is based on Joint probability distribution and conditional probability. So let's first
understand the joint probability distribution:

Joint probability distribution:


If we have variables x1, x2, x3,....., xn, then the probabilities of a different combination of x1, x2, x3.. xn,
are known as Joint probability distribution.

P[x1, x2, x3,....., xn], it can be written as the following way in terms of the joint probability distribution.

= P[x1| x2, x3,....., xn]P[x2, x3,....., xn]

= P[x1| x2, x3,....., xn]P[x2|x3,....., xn]....P[xn-1|xn]P[xn].

Example: Harry installed a new burglar alarm at his home to detect burglary. The alarm reliably responds
at detecting a burglary but also responds for minor earthquakes. Harry has two neighbors David and
Sophia, who have taken a responsibility to inform Harry at work when they hear the alarm. David always
calls Harry when he hears the alarm, but sometimes he got confused with the phone ringing and calls at
that time too. On the other hand, Sophia likes to listen to high music, so sometimes she misses to hear the
alarm. Here we would like to compute the probability of Burglary Alarm.

Problem:

Calculate the probability that alarm has sounded, but there is neither a burglary, nor an
earthquake occurred, and David and Sophia both called the Harry.

Solution:

o The Bayesian network for the above problem is given below. The network structure is showing
that burglary and earthquake is the parent node of the alarm and directly affecting the probability
of alarm's going off, but David and Sophia's calls depend on alarm probability.
o The network is representing that our assumptions do not directly perceive the burglary and also do
not notice the minor earthquake, and they also not confer before calling.
o The conditional distributions for each node are given as conditional probabilities table or CPT.
o Each row in the CPT must be sum to 1 because all the entries in the table represent an exhaustive
set of cases for the variable.
o In CPT, a boolean variable with k boolean parents contains 2 K probabilities. Hence, if there are
two parents, then CPT will contain 4 probability values

List of all events occurring in this network:

o Burglary (B)
o Earthquake(E)
o Alarm(A)
o David Calls(D)
o Sophia calls(S)

We can write the events of problem statement in the form of probability: P[D, S, A, B, E], can rewrite the
above probability statement using joint probability distribution:

P[D, S, A, B, E]= P[D | S, A, B, E]. P[S, A, B, E]

=P[D | S, A, B, E]. P[S | A, B, E]. P[A, B, E]

= P [D| A]. P [ S| A, B, E]. P[ A, B, E]

= P[D | A]. P[ S | A]. P[A| B, E]. P[B, E]


= P[D | A ]. P[S | A]. P[A| B, E]. P[B |E]. P[E]

Let's take the observed probability for the Burglary and earthquake component:

P(B= True) = 0.002, which is the probability of burglary.

P(B= False)= 0.998, which is the probability of no burglary.

P(E= True)= 0.001, which is the probability of a minor earthquake

P(E= False)= 0.999, Which is the probability that an earthquake not occurred.

We can provide the conditional probabilities as per the below tables:

Conditional probability table for Alarm A:

The Conditional probability of Alarm A depends on Burglar and earthquake:

B E P(A= True) P(A= False)

True True 0.94 0.06

True False 0.95 0.04

False True 0.31 0.69

False False 0.001 0.999

Conditional probability table for David Calls:

The Conditional probability of David that he will call depends on the probability of Alarm.
A P(D= True) P(D= False)

True 0.91 0.09

False 0.05 0.95

Conditional probability table for Sophia Calls:

The Conditional probability of Sophia that she calls is depending on its Parent Node "Alarm."

A P(S= True) P(S= False)

True 0.75 0.25

False 0.02 0.98


From the formula of joint distribution, we can write the problem statement in the form of probability
distribution:
P(S, D, A, ¬B, ¬E) = P (S|A) *P (D|A)*P (A|¬B ^ ¬E) *P (¬B) *P (¬E).
= 0.75* 0.91* 0.001* 0.998*0.999
= 0.00068045.
Hence, a Bayesian network can answer any query about the domain by using Joint distribution.
The semantics of Bayesian Network:
There are two ways to understand the semantics of the Bayesian network, which is given below:
1. To understand the network as the representation of the Joint probability distribution.
It is helpful to understand how to construct the network.
2. To understand the network as an encoding of a collection of conditional independence
statements.
It is helpful in designing inference procedure
causal networks

Causal AI is the only technology that can reason and make choices like humans do. It utilizes causality to go beyond narrow
machine learning predictions and can be directly integrated into human decision-making
Refer: Approximate & Exact inference in hand written notes

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy