0% found this document useful (0 votes)
20 views20 pages

Unit-3 (4)

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views20 pages

Unit-3 (4)

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Unit III

CONJUNCTIVE NORMAL FORM (CNF) or CLAUSAL NORMAL FORM

 A formula is in conjunctive normal form (CNF) or clausal normal form if it is a conjunction of one
or more clauses, where a clause is a disjunction of literals; otherwise put, it is an AND of ORs.
 Every propositional formula can be converted into an equivalent formula that is in CNF. This
transformation is based on rules about logical equivalences: the double negative law, De
Morgan's laws, and the distributive law.
 As a normal form, it is useful in automated theorem proving
 Conjunctive Normal Form (CNF): A WFF is in CNF format when it is a conjunction of disjunctions
of literals.
 Example: (P  Q  R) (S  P  T R) (Q  S)
 A clause is the disjunction of many things. The units that make up a clause are called literals.

Steps to convert PL into CNF

1. Remove ↔ using rule:


a) α ↔ β ≡ (α → β) Ʌ (β → α)
2. Remove → using rule:
a) α → β ≡ ᴦ α V β
3. Move ᴦ (negation) inwards
a) ᴦ(α V β) ≡ ᴦ α Ʌ ᴦ β
b) ᴦ(α Ʌ β) ≡ ᴦ α V ᴦ β
c) ᴦ(ᴦ α) ≡ α
4. Apply distributive and/ or commutative law
a) α Ʌ (β V γ) ≡ (α Ʌ β) V (α Ʌ γ)
b) α V (β Ʌ γ) ≡ (α V β) Ʌ (α V γ)
c) α Ʌ β ≡ β Ʌ α
d) α V β ≡ β V α

EXAMPLE:

P  (Q  R)
1. Eliminate, replacing α  β with (α  β)(β  α).
(P  (Q  R))  ((Q R)  P)
2. Eliminate , replacing α  β with α β.
(P  Q  R)  ((Q R)  P)
3. Move  inwards using de Morgan's rules and double-negation:
(P Q R)  ((Q R)  P)
4. Apply distributive law ( over ) and flatten:
(P  Q  R)  (Q P)  (R  P)

RESOLUTION

In propositional logic, the procedure for producing a proof by resolution of proposition P with respect to
a set of axioms F is called RESOLUTION.

Resolution for CNF – applied to a special type of wffs: conjunction of clauses.


Literal – either an atom (e.g., P) or its negation (P).
Clause – disjunction of literals (e.g., (P  Q  R)).
ALGORITHM: PROPOSITIONAL RESOLUTION

1. Convert all the propositions of F to clause form


2. Negate P and convert the result to clause form. Add it to the set of clauses obtained in step 1.
3. Repeat until either a contradiction is found or no progress can be made:
(a) Select two clauses. Call these the parent clauses.
(b) Resolve them together. The resulting clause, called the resolvent, will be the disjunction of
all of the literals of both of the parent clauses with the following exception: If there are any
pairs of literals L and ¬L such that one of the parent clauses contains L and the other
contains ¬L, then select one such pair and eliminate both L and ¬L from the resolvent.
(c) If the resolvent is the empty clause, then a contradiction has been found. If it is not then add
it to the set of clauses available to the procedure.

Limitations of Propositional Logic

Consider the following arguments:

 All dogs are faithful.


 Tommy is a dog.
 Therefore, Tommy is faithful.

How to represent and infer this in Propositional Logic?


Suppose, P considers “all dogs are faithful”.
Q considers “Tommy is a dog”.
But, P ᴧ Q → “Tommy is faithful”?
No, we cannot infer this in Propositional Logic.

Another example,
Tom is a hardworking student.
Hardworking (Tom)
Tom is an intelligent student
Intelligent (Tom)
If Tom is a hardworking and Tom is intelligent then Tom scores high marks.
In PL,
Hardworking (Tom) ᴧ Intelligent (Tom) → score_high_marks (Tom).

But, what about John and Jill? If we could write instead “All students who are hardworking and
intelligent scores high marks.
Unfortunately, the given statement cannot be written in Propositional Logic because PL structure don’t
allow this.

FIRST ORDER LOGIC or PREDICATE LOGIC

 FOL is a generalization of PL that allows us to express and infer arguments in infinite model like
o All men are mortal
o Some birds cannot fly
o At least one planet has life on it
 Each sentence, or statement, is broken down into a subject and a predicate.
 In each sentence, we actually talk about something. So in the sentence “Pinky is a cat” where
“Pinky” is a subject. And we are giving property to the subject which is called predicate. So in
the same sentence “cat” is the predicate. The predicate modifies or defines the properties of
the subject. In first-order logic, a predicate can only refer to a single subject.
 A sentence in first-order logic is written in the form P(x), where P is the predicate and x is the
subject, represented as a variable. As in the sentence “Pinky is a cat”, we can write it in FOL as
cat (pinky) which actually we called methods. In more generalized form, cat (x) means x is a cat.
 Sometimes, subject is not a single element. It becomes a group of objects. Example: “Every Lion
drinks coffee”. So we do have a Universe of Discourse “Lion”, so whenever we refer variable “x”
then that means “x” is one of the Lion which represents the connection of “x” and UoD.

 X1
 X2 LION
 X3

Where each x1, x2, x3 are particular element of LION set, then “every Lion drinks coffee” is same
as:
drinks (X1, coffee) ᴧ drinks (X2, coffee) ᴧ drinks (X3, coffee)
Now if we do have infinite lions then we cannot mention each element separately. How many
subjects would be in UoD?
Again, suppose consider another example, “Some cats are intelligent”

 C1
CAT
 C2
 C3

Statement either refers C1, C2 or C3 is intelligent.


Intelligent (C1) V Intelligent (C2) V Intelligent (C3).

If there is no cat who is intelligent, then the whole statement would be FALSE. Again if UoD of
cat is infinite so again it would be imposing to ORing the statements.

 In first-order logic, a sentence can be structured using the universal quantifier (symbolized ∀) or
the existential quantifier (Ǝ).
So for the above example,
∀ x * drinks (x, coffee)+
Ǝ x {intelligent (x)}

 Every complete “sentence” contains two parts: a “subject” and a “predicate”.


 The subject is what (or whom) the sentence is about.
 The predicate tell something about the subject.

Example: the sky is blue, predicate is “is blue” and subject is “sky” and can be represented as Blue(sky)

Syntax of FOL:

1. Connectives
2. Quantifiers
3. Constants
4. Variables
5. Functions
6. Predicates
User defines these primitives:
Constant symbols: representing individuals in the world. E.g., Mary, 3
Function symbols map individuals to individuals. E.g., father-of(Mary) = John, color-of(Sky) = Blue
Predicate symbols map from individuals to truth values. E.g., greater(5,3), green(Grass), color(Grass,
Green)

FOL supplies these primitives:


Variable symbols. E.g., x, y
Connectives
 Same as in propositional logic: not (ᴦ), and (ᴧ), or (ᴠ), implies (→), iff (↔)
Quantifiers: Universal (∀) and Existential (∃)

Sentences are built up from terms and atoms:

 A term (denoting a real-world individual) is a constant symbol, a variable symbol, or an n-place


function of n terms. For example, x and f(x1, ..., xn) are terms, where each xi is a term.
 An atom (which has value true or false) is either an n-place predicate of n terms, or, if P and Q
are atoms, then ᴦP, P V Q, P ᴧ Q, P → Q, P ↔ Q are atoms
 A sentence is an atom, or, if P is a sentence and x is a variable, then (Ax)P and (∃x)P are
sentences
 A well-formed formula (wff) is a sentence containing no "free" variables, i.e., all variables are
"bound" by universal or existential quantifiers. E.g., (∀x) P(x,y) has x bound as a universally
quantified variable, but y is free.

INFERENCE RULES FOR FOL

Inference rules for PL apply to FOL as well.

New (sound) inference rules for use with quantifiers:

Universal Elimination

If (∀x)P(x) is true, then P(c) is true, where c is a constant in the domain of x. For example, from
(∀x)eats(Ziggy, x) we can infer eats(Ziggy, IceCream). The variable symbol can be replaced by any ground
term, i.e., any constant symbol or function symbol applied to ground terms only.
Existential Introduction

If P(c) is true, then (∃x)P(x) is inferred. For example, from eats(Ziggy, IceCream) we can infer
(∃x)eats(Ziggy, x). All instances of the given constant symbol are replaced by the new variable symbol.
Note that the variable symbol cannot already exist anywhere in the expression.

Existential Elimination

From (∃x)P(x) infer P(c). For example, from (Ex)eats(Ziggy, x) infer eats(Ziggy, Cheese). Note that the
variable is replaced by a brand new constant that does not occur in this or any other sentence in the
Knowledge Base. In other words, we don't want to accidentally draw other inferences about it by
introducing the constant. All we know is there must be some constant that makes this true, so we can
introduce a brand new one to stand in for that (unknown) constant.

Entailment

 Entailment means that one thing follows from another: KB ╞ α


 Knowledge base KB entails sentence α if and only if α is true in all worlds where KB is true E.g.,
the KB containing “the Giants won” and “the Reds won” entails “Either the Giants won or the
Reds won” E.g., x+y = 4 entails 4 = x+y
 Entailment is a relationship between sentences (i.e., syntax) that is based on semantics

Proof System

A proof is a sequence of sentences, where each sentence is either a premise or a sentence derived from
earlier sentences in the proof by one of the rules of inference.

The last sentence is the theorem (also called goal or query) that we want to prove.

Unification

if we can find a substitution θ such that King(x) and Greedy(x) match King(John) and Greedy(y)

p q θ
Knows(John,x) Knows(John,Jane) {x/Jane}
Knows(John,x) Knows(y,OJ) {x/OJ, y/John}
Knows(John,x) Knows(y,Mother(y)) {x/Mother(John), y/John}
Knows(John,x) Knows(x,OJ) {fail}

Convert the following sentences into FOL:

Every gardener likes the sun.


x gardener(x)  likes(x,Sun)
You can fool some of the people all of the time.
x t person(x) time(t)  can-fool(x,t)
You can fool all of the people some of the time. (two ways)
x t (person(x)  time(t) can-fool(x,t))
x (person(x)  t (time(t) can-fool(x,t))
All purple mushrooms are poisonous.
x (mushroom(x)  purple(x))  poisonous(x)
No purple mushroom is poisonous. (two ways)
x purple(x)  mushroom(x)  poisonous(x)
x (mushroom(x)  purple(x))  poisonous(x)
There are exactly two purple mushrooms.
x y mushroom(x)  purple(x)  mushroom(y)  purple(y) ^ (x=y)  z (mushroom(z) 
purple(z))  ((x=z)  (y=z))
Bush is not tall.
tall(Bush)
X is above Y iff X is on directly on top of Y or there is a pile of one or more other objects directly on top
of one another starting with X and ending with Y.
x y above(x,y) ↔ (on(x,y)  z (on(x,z)  above(z,y)))

CONJUNCTIVE NORMAL FORMULA

FOL TO CNF: First Order Logic: Conversion to CNF

1. Eliminate bi-conditionals and implications:


 Eliminate ⇔, replacing α ⇔ β with (α ⇒ β) ∧ (β ⇒ α).
 Eliminate ⇒, replacing α ⇒ β with ¬α ∨ β.
2. Move ¬ inwards:
 ¬(∀ x p) ≡ ∃ x ¬p,
 ¬(∃ x p) ≡ ∀ x ¬p,
 ¬(α ∨ β) ≡ ¬α ∧ ¬β,
 ¬(α ∧ β) ≡ ¬α ∨ ¬β,
 ¬¬α ≡ α.
3. Standardize variables apart by renaming them: each quantifier should use a different variable.
4. Skolemize: each existential variable is replaced by a Skolem constant or Skolem function of the
enclosing universally quantified variables.
 For instance, ∃x Rich(x) becomes Rich(G1) where G1 is a new Skolem constant.
 “Everyone has a heart” ∀ x Person(x) ⇒ ∃ y Heart(y) ∧ Has(x, y) becomes ∀ x Person(x) ⇒
Heart(H(x)) ∧ Has(x, H(x)), where H is a new symbol (Skolem function).
5. Drop universal quantifiers
 For instance, ∀ x Person(x) becomes Person(x).
6. Distribute ∧ over ∨:
 (α ∧ β) ∨ γ ≡ (α ∨ γ) ∧ (β ∨ γ).

Exercise:

1. Convert “Everybody who loves all animals is loved by someone” to CNF

(∀x)[(∀y){Animal(y) → Loves(x,y)} → (∃y) Loves(y,x)]

Solution:

1. Eliminate implications: ∀x [¬∀y ¬Animal(y) ∨ Loves(x, y)] ∨ [∃y Loves(y, x)]


2. Move ¬ inwards
a) ∀x [∃y ¬(¬Animal(y) ∨ Loves(x, y))] ∨ [∃y Loves(y, x)]
b) ∀x [∃y ¬¬Animal(y) ∧ ¬Loves(x, y)] ∨ [∃y Loves(y, x)] (De Morgan)
c) ∀x [∃y Animal(y) ∧ ¬Loves(x, y)] ∨ [∃y Loves(y, x)] (double negation)
3. Standardize variables: ∀x [∃y Animal(y) ∧ ¬Loves(x, y)] ∨ [∃z Loves(z, x)]
4. Skolemization: ∀x [Animal(F(x)) ∧ ¬Loves(x, F(x))] ∨ [Loves(G(x), x)]
5. Drop universal quantifiers: [Animal(F(x)) ∧ ¬Loves(x, F(x))] ∨ [Loves(G(x), x)]
6. Distribute ∨ over ∧: [Animal(F(x)) ∨ Loves(G(x), x)] ∧ [¬Loves(x, F(x)) ∨ Loves(G(x), x)]

RESOLUTION IN PREDICATE LOGIC

Two literals are contradictory if one can be unified with the negation of the other. For example man(x)
and man (Himalayas) are contradictory since man(x) and man(Himalayas ) can be unified. In predicate
logic unification algorithm is used to locate pairs of literals that cancel out. It is important that if two
instances of the same variable occur, then they must be given identical substitutions. The resolution
algorithm for predicate logic as follows

Let f be a set of given statements and S is a statement to be proved.

1. Covert all the statements of F to clause form.


2. Negate S and convert the result to clause form. Add it to the set of clauses obtained in 1.
3. Repeat until either a contradiction is found or no progress can be made or a predetermined
amount of effort has been expended.
a. Select two clauses. Call them parent clauses.
b. Resolve them together. The resolvent will be the disjunction of all of these literals of
both clauses. If there is a pair of literals T1 and T2 such that one parent clause contains
Ti and the other contains T2 and if T1 and T2 are unifiable, then neither t1 nor T2 should
appear in the resolvent. Here Ti and T2 are called complimentary literals.
c. If the resolvent is the empty clause, then a contradiction has been found. If it is not,
then add it to the set of clauses available to the procedure.

Resolution Example: Anyone passing his history exams and winning the lottery is happy. But anyone who
studies or is lucky can pass all his exams. John did not study but John is lucky. Anyone who is lucky wins
the lottery. Is John happy?

Example 1:

Convert the following sentence into predicate logic and then prove "Is someone smiling?” using
resolution:
1. All people who are graduating are happy
2. All happy people smile
3. Someone is graduating

FOL:
 ∀x*graduating(x) → happy(x)}
 ∀x*happy(x) → smile(x)}
 Ǝx{graduate(x)}
 Prove: Ǝx(smile(x))

Negate the conclusion: ᴦƎw(smile(w))

FOL to CNF:
Step 1:
 ∀x{ᴦgraduating(x) V happy(x)}
 ∀x{ᴦhappy(x) V smile(x)}
 Ǝx{graduate(x)}
 ᴦƎw(smile(w))
Step 2: standardize variable
 ∀x{ᴦgraduating(x) V happy(x)}
 ∀y{ᴦhappy(y) V smile(y)}
 Ǝz{graduate(z)}
 ∀w(ᴦsmile(w))
Step 3: skolemization
 ∀x{ᴦgraduating(x) V happy(x)}
 ∀y{ᴦhappy(y) V smile(y)}
 graduate(A)
 ∀w(ᴦsmile(w))
Step 4: drop universal
 ᴦgraduating(x) V happy(x)
 ᴦhappy(y) V smile(y)
 graduate(A)
 ᴦsmile(w)

Resolution tree
 If facts ‘F’ is to be proved then it start with “ᴦF”
 It contradicts all the other rule in KB
 It process stops when it returns NULL clause.

Example 2:

Convert the following sentence into predicate logic and then prove "Was Marcus loyal to Caesar?” using
resolution:
1. Marcus was a man.
2. Marcus was a Pompeian.
3. All Pompeians were Romans.
4. Caesar was a ruler.
5. All Romans were either loyal to Caesar or hated him.
6. Everyone is loyal to someone.
7. People only try to assassinate rulers they are not loyal to.
8. Marcus tried to assassinate Caesar.
Solution:

The facts described by these sentences can be represented as a set of wff's in predicate logic as follows:
1. Marcus was a man.
man(Marcus)
2. Marcus was a Pompeian.
Pompeian(Marcus)
3. All Pompeians were Romans.
∀x: Pompeian(x) → Roman(x)
4. Caesar was a ruler.
ruler(Caesar)
5. All Romans were either loyal to Caesar or hated him.
∀x: Roman(x) → loyalto(X. Caesar) V hate(x, Caesar)
6. Everyone is loyal to someone.
∀x : → y: Ioyalto(x,y)
7. People only try to assassinate rulers they are not loyal to.
∀ x : ∀ y : person(x) ∧ ruler(y) ∧ tryassassinate(x,y) → ¬ Ioyalto(x,y)
8. Marcus tried to assassinate Caesar.
tryassassinate (Marcus, Caesar)

Additional: ∀x: man(x) → person(x)


∀x: ᴦman(x) V person(x)
9. ᴦman(x) V person(x)

CNF:
o man(Marcus)
o Pompeian(Marcus)
o Pompeian(x1)  Roman(x1)
o ruler(Caesar)
o Roman(x2)  loyalto(x2, Caesar)  hate(x2, Caesar)
o loyalto(x3, S1(x3))
o person(x4)  ruler(y1)  tryassassinate(x4, y1)  loyalto(x4, y1)
o tryassassinate(Marcus, Caesar)

Resolution tree:
1. Convert the following sentences to FOPL
 Jack owns a dog
 Every dog owner is an animal lover.
 No animal lover kills an animal.
 Either Jack or Curiosity killed the cat, who is named Tuna
Also prove by resolution - Did curiosity kill the cat
a. (x) Dog(x)  Owns(Jack,x)
b. (x) ((y) Dog(y)  Owns(x, y))  AnimalLover(x)
c. (x) AnimalLover(x)  ((y) Animal(y)  Kills(x,y))
d. Kills(Jack,Tuna)  Kills(Curiosity,Tuna)
e. Cat(Tuna)
f. (x) Cat(x)  Animal(x)
g. Kills(Curiosity, Tuna)  GOAL

2. Translate the following sentences into first order logic.


a. All dogs are mammals
b. Fido is a dog
c. Fido is a mammal
d. All mammals produce milk

3. Using propositional linear resolution, show the following propositional sentence is unsatisfiable.
Convert this sentence to clause form and derive the empty clause using resolution-

4. Represent the following sentence in the Predicate form "AII the children like sweets".
5. Represent the following sentences in first-order logic, using a consistent vocabulary (which you
must define):
a. Not all students take both History and Biology'
b. Only one student failed History'
c. Only one student failed both History and Biology'
d. The best score in History was better than the best score in Biology.
e. Every person who dislikes all vegetarians is smart
Practice Exercise -------

f. No person likes a smart vegetarian'


g. There is a woman who likes all men who are not vegetarians.
h. There is a barber who shaves all men in town who do not shave themselves.
i. No person likes a professor unless the professor is smart.
j. politicians can fool some of the people all of the time, and they can fool all of the people
some of the time, but they can't fool all of the people all of the time'
Give a predicate calculus sentence such that every world in which it is true contains exactly one
object.
6. Represent the sentence "All Germans speak the same languages" in predicate calculus. lJse
speaks(x, l), meaning that person ‘x’ speaks language ‘l’.
7.

Solution:
FOL:
a) ∀x food(x) -> likes(John, x)
b) food(Apple)
c) food(Chicken)
d) ∀x ∃y [ eats(y, x) ^ ¬killed(y, x) -> food(x) ]
e) eats(Bill, peanuts) ^ alive(Bill, peanuts)
f) ∀x eats(Bill, x) => eats(Sue, x)
Clausal Form:
a. ¬food(x) \/ likes(John, x)
b. food(Apple)
c. food(Chicken)
d. ¬eats(y, x) \/ killed(y, x) \/ food(x)
e. eats(Bill, peanuts)
f. ¬killed (Bill,peanuts)
g. ¬eats(Bill, x) \/ eats(Sue, x)

Resolve with Sub-goal


¬likes(John, peanuts)
1 ¬food(peanuts)
4 ¬eats(y, peanuts) \/ killed(y, peanuts)
5b ¬eats(y, peanuts)
5a Null clauses

Using backward chaining:


8.

OR

Solution:

9. Consider the facts:


a) Anyone whom Mary loves is a football star.
b) Any student who does not pass does not play.
c) John is a student.
d) Any student who does not study does not pass.
e) Anyone who does not play is not a football star.
Prove using resolution “If John does not study, then, Mary does not love John”.
Predicate logic:
a) ∀ x (LOVES(Mary,x) → STAR(x))
b) ∀ x (STUDENT(x) ∧ ¬ PASS(x) → ¬ PLAY(x))
c) STUDENT(John)
d) ∀ x (STUDENT(x) ∧ ¬ STUDY(x) → ¬ PASS(x))
e) ∀ x (¬ PLAY(x) → ¬ STAR(x))
Clause form:
a) ¬LOVES(Mary,x) ∨ STAR(x)
b) ¬STUDENT(x) ∨ PASS(x) ∨ ¬PLAY(x)
c) STUDENT(John)
d) ¬STUDENT(x) ∨ STUDY(x) ∨ ¬ PASS(x)
e) PLAY(x) ∨ ¬STAR(x)

Resolve with Sub-goal


¬STUDY(John) ∨ LOVES(Mary,John)
1 ¬STUDY(John) ∨ STAR(John)
5 ¬STUDY(John) ∨ PLAY(John)
4 PLAY(John) ∨ ¬ PASS(John) ∨ ¬STUDENT(John)
3 PLAY(John) ∨ ¬ PASS(John)
2 ¬STUDENT(John)
3 Null clause

INFERENCE RULES

Deductive inference rule:

Forward Chaining: Conclude from "A" and "A implies B" to "B".
A
A -> B
B
-------- ------------- -------------
Example:
It is raining.
If it is raining, the street is wet.
The street is wet.
-------- ------------- -------------

Forward chaining starts with the available data and uses inference rules to extract more data until the
goal is reached. Also called data-driven knowledge extraction.

Abductive inference rule:

Backward Chaining: Conclude from "B" and "A implies B" to "A".
B
A -> B
A
-------- ------------- -------------
Example:
The street is wet.
If it is raining, the street is wet.
It is raining.

Backward chaining is done in the backward direction. The system selects a goal state and rules whose
then portion has the goal state as conclusion. It establishes sub-goals to be satisfied for the goal state to
be true. Also called goal-driven knowledge extraction.
PROBABILITY TEHORY

Conditional probability, P(A|B), indicates the probability of event A given that we know event B has
occurred.

Syntax and semantics of graphical models for representing probability distributions:


– Bayesian networks (built on top of directed graphs)
– Markov networks (built on top of undirected graphs)

Independent events

The difference between the two examples is that in the first one, the two events are independent while
in the second they are not.
 Events A and B are independent if
P(A ∩ B) = P(A) · P(B).

Conditional probability

Let A and B be events, with P(B) > 0.


 The conditional probability P(A|B) of A given B is given by
𝑃(𝐴 ∩ 𝐵)
𝑃(𝐴|𝐵) =
𝑃(𝐵)

 Notice: If A and B are independent, then

𝑃(𝐴 ∩ 𝐵) 𝑃(𝐴) · 𝑃(𝐵)


𝑃(𝐴|𝐵) = = = P(A)
𝑃(𝐵) 𝑃(𝐵)

Probabilistic Reasoning based on Bayes’ Theorem:

 In probability theory, Bayes’ theorem (alternatively Bayes’ law or Bayes' rule) describes the
probability of an event, based on prior knowledge of conditions that might be related to the
event.
 Bayes’ theorem is named after Reverend Thomas Bayes.
 Bayes' theorem is stated mathematically as the following equation:

EXAMPLE:

1. At a certain university, 4% of men are over 6 feet tall and 1% of women are over 6 feet tall. The
total student population is divided in the ratio 3:2 in favour of women. If a student is selected
at random from among all those over six feet tall, what is the probability that the student is a
woman?
2. A factory production line is manufacturing bolts using three machines, A, B and C. Of the total
output, machine A is responsible for 25%, machine B for 35% and machine C for the rest. It is
known from previous experience with the machines that 5% of the output from machine A is
defective, 4% from machine B and 2% from machine C. A bolt is chosen at random from the
production line and found to be defective. What is the probability that it came from (a) machine
A (b) machine B (c) machine C?

3. Machines A and B produce 10% and 90% respectively of the production of a component
intended for the motor industry. From experience, it is known that the probability that machine
A produces a defective component is 0.01 while the probability that machine B produces a
defective component is 0.05. If a component is selected at random from a day’s production and
is found to be defective, find the probability that it was made by (a) machine A; (b) machine B.
UTILITY THEORY

 The main idea of Utility Theory is: an agent's preferences over possible outcomes can be
captured by a function that maps these outcomes to a real number; the higher the number the
more that agent likes that outcome. The function is called a utility function.
 Utility Theory uses the notion of Expected Utility (EU) as a value that represents the average
utility of all possible outcomes of a state, weighted by the probability that the outcome occurs.
 The agent can use probability theory to reason about uncertainty. The agent can use utility
theory for rational selection of actions based on preferences. Decision theory is a general theory
for combining probability with rational decisions
Decision theory = Probability theory + Utility theory
 The other key concept of Utility Theory is known as the Principle of Maximum Utility (MEU)
which states that a rational agent should choose an action that maximizes the agent’s expected
utility

HIDDEN MARKOV MODEL (HMM)

 In probability theory, a Markov model is a stochastic model used to model randomly changing
systems. It is assumed that future states depend only on the current state, not on the events
that occurred before it.
 A Markov chain is a sequence of states such that the (n+1)th state is independent of all previous
states if the nth state is known. That is, you can predict the distribution of the next state if I tell
you the current state, without telling how I got to the current state.
 A hidden Markov model is a Markov chain for which the state is only partially observable. In
other words, observations are related to the state of the system, but they are typically
insufficient to precisely determine the state.

Note: The main weakness of Markov networks is their inability to represent induced and non-transitive
dependencies; two independent variables will be directly connected by an edge, merely because some
other variable depends on both. As a result, many useful independencies go unrepresented in the
network. To overcome this deficiency, Bayesian networks use the richer language of directed graphs,
where the directions of the arrows permit us to distinguish genuine dependencies from spurious
dependencies induced by hypothetical observations.
BAYESIAN NETWORK

A probability theory is called a Bayesian network when the underlying graph is directed and a Markov
network/Markov random field when the underlying graph is undirected.

A Bayesian network is a DAG:


 Each node corresponds to a random variables
 Directed edges link nodes
 Edge goes from parent to child
 Each node has a conditional probability dist.
 P(Xi | Parents(Xi) )
 Topology of network represents causality

Example: The Alarm Problem by Pearl 1990

You have a new burglar alarm installed. It is reliable about detecting burglary, but responds to minor
earthquakes. Two neighbors (John, Mary) promise to call you at work when they hear the alarm. John
always calls when hears alarm, but confuses alarm with phone ringing (and calls then also). Mary likes
loud music and sometimes misses alarm! Given evidence about who has and hasn’t called, estimate the
probability of a burglary.

Represent problem using 5 binary variables:


 B = a burglary occurs at your house
 E = an earthquake occurs at your house
 A = the alarm goes off
 J = John calls to report the alarm
 M = Mary calls to report the alarm

Bayesian networks (or belief networks) are a graphical and compact way to represent uncertain
knowledge, based on this idea.

Constructing the Bayesian Network

1. Order the variables in terms of causality (may be a partial order)


e.g., {E, B} -> {A} -> {J, M}
2. Use these assumptions to create the graph structure of the Bayesian network

3. Fill in Conditional Probability Table (CPT)


a. One for each node
b. 2p entries, where p is the number of parents
Note: Where do these probabilities come from: expert knowledge, data (retrieval frequency
estimates)
A Bayesian network is directed acyclic graph (DAG) with the following components:

1. A set of nodes, one for each random variable of the “world” represented
2. A set of directed arcs connecting nodes
a. If there is an arc connecting X with Y , we say that X is a parent of Y (parents(X) denotes
the set of parents variable of X)
b. From the concept of parent variable, we define also the concepts of ancestors and
descendants of a random variable.
3. Each node Xi has an associated conditional probability distribution (CPT) P(Xi |parents(Xi))
a. If Xi is Boolean, we usually omit the probability of the false value.

General form of Bayesian Network:

An example: the probability that the alarm has gone off, both John and Mary call the police, but nothing
has happened is:

P(J, M, A, ¬B, ¬E) = P(J|A)P(M|A)P(A|¬B, ¬E)P(¬B)P(¬E) = 0.9 × 0.7 × 0.001 × 0.999 × 0.998 = 0.00062

For example, the probability of a Burglary, knowing that both John and Mary called the police i.e.,
compute P(burglary|john, mary)

Canonical models of a Bayes Network


It allows you to say things like: if you know the state of the sprinkler (on or off) and the state of the rain
(rainy or not), then knowing whether it's cloudy doesn't affect the probability that the grass is wet.

The main weakness of Markov networks is their inability to represent induced and non-transitive
dependencies; two independent variables will be directly connected by an edge, merely because some
other variable depends on both. As a result, many useful independencies go unrepresented in the
network. To overcome this deficiency, Bayesian networks use the richer language of directed graphs,
where the directions of the arrows permit us to distinguish genuine dependencies from spurious
dependencies induced by hypothetical observations.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy