Mca Ai
Mca Ai
MIT-13
ARTIFICIAL INTELLIGENCE
Advisory Committee
1. Dr. Jayant Sonwalkar 4. Dr. Sharad Gangele
Hon’ble Vice Chancellor Professor
Madhya Pradesh Bhoj (Open) University, R.K.D.F. University, Bhopal (M.P.)
Bhopal (M.P.)
5. Dr. Romsha Sharma
2. Dr. L.S. Solanki Professor
Registrar Sri Sathya Sai College for Women,
Madhya Pradesh Bhoj (Open) University, Bhopal (M.P.)
Bhopal (M.P.)
6. Dr. K. Mani Kandan Nair
3. Dr. Kishor John Department of Computer Science
Director Makhanlal Chaturvedi National University of
Madhya Pradesh Bhoj (Open) University, Journalism and Communication, Bhopal (M.P.)
Bhopal (M.P.)
COURSE WRITERS
Dr. Angajala Srinivasa Rao, Professor and Principal, Computer Science and Engineering , Nova College of Engineering
and Technology, Ibrahimpatnam, Andhra Pradesh
Units (1.0-1.3, 2.4-2.4.1, 2.9-2.14)
Dr. Preety Khatri, Assistant Professor, Computer Science, S.O.I.T., I.M.S., Noida
Units (1.3.1-1.9, 2.4.2-2.8, 3, 4, 5)
All rights reserved. No part of this publication which is material protected by this copyright notice
may be reproduced or transmitted or utilized or stored in any form or by any means now known or
hereinafter invented, electronic, digital or mechanical, including photocopying, scanning, recording
or by any information storage or retrieval system, without prior written permission from the Registrar,
Madhya Pradesh Bhoj (Open) University, Bhopal.
Information contained in this book has been published by VIKAS® Publishing House Pvt. Ltd. and has
been obtained by its Authors from sources believed to be reliable and are correct to the best of their
knowledge. However, the Madhya Pradesh Bhoj (Open) University, Bhopal, Publisher and its Authors
shall in no event be liable for any errors, omissions or damages arising out of use of this information
and specifically disclaim any implied warranties or merchantability or fitness for any particular use.
Unit – I
What is Artificial Intelligence, Artificial Intelligence: An Introduction, AI Unit-1: Basics of Artificial
Problems, The Underlying Assumption, AI Techniques, Games, Theorem Intelligence (Pages 3-47)
Proving, Natural Language Processing, Vision Processing, Speech
Processing, Robotics, Expert System, Search Knowledge, Abstraction.
Unit – II
Problem, Problem Space and Search, Defining the Problem as a State Unit-2: Problem Space, Search and
Space, Production Systems, Heuristic Search, Heuristic Search Techniques, Knowledge Representation
Best-First Search, Branch-and-Bound, Problem Reduction, Constraint (Pages 49-130
Satisfaction, Means-End Analysis.
Knowledge Representation, Representation and Mapping, Approaches
to Knowledge Representation, Issues in Knowledge Representation, The
Frame Problem.
Unit – III
Predicate Logic, Representing Simple Facts in Logic, Representing Instance Unit-3: Predicate Logic and Rule
and is a Relationships, Modus Ponens, Resolution, Natural Deduction, Based System
Dependency-Directed Backtracking, Rule Based Systems, Procedural (Pages 131-163)
versus Declarative Knowledge, Forward versus Backward Reasoning,
Matching, Conflict Resolution, Use of Non Back Track,
Unit – IV
Structured Knowledge Representation Semantic Net, Semantic Nets, Unit-4: Structured Knowledge
Frames, Slots Exceptions, Slot-Values as Objects, Handling Uncertainties, Representation and Semantic Net
Probabilistic Reasoning, Use of Certainty Factor, Fuzzy Logic (Pages 165-186)
Unit – V
Learning, Concept of Learning, Rote Learning, Learning by Taking Advice, Unit-5: Learning and Expert Systems
Learning in Problem Solving, Learning by Induction, Explanation-Based (Pages 187-219)
Learning, Learning Automation, Learning in Neural Networks, Expert
Systems, Need and Justification of Expert Systems, MYCIN, Representing
and Using Domain Knowledge, RI.
CONTENTS
INTRODUCTION 1
Artificial Intelligence (AI) is the realm of computer science that emphasizes on the
creation of machines that can engage on behaviour that is considered to be intelligent NOTES
according to humans. Researchers can now create systems that can mimic human
thought, understand speech and perform various feats that were considered
impossible earlier. The term ‘Artificial Intelligence’ was coined by John McCarthy
in 1956 and he defined it as ‘the science and engineering of making intelligent
machines’. AI has become an essential part of the technology industry now, and
scholars are trying to delve deeper into the field in order to provide solutions for
various problems that require the intervention of machines. The central problems
of AI include reasoning, knowledge, planning, learning, communication, perception
and the ability to move and manipulate objects.
Technically, Artificial Intelligence (AI) is a technique that helps to create
software programs to make computers perform operations that require human
intelligence. Currently, AI is being used in various application areas, such as intelligent
game playing, natural language processing, vision processing and speech processing.
In AI, a large amount of knowledge is required to solve problems, such as natural
language understanding and generation. To represent knowledge in AI, predicate
logic is used, which helps represent simple facts. Artificial intelligence is also referred
to as computational intelligence. Although the progress made in the field of AI is
just a fraction of the computer revolution, AI has certainly helped to enhance the
quality of life. The best way to explain AI to a layman is to tell him that it helps to
make computers ‘behave’ intelligently like human beings. As a result, we now
have systems that can monitor work in various production plants; or machines that
can understand instructions and can be easily controlled by human beings. AI has
also helped create computer programs that cannot only play chess but also defeat
world champions at the game. However, it remains to be seen whether AI systems
can become fast, efficient and intelligent enough to completely take the place of
the human mind in any situation.
This book, Artificial Intelligence, follows the SIM format wherein each
Unit begins with an Introduction to the topic followed by an outline of the
‘Objectives’. The detailed content is then presented in a simple and an organized
manner, interspersed with Answers to ‘Check Your Progress’ questions to test
the understanding of the students. A ‘Summary’ along with a list of ‘Key Terms’
and a set of ‘Self-Assessment Questions and Exercises’ is also provided at the
end of each unit for effective recapitulation.
Self - Learning
Material 1
Basics of Artificial
INTELLIGENCE
NOTES
Structure
1.0 Introduction
1.1 Objectives
1.2 Basic Concepts of Artificial Intelligence
1.2.1 Underlying Assumption of AI
1.2.2 AI Techniques
1.3 AI Problems
1.3.1 Theorem Proving through AI
1.4 Application Areas of AI
1.4.1 Games
1.4.2 Natural Language Processing
1.4.3 Vision Processing
1.4.4 Speech Processing
1.4.5 Robotics
1.4.6 Expert Systems
1.4.7 Search Knowledge
1.4.8 Abstraction
1.5 Answers to ‘Check Your Progress’
1.6 Summary
1.7 Key Terms
1.8 Self-Assessment Questions and Exercises
1.9 Further Reading
1.0 INTRODUCTION
The term ‘Artificial Intelligence’ was coined by John McCarthy in 1956. This term
is used to describe the ‘Intelligence’ demonstrated by a system. It plays a key role
in problem solving. Production systems help in searching for a solution to a problem.
Certain heuristic algorithms have been developed to solve a problem within the
scheduled time and space. Artificial Intelligence (AI) involves the task of creating
intelligent computers that can perform activities similar to those performed by a
human being, but more efficiently. The main objective of AI is to create an
information processing theory, which can help develop intelligent computers.
Currently, AI is used in various areas, such as games, natural language processing,
vision processing, speech processing, robotics and expert system. Banks use
software systems that are created using AI to organize operations, invest in stocks
and manage property.
A natural language is a language which is written and spoken by human
beings for communication. Natural languages are different from computer
programming languages because they have evolved naturally while computer
programming languages have been developed by human beings. There are basically
two systems natural language generation system and natural language understanding
system. The natural language generation involves conversion of information from
Self - Learning
Material 3
Basics of Artificial computer databases into normal human language. Rule based system is the most
Intelligence
used form of artificial intelligence in the industry, which is also known as the expert
system. Rule-based system, also known as expert system or production system
has immense importance in the building of knowledge system. In these systems,
NOTES the domain expertise is encoded in the form of ‘if–then’ rules. This enables a
modular portrayal of the knowledge, which facilitates its updating and maintenance.
In this unit, you will learn about the basic concept of artificial intelligence,
underlying assumptions of AI, AI techniques, AI problems, games, natural language
processing, vision processing, speech processing, robotics, expert systems, search
knowledge and abstraction.
1.1 OBJECTIVES
After going through this unit, you will be able to:
Understand the basic concept of Artificial Intelligence (AI)
Learn about underlying assumption of AI
Analyse AI techniques
Discuss about the AI problems
Explain the application areas of AI
Self - Learning
Material 15
Basics of Artificial The Second Approach
Intelligence
The structure of the data is as before but we use 2 for a blank, 3 for an X and 5 for
an O. A variable called TURN indicates 1 for the first move and 9 for the last. The
NOTES algorithm consists of three actions:
(i) MAKE2 which returns 5 if the centre square is blank; otherwise it returns
any blank non-corner square, i.e., 2, 4, 6 or 8.
(ii) POSSWIN (p) returns 0 if player p cannot win on the next move and
otherwise returns the number of the square that gives a winning move. It
checks each line using products 3 × 3 × 2 = 18 gives a win for X,
5 × 5 × 2 = 50 gives a win for O, and the winning move is the holder of the
blank.
(iii) GO (n) makes a move to square n setting BOARD[n] to 3 or 5.
This algorithm is more involved and takes longer but it is more efficient in
storage which compensates for its longer time. It depends on the programmer’s
skill.
The Final Approach
The structure of the data consists of BOARD which contains a nine element vector,
a list of board positions that could result from the next move and a number
representing an estimation of how the board position leads to an ultimate win for
the player to move.
This algorithm looks ahead to make a decision on the next move by deciding
which the most promising move or the most suitable move at any stage would be
and selects the same.
Consider all possible moves and replies that the program can make. Continue
this process for as long as time permits until a winner emerges, and then choose
the move that leads to the computer program winning, if possible in the shortest
time.
Actually this is most difficult to program by a good limit but it is as far that
the technique can be extended to in any game. This method makes relatively fewer
loads on the programmer in terms of the game technique but the overall game
strategy must be known to the adviser.
Question Answering
Let us consider question answering systems that accept input in English and provide
answers also in English. This problem is harder than the previous one as it is more
difficult to specify the problem properly. Another area of difficulty concerns deciding
whether the answer obtained is correct, or not, and further what is meant by
‘correct’. For example, consider the following situation:
Rani went shopping for a new coat. She found a red one she really liked.
When she got home, she found that it went perfectly with her favourite dress.
Question
1. What did Rani go shopping for?
Self - Learning 2. What did Rani find that she liked?
16 Material 3. Did Rani buy anything?
Method 1 Basics of Artificial
Intelligence
This method can be analysed as follows.
Data Structures
NOTES
A set of templates that match common questions and produce patterns used to
match against inputs. Templates and patterns are used so that a template that
matches a given question is associated with the corresponding pattern to find the
answer in the input text. For example, the template who did x y generates x y z if
a match occurs and z is the answer to the question. The given text and the question
are both stored as strings.
Algorithm
Answering a question requires the following four steps to be followed:
Compare the templates against the questions and store all successful matches
to produce a set of text patterns.
Pass these text patterns through a substitution process to change the person
or voice and produce an expanded set of text patterns.
Apply each of these patterns to the text; collect all the answers and then
print the answers.
Example
In question 1 we use the template WHAT DID X Y which generates
Rani go shopping for z and after substitution we get
Rani goes shopping for z and Rani went shopping for z giving z [equivalence] a
new coat
In question 2 we need a very large number of templates and also a scheme to
allow the insertion of ‘find’ before ‘that she liked’; the insertion of ‘really’ in the
text; and the substitution of ‘she’ for ‘Rani’ gives the answer ‘a red one’.
Question 3 cannot be answered.
Comments
This is a very primitive approach basically not matching the criteria we set for
intelligence and worse than that, used in the game.
Method 2
This method can be analysed as follows.
Data Structures
A structure called English consists of a dictionary, grammar and some semantics
about the vocabulary we are likely to come across. This data structure provides
the knowledge to convert English text into a storable internal form and also to
convert the response back into English. The structured representation of the text
is a processed form and defines the context of the input text by making explicit all
Self - Learning
Material 17
Basics of Artificial references such as pronouns. There are three types of such knowledge
Intelligence
representation systems: production rules of the form ‘if x then y’, slot and filler
systems and statements in mathematical logic. The system used here will be the
slot and filler system. Take, for example sentence:
NOTES
‘She found a red one she really liked’.
Event 2 Event 2
instance: finding instance: liking
tense: past tense: past
agent:Rani modifier: much
object: Thing1 object: Thing1
Thing 1
instance: coat
colour: red
The question is stored in two forms: as input and in the above form.
Algorithm
Convert the question to a structured form using English know how, then use
a marker to indicate the substring (like ‘who’ or ‘what’) of the structure,
that should be returned as an answer. If a slot and filler system is used a
special marker can be placed in more than one slot.
The answer appears by matching this structured form against the structured
text.
The structured form is matched against the text and the requested segments
of the question are returned.
Examples
Both questions 1 and 2 generate answers via a new coat and a red coat respectively.
Question 3 cannot be answered, because there is no direct response.
Comments
This approach is more meaningful than the previous one and so is more effective.
The extra power given must be paid for by additional search time in the knowledge
bases. A warning must be given here: that is – to generate an unambiguous English
knowledge base is a complex task and must be left until later in the course. The
problems of handling pronouns are difficult. For example:
Rani walked up to the salesperson: she asked where the toy department
was.
Rani walked up to the salesperson: she asked her if she needed any help.
Whereas in the original text the linkage of ‘she’ to ‘Rani’ is easy, linkage of
‘she’ in each of the above sentences to Rani and to the salesperson requires
additional knowledge about the context via the people in a shop.
Self - Learning
18 Material
Method 3 Basics of Artificial
Intelligence
This method can be analysed as follows.
Data Structures
NOTES
World model contains knowledge about objects, actions and situations that are
described in the input text. This structure is used to create integrated text from
input text. Figure 1.1 shows how the system’s knowledge of shopping might be
represented and stored. This information is known as a script and in this case is a
shopping script.
Shopping Script: C - Customer, S - Salesperson
Props: M - Merchandize, D - Money-dollars, Location: L - a Store.
C contacts to S
C (Customer) (Salesperson) to know
selects the products product’s details and
to purchase payment scheme
Materials to be
delivered by S after
clearing payment
Admin Dept contains M (Merchandize), D (Money-
Dollars and L (Store Details) to control the whole
shopping transaction
Self - Learning
Material 19
Basics of Artificial Algorithm
Intelligence
Convert the question to a structured form using both the knowledge contained in
Method 2 and the World model, generating even more possible structures, since
NOTES even more knowledge is being used. Sometimes filters are introduced to prune the
possible answers. To answer a question, the scheme followed is:
Convert the question to a structured form as before but use the world model
to resolve any ambiguities that may occur. The structured form is matched against
the text and the requested segments of the question are returned.
Example
Both questions 1 and 2 generate answers, as in the previous program.
Question 3 can now be answered. The shopping script is instantiated and
from the last sentence the path through step 14 is the one used to form the
representation.
‘M’ is bound to the red coat-got home. ‘Rani buys a red coat’ comes from
step 10 and the integrated text generates that she bought a red coat.
Comments
This program is more powerful than both the previous programs because it has
more knowledge. Thus, like the last game program it is exploiting AI techniques.
However, we are not yet in a position to handle any English question. The major
omission is that of a general reasoning mechanism known as inference to be used
when the required answer is not explicitly given in the input text. But this approach
can handle, with some modifications, questions of the following form with the
answer—Saturday morning Rani went shopping.
Her brother tried to call her but she did not answer.
Question
Why could not Rani’s brother reach her?
Answer
Because she was not in.
This answer is derived because we have supplied an additional fact that a person
cannot be in two places at once.
This patch is not sufficently general so as to work in all cases and does not provide
the type of solution we are really looking for.
1.3 AI PROBLEMS
Intelligence does not imply perfect understanding; every intelligent being has
limited perception, memory and computation. Many points on the spectrum of
intelligence versus cost are viable, from insects to humans. AI seeks to understand
the computations required from intelligent behaviour and to produce computer
systems that exhibit intelligence. Aspects of intelligence studied by AI include
Self - Learning
20 Material
perception, communicational using human languages, reasoning, planning, learning Basics of Artificial
Intelligence
and memory.
Let us consider some of the problems that artificial intelligence is used to
solve. Early examples are game playing and theorem proving, which involves NOTES
resolution. Common sense reasoning formed the basis of GPS a general problem
solver. Natural language processing met with early success; then the power of the
computers hindered progress, but currently this topic is experiencing a flow. The
question of expert systems is interesting but it represents one of the best examples
of an application of AI, which appears useful to non-AI people. Actually the expert
system solves particular subsets of problems using knowledge and rules about a
particular topic.
The following questions are to be considered before we can step forward:
1. What are the underlying assumptions about intelligence?
2. What kinds of techniques will be useful for solving AI problems?
3. At what level human intelligence can be modelled?
4. When will it be realized when an intelligent program has been built?
To solve the problem of building a system you should take the following steps:
1. Define the problem accurately including detailed specifications and what
constitutes a suitable solution.
2. Scrutinize the problem carefully, for some features may have a central affect
on the chosen method of solution.
3. Segregate and represent the background knowledge needed in the solution
of the problem.
4. Choose the best solving techniques for the problem to solve a solution.
Problem solving is a process of generating solutions from observed data.
A ‘problem’ is characterized by a set of goals,
A set of objects, and
A set of operations.
These could be ill-defined and may evolve during problem solving.
A ‘problem space’ is an abstract space.
A problem space encompasses all valid states that can be generated
by the application of any combination of operators on any combination
of objects.
The problem space may contain one or more solutions. A solution is a
combination of operations and objects that achieve the goals.
A ‘search’ refers to the search for a solution in a problem space.
Search proceeds with different types of ‘search control strategies’.
The depth first search and breadth first search are the two common
search strategies.
Self - Learning
Material 21
Basics of Artificial AI - General Problem Solving
Intelligence
Problem solving has been the key area of concern for Artificial Intelligence.
Problem solving is a process of generating solutions from observed or given
NOTES data. It is however not always possible to use direct methods (i.e., go directly
from data to solution). Instead, problem solving often needs to use indirect or
model-based methods.
General Problem Solver (GPS) was a computer program created in 1957 by
Simon and Newell to build a universal problem solver machine. GPS was based
on Simon and Newell’s theoretical work on logic machines. GPS in principle can
solve any formalized symbolic problem, such as theorems proof and geometric
problems and chess playing.
GPS solved many simple problems, such as the Towers of Hanoi, that
could be sufficiently formalized, but GPS could not solve any real-world
problems.
To build a system to solve a particular problem, you need to take the following
steps:
1. Define the problem precisely – find input situations as well as final situations
for an acceptable solution to the problem
2. Analyse the problem – find few important features that may have impact on
the appropriateness of various possible techniques for solving the problem
3. Isolate and represent task knowledge necessary to solve the problem
4. Choose the best problem-solving technique(s) and apply to the particular
problem
Problem Definitions
A problem is defined by its ‘elements’ and their ‘relations’. To provide a formal
description of a problem, you need to take the following steps:
1. Define a state space that contains all the possible configurations of the
relevant objects, including some impossible ones.
2. Specify one or more states that describe possible situations, from which the
problem-solving process may start. These states are called initial states.
3. Specify one or more states that would be acceptable solution to the problem.
These states are called goal states.
Specify a set of rules that describe the actions (operators) available.
The problem can then be solved by using the rules, in combination with an
appropriate control strategy, to move through the problem space until a path
from an initial state to a goal state is found. This process is known as ‘search’.
Thus:
Search is fundamental to the problem-solving process.
Search is a general mechanism that can be used when a more direct method
is not known.
Self - Learning
22 Material
Search provides the framework into which more direct methods for solving Basics of Artificial
Intelligence
subparts of a problem can be embedded. A very large number of AI problems
are formulated as search problems.
Problem space
NOTES
A problem space is represented by a directed graph, where nodes represent
search state and paths represent the operators applied to change the state.
To simplify search algorithms, it is often convenient to logically and
programmatically represent a problem space as a tree. A tree usually
decreases the complexity of a search at a cost. Here, the cost is due to
duplicating some nodes on the tree that were linked numerous times in the
graph, e.g., node B and node D.
A tree is a graph in which any two vertices are connected by exactly one
path. Alternatively, any connected graph with no cycles is a tree.
Figure 1.2(a) shows a graph and Figure 1.2(b) shows a tree.
A B
C D
B C
D B D
Self - Learning
Fig. 1.2(b) Tree
Material 23
Basics of Artificial The term, ‘Problem Solving’ relates to analysis in AI. Problem solving may
Intelligence
be characterized as a systematic search through a range of possible actions to
reach some predefined goal or solution. Problem-solving methods are categorized
as special purpose and general purpose.
NOTES
A special-purpose method is tailor-made for a particular problem, often
exploits very specific features of the situation in which the problem is
embedded.
A general-purpose method is applicable to a wide variety of problems.
One General-purpose technique used in AI is ‘means-end analysis’, which
is a step-by-step or incremental reduction of the difference between current
state and final goal.
Problem Characteristics
Heuristics cannot be generalized, as they are domain specific. Production systems
provide ideal techniques for representing such heuristics in the form of IF-THEN
rules. Most problems requiring simulation of intelligence use heuristic search
extensively. Some heuristics are used to define the control structure that guides the
search process, as seen in the example described above. But heuristics can also
be encoded in the rules to represent the domain knowledge. Since most AI
problems make use of knowledge and guided search through the knowledge, AI
can be described as the study of techniques for solving exponentially hard
problems in polynomial time by exploiting knowledge about problem domain.
To use the heuristic search for problem solving, the problem shoul be analysed for
the following considerations:
Decomposability of the problem into a set of independent smaller
subproblems.
Possibility of undoing solution steps, if they are found to be unwise.
Predictability of the problem universe.
Possibility of obtaining an obvious solution to a problem without comparison
of all other possible solutions.
Type of the solution: Whether it is a state or a path to the goal state.
Role of knowledge in problem solving.
Nature of solution process: With or without interacting with the user.
The general classes of engineering problems such as planning, classification,
diagnosis, monitoring and design are generally knowledge intensive and use a large
amount of heuristics. Depending on the type of problem, the knowledge representation
schemes and control strategies for search are to be adopted. Combining heuristics
with the two basic search strategies have been discussed above. There are a number
of other general-purpose search techniques which are essentially heuristics based.
Their efficiency primarily depends on how they exploit the domain-specific knowledge
to abolish undesirable paths. Such search methods are called ‘weak methods’, since
the progress of the search depends heavily on the way the domain knowledge is
exploited. A few of such search techniques which form the centre of manyAI systems
Self - Learning are briefly presented in the following sections.
24 Material
Problem Decomposition Basics of Artificial
Intelligence
Suppose you have to solve the expression: + (X³ + X² + 2X + 3sinx)dx
(X³ + X² + 2X + 3sinx)dx
NOTES
x² –3cosx
This problem can be solved by breaking it into smaller problems, each of
which we can solve by using a small collection of specific rules. Using this technique
of problem decomposition, we can solve very large problems very easily. This can
be considered as an intelligent behaviour.
Can Solution Steps Be Ignored?
Suppose we are trying to prove a mathematical theorem. First we proceed
considering that proving a lemma will be useful. Later we realize that it is not at all
useful. We start with another method and to prove the theorem simply ignore the
first method.
Consider the 8-puzzle problem to solve. Here if we make a wrong move
and realize the mistake we can go back as the control strategy keeps track of all
the moves. So we can backtrack to the initial state and start with some new move.
Consider the problem of playing chess. Here, once we make a move we
can never recover from that step. These problems are illustrated in the three
important classes of problems mentioned below:
1. Ignorable, in which solution steps can be ignored.
E.g.: Theorem Proving
2. Recoverable, in which solution steps can be undone.
E.g.: 8-Puzzle
3. Irrecoverable, in which solution steps cannot be undone.
E.g.: Chess
Is the Problem Universe Predictable?
Consider the 8-Puzzle problem. Every time we make a move, we know exactly
what will happen. This means that it is possible to plan an entire sequence of
moves and be confident what the resulting state will be. We can backtrack to
earlier moves if they prove unwise.
Suppose we want to play Bridge. We need to plan before the first play, but
we cannot play with certainty. So, the outcome of this game is very uncertain. In
case of 8-Puzzle, the outcome is very certain. To solve uncertain outcome problems,
we follow the process of plan revision as the plan is carried out and the necessary
feedback is provided. The disadvantage is that the planning in this case is often Self - Learning
very expensive. Material 25
Basics of Artificial Is Good Solution Absolute or Relative?
Intelligence
Consider the problem of answering questions based on a database of simple facts
such as the following:
NOTES Siva was a man.
Siva was a worker in a company.
Siva was born in 1905.
All men are mortal.
All workers in a factory died when there was an accident in 1952.
No mortal lives longer than 100 years.
Suppose we ask a question; ‘Is Siva alive?’
By representing these facts in a formal language, such as predicate logic,
and then using formal inference methods we can derive an answer to this question
easily. There are two ways to answer the question shown below:
Method I
Siva was a man.
Siva was born in 1905.
All men are mortal.
Now it is 2008, so Siva’s age is 103 years.
No mortal lives longer than 100 years.
Method II
Siva is a worker in the company.
All workers in the company died in 1952.
Answer: So Siva is not alive. It is the answer from the above methods.
We are interested to answer the question; it does not matter which path we
follow. If we follow one path successfully to the correct answer, then there is no
reason to go back and check another path to lead the solution.
1.3.1 Theorem proving through AI
AI is being used to solve problems such as intelligent game playing and theorem
proving using a computer system. In intelligent game playing, a computer is
programmed to play a game such as chess and tic-tac-toe in the same way as
human beings play. The chess game— developed by Arthur Samuel— was the
first game in which AI was used for intelligent game playing. Mathematical
theorems were proved using AI. The Theorem Prover system developed by
Gelernter uses AI to prove geometrical theorems. Computer researchers and
software developers consider that computers can be easily used with AI for
intelligent game playing and theorem proving because computers are fast and
can explore a large number of solution paths. After exploring the solution paths,
the computers can also efficiently select the most suitable solution path for solving
a problem.
Self - Learning
In the area of decision-making, AI has been used for common sense
26 Material reasoning in which reasoning about physical objects and their relationships with
each other is done. Common sense reasoning also includes reasoning about Basics of Artificial
Intelligence
actions and their consequences. AI is also used to develop software for vision
processing and speech recognition. In addition, it helps to solve the problem of
natural language understanding and for problem solving in specialized areas such
as medical diagnosis and chemical analysis. There are also various specialized NOTES
areas, such as engineering design, scientific discovery and financial planning, in
which it is necessary to obtain expertise. AI can be used to create complex
programs for solving problems in these specialized areas. It is easier to learn
perpetual, linguistic and common sense skills than expert skills. As a result,
currently AI is being used to solve problems related to areas in which only expert
skills are required instead of common sense skills. Table 1.1 shows various
tasks for which AI is being used.
Table 1.1 Functions of AI
Self - Learning
Material 31
Basics of Artificial Text Simplification
Intelligence
The text simplification is an operation that is used to modify and enhance the
human-readable text in such a manner that the grammar and structure of the text is
NOTES simplified, while the meaning and the information remains the same. Text
simplification is an important area of research since natural languages contain
complex structures, which cannot be processed through the automation system.
Automatic Summarization
Automatic summarization is the process of creating a shortened version of the text
with the help of a computer program. The most common type of automatic
summarization is multidocument summarization. Multidocument summarization helps
to extract information from multiple texts written on a single topic. It helps to
create reports that are both concise and comprehensive.
Information Extraction
Information Extraction (IE) is used to extract the structured information from
unstructured machine-readable documents. The main application of IE is to scan
the set of documents written in the natural language and store the extracted
information in a database. The different types of subtasks done by IE are as follows:
Named Entity Recognition: This task is performed to recognize numerical
expressions, place names, entity names and temporal expressions.
Co-Reference: This task is performed to identify noun phrases, which
represent a single object.
Terminology Extraction: This task is performed to find the relevant terms
for large and structured sets of text used to do statistical analysis.
Question Answering
Question Answering (QA) is a type of information retrieval in which a system
called QA system is used to retrieve answers for questions, which are written in a
natural language. QA is the most complex NLP technique as compared to other
techniques. The QA systems use text documents as their knowledge source. It
adds different natural language techniques to create a single processing technique
and then uses the newly developed technique to search answers for the questions
written in the natural language. The QA system contains a question classifier module
that is used to determine the types of questions and answers.
Optical Character Recognition
Optical Character Recognition (OCR) is a computer software that helps to convert
handwritten text images into machine-editable text. Text generated by OCR is
provided as input to the text search databases. OCR is mainly used by libraries,
businesses and government agencies to create text-searchable files for digital
collections. OCR can also be used in processing cheque and credit card slips.
1.4.3 Vision Processing
In vision processing, AI is used to create software programs which allow computers
Self - Learning to perform tasks such as mobile robot navigation, complex manufacturing tasks,
32 Material
analysis of satellite images and medical image processing. For vision processing, a Basics of Artificial
Intelligence
video camera is used, which provides a computer with a visual image. The visual
image is represented as a two-dimensional grid of intensity levels. Each pixel in the
visual image contains either a single bit or multiple bits of information. A visual
image captured through a camera consists of thousands of pixels. In vision NOTES
processing, there are various operations that can be performed through a software
program based on AI for the processing of visual images. These operations are as
follows:
Signal processing: It allows you to enhance images and provide them as
input to a vision processing software program based on AI.
Measurement analysis: It allows you to determine the two-dimensional
aspect for single object images.
Pattern recognition: It allows you to classify a single object image into a
category.
Image understanding: It allows you to locate an object, classify it and
develop a three-dimensional mode for images containing many objects.
In vision processing, the task of understanding an image is the most difficult task. AI
researchers are currently performing research for understanding an image. Although,
the main operations performed for understanding an image include pattern recognition
and measurement analysis, still there are various difficulties encountered in
understanding the images. These difficulties are as follows:
Loss of information when a two-dimensional image is converted into a three-
dimensional image. Due to the loss of information, understanding of the
image becomes difficult.
An image can contain multiple objects and there may be some objects,
which hide other objects. This makes the understanding of the image difficult,
as the AI software program for understanding the image may not know
how many objects are hidden in an image.
The value of a pixel may be affected by various image aspects such as the
color of the object, source of light and distance of the camera. It is a difficult
task to determine these effects on the value of a pixel.
A large amount of knowledge such as the shadows and textures of objects
in an image is required to understand a low-level image. Knowledge related to the
motion of objects in the image is also required for understanding an image. AI
software programs for understanding images may also require knowledge about
how multiple views of an object in an image are obtained. Multiple views of an
object can be obtained through the use of two or more cameras and this process
of getting multiple views is called stereovision. Another method of obtaining multiple
views of an object is by moving objects or cameras. Information related to an
image can also be obtained through the laser rangefinder, which is a device that
returns an array of distance measures. However, the laser rangefinder is an
expensive device so a method of integrating visual and range data can be used to
acquire information related to an image. AI software programs for image
understanding also require high-level knowledge about an image for interpreting
Self - Learning
visual data. Material 33
Basics of Artificial
Intelligence
1.4.4 Speech Processing
Another important application area of AI is speech processing, which involves the
processing of the spoken language. In various AI software programs such as the
NOTES programs for natural language understanding, providing input through typing is not
sufficient. These programs may require the users to provide data verbally. As a
result, AI has been used to create software programs that make the computer
intelligent enough to recognize the voice of a human being. Various software
programs have been developed using AI to recognize the voice of a human being.
These software programs have the following limitations:
Speaker dependence versus speaker independence: Many speech
recognition AI software programs are developed to recognize the voice of
only a specific speaker. These programs can be modified to recognize the
voice of other speakers also, but it takes a long time as these programs are
complex. By using the speaker independent AI software program, the
computer can be made intelligent to recognize the voice of any speaker and
translate the voice command into a written text. However, it is easier to
develop speaker dependent AI software programs for speech recognition
instead of speaker independent programs because speaker independence
is difficult to achieve due to variations in pitch and accent.
Continuous versus isolated word speech: The various speech recognition
AI software systems are developed to interpret an isolated word speech
instead of a continuous word speech. In an isolated word speech, a speaker,
who is a human being, has to pause between words while speaking. In a
continuous word speech, the speaker can speak words continuously without
pausing. It is easier for a human being to speak in continuous word speech
instead of isolated word speech.
Real-time versus offline processing: In various speech recognition AI
software programs, the processing of the speech is not done when the input
is being provided, but the processing is performed after some time. This is
called offline processing. However, when the processing of the data is done
at the same time when the input is being provided, then it is called real-time
processing. In some cases, it may be required that AI software programs
for speech recognition perform real-time processing. However, it is difficult
to achieve real-time processing because it requires high knowledge in the
AI software programs.
1.4.5 Robotics
Robotics is defined as the study of robots that helps in designing automated
mechanisms, which are capable of replacing humans in certain jobs such as bolting
and fitting automobile parts. The robotics system can be categorized into six types,
which are similar to the six-way division of the human body functions. Table 1.2
shows the relationship between the robotics system and the human system.
Self - Learning
34 Material
Table 1.2 Relationship between the Robotics System and the Human System Basics of Artificial
Intelligence
Self - Learning
Material 35
Basics of Artificial of sensing the environment and deciding any further action by robots has been
Intelligence
made possible with the help of AI. Robots can be programmed to carry out heavy
mechanical work, thus reducing the effort of human beings.
Knowledge Engineers
Inference Software
Engine
Users
Experts
Working Knowledge
Memory Base
Spreadsheets
Hardware
Task Analysis
Knowledge Acquisition
Prototype Development
Self - Learning
40 Material
and answer session. The two main methods of reasoning used in this architecture Basics of Artificial
Intelligence
are as follows:
1. Forward Chaining: This method involves checking the condition part of a
rule to determine whether it is true or false. If the condition is true, then the NOTES
action part of the rule is also true. This procedure continues until a solution
is found or a dead-end is reached. Forward chaining is commonly referred
to as data-driven reasoning
2. Backward Chaining: This is the reverse of forward chaining. It is used to
backtrack from a goal to the paths that lead to the goal. It is very useful
when all outcomes are known and the number of possible outcomes is not
large. In this case, a goal is specified and the expert system tries to determine
what conditions are needed to arrive at the specified goal. Backward chaining
is thus also called goal-driven.
Non-Production System Architecture
The non production system architecture of certain expert systems do not have rule
representation scheme. These systems employ more structure representation
schemes like frames, decision trees or specialized networks like neural networks.
Some of these architectures are discussed below.
Frame Architecture
Frames are structured sets of closely related knowledge, which may include object’s
or concept’s names, main attributes of objects, their corresponding values and
possibly some attached procedures. These values are stored in specified slots of
the frame and individual frames are usually linked together.
Decision Tree Architecture
Expert system may also store information in the form of a decision tree, that is, in
a top to bottom manner. The values of attributes of an object determine a path to
a leaf node in the tree which contains the objects identification. Each object attribute
corresponds to a non terminal node in the tree and each branch of the decision
tree corresponds to a set of values. New nodes and branches can be added to the
tree when additional attributes are needed to further discriminate among new
objects.
Black Board System Architecture
Black board architecture is a special type of knowledge based system which uses
a form of opportunistic reasoning. H. Penny Nii (1986) has aptly described the
blackboard problem solving strategy through the following analogy.
‘Imagine a room with a large black board on which a group of experts are
piecing together a jigsaw puzzle. Each of the experts has some special knowledge
about solving puzzles like border expert, shapes experts, colour expert etc.
Each member examines his or her pieces and decides if they will fit into the
partially completed puzzle. Those members having appropriate pieces go up to
Self - Learning
Material 41
Basics of Artificial the black board and update the evolving solution. The whole puzzle can be
Intelligence
solved in complete silence with no direct communication among members of the
group. Each person is self activating, knowing when to contribute to the solution.
The solution evolves in this incremental way with expert contributing dynamically
NOTES on an opportunistic basis, that is, as the opportunity to contribute to the solution
arises.
The objects in the black board are hierarchically organized into levels which
facilitate analysis and solution. Information from one level serves as input to a set
of knowledge sources. The sources modify the knowledge and place it on the
same or different levels.’
Black boards system is applied on WEARSAY family of projects, which
are speech understanding systems developed to analyse complex scenes and model
the human cognitive processes.
Analogical Reasoning Architecture
Expert systems based on analogical architectures solve problems by finding similar
problems and their solutions and applying the known solution to the new problem,
possibly with some kind of modification.
These architectures require a large knowledge base having numerous
problem solutions. Previously encountered situations are stored as units in memory
and are content-indexed for rapid retrieval.
1.4.7 Search Knowledge
A large amount of knowledge is required to solve the problems related to AI. If
the knowledge which is available for solving these AI problems is not enough, then
a search has to be made for obtaining more knowledge in a knowledge base. The
knowledge base must be systematically represented in order to efficiently search
for knowledge in it. Knowledge can be represented in the form of facts in a
knowledge base. A mechanism called search control knowledge can be used to
control the knowledge search. In the search control knowledge, the knowledge
about different paths, which can lead to a goal are obtained and reasoned. After
this, the best possible path is selected to achieve the goal state.
1.4.8 Abstraction
In abstraction, some details related to AI problems are eliminated to find a
solution for a problem. This process of eliminating details is continued until a
solution is found. Abstraction is mainly used to solve the hard problems of an
AI. Abstraction basically means hiding the unnecessary details of an AI problem.
For example, the predefined sort function of the C program is used to calculate
the sequence square root of a number. A programmer need not know the
implementation of details of the sort function in order to use it in the C program.
As a result, the implementation of details of the sort function is hidden, which is
the concept of abstraction.
Self - Learning
42 Material
Basics of Artificial
Intelligence
Check Your Progress
6. Name some application areas of AI.
7. Define an expert system. NOTES
8. Name some of the fields in which expert systems are used.
9. What is abstraction in relation to AI?
1.6 SUMMARY
Artificial Intelligence (AI) is the branch of computer science that deals with
the creation of computers with human skills. It recognizes its surrounding
and initiates actions that maximize its change of success.
The term AI is used to describe the ‘intelligence’ that the system
demonstrates. Tools and insights from fields, including linguistics, psychology,
computer science, cognitive science, neuroscience, probability, optimization
and logic are used.
The telephone is one of the most marvellous inventions of the
communications’ era. It helps in conquering the physical distance instantly.
The development of communication systems began two centuries ago with
wire-based electrical systems called telegraph and telephone. Before that
human messengers on foot or horseback were used. Egypt and China built
messenger relay stations.
The concept of AI as a true scientific pursuit is very new. It remained a plot
for popular science fiction stories over centuries. Most researchers associate
the beginning of AI with Alan Turing.
Perception is defined as ‘the formation, from a sensory signal, of an internal
representation suitable for intelligent processing’.
AI has applications in all fields of human study, such as finance and
economics, environmental engineering, chemistry, computer science and so
on.
Newell and Simon presented the Physical Symbol System Hypothesis, which
lies in the heart of the research in artificial intelligence.
Artificial intelligence research during the last three decades has concluded
that Intelligence requires knowledge.
To solve the problem of playing a game, we require the rules of the game
and targets for winning as well as representing positions in the game. The
opening position can be defined as the initial state and a winning position as
a goal state.
The problem solved by using the production rules in combination with an
appropriate control strategy, moving through the problem space until a path
from an initial state to a goal state is found.
Self - Learning Solutions can be good in different ways. They can be good in terms of
44 Material time or storage or in difficulty of the algorithm. In case of the travelling
salesman problem, finding the best path can lead to a significant amount Basics of Artificial
Intelligence
of computation.
A state-space demonstration is a mathematical structure of a physical system
as a set of input, output and state variables associated by first-order NOTES
differential equations. To abstract from the number of inputs, outputs and
states, the variables are denoted as vectors and the differential and algebraic
equations are depicted in matrix form.’
Iterative deepening carries out repetitive depth-limited searches, beginning
from zero and increasing once every time. Consequently, it has the space-
saving advantages of depth first search.
Bidirectional search is an algorithm that makes use of two searches that
occur at the same time to arrive at a target goal. Bidirectional search usually
seems to be an effective graph search as rather than carrying out a search
through a large tree, one search is performed backwards from the goal
while one search is performed forward from the beginning.
A heuristic is a method that improves the efficiency of the search process.
These are like tour guides. There are good to the level that they may neglect
the points in general interesting directions; they are bad to the level that they
may neglect points of interest to particular individuals.
In the game application area, a software program is created using AI, which
makes the computer intelligent for playing a game. To create software
programs for intelligent game playing, the developers first analyse various
options and then use computers to select the best option.
Natural Language Processing (NLP) provides a method for interaction
between computers and human beings.
In vision processing, AI is used to create software programs which allow
computers to perform tasks such as mobile robot navigation, complex
manufacturing tasks, analysis of satellite images and medical image
processing.
This module uses learning algorithms to learn from usages and experience,
saved in case history files. These algorithms themselves determine to a large
extent how successful a learning system will be.
Expert systems are capable of solving problems even where complete or
exact data do not exist. This is an important feature because complete and
accurate information on a problem is rarely available in the real world.
A programmer need not know the implementation of details of the sort
function in order to use it in the C program.
Short-Answer Questions
1. What are intelligent communication systems? Give some examples.
2. How do you think AI can help in authorizing financial transactions?
3. What problems are faced by AI, in general?
4. How AI can solve the problem of intelligent game playing?
5. What is of natural language processing?
6. Define the term OCR.
7. What are the operations used in vision processing?
8. What are the features that a machine needs to possess in order to qualify as
a robot?
9. List the fields in which expert systems can be used.
10. What is forward and backward chaining?
11. Give the concept of abstraction.
Long-Answer Questions
1. Discuss the events which have led to the development AI with the help of
examples.
2. Discuss briefly about the games with the help of examples.
3. Explain briefly about the problems of natural language processing. Give
appropriate examples.
4. Explain the vision processing application area of AI with the help of examples.
5. Briefly explain about the speech processing with the help of examples and
limitation.
6. Describe the robotics application area of AI.
7. Discuss the component of expert system with the help of examples.
Self - Learning 8. Explain briefly about the abstraction. Give appropriate examples.
46 Material
Basics of Artificial
1.9 FURTHER READING Intelligence
Self - Learning
Material 47
Problem Space, Search
AND KNOWLEDGE
NOTES
REPRESENTATION
Structure
2.0 Introduction
2.1 Objectives
2.2 Search Space Control
2.2.1 Defining Problem as a State Space Search
2.2.2 State Space Search
2.2.3 Design of Search Programs and Solutions
2.3 Production Systems
2.4 Heuristic Search
2.4.1 Heuristic Search Techniques
2.4.2 Best First Search
2.5 Branch and Bound
2.6 Problem Reduction
2.7 Constraint Satisfaction
2.8 Mean End Analysis
2.9 Basic Concept of Knowledge Representation
2.9.1 Representation and Mappings
2.9.2 Approaches to Knowledge Representation
2.9.3 Issues in Knowledge Representation
2.9.4 The Frame Problem
2.10 Answers to ‘Check Your Progress’
2.11 Summary
2.12 Key Terms
2.13 Self-Assessment Questions and Exercises
2.14 Further Reading
2.0 INTRODUCTION
State space search is a process used in the field of computer science, including
Artificial Intelligence (AI), in which successive configurations or states of an instance
are considered, with the intention of finding a goal state with the desired property.
Problems are often modelled as a state space, a set of states that a problem can
be in. The set of states forms a graph where two states are connected if there is an
operation that can be performed to transform the first state into the second. State
space search often differs from traditional computer science search methods
because the state space is implicit: the typical state space graph is much too large
to generate and store in memory. Instead, nodes are generated as they are explored,
and typically discarded thereafter. A solution to a combinatorial search instance
may consist of the goal state itself, or of a path from some initial state to the goal
state. Production systems provide appropriate structures for performing and
describing search processes.
A heuristic is a method that improves the efficiency of the search process. Self - Learning
These are like tour guides. There are good to the level that they may neglect the Material 49
Problem Space, Search points in general interesting directions; they are bad to the level that they may
and Knowledge
Representation neglect points of interest to particular individuals.
Branch and Bound (BB) is a general algorithm that is used to find optimal
solutions of different optimization problems, particularly in discrete and
NOTES
combinatorial optimization. It contains a systematic detail of each candidate
solution, in which big subsets of candidates giving no results are rejected in groups,
by making use of the higher and lower approximated limits of the quantity that are
undergoing optimization. A.H. Land and A.G. Doig in 1960 were the first ones to
propose the method for linear programming.
Constraint satisfaction is a usual problem the goal of which is finding values
for a set of variables which would satisfy a given set of constraints. It is the centre
of several applications in AI, and has witnessed its implementation in several
domains. These domains include planning and scheduling. Due it’s usually, maximum
AI researchers must be able to gain from possessing sound knowledge of methods
in this field. Means End Analysis (MEA) is a strategy which is brought into use in
Artificial Intelligence to control search in problem solving computer programs. It
has been in use since the 1950s as a creativity tool.
Knowledge representation and reasoning is the field of Artificial Intelligence
(AI) dedicated to representing information about the world in a form that a computer
system can use to solve complex tasks such as diagnosing a medical condition or
having a dialog in a natural language. Knowledge representation incorporates findings
from psychology about how humans solve problems and represent knowledge in
order to design formalisms that will make complex systems easier to design and
build.
In this unit, you will learn about the search space control, production system,
heuristic search, heuristic search techniques, best first search, branch and bond,
problem reduction, constraint satisfaction, means end analysis, basics of knowledge
representation, representation and mapping, approaches to knowledge
representation, issues in knowledge representation and the frame problem.
2.1 OBJECTIVES
After going through this unit, you will be able to:
Interpret the search space control
Elaborate on the production system
Define the heuristic search
Understand the techniques of heuristic search and best first search
Discuss about the branch and bound algorithm
Analyse the problem reduction technique
Explain about the constraint satisfaction problems
Illustrate the mean end analysis strategy
Discuss the basic concept of knowledge representation
50
Self - Learning
Material
Elaborate on the frame problem
Problem Space, Search
2.2 SEARCH SPACE CONTROL and Knowledge
Representation
The word ‘search’ refers to the search for a solution in a problem space.
Search proceeds with different types of ‘search control strategies’. NOTES
A strategy is defined by picking the order in which the nodes expand.
The search strategies are evaluated along the following dimensions:
completeness, time complexity, space complexity and optimality.
Algorithm’s Performance and Complexity
Performance of an algorithm depends on the folllowing internal and external factors.
Time required to run
Size of input to the algorithm
Space (memory) required to run
Speed of the computer
Quality of the compiler
Complexity is a measure of the performance of an algorithm. Complexity
measures the internal factors, usually in time than space.
Computational Complexity
It is the measure of resources in terms of time and space.
If A is an algorithm that solves a decision problem f, then run-time of A is the
number of steps taken on the input of length n.
Time Complexity T(n) of a decision problem f is the run-time of the ‘best’
algorithm A for f.
Space Complexity S(n) of a decision problem f is the amount of memory
used by the ‘best’ algorithm A for f.
‘Big - O’ Notation
The Big-O, theoretical measure of the execution of an algorithm, usually indicates
the time or the memory needed, given the problem size n, which is usually the
number of items. It is used to give an approximation to the run-time- efficiency of
an algorithm; the letter ‘O’ is for order of magnitude of operations or space at
run-time.
The Big-O of an Algorithm A
If an algorithm A requires time proportional to f(n), then algorithm A is
said to be of order f(n), and it is denoted as O(f(n)).
If algorithm A requires time proportional to n2, then the order of the
algorithm is said to be O(n2).
If algorithm A requires time proportional to n, then the order of the
algorithm is said to be O(n).
The function f(n) is called the algorithm’s growth-rate function. In other words,
if an algorithm has performance complexity O(n), this means that the run-time t
should be directly proportional to n, ie t • n or t = k n where k is constant of
proportionality. Similarly, for algorithms having performance complexity Self - Learning
O(log2(n)), O(log N), O(N log N), O(2N) and so on. Material 51
Problem Space, Search Example
and Knowledge
Representation Determine the Big-O of an algorithm:
Calculate the sum of the n elements in an integer array a[0..n-1].
NOTES Line no. Instructions No. of executions steps
line 1 sum 1
line 2 for (i = 0; i < n; i++) n+1
line 3 sum += a[i] n
line 4 print sum 1
Total 2n + 3
Thus, the polynomial (2n + 3) is dominated by the 1st term as n while the number
of elements in the array becomes very large.
In determining the Big-O, ignore constants such as 2 and 3. So the algorithm
is of order n.
So the Big-O of the algorithm is O(n).
In other words the run-time of this algorithm increases roughly as the size of
the input data n, e.g., an array of size n.
Example
Determine the Big-O of an algorithm for finding the largest element in a square 2-D
array a[0 . . . . . n-1] [0 . . . . . n-1]
Line no Instructions No of execution
line 1 max = a[0][0] 1
line 2 for (row = 0; row < n; row++) n+1
line 3 for (col = 0; col < n; col++) n*(n+1)
line 4 if (a[row][col] > max) max = a[row][col]. n*(n)
line 5 print max 1
Total 2n + 2n + 3
2
Thus, the polynomial (2n2 + 2n + 3) is dominated by the 1st term as n2 while the
number of elements in the array becomes very large.
In determining the Big-O, ignore constants such as 2, 2 and 3. So the algorithm
is of order n2.
The Big-O of the algorithm is O(n2).
In other words, the run-time of this algorithm increases roughly as the square
of the size of the input data which is n2, e.g., an array of size n x n.
Example: Polynomial in n with degree k.
The number of steps needed to carry out an algorithm is expressed as
f(n) = ak nk + ak-1 nk-1 + ... + a1 n1 + a0
Then f(n) is a polynomial in n with degree k and f(n) O(nk ).
To obtain the order of a polynomial function, use the term, which is of the
Self - Learning
highest degree and disregard the constants and the terms which are of lower
52 Material degrees.
The Big-O of the algorithm is O(nk). In other words, the run-time of this algorithm Problem Space, Search
and Knowledge
increases exponentially. Representation
Example: Growth rates variation
Problem: If an algorithm requires 1 second run-time for a problem of size 8, NOTES
then find the run-time for that algorithm for the problem of size 16?
Solutions: If the Order of the algorithm is O(f(n)) then the calculated execution
time T (n) of the algorithm as problem size increases are as below.
O(f(n)) Run-time T(n) required as problem size increases
O(1) T(n) = 1 second ;
Algorithm is constant time, and independent of the
size of the problem.
O(log2n) T(n) = (1*log216) / log28 = 4/3 seconds ;
Algorithm is of logarithmic time, increases slowly
with the size of the problem.
O(n) T(n) = (1*16) / 8 = 2 seconds
Algorithm is linear time, increases directly with the
size of the problem.
O(n*log2n) T(n) = (1*16*log216) / 8*log28 = 8/3 seconds
Algorithm is of log-linear time, increases more
rapidly than a linear algorithm.
O(n2) T(n) = (1*162) / 82 = 4 seconds
Algorithm is quadratic time, increases rapidly with
the size of the problem.
O(n3) T(n) = (1*163) / 83 = 8 seconds
Algorithm is cubic time increases more rapidly than
quadratic algorithm with the size of the problem.
O(2n) T(n) = (1*216) / 28 = 28 seconds = 256 seconds
Algorithm is exponential time, increases too rapidly
to be practical.
Tree Structures used in Searching Algorithms
Tree is a way of organizing objects, related in a hierarchical fashion.
Tree is a type of data structure in which each element is attached to one or
more elements directly beneath it.
The connections between elements are called branches.
Tree is often called inverted trees because it is drawn with the root at the
top.
Self - Learning
Material 53
Problem Space, Search The elements that have no elements below them are called leaves.
and Knowledge
Representation A binary tree is a special type: each element has only two branches below
it.
NOTES Properties
The various properties of trees are as follows:
Tree is a special case of a graph.
The topmost node in a tree is called the root node.
At root node all operations on the tree begin.
A node has at most one parent.
The topmost node (root node) has no parents.
Each node has zero or more child nodes, which are below it .
The nodes at the bottommost level of the tree are called leaf nodes.
Since leaf nodes are at the bottom most level, they do not have children.
A node that has a child is called the child’s parent node.
The depth of a node n is the length of the path from the root to the node.
The root node is at depth zero.
Stacks and Queues
The Stacks and Queues are data structures that maintain the order of last-in,
first-out and first-in, first-out respectively. Both stacks and queues are often
implemented as linked lists, but that is not the only possible implementation.
Stack - Last In First Out (LIFO) lists
An ordered list; a sequence of items, piled one on top of the other.
The insertions and deletions are made at one end only, called Top.
If Stack S = (a[1], a[2], . . . . a[n]) then a[1] is bottom most element
Any intermediate element (a[i]) is on top of element a[i-1], 1 < i <= n.
In Stack all operation take place on Top.
The Pop operation removes item from top of the stack.
The Push operation adds an item on top of the stack.
Queue - First In First Out (FIFO) lists
An ordered list; a sequence of items; there are restrictions about how items
can be added to and removed from the list. A queue has two ends.
All insertions (enqueue ) take place at one end, called Rear or Back
All deletions (dequeue) take place at other end, called Front.
If Queue has a[n] as rear element then a[i+1] is behind a[i] , 1 < i <= n.
All operation takes place at one end of queue or the other.
The Dequeue operation removes the item at Front of the queue.
Self - Learning The Enqueue operation adds an item to the Rear of the queue.
54 Material
Search Problem Space, Search
and Knowledge
Search is the systematic examination of states to find path from the start / root Representation
state to the goal state.
Search usually results from a lack of knowledge. NOTES
Search explores knowledge alternatives to arrive at the best answer.
Search algorithm output is a solution, that is, a path from the initial state to a
state that satisfies the goal test.
For general-purpose problem-solving – ‘Search’ is an approach.
Search deals with finding nodes having certain properties in a graph that
represents search space.
Search methods explore the search space ‘intelligently’, evaluating
possibilities without investigating every single possibility.
Examples
For a Robot this might consist of PICKUP, PUTDOWN,
MOVEFORWARD, MOVEBACK, MOVELEFT, and MOVERIGHT—
until the goal is reached.
Puzzles and Games have explicit rules: e.g., the ‘Tower of Hanoi’ puzzle.
This puzzle involves a set of rings of different sizes that can be placed on
three different pegs.
The puzzle starts with the rings arranged as shown in Figure 2.1(a).
The goal of this puzzle is to move them all as to Figure 2.1(b).
Condition: Only the top ring on a peg can be moved, and it may only be
placed on a smaller ring, or on an empty peg.
(a) Start (b) Final
In this Tower of Hanoi puzzle, situations encountered while solving the problem
are described as states and set of all possible configurations of rings on the pegs
is called ‘problem space’.
States
A state is a representation of elements in a given moment.
A problem is defined by its elements and their relations.
At each instant of a problem, the elements have specific descriptors and relations;
the descriptors indicate how to select elements?
Self - Learning
Material 55
Problem Space, Search Among all possible states, there are two special states called:
and Knowledge
Representation Initial state – The start point
Final state – The goal state
NOTES State Change: Successor Function
A ‘successor function’ is needed for state change. The successor function moves
one state to another state.
Successor Function:
It is a description of possible actions; a set of operators.
It is a transformation function on a state representation, which converts
that state into another state.
It defines a relation of accessibility among states.
It represents the conditions of applicability of a state and corresponding
transformation function.
State Space
A state space is the set of all states reachable from the initial state.
A state space forms a graph (or map) in which the nodes are states and the
arcs between nodes are actions.
In a state space, a path is a sequence of states connected by a sequence of
actions.
The solution of a problem is part of the map formed by the state space.
Structure of a State Space
The structures of a state space are trees and graphs.
A tree is a hierarchical structure in a graphical form.
A graph is a non-hierarchical structure.
A tree has only one path to a given node;
i.e., a tree has one and only one path from any point to any other point.
A graph consists of a set of nodes (vertices) and a set of edges (arcs). Arcs
establish relationships (connections) between the nodes; i.e., a graph has
several paths to a given node.
The operators are directed arcs between nodes.
A search process explores the state space. In the worst case, the search
explores all possible paths between the initial state and the goal state.
Problem Solution
In the state space, a solution is a path from the initial state to a goal state or,
sometimes, just a goal state.
A solution cost function assigns a numeric cost to each path; it also gives the
cost of applying the operators to the states.
Self - Learning A solution quality is measured by the path cost function; and an optimal
56 Material
solution has the lowest path cost among all solutions.
The solutions can be any or optimal or all. Problem Space, Search
and Knowledge
The importance of cost depends on the problem and the type of solution Representation
asked.
Problem Description NOTES
A problem consists of the description of the following:
The current state of the world.
The actions that can transform one state of the world into another.
The desired state of the world.
The following action are taken to describe the problem:
State space is defined explicitly or implicitly
A state space should describe everything that is needed to solve a problem
and nothing that is not needed to solve the problem.
Initial state is start state
Goal state is the conditions it has to fulfill.
The description by a desired state may be complete or partial.
Operators are to change state
Operators do actions that can transform one state into another;
Operators consist of; Preconditions and Instructions
Preconditions provide partial description of the state of the world that
must be true in order to perform the action, and
Instructions tell the user how to create the next state.
Operators should be as general as possible, so as to reduce their number.
Elements of the domain has relevance to the problem
Knowledge of the starting point.
Problem solving is finding a solution
Find an ordered sequence of operators that transform the current (start)
state into a goal state.
Restrictions are solution quality any, optimal, or all
Finding the shortest sequence, or
finding the least expensive sequence defining cost, or
finding any sequence as quickly as possible.
This can also be explained with the help of algebraic function as given below.
Algebraic Function
A function may take the form of a set of ordered pair, a graph or a equation.
Regardless of the form it takes, a function must obey the condition that, no two of
its ordered pairs have the same first member with different second members.
Relation: A set of ordered pair of the form (x, y) is called a relation.
Self - Learning
Material 57
Problem Space, Search Function: A relation in which no two ordered pairs have the same x-value but
and Knowledge
Representation different y-value is called a function. Functions are usually named by lower-case
letters such as f, g, and h.
NOTES For example, f = {(-3, 9), (0, 0), (3, 9)}, g = {(4, -2), (2, 2)}
Here f is a function, g is not a function.
Domain and Range: The domain of a function is the set of all the first members
of its ordered pairs, and the range of a function is the set of all second members
of its ordered pairs.
if function f = {(a, A), (b, B), (c, C)},then its domain is {a, b, c } and its range is
{A, B, C}.
Function and Mapping: A function may be viewed as a mapping or a pairing of
one set with elements of a second set such that each element of the first set (called
domain) is paired with exactly one element of the second set ( called co domain).,
e.g., if a function f maps {a, b, c} into {A, B, C, D} such that a ’! A (read ‘ a is
mapped into A’), b ’! B, c ’! C then the domain is {a, b, c} and the co domain is
{A, B, C, D}. Since a is paired with A in co domain, A is called the image of a.
Each element of the co domain that corresponds to an element of the domain is
called the image of that element.
The set of image points, {A, B, C}, is called the range. Thus, the range is a subset
of the co domain.
Onto Mappings: Set A is mapped onto set B if each element of set B is image of
an element of a set A. Thus, every function maps its domain onto its range.
Describing a Function by an Equation: The rule by which each x-value gets
paired with the corresponding y-value may be specified by an equation. For
example, the function described by the equation y = x + 1 requires that for any
choice of x in the domain, the corresponding range value is x + 1. Thus, 2 ’! 3, 3
’! 4, and 4 ’! 5.
Restricting Domains of Functions: Unless otherwise indicated, the domain of
a function is assumed to be the largest possible set of real numbers. Thus: The
domain of y = x / (x2 - 4) is the set of all real numbers except ± 2 since for these
values of x the denominator is 0.
The domain of y = (x - 1)1/2 is the set of real numbers greater than or equal to 1
since for any value of x less than 1, the root radical has a negative radicand so the
radical does not represent a real number.
Example: Find which of the relation describe function?
(a) y = x1/2 , (b) y = x3 , (c) y > x , (d) x = y2
Equations (a) and (b) produce exactly one value of y for each value of x. Hence,
equations (a) and (b) describe functions.
The equation (c), y > x does not represent a function since it contains ordered
pair such as (1, 2) and (1, 3) where same value of x is paired with different values
of y.
The equation (d), x = y2 is not a function since ordered pair such as (4, 2) and
(4, -2) satisfy the equation but have the same value of x paired with different
values of y.
Function Notation: For any function f, the value of y that corresponds to a
Self - Learning given value of x is denoted by f(x).
58 Material
If y = 5x -1, then f (2), read as ‘f of 2’, represents the value of y. Problem Space, Search
and Knowledge
When x = 2, then f(2) = 5 . 2 - 1 = 9; Representation
when x = 3, then f(3) = 5 . 3 - 1 = 14;
In an equation that describes function f, then f(x) may be used in place of y, for
example, f(x) = 5x -1. If y = f(x), then y is said to be a function of x. NOTES
Since the value of y depends on the value of x, y is called the dependent variable
and x is called the independent variable.
2.2.1 Defining Problem as a State Space Search
To solve the problem of playing a game, we require the rules of the game and
targets for winning as well as representing positions in the game. The opening
position can be defined as the initial state and a winning position as a goal state.
Moves from initial state to other states leading to the goal state follow legally.
However, the rules are far too abundant in most games— especially in chess,
where they exceed the number of particles in the universe. Thus, the rules cannot
be supplied accurately and computer programs cannot handle easily. The storage
also presents another problem but searching can be achieved by hashing.
The number of rules that are used must be minimized and the set can be
created by expressing each rule in a form as possible. The representation of games
leads to a state space representation and it is common for well-organized games
with some structure. This representation allows for the formal definition of a problem
that needs the movement from a set of initial positions to one of a set of target
positions. It means that the solution involves using known techniques and a systematic
search. This is quite a common method in Artificial Intelligence.
2.2.2 State Space Search
A state space represents a problem in terms of states and operators that change
states.
A state space consists of the following:
A representation of the states the system can be in. For example, in a board
game, the board represents the current state of the game.
A set of operators that can change one state into another state. In a board
game, the operators are the legal moves from any given state. Often the
operators are represented as programs that change a state representation
to represent the new state.
An initial state.
A set of final states; some of these may be desirable, others undesirable.
This set is often represented implicitly by a program that detects terminal
states.
The Water Jug Problem
In this problem, we use two jugs called four and three; four holds a maximum of
four gallons of water and three a maximum of three gallons of water. How can we
get two gallons of water in the four jug?
The state space is a set of prearranged pairs giving the number of gallons of
water in the pair of jugs at any time, i.e., (four, three) where four = 0, 1, 2, 3 or 4 Self - Learning
Material 59
and three = 0, 1, 2 or 3.
Problem Space, Search The start state is (0, 0) and the goal state is (2, n) where n may be any but
and Knowledge
Representation it is limited to three holding from 0 to 3 gallons of water or empty. Three and four
shows the name and numerical number shows the amount of water in jugs for
solving the water jug problem. Table 2.1 lists the major production rules for solving
NOTES this problem.
Table 2.1 Production Rules for the Water Jug Problem
Self - Learning
62 Material
Problem Space, Search
2.3 PRODUCTION SYSTEMS and Knowledge
Representation
2 3 4 1 2 3
8 6 2 8 4
7 5 7 6 5
This example can be solved by the operator sequence UP, RIGHT, UP, LEFT,
DOWN.
Example: Missionaries and Cannibals
The Missionaries and Cannibals problem illustrates the use of state space search
for planning under constraints:
Three missionaries and three cannibals wish to cross a river using a two-
person boat. If at any time the cannibals outnumber the missionaries on either side
of the river, they will eat the missionaries. How can a sequence of boat trips be
performed that will get everyone to the other side of the river without any missionaries
being eaten?
Self - Learning
Material 63
Problem Space, Search State Representation:
and Knowledge
Representation 1. BOAT position: original (T) or final (NIL) side of the river.
2. Number of Missionaries and Cannibals on the original side of the river.
NOTES 3. Start is (T 3 3); Goal is (NIL 0 0).
Operators:
Table 2.3 lists the operators and their descriptions
Table 2.3 Operators and Descriptions
Operators Descriptions
(MM 2 0) Two Missionaries cross the river.
(MC 1 1) One Missionary and one Cannibal.
(CC 0 2) Two Cannibals.
(M 1 0) One Missionary.
(C 0 1) One Cannibal.
0 3 0
CC
0 1 1
C
M
0 2 0
1 1 0
CC
MC 0 0 1
Heuristics are knowledge about domain, which help search and reasoning
in its domain. NOTES
Heuristic search incorporates domain knowledge to improve efficiency over
blind search.
Heuristic is a function that, when applied to a state, returns value as estimated
merit of state, with respect to goal.
Heuristics might (for reasons) underestimate or overestimate the merit
of a state with respect to goal.
Heuristics that underestimate are desirable and called admissible.
Heuristic evaluation function estimates likelihood of given state leading to
goal state.
Heuristic search function estimates cost from current state to goal, presuming
function is efficient.
Heuristic Search Compared with Other Search
The Heuristic search is compared with Brute force or Blind search techniques
shown is Table 2.4.
Table 2.4 Comparison of Algorithms
* (Distance in Kilometres)
One situation is the salesman could start from Hyderabad. In that case, one path
might be followed as shown in Figure 2.4.
Hyderabad
15
Secunderabad
780
Chennai
420
Mumbai
340
Bangalore
780
Hyderabad
TOTAL : 2335
Here the total distance is 2335 km. But this may not be a solution to the
problem, maybe other paths may give the shortest route.
It is also possible to create a bound on the error in the answer, but in
general it is not possible to make such an error bound. In real problems, the value
of a particular solution is trickier to establish, but this problem is easier if it is
Self - Learning measured in miles, and other problems have vague measures.
68 Material
Although heuristics can be created for unstructured knowledge, producing Problem Space, Search
and Knowledge
coherent analysis is another issue and this means that the solution lacks reliability. Representation
Rarely is this an optimal solution, since the required approximations are usually in
sufficient.
Although heuristic solutions are bad in the worst case, the problem occurs NOTES
very infrequently.
Formal Statement
Problem solving is a set of statements describing the desired states expressed in a
suitable language; e.g., first-order logic.
The solution of many problems (like chess, crosses) can be described by
finding a sequence of actions that lead to a desired goal.
Each action changes the state, and
The aim is to find the sequence of actions that lead from the initial (start)
state to a final (goal) state.
A well-defined problem can be described by the example given below:
Example
Initial State: (S)
Operator or successor function: for any state x , returns s(x), the set of
states reachable from x with one action.
State space: all states reachable from the initial one by any sequence of
actions.
Path: sequence through state space.
Path cost: function that assigns a cost to a path; cost of a path is the sum of
costs of individual actions along the path.
Goal state: (G)
Goal test: test to determine if at goal state.
Search Notations
Search is the systematic examination of states to find path from the start / root
state to the goal state.
The notations used for this purpose are given as follows:
Evaluation function f (n) estimates the least cost solution through node n
Heuristic function h(n)
Estimates least cost path from node n to goal node
Cost function g(n) estimates the least cost path (Refer Figure 2.5) from
start node to node n
Self - Learning
Material 69
Problem Space, Search f (n) = g(n) + h(n)
and Knowledge
Representation Actual Estimate
Goal
Start n
NOTES
g(n) h(n)
f(n)
The notations ^f, ^g, ^h, are sometimes used to indicate that these values
are estimates of f , g , h
^f(n) = ^g(n) + ^h(n)
If h(n) d• actual cost of the shortest path from node n to goal, then h(n) is
an underestimate.
AI Search and Control of Strategies
Estimate cost function g*
The estimated least cost path from start node to node n is written as g*(n).
g* is calculated as the actual cost, so far, of the explored path.
g* is known exactly by summing all path costs from start to current state.
If search space is a tree, then g* = g, because there is only one path from
start node to current node.
In general, the search space is a graph.
If search space is a graph, then g* e” g,
g* can never be less than the cost of the optimal path; it can only over
estimate the cost.
g* can be equal to g in a graph if chosen properly.
Estimate heuristic function h*
The estimated least cost path from node n to goal node is written h*(n)
h* is heuristic information, represents a guess at: ‘How hard it is to reach
from current node to goal state ?’.
h* may be estimated using an evaluation function f(n) that measures
‘goodness’ of a node.
h* may have different values; the values lie between 0 d” h*(n) d” h(n);
they mean a different search algorithm.
If h* = h, it is a perfect heuristic; means no unnecessary nodes are ever
Self - Learning
70 Material
expanded.
2.4.2 Best First Search Problem Space, Search
and Knowledge
Representation
Best First Search or BFS is a search algorithm which searches a graph by
expanding the most favorable node selected according to a specified rule. Judea
Pearl described best first search as “Estimating the promise of node n by a NOTES
heuristic evaluation function f(n) which, in general, may depend on the
description of n, the description of the goal, the information gathered by the
search up to that point, and most important, on any extra knowledge about
the problem domain”.
Some authors have specifically used the term best first search for searching
using a heuristic search that attempts to predict how close the end of a path is to a
solution, so that paths which are judged to be closer to a solution are extended
first. This specific type of search is called greedy best first search. Efficient selection
of the current best candidate for extension is typically implemented using a priority
queue. The A* search algorithm is an example of best first search, as is B*. Best
first algorithms are often used for path finding in combinatorial search.
Algorithm
The algorithm for best first search must be correct in order to work efficiently.
Consider the following algorithm which is not correct, i.e., it does not always find
a possible path between two nodes, even if there is one. For example, it gets stuck
in a loop if it arrives at a dead end, i.e., a node with the only successor being its
parent. It would then go back to its parent, add the dead end successor to the
OPEN list again, and so on.
OPEN = [initial state]
while OPEN is not empty or until a goal is found
do
1. Remove the best node from OPEN, call it n.
2. If n is the goal state, backtrace path to n (through
recorded parents) and return path.
3. Create n’s successors.
4. Evaluate each successor, add it to OPEN, and record
its parent.
done
The following description extends the algorithm to use an additional CLOSED list,
containing all nodes that have been evaluated and will not be looked at again. As
this will avoid any node being evaluated twice, it is not subject to infinite loops.
OPEN = [initial state]
CLOSED = []
while OPEN is not empty
do
1. Remove the best node from OPEN, call it n, add it to
CLOSED.
2. If n is the goal state, backtrace path to n (through
recorded parents) and return path.
3. Create n’s successors.
4. For each successor do:
a. If it is not in CLOSED and it is not in OPEN: evaluate Self - Learning
Material 71
it, add it to OPEN, and record its parent.
Problem Space, Search b. Otherwise, if this new path is better than previous
and Knowledge one, change its recorded parent.
Representation
i. If it is not in OPEN add it to OPEN.
ii. Otherwise, adjust its priority in OPEN using this
NOTES new evaluation.
done
Best first search in its most universal form is a simple heuristic search algorithm.
‘Heuristic’ here refers to a general problem solving rule or set of rules that never
assure the best solution or even any solution, but functions as a suitable controller
for problem solving. Best first search is a graph based search algorithm (Dechter
and Pearl, 1985), specifying that the search space can be represented as a series
of nodes connected by paths.
Typically, the term ‘Best First’ refers to the method of exploring the node
with the best ‘Score’ first. An evaluation function is used for assigning a score to
each candidate node. The algorithm maintains two lists, one containing a list of
candidates yet to explore (OPEN) and the other containing a list of visited nodes
(CLOSED). Since all unvisited successor nodes of every visited node are included
in the OPEN list, the algorithm is not restricted to only exploring successor nodes
of the most recently visited node. Alternatively, the algorithm always selects the
best of all unvisited nodes that have been graphed, rather than being restricted to
only a small subset, such as immediate neighbours.
The best first search algorithm proceeds in the following manner:
Step 1: Start with OPEN holding the initial state.
Step 2: Repeat.
Step 3: Pick the best node on OPEN.
Step 4: Generate its successors.
The first step is to define the OPEN list with a single node, the starting node. The
second step is to check whether or not OPEN is empty. If it is empty, then the
algorithm returns failure and exits. The third step is to remove the node with the
best score, n, from OPEN and place it in CLOSED. The fourth step “expands”
the node n, where expansion is the identification of successor nodes of n. The fifth
step then checks each of the successor nodes to see whether or not one of them
is the goal node. If any successor is the goal node, the algorithm returns success
and the solution, which consists of a path traced backwards from the goal to the
start node. Otherwise, the algorithm proceeds to the sixth step. For every successor
node, the algorithm applies the evaluation function, f, to it and then checks to see
if the node has been in either OPEN or CLOSED. If the node has not been in
either, it gets added to OPEN. Finally, the seventh step establishes a looping
structure by sending the algorithm back to the second step. This loop will only be
broken if the algorithm returns success in step five or failure in step two.
The algorithm is represented as follows in the form of pseudo-code:
1. Define a list, OPEN, consisting solely of a single node, the start node, s.
2. IF the list is empty, return failure.
3. Remove from the list the node n with the best score (the node where f is the
Self - Learning minimum), and move it to a list, CLOSED.
72 Material
4. Expand node n. Problem Space, Search
and Knowledge
5. IF any successor to n is the goal node, return success and the solution (by Representation
tracing the path from the goal node to s).
6. FOR each successor node: NOTES
· Apply the evaluation function, f, to the node.
· IF the node has not been in either list, add it to OPEN.
7. Loop the structure by sending the algorithm back to the second step.
In addition, best first search is an algorithm that traverses a graph in search of one
or more goal nodes. For example, a maze is a special illustration of the mathematical
object known as a ‘Graph’. The defining characteristic of this search is that, best
first search uses an evaluation function called a ‘heuristic’ search to determine
which object is the most promising and then examines this object. This ‘best first’
behaviour is implemented with a PriorityQueue. The algorithm for best first search
is as follows:
Best-First-Search( Maze m )
Insert( m.StartNode )
Until PriorityQueue is empty
c <- PriorityQueue.DeleteMin
If c is the goal
Exit
Else
For each neighbor n of c
If n “Unvisited”
Mark n “Visited”
Insert( n )
Mark c “Examined”
End procedure
The objects which will be stored in the PriorityQueue are maze cells and the
heuristic search will be the cell’s ‘Manhattan distance’ from the exit. The Manhattan
distance is a fast-to-compute and surprisingly accurate measurement of how likely
a MazeCell will be on the path to the exit. Geometrically, the Manhattan distance
is distance between two points if only allowed to walk on paths that were at 90
degree angles from each other.
Branch and Bound (BB) is a general algorithm that is used to find optimal solutions
NOTES of different optimization problems, particularly in discrete and combinatorial
optimization. It contains a systematic detail of each candidate solution, in which
big subsets of candidates giving no results are rejected in groups, by making use of
the higher and lower approximated limits of the quantity that are undergoing
optimization.
A.H. Land and A.G. Doig in 1960 were the first ones to propose the method
for linear programming.
General Description
For certainty, presume that the aim is of finding the least value of a function f(x), in
which x is in the range of over a certain set S of permissible or candidate solutions
(the search space or feasible region). Remember that finding the highest value of
f(x) is possible by finding the minimum of g(x) = ‘f(x). (For instance, one could
assume S to be the set of all probable trip schedules for a bus fleet, and f(x) could
be the aniticipated revenue for schedule x.)
A branch-and-bound process needs two tools. The first one is a splitting
process in which, given a set S of candidates, it gives back two or more smaller
sets S1, S2...the union of which would cover S. Remember that the minimum of
f(x) over S is min {v1, v2, ...}, where each vi is the minimum of f(x) inside Si. This
step is known as branching, as its recursive application determines a tree structure
(the search tree) the nodes of which are the subsets of S.
One more tool is a process which calculates the upper and lower limits for
the minimum value of f(x) inside a given subset S. This step is known as bounding.
The main idea of the BB algorithm is that, in case the lower limit for some
tree node (set of candidates) A is higher as compared to the upper limit for another
node B, then it is safe to discard A from the search. This step is known as pruning,
and is generally applied with the help of a global variable m (shared among all
nodes of the tree) which is maintained, that documents the minimum upper limit
seen among all subregions investigated so far. Any node the lower limit of which is
greater than m can be discarded.
The recursive process ceases once the current candidate set S is decreased
to a single element; or even when the upper limit for set S matches the lower limit.
In any way, any element of S will be a minimum of the function within S.
Effective Subdivision
The effectiveness of the technique is strongly dependent on the node-splitting
process and on the upper and lower limit estimators. Everything else being equal,
it is most advisable to select a splitting method that gives non-overlapping subsets.
Typically the process terminates when either pruning or solving of the
search nodes is done. At that stage, all non-pruned subregions will have their
upper and lower bounds as equivalent to the global minimum of the function.
Self - Learning
74 Material
Practically the procedure is frequently stopped after a prescribed time. At that
point, the minimum lower and upper bound, amongst all non-pruned sections, Problem Space, Search
and Knowledge
determine a range of values that has the global minimum. Optionally, inside an Representation
overriding time constraint, it is possible to terminate the algorithm whenever any
error criterion, like (max – min)/(min + max), is below a precribed value.
NOTES
The effectiveness of the technique relies crucially on the efficiency of the
branch and bound algorithms used; it is possible for bad choices to result in repeated
branching, with no pruning, till the sub-regions become extremely small. In this
case, the method will be decreased to a fully comprehensive details of the domain,
that is generally unrealistically big. A global bounding algorithm that would provide
a solution for all problems does not exist. Moreover, there is less possibility of
ever finding one. Therefore, the normal approach has to be applied distinctly for
every application, including branch and bound algorithms that are particularly
designed for it.
It is possible to categorize branch and bound techniques as per the bounding
methods and as per the methods of creating/inspecting the search tree nodes.
The branch and bound design technique is much like backtracking in which
a state space tree is utilized for problem solving. The dissimilarities are that this
method (1) does not restrict us from any specific method of traversing the tree and
(2) is utilized solely for optimization problems.
This method by nature gives itself to be applied both in a parallel and
distributed manner, see, e.g., the travelling salesman problem article.
Applications
The branch and bound algorithm is used for the resolution of following problems:
Knapsack Problem
Integer Programming
Nonlinear Programming
Traveling Salesman Problem (TSP)
Quadratic Assignment Problem (QAP)
Maximum Satisfiability Problem (MAX-SAT)
Nearest Neighbor Search (NNS)
Cutting Stock Problem
False Noise Analysis (FNA)
Branch and bound could even be a base of several heuristics. For instance,
one can desire to terminate the branching, once the space between the upper and
lower bounds gets smaller as compared to a particular threshold. This is made use
of when the solution is ‘appropriate enough for realistic objectives’ and can greatly
reduce the computations required. This kind of solution can be mainly applied
when the cost function brought into use is loud or is the outcome of statistical
estimates. Therefore, one does not know specifically. Instead, one only knows to
be inside a range of values with a particular possibility. An instance of its
implementation here is in biology while cladistic analysis is being performed for
evaluating developing relationships between organisms, wherein the data sets are Self - Learning
frequently unrealistically huge with no heuristics. Material 75
Problem Space, Search This causes branch and bound techniques to be frequently used in game
and Knowledge
Representation tree search algorithms, most significantly through the utilization of alpha-beta pruning.
Branch and Bound Algorithm Technique
NOTES
Branch and bound is yet another algorithm technique that will be presented in the
multi-part article series including algorithm design patterns and techniques. Branch
and bound is one of the most complicated techniques and for sure is difficult to be
discussed as a whole in one article. Thus, the focus will be on the A* algorithm
which is the most distinct branch and bound graph search algorithm.
By now, you should be able to understand that the most vital techniques,
like, backtracking, the greedy strategy, divide and conquer, dynamic programming,
and even genetic programming have all been covered. It is extremely helpful in
understanding the dissimilarities branch and bound algorithms.
Branch and bound is an algorithm technique that is frequently applied to find
the optimal solutions when optimization problems arise. It is chiefly brought into
use for combinatorial and discrete global optimizations of problems. In short, this
technique is the best option when the domain of probable candidates is extremely
big and all the other algorithms prove unsuccessful. This technique is grounded on
the group removal of the candidates.
The tree structure of algorithms must already be known to you. Among the
techniques learned, both backtracking and divide-and-conquer travel through
the tree in its depth, though they adopt opposite routes. The greedy strategy takes
up a single route and does not bother about the others. Dynamic programming is
known to approach this in a kind of Breadth First Search Variation (BFS).
Now, in case the decision tree of the problem that you plan to solve has
really an unlimited depth, then, according to definition, the backtracking and divide
and conquer algorithms are eliminated. Greedy strategy cannot be relied upon
since it is problem-dependent and does not ensure of delivering a global optimum,
unless mathematically proved otherwise.
As a final resort, you can also consider dynamic programming. The fact is
that possibly the problem can actually be solved with the help of dynamic
programming. However, the implementation will not be an effective approach;
moreover, its implementation will be difficult. Therefore, you can understand, that
in case there is a complicated problem in which many parameters will be required
to explain the solutions of sub-problems, dynamic programming will prove to be
ineffective.
In case a real-world proof is still required, the fifteen puzzle exists. One of
the most direct implementations of dynamic programming will need sixteen different
parameters for representing the optimum values of the solutions of every sub-
problem. This means a 16-dimensional array. Thus, the reason for dynamic
programming is eliminated.
This can be summarized in a list form as follows:
Suppose you are attempting to find the minimum of a certain function
Self - Learning
f(x) in a certain range x[x1;x2]. You do not want to simply repeat all the
76 Material values of x—you need optimizations.
BB algorithm first divides the [x1;x2] range into a number of sub-ranges. Problem Space, Search
and Knowledge
After that BB algorithm makes an estimate for the lower and upper Representation
bounds for the lowest value of f(x) for every sub-range; this is the stage
where optimization takes place— rather than all the values inside a sub-
NOTES
range, just two evaluations of f(x) for each sub-range can be performed.
A comparison of all the sub-ranges is done, with the one with the lowest
upper bound for the minimal f(x) value being repeated for finding the universal
minimum of f(x). Rather than repeating the final sub-range, it can even be divided
into more sub-ranges for implementing the branch and bound algorithm once again.
Branch and bound algorithm appears like a typical tradeoff case. You may
obtain speed (while sub-ranges are rough, i.e., involve several data points).You
may even obtain accuracy (e.g., while each sub-range has only three data-points,
such that it is difficult to miss a local minimum).
However, there there in no proof to show that implementing branch and
bound algorithm to one set of data points with differing criteria for sub-range sizes
can enable accomplishing greater accuracy (by reducing chances of leaving out a
local minimum) while retaining unproportionally greater speed for coarse sub-
ranges.
Recollect that in state space search the generators relate to shifts in the state
space. Therefore, the two states below the top state in the triangle of states
correspond to the shifts of the smallest disk either to the rightmost peg or to the
middle peg. The simplest solution to this problem seems to correspond to the path
down the right side of the state space. This solution is shown in Figure 2.6(b).
Self - Learning
80 Material
Problem Space, Search
and Knowledge
Representation
NOTES
Notice that four states in the state space exist satisfying the goal of the first
subproblem and also four satisfying the goal of the second subproblem. This
observation forces one to understand that indispensable while using non-terminal
rules for problem reduction is the presumption that there exists a path in the state
space that will be found out once the partial state descriptions are bound to a
suitable state space.
Order of Problem Solving and Order of Problem Execution
One more vital aspect of problem reduction rules is that using them enables the
order of problem solving to be different from the order of problem execution.
In the state space mentioned these two are evidently the same In the given case,
the problem solver can select any of the subproblems to work on initially.
In the Tower of Hanoi problem no evident benefit exists of solving the
problem in an order that is different from the order of problem execution. However,
in certain problems one may note that some subproblems are not as difficult as
others and their solution can make the solution of the remaining subproblems simpler.
A Cryptarithmetic Problem has been evolved to illustrate this point. In this case,
the subproblems correspond to the various equations that are formed in the algebraic
depiction of the problem. Notice that, in this case the state space has 10! states
and finding a solution for it it via state space search is not possible for any sensible
Self - Learning
human mind. Material 81
Problem Space, Search
and Knowledge 2.7 CONSTRAINT SATISFACTION
Representation
Constraint satisfaction is a usual problem the goal of which is finding values for a
NOTES set of variables which would satisfy a given set of constraints. It is the centre of
several applications in AI, and has witnessed its implementation in several domains.
These domains include planning and scheduling. Due its usuality, maximum AI
researchers must be able to gain from possessing sound knowledge of methods in
this field.
Constraint Satisfaction Problems
Constraint satisfaction problems or CSPs are mathematical problems defined as a
set of objects whose state must satisfy many constraints or limitations. CSPs help
in representing the entities in a problem as a uniform collection of finite limitations
over variables, that can be solved by constraint satisfaction techniques. CSPs are
the topic of intensive research in both AI and operations research, as their
customary formulation offers a general base for analysing and solving problems of
a number of unrelated families. CSPs frequently show great complexity, that
requires a coupling of heuristics and combinatorial search techniques to be solved
within a rational time.
Examples of problems that can be modelled as a CSP are as follows:
Eight-queens puzzle
Map coloring problem
Sudoku
Boolean satisfiability
Formal Definition
Formally, a CSP can be defined as a triple (X, D, C), in which X is a set of
variables, D is a domain of values, and C is a set of constraints. Each constraint is
in turn a pair (t, R) where t is a tuple of variables and R is a set of tuples of values.
All these tuples have the same number of elements; the result is that R is a relation.
An assessment of the variables is a function from variables to values, v : X D.
This kind of an assessment is known to satisfy a constraint <(x1..., xn)R> if (v(x1),
..., v(xn)) R. A solution is an assessment that is known to satisfy all constraints.
Resolution of CSPs
CSPs on fixed domains are classically solved with the help of a form of search.
The most used methods are types of backtracking, constraint propagation and
local search.
Backtracking is a recursive algorithm. It is known to maintain an incomplete
assignment of the variables. In the beginning, none of the variables are assigned.
At every step, a variable is selected, with all possible values being assigned to it in
turn. For every value, a checking of the consistency of the incomplete assignment
with the constraints is performed. In case of consistency, a recursive call is carried
out. When each value has been tried, the algorithm backtracks. In this basic
Self - Learning backtracking algorithm, consistency is defined as the satisfaction of all constraints
82 Material
the variables of which are all assigned. Different types of backtracking exist. Problem Space, Search
and Knowledge
Backmarking enhances the effectiveness to check consistency. Backjumping Representation
enables a portion of the search to be saved by backtracking ‘more than one
variable’ in certain instances. Constraint learning deduces and saves new constraints
which can be made use of later to keep away from performing a part of the NOTES
search. Look-ahead is also frequently made use of in backtracking to try to predict
the effects of selecting a variable or a value. Thus, it sometimes determines in
advance whether a subproblem can be solved or not.
Constraint propagation techniques are made use of for modifying a
CSP. More accurately, they are techniques that impose a kind of local
consistency, which are conditions associated with the consistency of a group
of variables and/or constraints. Constraints propagation has several uses.
First, it turns a problem into one that is equivalent but is generally
uncomplicated to solve. Second, they can prove wether a problem can be
solved or not. It is not sure whether this would take place in general.
However, it always occurs for certain kinds of constraint propagation and/
or for certain types of problems. The most popular types of local consistency
that is widely used are arc consistency, hyper-arc consistency, and path
consistency. The most widely known constraint propagation technique is
the AC-3 algorithm, which imposes arc consistency.
Local search methods are partial satisfiability algorithms. It is not sure if
they can find solution to a problem or not. They function by repeatedly enhancing
an entire assignment over the variables. At every stage, a few variables change
value, with the whole objective being that to increase the number of constraints
satisfied by this assignment. The min-conflicts algorithm is a local search algorithm
that are CSPs specific and grounded on that principle. In reality, local search
seems to function effectively when these changes are also affected by random
choices. Combination of search with local search has been developed, resulting in
hybrid algorithms.
Theoretical Aspects of CSPs
CSPs are also a subject of study in computational complexity theory and finite
model theory. A vital question is whether for each set of relations, the set of all
CSPs that can be represented using only relations chosen from that set is
either in PTIME or otherwise NP-complete (assuming P = NP). If this kind of
a dichotomy is true, then CSPs offer one of the largest known subsets of NP.
This evades problems that are neither polynomial time solvable nor NP-
complete, whose existence was shown by Ladner. Dichotomy results are
famous for CSPs where the domain of values is of size 2 or 3, but the general
case remains open.
Maximum classes of CSPs which are known to follow instructions are
those where the hypergraph of constraints has bounded treewidth (with no
limitations on the set of constraint relations), or where the constraints have a
random kind but there exist mainly non-unary polymorphisms of the set of
constraint relations.
Self - Learning
Material 83
Problem Space, Search Each CSP may even be considered as a conjunctive query containment
and Knowledge
Representation problem.
Types of CSPs
NOTES
The typical model of CSP outlines a model of stationary, rigid constraints. This
inflexible model is a fault which makes the representation of problems difficult.
Many proposals about making changes to the primary CSP definition have been
put forth, so that the models adapts to a broad range of problems.
Dynamic CSPs
Dynamic CSPs (DCSPs) are helpful when the original design of a problem is
modified in a certain way, mainly due to the set of constraints being considered
undergoes evolution because of the environment. DCSPs are seen as a series of
static CSPs, with each one being a modification of the earlier one wherein it is
possible to add (restriction) or remove (relaxation) variables and constraints.
Information seen in the originial formulations of the problem may be made use of
for refining the subsequent ones. It is possible to classify the solving method as per
the method in which transfer of information takes place. The solving methods are
as follows:
Oracles: The solution found to earlier CSPs in the series are made use of
as heuristics to direct the resolution of the present CSP from the beginning.
Local repair: Every CSP is computed beginning from the incomplete solution
of the earlier one and making repairs to the varying constraints with local
search.
Constraint recording: New constraints are defined at every step of the
search for representing the learning of varying group of decisions. Those
constraints are taken over to the new CSP problems.
Flexible CSPs
Typical CSPs handle constraints as being rigid. This means that they are extremely
important (every solution should satisfy all of them) and nonflexible (meaning that
they should be fully satisfied otherwise they are totally violated). Changeable CSPs
help in relaxing those presumptions with partial relaxation of the constraints and
permitting non-complaince of the solution with all of them. Some kinds of flexible
CSPs are as follows:
MAX-CSP, in which violation of several constraints is permitted, with
the quality of a solution being measured by the number of satisfied
constraints.
Weighted CSP, a MAX-CSP in which every constraint being violated is
weighed as per a pre-defined preference. Therefore, satisfying constraint
with greater weight takes preference.
Fuzzy CSP model constraints as fuzzy relations in which the satisfaction of
a constraint is a constant function of its variables’ values, moving from
completely satisfied to completely violated.
Self - Learning
84 Material
Specifying Constraint Problems Problem Space, Search
and Knowledge
Just like many successful AI techniques, constraint solving is all about find solutions Representation
to problems. By any means define the intelligent task as a problem, then rub it into
a CSP’s , place it inside a constraint solver and check if you find a solution. The NOTES
parts that CSPs contain are as follows:
A set of variables X = {x1, x2, ..., xn}
A finite set of values that each variable can take. This is known as the
domain of the variable. The domain of variable xi is written as Di..
A set of constraints that denotes the values that variables can assume
simultaneously
Depending on the solver that you are using, constraints are frequently
expressed as relationships between variables, e.g., x1 + x2 < x3. However, to
discuss constraints in a more formal manner, the following notation is used:
A constraint Cijk specifies the tuples of values variables xi, xj and xk that are
permitted to take simultaneously. In simple terms, a constraint usually speaks about
things which can not happen, but formally, tuples (vi, vj, vk) are being considered
that xi, xj and xk can take simultaneously. As a simple example, suppose you have
a CSP with two variables x and y, and that x can take values {1,2,3}, while y can
take values {2,3}. Then the constraint that x=y will be expressed as:
Cxy={(2,2), (3,3)},
and the constraint that x<y will be expressed as
Cxy = {(1,2),(1,3),(2,3)}
A solution to a CSP is assigning of values, one to every variable in a manner
that no constraint breaks. It is dependent on the existing problem, but the user
may want to be aware about the existence of a solution, i.e., the user will adopt the
first given answer. Optionally, they may need all the solutions to the problem, or
they may like to know that there is no solution to the problem. Certain times, the
aim of the exercise is finding the optimum solution on the basis of a certain measure
of worth. Certain times, this can be done without all the solutions being enumerated.
However, at other times, it would be essential to find all solutions, then assess the
one which would be the optimum. In the high-IQ problem, a solution is merely a
set of lengths, one per square.
Binary Constraints
Unary constraints denote that a specific variable can adopt certain values, which
primarily limits the domain for that variable, and hence needs to be taken care of
while the CSP is being specified. Binary constraints associate two variables, and
binary constraint problems are particular CSPs involving only binary constraints.
Binary CSPs have a distinct place in the theory since all CSPs can be denoted as
binary CSPs. Moreover, binary CSPs can be represented using graphs and
matrices, which can make them easier to comprehend.
Binary constraint graphs, such as the one shown in Figure 2.10 represent
the constraint problems clearly. Here, the nodes are the variables and the edges
denote the constraints on the variables between the two variables that the edge Self - Learning
Material 85
Problem Space, Search joins (note that the constraints indicate the values that can be assumed at the same
and Knowledge
Representation time).
X1 X2
NOTES Nodes are variables
{(5, 7), (2, 2)}
X4
Matrices can also be used to represent binary constraints, with one matrix
for every constraint. For instance, in the above constraint graph, the constraint
between variables x4 and x5 is {(1,3),(2,4),(7,6)}. Table 2.6 represents this.
Table 2.6 Matrices
C 1 2 3 4 5 6 7
1 *
2 *
7 *
In this the asterixes signify the entry (i,j) in the table such that variable x4 can
take value i at the same time that variable x5 assumes value j. Since it possible to
write all CSPs as binary CSPs, the artificial generation of random binary CSPs as
a set of matrices is frequently used to evaluate the relative capabilities of constraint
solvers. However, it is important to note that in actual world constraint problems,
much more structure to the problems exist as compared to what you obtain from
Self - Learning such random constructions.
86 Material
A common example of CSP, is the ‘n-queens’ problem, which is the problem Problem Space, Search
and Knowledge
of positioning n queens on a chess board in a manner that there is no threat to one Representation
another along the vertical, horizontal or diagonal. There are several probabilities
to represent this as a CSP (in fact, to find the best specification of a problem such
that a solver is able to get the answer as soon as possible is an extremely skilled NOTES
art). One likelihood is to have the variables represent the rows and the values they
can assume to represent the columns on the row that a queen was located on.
Take a look at the following solution to the 4-queens problem shown in Figure
2.11.
Then, if you count rows from the top downwards and columns from the left,
the solution can be denoted as: X1=2, X2=4, X3=1, X4=3. This is due to the fact
that the queen on row 1 is in column 2, the queen in row 2 is in column 4, the
queen in row 3 is in column 1 and the queen in row 4 is in column 3. The constraint
between variable X1 and X2 will be:
C1,2 = {(1,3),(1,4),(2,4),(3,1),(4,1),(4,2)}
For an exercise, try working out precisely that which the above constraint is
trying to say.
Arc Consistency
There have been several advances in the methodology of constraint solvers searching
for solutions (note that this denotes an assignment of a value to every variable in a
manner that none of the constraint is violated). You first look at a pre-processing
stage which helps in highly improving the effectiveness by pruning the search space,
namely arc-consistency.
The pre-processing routine for binary constraints called arc-consistency
includes calling a pair (xi, xj) an arc and taking note that this is an ordered pair, i.e.,
it is not similar to (xj, xi). Every arc is related with one constraint Cij, that constrains
variables xi and xj. It can be said that the arc (xi, xj) is consistent if, for all values a
in Di, there is a value b in Dj in a manner that the assignment xi=a and xj=b satisfies
constraint Cij. Remember that (xi, xj) being consistent does not essentially signify
that (xj,xi) too is consistent. To utilize this in a pre-processing manner, each pair of
variables needs to be taken and made arc-consistent. That is, each pair (xi,xj)
needs to be taken and the variables removed from Di which make it inconsistent,
till it becomes consistent. This helps in the effective removal of values from the
domain of variables. Hence it prunes the search space making it possible for the
Self - Learning
solver to succeed (or be unsuccessful in finding a solution) faster. Material 87
Problem Space, Search To show the value of carrying out an arc-consistency check before beginning
and Knowledge
Representation a search for a solution, an example from Barbara Smith’s tutorial. Imagine that
you have four tasks to complete, namely, A, B, C and D, and you are trying to
schedule them. They are subjected to constraints which are as follows:
NOTES
Task A is known to last for 3 hours and precedes tasks B and C.
Task B is known to last for 2 hours and precedes task D.
Task C is known to last for 4 hours and precedes task D.
Task D is known to last for 2 hours.
This problem will be modelled with a variable for every task start times,
namely startA, startB, startC and startD. You will even have a variable for the
overall start time, namely, start, and a variable for the overall finishing time, namely,
finish. You may consider that {0} is the domain for the variable start. However,
the domains for all the other variables is {0,1,...,11}, since the sum of the duration
of the tasks is 3 + 2 + 4 + 2 = 11. The English specification of the constraints can
be translated into our formal model. Therefore, you begin with an intermediate
start:
start d” startA
startA + 3 d” startB
startA + 3 d” startC
startB + 2 d” startD
startC + 2 d” startD
startD + 2 d” finish
Then, by considering the values that each pair of variables can take in a
simultaneous manner, the constraints can be written as follows:
Cstart,startA = {(0,0), (0,1), (0,2), ..., (0,11)}.
CstartA,start = {(0,0), (1,0), (2,0), ..., (11,0)}.
CstartA,startB = {(0,3), (0,4), ..., (0,11), (1,4), (1,5), ..., (8,11)}, etc.
Now, it will be verified if every arc is arc-consistent, and if not, the values
from the domains of variables will be removed consistency is attained. You now
need to first consider the arc (start, start A) which is associated with the constraint
{(0,0), (0,1), (0,2), ..., (0,11)} above. You need to verify if there is any value, P,
in Dstart that is without a corresponding value, Q, in a manner that (P,Q) satisfies
the constraint, i.e., can be seen in the set of assignable pairs. As Dstart is only {0},
there is no cause for worry. You then start looking at the arc (startA, start), and
verify if there is any value in DstartA, P, which does not have a corresponding Q
such that (P,Q) is in CstartA, start. Again, there is no cause for worry, since all the
values in DstartA appear in CstartA, start.
If you now look at the arc (startA, startB), then the constraint in question is:
{(0,3), (0,4), ..., (0,11), (1,4), (1,5), ..., (8,11)}. It can be seen that their is no
pair of the form (9,Q) in the constraint, likewise, no pair of the form (10,Q) or
(11,Q). Hence, this arc is not arc-consistent, and it is important to eliminate the
Self - Learning
88 Material
values 9, 10 and 11 from the domain of startA to bring consistency to the arc. This Problem Space, Search
and Knowledge
is correct, since you know that, if task B will start after task A, with duration of 3 Representation
hours, and they will all start by the eleventh hour, then it is not possible for task A
to begin after the eighth hour. Therefore, –it is possible to remove the values 9, 10
and 11 from the domain of startA. NOTES
This technique of eliminating values from domains is extremely effective.
The domains become very small, as shown in the scheduling network shown in
Figure 2.12.
Start B
{3, 7}
Start Start A Start D finish
{0} {0, 2} {7, 3} {9, 11}
Start C
{3, 5}
You can see that the largest domain size contains only 5 values. This denotes
that a majority of the search space has been pruned. In reality, to eliminate as
many variables as possible in a CSP that depends on precedence constraints.
You need to work backwards, i.e., see the start time of the task, T, which
should take place in the end, then make every arc of the form (startT, Y)
consistent for every variable Y. Subsequently, go on to the task that needs to
take place second to last, etc. In CSPs only involving precedence constraints,
arc-consistency is sure to eliminate all values that cannot appear in a solution to
the CSP. In general, however, such a guarantee cannot be made, but arc-
consistency generally affects the beginning specification of a problem in some
way or the other.
Search Methods and Heuristics
The question of the method of constraint solvers searching for solutions now
arises—constraint that preserve assignments of values to variables—to the CSPs
they are given. The most evident approach is using a depth first search: assigning
a value to the first variable and checking that this assignment does not violate
any constraints. Then, go on to the next variable, assigning it a value and checking
that this does not violate any constraints, then go on to the next variable, so on
and so forth. In case an assignment violates a constraint, select another value for
the assignment till one is found that satisfies the constraints. In case, it is difficult
to find one, then this is when the search should backtrack. In this condition, the
earlier variable is once again looked at, with the next value for it being tried. In
this manner, all probable sets of assignments will be tried, with a solution being
found. The search diagram shown in Figure 2.13 taken from Smith’s tutorial
paper—denotes how the search for a solution to the 4-queens problem
progresses until it comes across a solution.
Self - Learning
Material 89
Problem Space, Search
and Knowledge
Representation
NOTES
Q Q Q Q Q Q Q Q
Q Q Q Q Q Q Q Q
Q Q Q Q Q Q Q Q Q
Q Q Q Q Q Q Q Q Q
Q Q Q Q Q Q Q Q Q
Q Q Q Q Q Q Q
Q Q Q Q Q Q Q
Q Q Q Q Q Q Q
Q Q Q Q Q Q Q
You can see that the first time that it backtracks is after it has failed to place
a queen in row three given queens in positions (1,1) and (2,3). In this case, it
backtracked and moved the queen in (2,3) to (2,4). Ultimately, this did not work
out either, so it was forced to backtrack further to move the queen in (1,1) to
(1,2). This helped in reaching the solution much faster.
A technique called forward checking is used by constraint solvers for adding
some sophistication t the search method. The common notion is to work similar to
a backtracking search. However, while conformance with constraints is checked
after a value to a variable ha been assigned, the agent even will check if this
assignment will break constraints with future variable assignments. That is, if Vc
has been allocated to the present variable c, then for every unassigned variable xi,
(temporarily) eliminate all values from Di which, combined with Vc break a
constraint. It is quite possible that while doing this Di becomes empty. This denotes
that the selection of Vc for the present variable is not too good—it will be unable
to find its way into a solution to the problem, since there is no method of assigning
a value to xi without a constraint getting broken. In this kind of a situation, although
the assigning of Vc may not break any constraints with previously assigned variables,
a new value is selected (or backtracking takes place in case there are no values
left), since it is known know that Vc is not a good assignment.
The diagram (again, taken from Smith’s tutorial) shown in Figure 2.14 depicts
the method of forward checking improving the search for a solution to the
4-queens problem.
Self - Learning
90 Material
Problem Space, Search
and Knowledge
Representation
NOTES
Q Q
Q Q Q
Q
Q Q
Q Q
Q Q
Q
Q
Q
NOTES The problem solver is a search engine that looks over a problem space
defined by the present domain and operators. As it performs its search, a
problem-solving trace is made by it. This incorporates every step of the search
(including paths that were later dropped) and its own logic at the time that is
about the present state of the search. This trace can be made use of by the
learning modules.
Prodigy’s problem solver makes use of a MEA for solving problems.
Differences are decided by drawing comparisons between the current state and
the goal state. Not like maximum MEA, Prodigy can have many goals that exist
simultaneously, that would need to be considered. The system initially decides the
goal that needs to be achieved and then produces differences for a single goal as
soon as this determination has been made. These differences can be easily calculated
as all knowledge is denoted in PDL. This is both uniform and can be penetrated.
Once the differences are calculated for the current goal, operators are proposed
on the basis of their capability to decrease the dissimilarities. As Prodigy makes
use of STRIPS-like operators, this step matches with the current operators’ add-
lists being scanned for determining whether an operator exerts any predicates
which, post binding, will lead to the elimination of a difference (the assertion of a
goal state conjunct). Choice among the operators is intervened by control rules.
When the control rules are absent, in keeping with Prodigy’s general commitment
technique, the intervention between operators defaults to a depth-first consideration
of each.
When the control rules are absent, the search defaults to depth-first MEA.
Nodes in the problem space are determined as the set of goals and the state of the
world, both indicated in first-order predicate logic. The search goes on through
the problem space till a node is found that achieves the top-level goal, with the
help of the following algorithm:
Decision Phase
o Determining the node for expanding next (by either control rules or
DFS).
o Determining new goal from this node.
o Selecting operator for achieving the goal.
o Binding operator parameters.
Expansion Phase
o If it is possible to apply the operator, apply it. Or else, create subgoal
on non-matching preconditions.
Control Rules: Control rules are made use of for the following three purposes:
1. To improve the search efficiency
2. To improve the solution quality
3. To direct the problem solver along normally-unexplored paths
Self - Learning
94 Material
In order to search, Prodigy presumes that the search will be directed by Problem Space, Search
and Knowledge
exclusive control knowledge to make crucial decisions. This presumption is known Representation
as the casual commitment technique. It denotes that the problem solver will not try
refined conduct when that kind of control knowledge is absent.
NOTES
Control rules can be categorized into a left-hand side that is temporarily
found to be matching variables against the present state axioms. Another category
is the right-hand side action that denotes whether to SELECT, REJECT or
PREFER a specific candidate rule. A control decision is made on the basis of
these indications with the determination of a new candidate node.
Problem Space Architecture (Soar)
Soar was made use of to imagine the reality of a universal weak method, an approach
that stated that the search technique must occur as a result of the communication
between the structure of the agent and the projected task. The search strategy
selected is presumed as being weak; i.e., the agent does not possess much
knowledge regarding the task environment. Therefore, any of the weak methods
might occur in Soar when this universal weak method and the task interact with
each other. The benefit of this kind of an approach is that it refrains from program
synthesis: the behavior results knowledge and task interact instead of being
programmed in a precise and detailed manner.
A summary of certain weak methods that have been demonstrated in Soar
with the help of the idea of a universal weak technique is as follows:
Heuristic Search
Operator Subgoaling
Waltz Constraint Propagation
Means-End Analysis
Generate and Test
Breadth First Search
Depth first search
Look-ahead Search
Simple and Steepest Ascent Hill Climbing
Progressive Deepening
Mini-Max
Alpha-Beta Pruning
Iterative Deepening
Branch and Bound
Best first search
Macro-operators
Modular Integrated Architecture (ICARUS): Daedalus
Daedalus makes use of a different type of MEA for generating plans. This constituent
calls for Labyrinth for retrieving suitable operators or stored plan on the basis of Self - Learning
Material 95
Problem Space, Search the problem’s preconditions, postconditions or the variations that it decreases. In
and Knowledge
Representation case, Daedalus happens to detect a loop or a dead end, it is known to recede and
retrieve a separate operator, that produces a heuristic depth first search with the
help of means-ends space. In case the Labyrinth gives back an entire plan, Daedalus
NOTES performs a kind of derivational analogy, to check the validity of the problem at
hand.
This algorithm of priority-first execution will appear to open the likelihood
of starvation of lower-priority tasks. The explanation is not very clear about this
issue, therefore no presumptions can be drawn either way.
Problem-Solving as Search
A vital feature of wise behaviour as performed in studies in AI is goal-based
problem solving. It is an underlying structure in which it is possible to give an
explanation of the solution of a problem by looking for a series of actions leading
to a desired goal. A goal-seeking system is to be connected to its external
environment with the help of sensory channels which help it to receive information
regarding the environment and the motor channels with the help of which it influences
the environment (the word ‘afferent’ describes ‘inward’ sensory flows, and ‘efferent’
describes ‘outward’ motor commands). Moreover, the system contains a certain
method to store in a memory information regarding the condition of the
environment (afferent information) and information regarding actions (efferent
information). The capacity of achieving goals is dependent on developing
relationships, simple or complex, between specific changes in states and specific
actions that will result in these changes. Search is the method of discovery and
assembly of a series of actions which will lead from a given state to a desirable
state. While this technique can suit machine learning and problem solving, it is not
always recommended for humans (e.g., cognitive load theory and its implications).
Working Methodology of MEA
The MEA technique is an approach that is used for controlling search in problem
solving. Given a current state and a goal state, an action is chosen that would
reduce the difference between the two. The action is carried out on the current
state to give rise to a new state, and the process is implemented in recursion to this
new state and the goal state.
Notice that, for MEA to be effective, the goal-seeking system should possess
a method to associate with any form of noticeable difference, those actions which
are valid for decreasing that difference. It should also have the means to detect the
progress that is being made (the changes in the differences between the real and
the desirable state), since certain tried series of actions might prove unsuccessful
and, hence, certain optional series can be tried.
When knowledge is obtainable that concerns the vitality of differences, the
most vital difference is selected first to further enhance the average performance
of MEA over other brute-force search techniques. However, even when differences
are not ordered as per the importance, MEA enhances over other search heuristics
(again in the average case) by focussing the problem solving on the real differences
Self - Learning between the present state and that of the goal.
96 Material
Problem Space, Search
and Knowledge
Check Your Progress Representation
Analysing Principles
Evolution
Self - Learning
Fig. 2.15 Knowledge Progression Material 103
Problem Space, Search Data is viewed as a collection of disconnected facts.
and Knowledge
Representation E.g.: It is raining.
Information emerges when relationships among facts are established and
NOTES understood. Providing answers to ‘who’, ‘what’, ‘where’, and ‘when’ gives
the relationships among facts.
E.g.: The temperature dropped 15 degrees and it started raining.
Knowledge emerges when relationships among patterns are identified and
understood. Answering to ‘how’ gives the relationships among patterns.
E.g.: If the humidity is very high, temperature drops substantially, and then
atmosphere holds the moisture, so it rains.
Wisdom is the understanding of the principles of relationships that describe patterns.
Providing the answer for ‘why’ gives understanding of the interaction between
patterns.
E.g.: Understanding of all the interactions that may happen between raining,
evaporation, air currents, temperature gradients and changes.
Let’s look at various kinds of knowledge that might need to represent AI systems:
Objects
Objects are defined through facts.
E.g.: Guitars with strings and trumpets are brass instruments.
Events
Events are actions.
E.g.: Vinay played the guitar at the farewell party.
Performance
Playing the guitar involves the behaviour of the knowledge about how to do
things.
Meta-knowledge
Knowledge about what we know. To solve problems in AI, we must
represent knowledge and must deal with the entities.
Facts
Facts are truths about the real world on what we represent. This can be
considered as knowledge level.
Knowledge Model
Knowledge model defines that as the level of ‘connectedness’ and ‘understanding’
increases, our progress moves from data through information and knowledge to
wisdom (Refer Figure 2.16).
The model represents transitions and understanding.
The transitions are from data to information, information to knowledge, and
finally knowledge to wisdom;
Self - Learning The support of understanding is the transitions from one stage to the next
104 Material stage.
The distinctions between data, information, knowledge and wisdom are not Problem Space, Search
and Knowledge
very discrete. They are more like shades of gray, rather than black and white. Representation
Data and information deal with the past and are based on gathering facts
and adding context.
NOTES
Knowledge deals with the present and that enables us to perform.
Wisdom deals with the future vision for what will be rather than for what it is
or it was.
Degrees of
Connectedness Wisdom
Understanding
Principles
Knowledge
Understanding
Patterns
Information
Understanding
Relations Degrees of
Understanding
Data
NOTES Subjective
Insight
Knowledge Concept
Information
Context
Socialization
Facts Data
Wisdon
Intemalization
Knowledge
Conversion Externalizatin
Combination
Principles are the basic building blocks of theoretical models and allow for making
predictions and drawing implications. These artifacts are supported in the
knowledge creation process for creating two of knowledge types: declarative and
procedural, which are explained below.
Knowledge Type
Cognitive psychologists sort knowledge into declarative and procedural
categories and some researchers add strategic as a third category.
Procedural knowledge Declarative knowledge
Examples: Procedures, Example: Concepts, objects,
rules, agendas, models strategies, facts, propositions,
assertions, semantic nets, logic
and descriptive models
Focuses on tasks that Refers to representations of
must be performed to objects and events; knowledge
reach a particular obje- about facts and relationships
ctive or goal
Knowledge about ‘how Knowledge about ‘that something
to do something’; e.g., to if is true or false’. e.g., A car has
determine if Peter or four tyres; Peter is older than
Robert is older, first find Robert
their ages
Note : About procedural knowledge, there is some disparity in views. One view is that it is
close to tacit knowledge; it manifests itself in the doing of something, yet cannot be expressed
in words; e.g., we read faces and moods.
Another view is that it is close to declarative knowledge; the difference is
that a task or method is described instead of facts or things. All declarative
knowledge is explicit knowledge; it is knowledge that can be and has been
articulated.
Strategic knowledge is considered to be a subset of declarative knowledge.
Knowledge Representation and Reasoning
Knowledge Representation (KR) is basically a replacement for an actual thing. It
Self - Learning
106 Material enables an entity to decide the end result by thinking rather than acting. KR is a set
of answers to questions. It is a medium for pragmatically efficient computation. Problem Space, Search
and Knowledge
According to John Sowa, in Knowledge Representation: Logical, Representation
Philosophical, and Computational Foundations, ‘knowledge representation is
a multidisciplinary subject that applies theories and techniques from three other
NOTES
fields: 1. Logic provides the formal structure and rules of inference. 2. Ontology
defines the kinds of things that exist in the application domain. 3. Computation
supports the applications that distinguish knowledge representation from pure
philosophy.’
According to David Poole, Alan Mackworth and Randy Goebel, in
Computational Intelligence: A Logical Approach, ‘in order to use knowledge
and reason with it, you need what we call a Representation and Reasoning System
(RRS). A representation and reasoning system is composed of a language to
communicate with a computer, a way to assign meaning to the language, and
procedures to compute answers given input in the language. Intuitively, an RRS
lets you tell the computer something in a language where you have some meaning
associated with the sentences in the language, you can ask the computer questions,
and the computer will produce answers that you can interpret according to the
meaning associated with the language. . . . One simple example of a representation
and reasoning system . . . . is a database system. In a database system, you can tell
the computer facts about a domain and then ask queries to retrieve these facts.
What makes a database system into a representation and reasoning system is the
notion of semantics. Semantics allows us to debate the truth of information in a
knowledge base and makes such information knowledge rather than just data.’
Knowledge representation and reasoning is a domain in Artificial Intelligence
(AI). Its scope involves how to formally ‘think’. This means using a symbol system
to represent that which can be discussed about, and functions that is, or is not,
within the domain of discourse that permits formal reasoning regarding objects
within the domain of discourse to take place. Usually, some amount of logic is
employed to provide formal semantics of the functioning of reasoning to symbols
in the domain of discourse, and to supply quantifiers and modal operators that
give meaning to the sentences in the logic.
Overview
Rules, frames, semantic networks and tagging are techniques of representation
that have come up from human information processing theories. Knowledge
representation aims to represent knowledge such that it can facilitate the drawing
of conclusions from knowledge. The issues that arise in knowledge representation
from the viewpoint of AI are as follows:
Representation of knowledge by people
Nature of knowledge and its representation
Representation schemes vis-à-vis a particular or a general-purpose domain
Nature of expression of a representation scheme or formal language
Declarative or procedural scheme
Scanty discussion and research exists in knowledge representation–related issues.
Some of the well-known problems that exist are as follows: Self - Learning
Material 107
Problem Space, Search Spreading Activation: Deals with issues in navigating a network of nodes
and Knowledge
Representation Subsumption: Deals with selective inheritance
Classification: Deals with classification of a product under both genre
NOTES and sub-genre
Knowledge representation refers to representations aimed at information
processing by modern computers, and specifically, for representations that consist
of explicit objects (the class of all chimpanzees or Johnny as a specific entity), and
of claims regarding them (Johnny is a chimpanzee, or all chimpanzees are cute and
know tricks). In this case, representing knowledge so explicitly enables computers
to arrive at conclusions from already-stored knowledge (Johnny is cute and knows
tricks).
Glimpse into the History of Knowledge Representation and Reasoning
Knowledge representation methods, such as heuristic knowledge, neural
networks and theorem proving, were tried in the 1970s and early 1980s,
with varying degrees of success. Also, various medical diagnoses and games
such as chess were major application areas.
In 1972, Prolog was developed. It represented propositions and rudimentary
logic that could derive conclusions from established premises.
The 1980s witnessed the birth of formal computer knowledge, languages
and systems. Major projects, such as the ‘Cyc’ project, which is ongoing,
tried to encode large bodies of general knowledge. This project went through
a large encyclopedia, encoding the information needed by readers to
understand basic physics, notions of time, causality, motivation,
commonplace objects and classes of objects.
Around the same time, much larger databases of language information were
being built in computational linguistics, and these coupled with faster
processing speed and capacity made intense knowledge representation
more feasible.
In the 1980s, KL-ONE targeted at knowledge representation itself.
In 1995, the Dublin Core standard of metadata was conceived. Languages
were developed to represent document structure, such as SGML (from
which HTML descended) and later XML. These enabled the retrieval of
information and efforts in data mining.
Web development has included development of XML-based knowledge
representation languages and standards, including RDF, RDF Schema,
DARPAAgent Markup Language (DAML) and Web Ontology Language
(OWL).
Various and notations are being developed to represent knowledge, which
have their foundation in logic and mathematics.
Properties of Knowledge Representation Systems
A knowledge representation system must possess the following properties:
Self - Learning Representational Adequacy: It denotes the ability to represent the
108 Material
requisite knowledge.
Inferential Adequacy: It refers to the simplicity with which inferences can Problem Space, Search
and Knowledge
be drawn using represented knowledge. Representation
Formal NOTES
Compute
Representation Output
INTERNAL
FACTS REPRESENTATIONS
ENGLISH ENGLISH
UNDERSTANDING GENERATION
ENGLISH
REPRESENTATIONS
Natural language (or English) is the way of representing and handling the facts.
Logic enables us to consider the following fact:
spot is a dog represented as dog(spot)
We infer that all dogs have tails
: dog(x) has_a_tail(x)
According to logical conclusion
has_a_tail(Spot)
Using a backward mapping function
Spot has a tail can be generated.
The available functions are not always one to one but are many to many
which are a characteristic of English representations. The sentences All dogs have
tails and every dog has a tail – both say that each dog has a tail, but from the first
sentence, one could say that each dog has more than one tail and try substituting
teeth for tails. When an AI program manipulates the internal representation of facts
these new representations can also be interpretable as new representations of facts.
Consider the classic problem of the mutilated chessboard. The
problem in a normal chessboard, the opposite corner squares, have been
eliminated. The given task is to cover all the squares on the remaining board by
dominoes so that each domino covers two squares. Overlapping of dominoes is
not allowed. Consider three data structures:
The first two data structures are shown in Figure 2.20 and the third data
Self - Learning structure is the number of black squares and the number of white squares. The
112 Material
first diagram loses the colour of the squares and a solution is not easy. The second Problem Space, Search
and Knowledge
preserves the colours but produces no easier path because the numbers of black Representation
and white squares are not equal. Counting the number of squares of each colour,
giving black as 32 and the number of white as 30, gives the negative solution as a
domino must be on one white square and one black square; thus the number of NOTES
squares must be equal for a positive solution.
Using Knowledge
We have briefly discussed above where we can use knowledge. Let us consider
how knowledge can be used in various applications.
Learning
Acquiring knowledge is learning. It simply means adding new facts to a knowledge
base. New data may have to be classified prior to storage for easy retrieval,
interaction and inference with the existing facts and has to avoid redundancy and
replication in the knowledge. These facts should also be updated.
Retrieval
Using the representation scheme shows a critical effect on the efficiency of the
method. Humans are very good at it. Many AI methods have tried to model humans.
Reasoning
Get or infer facts from the existing data.
If a system only knows that:
Ravi is a jazz musician.
All jazz musicians can play their instruments well.
If questions are like this
Is Ravi a jazz musician? OR
Can jazz musicians play their instruments well?
then the answer is easy to get from the data structures and procedures.
However, questions like
Can Ravi play his instrument well?
require reasoning. The above are all related. For example, it is fairly obvious
that learning and reasoning involve retrieval; etc.
2.9.2 Approaches to Knowledge Representation
We discuss the various concepts relating to approaches to knowledge
representation in the following sections. Self - Learning
Material 113
Problem Space, Search Knowledge Representation Using Natural Languages
and Knowledge
Representation The processing of natural languages provides the ability to read and understand
the languages spoken by humans to machines. Researchers believe that an effective
NOTES natural language processing system would be able to gain knowledge on its own,
by reading the existing matter available on the Internet. A few applications of
natural language processing include information retrieval and machine translation.
Natural languages are very expressive and almost everything that can be
expressed symbolically can also be expressed in natural languages. It is the most
expressive knowledge representation formalism that humans use. However, there
are various limitations also to natural languages. Its reasoning is very complex and
it is hard to maneuver. It is also very ambiguous and most people do not understand
the concepts of syntax and semantics. Also, there is very little uniformity in the
sentence structures.
The association between natural language and knowledge representation
focuses on the two following points:
Study of hybrid logic (propositional, first-order and higher-order) and
implementation of efficient proof methods.
Investigation of other logic that are of relevance to natural languages and
knowledge representation like memory logic, dedicated planning method
and the Discourse Representation Theory (DRT).
Properties for Knowledge Representation Systems
Knowledge representation systems possess the following properties:
Representational adequacy
Inferential adequacy
Inferential efficiency
Acquisitional efficiency
Well-defined syntax and semantics
Naturalness
Frame problem
Representational Adequacy
A knowledge representation scheme must be able to actually represent the
knowledge appropriate to our problem.
Inferential Adequacy
A knowledge representation scheme must allow us to make new inference from
old knowledge. It means the ability to manipulate the knowledge represented to
produce new knowledge corresponding to that inferred from the original. It must
make inferences that are as follows:
Sound – The new knowledge actually does follow from the old knowledge.
Complete – It should make all the right inferences.
Here soundness is usually easy, but completeness is very hard.
Self - Learning
114 Material
Example: Given Knowledge Problem Space, Search
and Knowledge
Tom is a man. Representation
is a
Musician
is a
Jazz Avant
Garde/Jazz
instance instance
Ravi John
bands bands
Self - Learning
Material 125
Problem Space, Search 13. The following architectures enable the utilization of MEA:
and Knowledge
Representation Planning and Learning Architecture (Prodigy)
Problem Space Architecture (Soar)
NOTES Modular Integrated Architecture (ICARUS)
14. The problem solver is a search engine that looks over a problem space
defined by the present domain and operators. As it performs its search, a
problem-solving trace is made by it.
15. First-Order Predicate Calculus (FOPC) is the best-understood scheme
for knowledge representation and reasoning.
16. The core concepts of research in Artificial Intelligence (AI) are knowledge
representation and knowledge engineering.
17. The five basic types of artifacts of knowledge are as follow:
(i) Facts
(ii) Concepts
(iii) Processes
(iv) Procedures
(v) Principles
18. Prolog was developed in 1972.
19. Acquisitional efficiency refers to the ability of the knowledge representation
system to acquire knowledge with the help of automatic methods rather
than depend on human intervention.
20. Acquiring knowledge is learning. It simply means adding new facts to a
knowledge base.
21. A frame problem is a problem of representing the facts that change as well
as those that do not change.
22. There are two main attributes of knowledge representation: Instance and
Isa.
23. A semantic network is a tool used in knowledge representation that consists
of a structure of semantic terms.
24. A conceptual graph is a graphical notation for logic based on semantic
networks of artificial intelligence.
25. A mind map is a diagram that represents words, ideas, tasks or other items
linked to and arranged around a main keyword or idea.
26. According to the common sense law of inertia an action can be assumed
not to change a given property of a situation unless there is evidence to the
contrary.
2.11 SUMMARY
The word ‘search’ refer to the search for a solution in a problem space.
Self - Learning Search is the systematic examination of states to find path from the start/
126 Material root state to the goal state.
A successor function needed for state change. The successor function moves Problem Space, Search
and Knowledge
one state to another state. Representation
A set of ordered pair of the form (x, y) is called a relation.
The domain of a function is the set of all the first members of its ordered NOTES
pairs, and the range of a function is the set of all second members of its
ordered pairs.
To solve the problem of playing a game, we require the rules of the game
and targets for winning as well as representing positions in the game. The
opening position can be defined as the initial state and a winning position as
a goal state.
The problem solved by using the production rules in combination with an
appropriate control strategy, moving through the problem space until a path
from an initial state to a goal state is found.
Production systems provide appropriate structures for performing and
describing search processes.
Production systems provide us with good ways of describing the operations
that can be performed in a search for a solution to a problem.
A heuristic is a method that improves the efficiency of the search process.
These are like tour guides. There are good to the level that they may neglect
the points in general interesting directions; they are bad to the level that they
may neglect points of interest to particular individuals.
A salesman has to visit a list of cities and he must visit each city only once.
There are different routes between the cities. The problem is to find the
shortest route between the cities so that the salesman visits all the cities at
one.
Depth first is recommended since it is possible to find a solution without
calculating all nodes and breadth first is recommended as it is not possible
to trap it in dead ends. The best first search permits the switching between
paths thus benefiting from both the approaches.
Branch and Bound (BB) is a general algorithm that is used to find optimal
solutions of different optimization problems, particularly in discrete and
combinatorial optimization. It contains a systematic detail of each candidate
solution, in which big subsets of candidates giving no results are rejected in
groups, by making use of the higher and lower approximated limits of the
quantity that are undergoing optimization.
Constraint satisfaction is a usual problem the goal of which is finding values
for a set of variables which would satisfy a given set of constraints.
One of the core concepts of research in AI is knowledge representation.
While declarative knowledge is represented as a static collection of facts
with a set of procedures for manipulating the facts, procedural knowledge
is described by an executable code that performs some action.
Declarative knowledge is knowledge of facts.
Procedural knowledge involves knowledge of formal language, symbolic
representations and knowledge of rules, algorithms, and procedures. Self - Learning
Material 127
Problem Space, Search There are four approaches to the goals of AI; computer systems that act
and Knowledge
Representation like humans, programs that simulate the human mind, knowledge
representation and mechanistic reasoning and intelligent or rational agent
design.
NOTES
Data is viewed as a collection of disconnected facts.
Information emerges when relationships among facts are established and
understood.
Knowledge emerges when relationships among patterns are identified and
understood.
Wisdom is the understanding of the principles of relationships that describe
patterns.
The Knowledge Model defines that as the level of ‘connectedness’ and
‘understanding’ increases, our progress moves from data through information
and knowledge to wisdom.
Knowledge Representation (KR) is basically a replacement for an actual
thing. It enables an entity to decide the end result by thinking rather than
acting.
Rules, frames, semantic networks and tagging are techniques of
representation that have come up from human information processing
theories.
Processing of natural languages provides the machines the ability to read
and understand the languages spoken by humans.
There are two main important attributes in Knowledge Representation:
Instance and Isa.
A semantic network is a tool used in knowledge representation that consists
of a structure of semantic terms.
The frame problem is the challenge of representing the effects of action in
logic without having to represent explicitly a large number of intuitively
obvious non-effects.
Short-Answer Questions
1. What is problem space?
2. Give the definition of production system.
3. Define heuristic search.
4. What do you understand by the travelling salesman problem?
5. List the characteristics of heuristic search.
6. State the best first search.
7. What is branch and bound?
8. Define problem reduction.
9. Write a short note on dynamic constraint satisfaction problems.
10. How will you define the mean end analysis?
11. What role is played by knowledge in AI programs?
12. Write a short note on ‘artifacts of knowledge’.
13. Write a short note on ‘knowledge progression in AI systems’.
14. What do you understand by inferential efficiency and naturalness of
knowledge representation systems?
Long-Answer Questions
1. Explain how Big-O notation is used for measuring complexity of an algorithm.
2. Discuss briefly about the production systems. Give characteristics with the
help of examples.
3. What do you mean by heuristic search? Discuss heuristic search techniques
with the help of examples.
4. Describe the types of AI search techniques with the help of suitable example.
5. Describe the branch and bound algorithm technique with the help of suitable
example. Self - Learning
Material 129
Problem Space, Search 6. Explain briefly about the problem reduction. Give appropriate examples.
and Knowledge
Representation 7. What is constraint satisfaction? Discuss briefly about the types of CSPs.
8. Give the name of architectures that enable the utilization of MEA.
NOTES 9. Discuss the most difficult problems associated with knowledge
representation in AI.
10. What are the differences between procedural and declarative knowledge?
Give appropriate examples.
11. What do you understand by knowledge representation? Describe the various
methods used to represent knowledge in AI systems.
12. What are the advantages of using semantic networks in enterprises?
Self - Learning
130 Material
Predicate Logic and
3.0 INTRODUCTION
In Artificial Intelligence (AI), predicate logic provides the logic for dealing with
complex real-life scenarios. Prior to predicate logic, the popular way of knowledge
representation was propositional logic, which is suitable in situations where results
are either true or false, but not both. This limitation of propositional logic gives
way to the inception of predicate logic, using which you can not only represent the
existing facts but also derive new facts from the existing ones. However, the facts,
which are represented using predicate logic need to be true, otherwise the results
derived from the facts will be wrong. In classical logic, modus ponendo ponens
(Latin term for the way that affirms by affirming) is abbreviated as MP or Modus
Ponens. It is a valid and simple argument form and is also referred to as affirming
the antecedent or the law of detachment. It is closely related to another valid form
of argument, modus tollens.
Resolution is a procedure in which the statements are converted into a standard
form. Here, proofs are attempted to validate using refutation. One inference
procedure using resolution is known as refutation. Refutation is also known as
proof by contradiction and an absurdum. In other words, the resolution attempts
to prove a statement by showing that the negation of that statement produces a
contradictable result to the known statement. Natural deduction methods perform
deduction in a manner similar to reasoning used by humans, e.g., in proving
Self - Learning
mathematical theorems. Forward chaining and backward chaining are natural Material 131
Predicate Logic and deduction methods. Backtracking is used in manyAI applications to solve a number
Rule Based System
of schemes to improve its efficiency.
Rule-based system, also known as expert system or production system has
immense importance in the building of knowledge system. In these systems, the
NOTES
domain expertise is encoded in the form of ‘if–then’ rules. This enables a modular
portrayal of the knowledge, which facilitates its updating and maintenance, we
learn about forward reasoning, backward reasoning, conflict resolution and the
use of non-backtracking. Forward reasoning, which is also known as forward
chaining, breaks a task down into manageable and understandable steps. In case
of backward chaining, the teaching process begins at the end of the sequence and
moves to the beginning. It is used when it seems easier to teach a student a task
from the last step instead of the first.
Matching is required between the current state and the preconditions of
rules for better searching. This searching involves choosing from the rules, which
can be applied at some particular point that can lead to a solution. Search control
knowledge is defined as knowledge regarding different paths that are most likely
to lead quickly to a goal state.
Backtracking and non-backtracking methods and algorithms are used by
AI researchers to develop various applications of AI, such as game playing, expert
system, robotics, etc. AI uses genetic programming, which is a technique for getting
programs to solve a task by implementing the random List Processing Programming
(LISP) programs and selecting the fittest in millions of generations.
In this unit, you will learn about the overview of predicate logic, modus
ponens, resolution, natural deduction, dependency directed backtracking, rule
based systems, procedural vs declarative knowledge, forward or backward
reasoning, matching and conflict resolution and use of non-backtrack.
3.1 OBJECTIVES
After going through this unit, you will be able to:
Learn about the predicate logic
Represent simple facts in predicate logic
Define instance and is-a relationship
Discuss about the modus ponens
Elaborate on the resolution
Define natural deduction
Explain the dependency-directed backtracking
Learn about the rule based systems
Illustrate the procedural vs declarative knowledge
Analyse the forward vs backward reasoning
Understand the matching and conflict resolution
Self - Learning Discuss the use of non-back track
132 Material
Predicate Logic and
3.2 OVERVIEW OF PREDICATE LOGIC Rule Based System
From the preceding example, it can be easily concluded that Ram is not sitting
from the fact that Ram is running. Thus, it is seen that propositional logic is suitable
in situations in which the results are either true or false, but not both. However, to
the contrary of ease in use, propositional logic is not so powerful that it can represent
all types of facts and assertions used in computer science and mathematics.
Sometimes, it is also not ab1e to express certain types of relationships. To
demonstrate the limitations of propositional logic, let us consider the following
examples:
Let us consider the assertion, Tommy is a dog. In propositional logic, it can be
represented as follows:
TOMMYDOG
A similar sentence of this type can be, Maxi is a Dog, which can be represented in
propositional logic as follows:
MAXIDOG
However, from the above two representation of propositional logic, you cannot
differentiate between TOMMY and MAXI. You can represent them in a better
way as:
DOG(TOMMY)
DOG(MAXI) Self - Learning
Material 133
Predicate Logic and Another limitation of propositional logic comes while representing the relationships
Rule Based System
in sentences. For example, the assertion—All dogs have tails—can be represented
in propositional logic as follows:
TAILEDDOGS
NOTES
However, from this representation, you cannot capture information about
the relationship that whether it is talking about two dogs or dogs in general.
Consider the assertion, x is greater than 10, where x is a variable. However,
from this assertion, you cannot ensure whether it is true or false, unless you know
the value of x. Therefore, it is not a proposition and proposition logic cannot deal
with such types of sentences.
Propositional logic also cannot capture the patterns involved in the logical
equivalences in the following sentences:
1. Not all dogs are male, which is equivalent to, some dogs are not male.
2. Not all integers are odd, which is equivalent to, some integers are not odd.
3. Not all bikes are expensive, which is equivalent to, some bikes are not
expensive.
In the preceding sentences, each of the propositions is considered
independent of the other in propositional logic. For example, if P represents that
not all dogs are male and Q represents that some dogs are not male, then there is
no mechanism in propositional logic to represent the fact that P is equivalent to Q.
Thus, from the various examples representing the limitations in predicate
logic, it can be inferred that the main limitations in propositional logic are the need
of variables and quantification.
Basic Concepts of Predicate Logic
To overcome the limitations in propositional logic, you need to understand the
various basic concepts of predicate logic. These include:
Predicate
Terms
Quantifiers
Free and bound variables
Predicate
A predicate can be defined as a relation, which enables you to bind two atoms
together. For example, you can represent the assertion; Jimmy likes pastries, in
predicate logic as follows:
LIKES(Jimmy, pastries)
Here, the predicate is LIKES, which binds two atoms, Jimmy and pastries. The
two atoms represent the arguments for the predicate LIKES. In a generalized
form, this predicate can be represented as follows:
LIKES(x, y)
Self - Learning
134 Material
Where, x and y are variables representing x likes y. Moreover, you can also use Predicate Logic and
Rule Based System
function as an argument of a predicate. For example, Jimmy’s mother is Jack’s
mother, can be represented as follows:
MOTHER(mother(Jimmy), Jack)
NOTES
Here, MOTHER is a predicate and mother (Jimmy) is a function, which
indicates Jimmy’s mother.
Terms
The terms are the arguments used in a predicate. For example, you can represent
Jimmy is Jack’s sister, in predicate logic as follows:
SISTER(Jimmy, Jack)
Where, Jimmy and Jack are terms for the predicate SISTER. You can also use a
function as a term. For example, in predicate logic, Jack’s sister is Sam’s sister,
can be represented asfollows:
SISTER(sister(Jack), Sam)
Here, sister (Jack) is a function that indicates Jack’s sister. However, sister (Jack)
is an argument of the predicate, SISTER and hence it is also a term. You can
define the terms using the following rules:
A constant is a term.
A variable is a term.
If f is a function and x1, x2 and x3 are terms, then f(x1, x2. x3) is also a
term.
Quantifiers
A quantifier is defined as a symbol, which allows you to quantify a variable in a
logical expression. In other words, a quantifier allows you to declare or identify
the range or scope of the variables used in the logical expression. You can use
several quantifiers in a logical expression including some, much, many, few, little, a
lot, etc. However, basically in AI application, two primary quantifiers are used
with predicate logic. These are:
Universal quantifier: The universal quantifier in predicate logic allows
you to formalize the concept that something is true for everything or every
relevant thing. The symbol used to represent the universal quantifier is .
For example, if x is a variable, then you can read x as any one of the
followings:
o For all x
o For each x
o For every x
Existential quantifier: The existential quantifier in predicate logic allows
you to formalize the concept that something is true for something or for at
least one relevant thing. The symbol used to represent the universal quantifier
is . For example, if x is a variable, then you can read x as any one of the
following:
o There exists a b
o For some b
Self - Learning
o For at least one b Material 135
Predicate Logic and Table 3.1 shows the use of the universal and existential quantifiers in predicate
Rule Based System
logic.
Table 3.1 Universal and Existential Quantifiers in Predicate Logic
Self - Learning
Material 137
Predicate Logic and
Rule Based System
NOTES
You cannot put forward any statement to prove person (Pratap) as according
to the given sentences, Man (Pratap). Therefore, you cannot prove that Pratap
was not loyal to Rana. However, you can solve this problem by adding the following
statement:
9. All men are people
x: man(x) person(x)
Now, you can easily prove that Pratap was not loyal to Rana. This process
of getting an answer by starting from the goal itself is known as backward chaining.
The reverse of this process is called forward chaining.
To understand the concept of backward chaining more clearly, let us consider
another example. Suppose, in this context, the following statements are available:
1. Jimmy likes all types of fruit.
x: fruit(x) likes(Jimmy, x)
2. Mangoes are fruits.
fruit(x)
3. Anything anyone eats is not killed by is a fruit.
x: y: eats(x, y) killedby(x, y) à fruit(y)
4. Jack eats an apple and is alive.
Eats(Jack, apple) killedby(Jack, apple)
(Here it is assumed that alive is same as not killed
by)
5. Sam eats everything Jack eats.
x: fruit(x) eats (Jack, x) eats(Sam, x)
Now, consider that you need to answer whether Jimmy likes apple. You
can prove that Jimmy likes apple applying backward chaining, as shown in Figure
3.3.
Self - Learning
138 Material
Fig. 3.3 Example to Show the Use of Backward Chaining
3.2.2 Representing Instance and is a Relationships Predicate Logic and
Rule Based System
An instance is a binary predicate containing two arguments, where the first argument
is an object of the class represented by the second argument. For example, consider
the following predicate logic: NOTES
cat(Billy)
This statement represents a unary predicate containing a single argument
and the meaning of this statement is that Billy is a cat. However, you can use an
object of the class cat to represent this predicate as an instance, shown as follows:
Instance(Billy, cat)
Here, Billy is the object of the class cat.
However, an instance does not directly represent an isa predicate, where
the isa predicate is also a binary predicate used to simplify the logical representations.
To understand the instance and isa predicates and their relationships, let us consider
the first five statements of the first example representing simple facts in predicate
logic.
Pratap was a man.
Pratap was a Rajpoot.
All Rajpoots were Indians.
Rana was a ruler.
All Indians were either loyal to Rana or hated him.
Figure 3.4 shows the instance and isa predicates for the preceding statements
and the relationship between the predicates.
Self - Learning
Fig. 3.4 Relationship between Instance and is a Predicates Material 139
Predicate Logic and
Rule Based System
3.2.3 Modus Ponens
In classical logic, modus ponendo ponens (Latin term for the way that affirms
by affirming is abbreviated as MP or Modus Ponens. It is a valid and simple
NOTES argument form and is also referred to as affirming the antecedent or the law of
detachment. It is closely related to another valid form of argument, modus tollens.
Modus ponens is a common rule of inference and takes the following form:
If P, then Q.
P.
Therefore, Q.
Formal Notation: The modus ponens rule may be written in sequent
notation as:
The argument form has two premises. The first premise is the ‘if–then’ or
conditional claim that P implies Q. The second premise is that P, the antecedent of
the conditional claim is true. These two premises can be logically concluded that
Q which is the consequent of the conditional claim and is true. In Artificial
Intelligence, modus ponens is also called forward chaining.
The following example is an argument that fits the form modus ponens:
If today is Monday, then I will go to work.
Today is Monday.
Therefore, I will go to work.
This is a valid argument but it has no bearing on whether any of the statements
in the argument is true for modus ponens for a sound argument. The premises
must be true for any true instances of the conclusion. An argument is valid but
unsound if one or more premises are false. If an argument is valid and all the
premises are true then the argument is termed sound argument. For example, one
may go to work on Wednesday because the reasoning for going to work is unsound.
The argument is only sound for Monday. A propositional argument that uses modus
ponens is deductive.
The Curry-Howard correspondence between proofs and programs that
are related to modus ponens function application: if f is a function of type P Q
and x is of type P, then f x is of type Q. The validity of modus ponens in classical
two-valued logic can be clearly demonstrated by use of a truth table.
p q pq
T T T
T F F
F T T
F F T
Self - Learning
140 Material
In instances of modus ponens it is assumed that premises p q is true and Predicate Logic and
Rule Based System
p is true. Therefore, whenever p q is true and p is true, q must also be true.
3.2.4 Resolution
Resolution is a procedure in which the statements are converted into a standard NOTES
form. Here, proofs are attempted to validate using refutation. One inference
procedure using resolution is known as refutation. Refutation is also known as
proof by contradiction and an absurdum. In other words, the resolution attempts
to prove a statement by showing that the negation of that statement produces a
contradictable result to the known statement.
Conversion to Clause Form
Suppose, we know that all Indians who know Amit hate ABC. We can represent
this statement as follows:
x : [Indians(x) Know (x, Amit)]
[hate(x, ABC) V ( y: z:hate(y,z)
This formula requires a complex matching process in a proof. It is very
important to match the pieces with the formula. The process of matching the pieces
with the formula would have been easier, if the formula were in a simpler form.
Basis of Resolution
The procedures in the resolutions are very easy, as the two clauses in the resolution
procedures are known as the parent clauses. The parent clauses are compared,
which gives rise to the new clauses. The new clause states ways in which the
interaction between the parent clauses must take place. This can be explained
with the help of the following example:
Summer V Winter
Summer V Hot
By recalling both the clauses we can state that they must be true.
From the example we can recall that at any point one of Summer and
Summer will be true. If the summer is true, then surely hot will be the truth of
the second clause. With the help of these two clauses we can deduce the statement
as:
Winter V Hot
Taking the two clauses each containing the same literal is the way the
resolution operates. The literals in the resolution must appear in the positive and
the negative forms in the respective clauses.
The contradiction in the resolutions can be found if the clauses are produced in the
empty clause, such as:
Summer
Summer
These two clauses will give the result as an empty clause. If there is any
contradiction in the clauses, then it can be found. If there is no contradiction, then
the clauses in the procedures can never be found.
Self - Learning
Material 141
Predicate Logic and Resolution in Propositional Logic
Rule Based System
To explain the working of the resolution, first the procedures of the resolutions for
the propositional logic must be presented. To understand how the resolution works,
NOTES consider the following example:
All Indians are Asians.
Aryabhatta is an Indian.
Therefore, Aryabhatta is an Asian.
Or, more generally:
X, P(X) implies Q(X).
P(a).
Therefore, Q(a).
Resolution in prepositional logic can also be explained with the help of an
example. First, the axioms are converted in a clause form, shown as follows:
Given Axioms Converted to Clause Form
A A
(E G) J E VG V J
(O V Z) G O V G
Z Z V G
T
After the axioms are converted in the clause form, then J is negated,
producing J which is in a clause form. Then the pair of the clauses is selected in
order to resolve them together. All the pairs cannot be resolved except those
which have the complimentary literals that will produce the empty clause.
Another way of viewing the resolution process is that all the clauses that are
true are taken and the new classes are developed in place of the original clauses
that are true.
Resolution in Predicate Logic
To prove things in predicate logic, you need two things. First, you need to determine
what inference rules are valid and second, you need to know a good proof
procedure that will allow proving the things with the inference rules in an efficient
manner. You can explain the resolution in predicate logic with the help of the
following example:
Man(Amit)
man(x1) V Amit(x1)
The literal man (Amit) can be unified with the literal man (x1) with the
substitutions Amit/x1, determining that for x1= Amit, man(Amit) is false. But the
two literals cannot be cancelled out. Therefore, the resolutions in the predicates
determine the importance of converting the variables into the clause form.
From the following example, you can understand how the resolution can be
used to prove new things. You can use the resolutions to prove the things about
Self - Learning
142 Material Amit. First, the statements have to be converted into the clauses.
Loyal(Amit, ABC) Predicate Logic and
Rule Based System
From this example, many more resolutions could have been generated. The
actual goal of the preceding statement is to prove whether Amit hated ABC. In
that case, the statement would have been:
NOTES
hate(Amit, ABC)
Question Answering
Question answering is basically a type of information retrieval in which a
system called Question Answering (QA) system is used to retrieve the answers
for the questions, which are written in the natural language. The question answering
is the most complex natural language processing techniques as compared to other
techniques. The QA system uses the text documents as their knowledge source.
The QA system adds different natural language techniques to create a single
processing technique and then uses the newly developed technique to search for
answers to the questions written in the natural language. The QA system contains
a question classifier module that is used to determine the types of questions and
answers.
3.2.5 Natural Deduction
Natural deduction methods perform deduction in a manner similar to reasoning
used by humans, e.g., in proving mathematical theorems. Forward chaining and
backward chaining are natural deduction methods. These are similar to the algorithms
described earlier for propositional logic, with extensions to handle variable bindings
and unification.
Backward chaining by itself is not complete, since it only handles Horn
clauses (clauses that have at most one positive literal). Not all clauses are Horn;
for example, ‘every person is male or female’ becomes ¬ Person(x) Male(x)
Female(x) which has two positive literals. Such clauses do not support backward
chaining.
Splitting can be used with back chaining to make it complete. Splitting makes
assumptions (e.g., ‘Assume x is Male’) and attempts to prove the theorem for
each case.
3.2.6 Dependency
Conceptual dependency theory is used as a model of natural language understanding
used in artificial intelligence systems. Schank developed the model for representing
knowledge for natural language. Independent usage of words in the input, i.e., two
sentences which are identical in meaning have a single representation. The system
is also used to draw logical inferences.
The dependency model uses the following basic representational tokens:
Real world objects, each with some attributes
Real world actions, each with attributes
Times
Locations
Self - Learning
Material 143
Predicate Logic and A set of conceptual transitions then act on this representation as an ATRANS
Rule Based System
is used to represent a transfer, such as ‘give’ or ‘take’ while a PTRANS is used to
act on locations, such as ‘move’ or ‘go’. An MTRANS represents mental acts,
such as ‘tell’, etc. For example, a sentence ‘Mohan gave a pen to Sudhir’ can be
NOTES represented as the action of an ATRANS on two real world objects Mohan and
Sudhir.
A dependency in the Unified Modeling Language (UML) always exists
between two defined elements if a change in the definition results in a change to
the other. In UML is indicated using a dashed line pointing from the dependent to
the independent element.
If there are more than one dependent or independent participates in the
dependency then the arrows with their tails on the dependent elements are connected
to the tails of one or more arrows with their heads on the independent elements. A
small dot is placed on the junction point along with a note on the dependency.
Dependency is also a model-level relationship that describes the need to investigate
the model definition of the dependent element for possible changes if the model
definition of the independent element is changed.
A dependency is a semantic relationship that changes the independent
modelling element which may affect the semantics of the dependent modeling
element. It also identifies a set of model elements that needs other model elements
for their specification or implementation. The arrow represents a dependency
specifying the direction of a relationship and not the direction of a process.
3.2.7 Directed Backtracking
Backtracking is used in many AI applications to solve a number of schemes to
improve its efficiency. Such schemes are termed dependency-directed
backtracking, or sometimes intelligent backtracking and can be classified as follows:
Lookahead Schemes
These schemes are used to control that which variable to be instantiated next or
what value to be selected from the consistent options.
Variable Ordering: This approach helps to select a variable for making
the rest of the problem easier for solving. Basically, this is done by selecting
the variable involved in the most of the constraints.
Value Ordering: A value is selected for maximizing the number of options
available for future assignments.
Look-Back Schemes
In backtracking, look-back schemes are used to control the specific decisions of
where and how to go back in case of dead-ends. Basically, there are two basic
approaches:
Go Back to Source of Failure: It changes only those past decisions that
caused the error and leaves other past decisions unchanged.
Constraint Recording: It record the ‘reasons’ for the dead-end so that
Self - Learning they can be avoided in future search.
144 Material
Dependency-directed backtracking is also used in truth-maintenance Predicate Logic and
Rule Based System
systems. It follows a variable that assigns some value and a justification for that
value is recorded. A default value is then assigned to some other variable and
justified. The system now checks whether the assignments have violate any
constraints. If there is any then it is records as the two which are not simultaneously NOTES
acceptable. This record is used for justifying the choice of some other variable and
continues until a solution is found. Such systems never perform redundant
backtracking.
When you match the LHS on a database that consists of the start symbol
‘S’ it provides a generator for strings in the language. On the other hand, matching
on the RHS of the same set of rules provides a recognizer for the language. You
can also change the process slightly to get a top-down recognizer by interpreting
the elements in the LHS as goals to be achieved by the correct matching of elements
in the RHS. In the aforementioned case the rules ‘unwind.’ Thus you can use the
same set of rules in several manners. Note, however, that while doing so, you get
many different systems having characteristically different control structures and
behaviour. The organization and assessment of the set of rules is also a vital issue.
The basic scheme is the fixed, i.e., total ordering; however, elaborations quickly
grow more intricate. The term conflict resolution is used to refer to the process of
selecting a rule.
Self - Learning
Material 149
Predicate Logic and Database
Rule Based System
In the most basic production system the database is just a collection of symbols
referring to the state of the world; however, the correct interpretation of these
symbols wholly depends generally on the nature of the application. The database
NOTES
is interpreted as modelling the composition of a few memory mechanisms, i.e.,
Short-Term Memory (STM), wherein each symbol represents some ‘chunk’ of
knowledge for those systems intended to explore symbol-processing aspects of
human cognition.
Interpreter
The interpreter is the fountain of the whole variation that exists among different
systems; however, it may be seen, in the fundamental terms, as a select-execute
loop in which one rule, which is applicable to the current state of the database, is
chosen and then executed. Its action brings modifications in the database and the
select phase restarts. Although selection is at times a process of choosing the first
rule that matches the current database, it becomes crystal clear why this cycle is
often termed as a recognize-act, or situation-action, loop. This alternation between
selection and execution is an essential characteristic of production system
architecture, which is totally responsible for one of its most fundamental elements.
By selecting each new rule for execution, based on the total contents of the
database, you effectively carry out a complete re-evaluation of the control state of
the system at every cycle. This is very different from the already set approaches in
which control flow is generally dependent on just a small fraction of the total
number of state variables, and is typically the decision of the process currently
executing. Production systems are thus sensitive to any change that takes place in
the entire environment, and highly responsive to such changes within the scope of
a single execution cycle. Of course, the price of such responsiveness is the calculation
time required for the re-evaluation.
Key Features of Rule-Based Systems
The following are the key features of the rule-based systems:
Practical human experience and expertise can often be captured in the form
of if/then rules.
Instead of standard programming control strategies, rule-based systems
are more flexible since those rules, which are appropriate in a certain situation,
are dynamically chosen and combined.
A rule-based system explains its results by recognizing the rules which
resulted in a specific solution and describing the conditions which lead to
that particular rule to be used.
Rule-Based System Architecture
The following is the rule-based system architecture:
A collection of facts
A collection of rules
An inference engine
Self - Learning
150 Material
You might look forward to: Predicate Logic and
Rule Based System
See which new facts can be derived at, and
Ask whether a fact is implicit in the knowledge base and already known
facts (Refer Figure 3.5). NOTES
3, select 1 rule to “fire”
Production Rules (conflict resolution) Working
pattern > action Memory
“ rules of use”
2. “ trigger”
appropriate
rules
Inference Engine
(control)
NOTES
156
Self - Learning
Material
Reject(P,c): Return true only if the partial candidate c is not worth completing.
Accept(P,c): Return true if c is a solution of P, and false otherwise. Predicate Logic and
Rule Based System
First(P,c): Generate the first extension of candidate c.
Next(P,s): Generate the next alternative extension of a candidate, after the
extension s. NOTES
Output(P,c): Use the solution c of P, as appropriate to the application.
The backtracking algorithm reduces them to the call bt(root(P)), where bt
is the following recursive procedure:
procedure bt(c)
if reject(P,c) then return
if accept(P,c) then output(P,c)
s first(P,c)
while s do
bt(s)
s next(P,s)
Algorithm
The algorithm for backtracking method is defined in the following way:
bool finished = FALSE; /* found all solutions yet? */
backtrack(int a[], int k, data input)
{
int c[MAXCANDIDATES]; /* candidates for next position */
int ncandidates; /* next position candidate count */
int i; /* counter */
if (is_a_solution(a,k,input))
process_solution(a,k,input);
else {
k = k+1;
construct_candidates(a,k,input,c,&ncandidates);
for (i=0; i<ncandidates; i++) {
a[k] = c[i];
backtrack(a,k,input);
if (finished) return; /* terminate early */
}
}
}
Non-Backtracking System
Non-backtracking system is the opposite of backtracking system. It uses n-bit
speed to find solutions to a computational problem. It supports one-sided error
with fixed probability. The uses of non-backtracking are as follows:
Search plays an important role in knowledge discovery in databases, such
as Knowledge Discovery in Database (KDD) and data mining.
Self - Learning
Material 157
Predicate Logic and Non-backtracking system follows a search process to explore the useful
Rule Based System
knowledge from given data.
It uses prune-search algorithm to determine the basic search techniques
and highlight their performance and complexity.
NOTES
It is also used in systematic enumerative search methods, including best-
first search, depth-first branch-and-bound and iterative deepening and
neighborhood search methods, including gradient descent, artificial networks
etc.
By exploiting the dependency information, it recovers that information lost
by backtracking and thus avoids the wasteful repetition of computation.
The non-backtracking algorithm maintains an (extended) dependency set
A = (U;D;F), which is defined in the algorithm, where U and F are sets of
pairs of the form ‘s S’ and D is a set of triplets of the form ‘s S’.
It is used in functional data structures, which is also implemented in top-
down algorithm.
The non-backtracking Knuth-Morris-Pratt algorithm is frequently used in
hash table to delegate each empty entry in the parsing table that is filled with
a pointer to a special error routine. This algorithm provides significant speed
over Brute-force attack. Brute-force string matching compares a given
pattern with all substrings of a given text. Those comparisons between
substring and pattern, proceed character by character unless a mismatch is
found. Whenever a mismatch is found, the remaining character comparisons
for that substring are dropped and the next substring is selected immediately.
The brute-force searching usually involves automatic shifts upon mismatch
to avoid unnecessary comparisons.
The non-backtracking method indirectly supports in the processing of
Augmented Transition Network (ATN) parsing. This network is used in
transmitting the data especially data trafficking for virtual world, applications
of multimedia, speech recognition, game playing etc.
Self - Learning
158 Material
Predicate Logic and
3.4 ANSWERS TO ‘CHECK YOUR PROGRESS’ Rule Based System
3.5 SUMMARY
Predicate logic is one of the important knowledge representation languages,
which is concerned with deriving the real-world facts.
Predicate logic enables you to use variables and quantifiers for representing
the real-world facts as statements, which are written as wff’s.
An instance is a binary predicate containing two arguments, where the first
argument is an object of the class represented by the second argument.
In classical logic, modus ponendo ponens (Latin term for the way that affirms
by affirming is abbreviated as MP or Modus Ponens. It is a valid and simple
argument form and is also referred to as affirming the antecedent or the law
of detachment. It is closely related to another valid form of argument, modus
tollens.
The Curry-Howard correspondence between proofs and programs that
are related to modus ponens function application: if f is a function of type
P Q and x is of type P, then f x is of type Q.
Resolution is a procedure in which the statements are converted into a
standard form.
The procedures in the resolutions are very easy, as the two clauses in the
resolution procedures are known as the parent clauses.
Natural deduction methods perform deduction in a manner similar to
Self - Learning reasoning used by humans, e.g., in proving mathematical theorems. Forward
160 Material chaining and backward chaining are natural deduction methods.
Backtracking is used in many AI applications to solve a number of schemes Predicate Logic and
Rule Based System
to improve its efficiency. Such schemes are termed dependency-directed
backtracking.
Lookaheah schemes are used to control that which variable to be instantiated
NOTES
next or what value to be selected form the consistent options.
A rule-based system refers to a system of encoding an expert’s wisdom in
a relatively narrow area into an automated system.
A rule-based system can be considered as being similar to a multi-threaded
system.
Specific is the based on the number of conditions offered by rules. From
the conflict set, the rule having maximum conditions is chosen. This is done
on the assumption that it has the most relevance to the existing data if it has
the most conditions.
A pure production system generally comprises three fundamental
components: a set of rules, a database and an interpreter for the rules.
A knowledge-based system represents, acquires and applies knowledge
for a specific objective.
In this approach of knowledge representation, knowledge is stored in the
form of procedures. The procedures contain the logic in the form of a code,
which specifies when and how the actions are to be performed. LISP is one
of the languages used for writing procedures.
In AI, modus ponens is also called forward reasoning. Modus ponens is a
common rule of inference, which is a function from the sets of formulae.
Matching is required between the current state and the preconditions of
rules for better searching. This searching involves choosing from the rules,
which can be applied at some particular point that can lead to a solution.
Search control knowledge is defined as knowledge regarding different paths
that are most likely to lead quickly to a goal state.
Backtracking and non-backtracking methods and algorithms are used by
AI researchers to develop various applications of AI, such as game playing,
expert system, robotics, etc.
Short-Answer Questions
1. What is the difference between propositional logic and predicate logic?
2. Give some examples of propositional logic.
3. Why do you need predicate logic?
4. Define the term modus ponens.
5. What is resolution?
6. Name two methods of measuring used in natural deduction.
7. State the variable ordering.
8. What are the main features of rule-based systems?
9. How will you define the procedural vs declarative knowledge?
10. Differentiate between forward and backward reasoning.
11. What is matching control knowledge?
12. Define the term list processing programming.
Long-Answer Questions
1. Explain resolution from the point of view of propositional logic and predicate
logic.
2. Give an example to represent simple facts with predicate logic.
3. Briefly explain about the modus pones. Give appropriate examples.
4. Discuss about the concept of resolution with the help of giving examples.
5. Explain briefly about the natural deduction. Give appropriate examples.
6. Explain the role of an interpreter in a rule-based system.
7. Differentiate between procedural and declarative knowledge with the help
of example.
8. Illustrate the concept of forward and backward reasoning. Give appropriate
examples.
9. Analyse the matching and conflict resolution with the help of examples.
10. Describe the use of non-back track with the help of giving examples.
Self - Learning
Material 163
Structured Knowledge
KNOWLEDGE
NOTES
REPRESENTATION AND
SEMANTIC NET
Structure
4.0 Introduction
4.1 Objectives
4.2 Semantic Nets
4.2.1 Frames
4.2.2 Slot Exceptions
4.2.3 Slot Values as Object
4.3 Handling Uncertainties
4.3.1 Probabilistic Reasoning
4.4 Use of Certainty Factor
4.5 Fuzzy Logic
4.6 Answers to ‘Check Your Progress’
4.7 Summary
4.8 Key Terms
4.9 Self-Assessment Questions and Exercises
4.10 Further Reading
4.0 INTRODUCTION
A semantic network, or frame network is a knowledge base that represents semantic
relations between concepts in a network. This is often used as a form of knowledge
representation. It is a directed or undirected graph consisting of vertices, which
represent concepts, and edges, which represent semantic relations between
concepts, mapping or connecting semantic fields. A semantic network may be
instantiated as, for example, a graph database or a concept map. Typical
standardized semantic networks are expressed as semantic triples. Semantic
networks are used in natural language processing applications such as semantic
parsing and word-sense disambiguation.
Frames are an artificial intelligence data structure used to divide knowledge
into substructures by representing ‘Stereotyped Situations’. They were proposed
by Marvin Minsky in his 1974 article ‘A Framework for Representing Knowledge’.
Frames are the primary data structure used in artificial intelligence frame language;
they are stored as ontologies of sets.
Slot and filler structures are types of data structures that are used to implement
property inheritance. In these structures, knowledge is represented using objects
and their attributes. Each object is connected with other objects or attributes using
a relation. For example, a national-team is an object and player John, who is also
an object, is a member of that team. In this example, the national-team and John
are connected to each other using a relation ‘is-a-member-of’. Conditional planning Self - Learning
Material 165
Structured Knowledge is a way to deal with uncertainty when planning. It is a planning method for managing
Representation
and Semantic Net bounded indeterminacy. It is a way to deal with uncertainty by checking what is
actually happening in the environment at predetermined points in the plan.
Probabilistic reasoning involves the use of probability and logic to deal with
NOTES
uncertain situations. The result is a richer and more expressive formalism with a
broad range of possible application areas. Probabilistic logics attempt to find a
natural extension of traditional logic truth tables: the results they define are derived
through probabilistic expressions instead. A difficulty with probabilistic logics is
that they tend to multiply the computational complexities of their probabilistic and
logical components. Other difficulties include the possibility of counter-intuitive
results, such as those of Dempster–Shafer theory in evidence-based subjective
logic. The need to deal with a broad variety of contexts and issues has led to many
different proposals.
Certainty factors theory is an alternative to Bayesian reasoning which is
used when reliable statistical information is not available or the independence of
evidence cannot be assumed. Another methodology which is used to deal with
reasoning is fuzzy logic which is a multi-valued logic derived from fuzzy set theory.
In this unit, we will discuss these three tools of statistical reasoning in detail.
In this unit, you will learn about the semantic nets, frames, slot exceptions,
slot values as object, handling uncertainties, probabilistic reasoning, use of certainty
factor and fuzzy logic.
4.1 OBJECTIVES
After going through this unit, you will be able to:
Learn about the semantic nets
Explain about the frames
Analysis the slot exceptions
Elaborate on the handling uncertainties
Discuss about the probabilistic reasoning
Illustrate the use of certainty factor
Define fuzzy logic
NOTES
NOTES
Self - Learning
168 Material
Structured Knowledge
Representation
and Semantic Net
NOTES
Self - Learning
170 Material
slots acquired through inheritance. The number of slots is limited by the Structured Knowledge
Representation
amount of RAM available. With the exception of the keywords isa and name, and Semantic Net
which are reserved for usage in object patterns, the name of a slot can be
any symbol.
NOTES
The class precedence list for the instance’s class is checked in order from
most specific to most generic to determine the set of slots for an instance (left to
right). A class’s superclasses are less specific than its subclasses. With the exception
of no-inherit slots, slots defined in any of the classes in the class precedence list
are assigned to the instance. With the exception of composit, if a slot is inherited
from multiple classes, the definition provided by the more specific class takes
precedence.
For example,
(defclass A (is-a USER)
(slot fooA)
(slot barA))
(defclass B (is-a A)
(slot fooB)
(slot barB))
A has the following class precedence list: A USER OBJECT. There will be
two slots in each instance of A: fooA and barA. B’s class precedence list is as
follows: B AN OBJECT OF A USER. There will be four slots in each instance of
B: fooB, barB, fooA, and barA.
Facets make up slots in the same way that slots make up classes. Facets
describe a slot’s default value, storage, access, inheritance propagation, source of
other facets, pattern-matching reactivity, visibility to subclass message-handlers,
automatic creation of message-handlers to access the slot, the name of the message
to send to set the slot, and constraint information. With the exception of shared
slots, each object can still have its own value for a slot.
4.2.3 Slot Values as Object
In slot and filler structures, a slot is an object or attribute and filler is a value of any
data type such as integer and string, which a slot can take. Consider an example of
slot and filler structure, as shown in Figure 4.8. In this figure, Person, Adult-Male,
Baseball-Player, Fielder, Pitcher, Pee-Wee-Reese and Three-Finger-Brown are
objects. Whereas, Right, 6-11, 7-12, .352, Equal to Handed, .262, .106 are
attributes. Attributes and objects are considered as slots in a slot and filler structure,
which can be replaced by other values called fillers.
Self - Learning
Material 171
Structured Knowledge
Representation
and Semantic Net
NOTES
Slot and filler structures are divided into two categories, which are:
Weak Slot and Filler Structure: A slot and filler structure that does not
apply any rules on the content of the structure. Examples of weak and slot
filler structure are semantic net and frame.
Strong Slot and Filler Structure: A slot and filler structure in which links
between objects are based on rigid rules. Examples of strong slot and filler
structure are CD, scripts and CYCorp.
n
n P ( p1 , p2 ,, pn ) : pi 0, pi 1 .
i 1
Following are some axiomatic characterizations of the measure of uncertainty
H n ( p1 , p2 , , pn ) which are used to get at its exact expression. For that, let X
and Y be two independent experiments with n and m values, respectively.
Let P ( p1 , p2 , , pn ) n be a probability distribution associated with X and
Q ( q1 , q2 , , qm ) m be a probability distribution associated with Y. This lead
us to write that, H nm ( P * Q ) H n ( P ) H m (Q ), for all P ( p1 , p2 ,, pn ) n ,
Q ( q1 , q2 , , qm ) m , and P * Q ( p1q1 , , p1qm , p2 q1 , , p2 qm , , pn q1 , , pn qm ) nm .
( p q , , p qm , p q , , p qm , , pn q , , pn qm ) nm . Replacing pi h( pi ) by f ( pi ), i 1, 2, , n, we get the equation
n
H n ( P) f ( pi ).
i 1
0.75 0.5
=
0.75 0.5 0.5 0.5
= 0.6.
Thus, the prior probability, P(H1 ) which was 0.5, now becomes P(H1 | E),
which is 0.6.
The Bayesian probability calculus has been supported by several arguments,
such as the Cox axioms, the Dutch book argument, arguments based on decision
theory and de Finetti’s theorem.
Richard T. Cox proved that Bayesian updating follows from several axioms,
including two functional equations and the controversial hypothesis that probability
is a continuous function.
The Dutch book argument was proposed by de Finetti and is based on
betting. A Dutch book is made when a clever gambler places a set of bets that
guarantee a profit, no matter what the outcome is of the bets. If a bookmaker
follows the rules of the Bayesian calculus in the construction of his odds, a Dutch
book cannot be made.
However, Ian Hacking noted that traditional Dutch book arguments did not
specify Bayesian updating: they left open the possibility that non-Bayesian updating
rules could avoid Dutch books. In fact, there are non-Bayesian updating rules that
also avoid Dutch books. The additional hypotheses sufficient to specify (uniquely)
Self - Learning
Bayesian updating are substantial, complicated, and unsatisfactory, according to
176 Material Bas van Fraassen’s book Laws and Symmetries.
A decision-theoretic justification of Bayesian methods was given byAbraham Structured Knowledge
Representation
Wald, who proved that every Bayesian procedure is admissible. Conversely, every and Semantic Net
admissible statistical procedure is either a Bayesian procedure or a limit of Bayesian
procedures. Wald’s result also established the Bayesian formalism as a fundamental
technique in such areas of frequentist statistics as point estimation, hypothesis testing, NOTES
and confidence intervals.
Bayesian methods have been used for hundreds of years, so there are many
examples of Bayesian inference to scrutinize. Of the tens of thousands of papers
published using Bayesian methods, few criticisms have been made of implausible
priors in concrete applications. Such criticisms are themselves welcomed by
Bayesian statisticians, as part of the inevitable revisions of science. Nonetheless,
worries about the possible problems of Bayesian methods continue to appear.
Concerns have been raised that a Bayesian view could be problematic for scientific
judgements, since a Bayesian information processor tends to confirm already
established views and to suppress controversial views. Such worries have not so
far been accompanied by experimental evidence, nor have they published examples
of implausible priors that have led to practical problems.
4.3.1 Probabilistic Reasoning
It is one of the problem solving systems that collects evidences of the problems
and modify their behaviour on the basis of the evidence. In the probabilistic
approach, PROSPECTOR, which is a representative of the system, is used to
handle uncertainty. This approach uses the Bayes’ theorem to solve the problem
of uncertainty, as shown in the code:
P(H|E)=P(E|H)P(H)/(“i P(E/Hi)P(Hi))
In this code, P refers to the probability function.
You can combine the evidence under the assumption of conditional
independence in the probability function. The formula that combines the evidence
in the probability function is as follows:
P(H|E1,E2)=P(E1|H)P(E2/H)P(H)/(“i P(E1/Hi)P(E2/Hi)P(Hi))
You can define odds in the probability function using the following formula:
O(H)=P(H)/P(¬H)=P(H)/(1-P(H))
This code shows the odds of H in the probability function.
You can define a likelihood ratio of E with respect to H, as shown in the
code:
λ(E,H)=P(E|H)/(P(E|¬H)
From this equation, odds-likelihood formulation of Bayes’ rule is derived,
as shown in the code:
O(H|E)= λ (E,H)O(H)
You can combine evidence with the odds-likelihood formulation by using
the following formula:
O(H|E1,E2)= λ(E2,H) λ(E1,H)O(H)
Self - Learning
Material 177
Structured Knowledge It is recommended to update odds than probabilities since it’s easier. You
Representation
and Semantic Net can obtain the probability by odds easily using the following formula:
P(H)=O(H)/(1+O(H))
NOTES When the information is transmitted using rules, the evidence derived from
the rule’s conclusion is uncertain. In this situation, the formula for probability with
evidence would be:
P(H|E’)=P(H|E)P(E|E’)+P(H| ¬E)P(¬E|E’)
In the above code, E’ refers to the observed evidence and E refers to the
actual and absolute evidence.
P(H|E’) is calculated using a linear interpolation between two extreme
cases, P(H|E), which is known to be true and P(H|¬E), which is known to
be false. The linear interpolation scheme uses three reference points:
When P(E|E’)=0, P(H|E’) = P(H|¬E)
When P(E|E’)=P(E), P(H|E’) = P(H)
When P(E|E’)=1, P(H|E’)=P(H|E)
Self - Learning
178 Material
The CF model is based on the following assumptions: Structured Knowledge
Representation
Faults or hypotheses are mutually exclusive and exhaustive. and Semantic Net
MB MD
CF
1 min ( MB, MD)
The preceding formula gives CFs from – 1.0 (total disbelief) to 1.0
(total belief). If MD is considered to be 0, CFs would range from 0 (complete
disbelief) to 1 (complete belief). If MD is not used, then CF will equal MB.
McAllister Scheme
David McAllister developed a technique for ‘certainty factors’ to be used in an
‘expert system’.
The function of a CF in the McAllister scheme is to determine the accuracy
or reliability of a hypothesis. A CF is neither a probability nor a truth value.
As per this scheme, a certainty factor is a number that varies from 0.0 to
1.0. A number 0.6 is assigned to a ‘suggestive evidence’. A number such as 0.8 is
assigned to a ‘strongly suggestive evidence’.
McAllister scheme allows addition of latest evidences. A positive sum would
increase certainty. The rule for addition of two positive certainty factors is as follows:
Self - Learning
180 Material CFcombine (CFa CFb) = CFa + CFb (1 – Cfa)
Where CFa and CFb are two certainty factors. The influence of the second Structured Knowledge
Representation
certainty factor is decreased from the remaining uncertainty of the first, and the and Semantic Net
result is added to the certainty of the first.
Using the Model NOTES
In order to make the certainty factor model perform satisfactorily, the following
guidelines must be followed:
To minimize the occurrence of conflicting derivations for a single hypothesis,
the condition parts of production rules drawing opposite conclusions, must
be specified as ‘mutually exclusive’ as possible.
The several pieces of evidence pertaining to a single hypothesis must be
grouped in a way that the Boolean combinations of evidence mentioned in
separate production rules are as independent as possible, and the atomic
pieces of such a Boolean combination evidence within a production rule is
as strongly correlated as possible.
The production rules must be specified in such a way that a chain of rules
that may arise while actually reasoning with the system is able to narrow the
focus of attention.
Disadvantages of certainty factors
The CF model suffers from the following two demerits:
1. The concept of modeling human uncertainty by means of numerical certainty
factors is controversial. Some people consider the formulae used for CF
model invalid.
2. This model needs more work from the user than the binary logic mode. The
user should assign a CF to every probable answer. In case the user ignores
or forgets to assign a CF for a hypothesis, then the system would assume a
default value of 0 (meaning ‘do not know’) for that hypothesis. This may or
may not be a correct interpretation of the user’s belief.
Fuzzy Control
Fuzzy control makes use of the theory of fuzzy rules. By using a procedure that
was developed by Ebrahim Mamdani, the three following steps are used to create
a fuzzy controlled machine:
1. Fuzzification: This entails the use of membership functions to graphically
describe a situation.
2. Rule Evaluation: This involves the application of fuzzy rules.
3. Defuzzification: It entails the process of obtaining the crisp or actual results.
Self - Learning
Material 183
Structured Knowledge
Representation 4.7 SUMMARY
and Semantic Net
Semantic nets help you to represent information using a set of nodes, which
NOTES are connected to each other by arcs. Each arc is directed and labelled,
which allows you to represent the relationship among nodes.
Partitioned semantic nets are a type of semantic nets that allow you to
represent quantified expressions using semantic nets.
Frames are also used to represent knowledge in the weak slot and filler
structures.
A frame is a collection of attributes and associated values to represent facts.
Attributes in the frames are called slots and the associated values are used
to define constraints, which are applied on the slots.
In slot and filler structures, a slot is an object or attribute and filler is a value
of any data type such as integer and string, which a slot can take.
It is one of the problem solving systems that collects evidences of the
problems and modify their behaviour on the basis of the evidence. In the
probabilistic approach, PROSPECTOR, which is a representative of the
system, is used to handle uncertainty.
The Certainty Factor (CF) model is the technique used to handle the
uncertainty in rule-based systems.
The first CF model was developed by Shortliffe and Buchanan in 1975 for
MYCIN.
MYCIN is an expert system used to diagnose and treat meningitis and
infections of the blood.
Shortliffe and Buchanan who developed the concept of certainty factors
got expert doctors express their level of certainty or uncertainty in MYCIN
and then determined the corresponding CFs from these levels.
Fuzzy logic is a variety of multi-valued logic that has been derived from the
fuzzy set theory. It has been used to deal with reasoning that is estimated
rather than precise.
Fuzzy logic variables may contain a truth value that can range from 0 to 1
and that is not constrained to the two truth values of classic propositional
logic.
Fuzzy logic can be implemented in various kinds of systems ranging from
simple, small, embedded micro-controllers to large, networked, multi-
channel PC or workstation-based data acquisition and control systems.
Fuzzy logic can also control systems that are non-linear and that would be
impossible to control mathematically. This will allow the control of systems
that would usually be considered to be unfeasible.
Fuzzy control makes use of the theory of fuzzy rules. By using a procedure
that was developed by Ebrahim Mamdani.
Self - Learning
184 Material
Structured Knowledge
4.8 KEY TERMS Representation
and Semantic Net
Partitioned semantic nets: They are a type of semantic nets that allow
you to represent quantified expressions using semantic nets. NOTES
Frames: They collections of attributes and associated values to represent
facts.
Weak slot and filler structure: It is a slot and filler structure, which does
not apply any rules on the content of the structure.
Strong slot and filler structure: It is a slot and filler structure in which the
links between objects are based on rigid rules.
Fuzzification: It is the process of transforming crisp values into grades of
membership for linguistic terms of fuzzy sets.
Certainty Factor (CF) model: Certainty Factor (CF) model is a technique
used to handle the uncertainty in rule-based systems.
Short-Answer Questions
1. How will you define the semantic nets?
2. What are frames?
3. Define a slot and filler structure.
4. What is a weak slot and filler structure?
5. What is a strong slot and filler structure?
6. Write a short note on:
(i) Logic programming
(ii) Probabilistic approach using Bayes’ theorem
7. What is Certainty Factor (CF)?
8. List the unique features of fuzzy logic.
9. What are the steps involved in the fuzzy approximation theorem?
10. How fuzzy controlled machines can be created?
Long-Answer Questions
1. Discuss briefly about the semantic nets with the help of relevant examples.
2. Elaborate on the frames. Give appropriate examples.
3. Briefly explain about the slot values as objects with the help of examples.
4. Explain with the help of an example, how Bayes’ theorem is useful in
probabilistic reasoning.
5. Describe the role of Certainty Factor (CF) model in statistical reasoning.
Self - Learning
Material 185
Structured Knowledge 6. Explain the concepts of ‘Measure of Belief’ and ‘Measure of Disbelief’
Representation
and Semantic Net with respect to the Certainty Factor (CF) model.
Self - Learning
186 Material
Learning and Expert
SYSTEMS
NOTES
Structure
5.0 Introduction
5.1 Objectives
5.2 Concept of Learning
5.2.1 Explanation Based Learning
5.3 Learning by Induction
5.4 Learning Automation
5.5 Learning in Neural Networks
5.6 Expert Systems
5.6.1 Need and Justification of Expert Systems
5.6.2 Stages of Expert Systems
5.6.3 Representing and Using Domain Knowledge
5.6.4 Functioning of MYCIN and Rule Induction (RI)
5.7 Answers to ‘Check Your Progress’
5.8 Summary
5.9 Key Terms
5.10 Self-Assessment Questions and Exercises
5.11 Further Reading
5.0 INTRODUCTION
In learning, Machine learning (ML) is a scientific discipline related to the design
and development of algorithms. Machine learning is very important in the research
for Artificial Intelligence (AI), in which unsupervised learning helps to find patterns
regarding the stream of input and supervised learning consists of both classification
and numerical regression. Rote Learning is consists of simply storing of computed
information. A lot of AI programs significantly improve their performance with the
help of rote learning. Problem Solving Experience (Analogy) is involves
remembering the manner in which a problem is solved. Hence, when the same
problem re-occurs, you can solve it more efficiently.
Induction algorithms form another approach to machine learning. While neural
networks are highly mathematical in nature, induction approaches involve symbolic
data. Induction methods, which are characterized as ‘Learning by example’, begin
with a set of observations. They construct rules to account for the observations and
try to find general patterns that can fully explain the observations. Learning automation
is supported by various algorithms and programs, which are based on future attempts
or past actions and are useful in learning from established failures or successes.
A neural network is a system that can resolve paradigms that linear computing
cannot. Traditionally, it is used to describe a network or circuit of biological neurons.
It also refers to artificial neural networks that are made up of artificial neurons or
nodes.
Expert systems are widely used today to solve real-world problems in the Self - Learning
areas of medicine, law, construction and manufacturing. A successful expert system Material 187
Learning and Expert is able to almost accurately mimic the way an expert applies his problem-solving
Systems
abilities while making a recommendation or drawing a conclusion with a high degree
of accuracy. Expert systems differ significantly from other computer program
architectures because they separate what is known about an application, called
NOTES domain knowledge, from the logic that controls how the knowledge is used, known
as inference procedures. Though expert systems cannot replace the experts, they
can assist those who are less knowledgeable in the subject domain by using the
knowledge of higher-level experts. These systems are also known as knowledge
based systems or decision support systems. This unit will discuss the working of
expert systems in detail.
In this unit, you will learn about the concept of learning, rote learning, learning
by taking advice, learning in problem solving, learning by induction, explanation
based learning, learning automation, learning in neural networks and expert systems.
5.1 OBJECTIVES
After going through this unit, you will be able to:
Understand the basic concept of learning
Explain the rote learning
Discuss the learning by taking advice
Analyse learning in problem solving
Define learning by induction
Learn about the learning automation
Illustrate the expert system
Understand the need and justification of expert systems
Explain the representing and using domain knowledge
a b
Self - Learning
Fig. 5.1 Graphical Representation of Failure-Driven Learning Material 193
Learning and Expert Learning by being told refers to the simple interaction between human and
Systems
the AI student, but the interaction faces the problem of communication. As the
teacher wants to teach in English, but AI does not understand English, so the
communication problem occurs. A solution to it can be that the teacher puts the
NOTES instructions into code. But, this is not preferable as lengthy instructions will need a
lot of coding and it will even be time consuming.
Learning by exploration refers to gathering the information and not really
pursuing any goal. It tries to search for interesting information so that it can store
and learn from it.
Integrating the Approaches
The integrated approaches paradigm gives researchers the license to study isolated
problems and search for solutions that are both verifiable and useful. A paradigm
also gives researchers a mutual base to communicate with other fields, such as
decision theory and economics.
Active learning approaches is less partial on a given situation, it is most
likely to provide surprising results.
The knowledge-intensive approach was developed by relevant knowledge
bases within the Strengths, Opportunities, Aspirations and Results (SOAR)
framework. This framework provides a correct framework for representing and
using the difficult information within a dynamic environment. Knowledge acquisition
includes many different activities. Most of the learning activities are as follows:
Rote Learning
It consists of simply storing of computed information. A lot of AI programs
significantly improve their performance with the help of rote learning.
Problem Solving Experience (Analogy)
It involves remembering the manner in which a problem is solved. Hence, when
the same problem re-occurs, you can solve it more efficiently.
Learning form Examples (Induction)
It is the manner of learning that involves stimuli without being given any explicit
rules.
Deductive Learning
It is done with the help of deductive inference steps. From these known facts, new
facts and relationships are generally derived.
Inductive Learning
Inductive learning is a type of learning in which an agent tries to calculate or create
an evaluation function. Most of the inductive learning can also be supervised learning,
in which examples are created with the help of classifications.
Decision Trees
Decision tree is an inductive learning structure, where every internal node in the
tree represents a test on one of those properties and the branches from the node
Self - Learning
194 Material
are labeled with the possible outcomes of the test. Every leaf node is a Boolean Learning and Expert
Systems
classifier for the input instance.
Connectionist Learning
The connectionist learning approach has been taken from the human brain’s model NOTES
as an enormous parallel computer, in which small computational units feed simple,
low-bandwidth signals to one another, and from which arises intelligence. It tries
to copy this behaviour on an abstract level with neural networks.
Neural Networks
Neural networks are a general target representation for knowledge. These networks
are encouraged by the neurons present in the brain but do not really fake neurons
(Refer Figure 5.2). Artificial neural networks characteristically comprise various
fewer than the approximately 10 neurons that are in the human brain, and the
artificial neurons, called units, are much modest than their biological complements.
A neural network comprises a set of nodes that match one node with another,
and weights are associated with every link. Some of the nodes receive inputs
through links while others receive them from the nature directly, and some of the
nodes even send out the outputs out through the network. Mostly, all the nodes
share identical activation function and threshold value and only the topology and
weights change.
Network Structures
The two basic types of fundamental network structures are feed-forward network
structures and recurrent network structures. Feed-forward network structures
refer to directed acyclic graphs. Recurrent network structures consist of loops,
and as a result it can represent state. All the connections in Hopfield networks are
bidirectional with symmetric weights, all units have outputs of 1 or –1 and the
activation function is the sign function. Feed-forward networks can be understood
as flowed dense linear functions. The inputs feed into a layer of concealed units,
Self - Learning
Material 195
Learning and Expert which can feed into layers of more concealed units, which finally feed into the
Systems
output layer. Each of the hidden units is a dense linear function of its inputs.
Neural networks of this kind can ensure as inputs any real numbers, and
NOTES they ensure a real number as output. For reversion, it is characteristic for the
output units to be a linear function of their inputs. For organization it is characteristic
for the output to be a sigmoid function of its inputs. For the concealed layers, there
is no argument in having their output is a linear function of their inputs as;
accumulation of the additional layers gives no extra functionality. The output of
each concealed unit is, therefore, a dense linear function of its inputs (Refer Figure
5.3).
Related with a network are the parameters for all of the linear functions.
These parameters can be adjusted concurrently to diminish the forecast mistake
on the training examples.
The wi are weights. The weight inside the nodes is the weight that does not
depend on an input; it is the one multiplied by 1.
A major problem in building neural networks is deciding about the initial
topology. Usually, cross-validation techniques are used for determining when the
network size is right.
Perceptrons
Perceptrons refer to single-layer, feed-forward networks that were initially studied
in the 1950s. They can just learn linearly separable functions. Perceptrons are
studied by updating the weights on their links as a response to the difference
among their output value and the correct output value. The learning rule for each
weight is as follows:
Wj Wj + A Ij Err
where, A is a constant called the learning rate.
Self - Learning
196 Material
Bayesian Learning in Belief Networks Learning and Expert
Systems
Bayesian learning indicates a lot of presumptions about the data. Every hypothesis
weighs its posterior probability whenever a prediction is made. The basic theme is
that instead of having just one presumption, many should be entertained and NOTES
calculated depending on their likelihoods.
Although, updating and using logic with a lot of hypotheses can be intractable,
the most popular approximation is using a most probable hypothesis, i.e., an Hi of
H that maximizes P(Hi | D), where D is the data. This is often termed as the
Maximum Posteriori (MAP) hypothesis HMAP:
P(X | D) ~ = P(X | HMAP) x P(HMAP | D)
For finding HMAP, you apply Baye’s rule:
P(Hi | D) = [P(D | Hi) x P(Hi)] / P(D)
Since, P(D) is fixed around the hypotheses, you just have to maximize the
numerator. The first term depicts the probability which this data set might have, if
Hi is the model. The second term refers to the prior probability that was assigned
to the model.
Belief Network Learning Problems
Four belief networks are generally talked about, depending on the network’s
structure, whether it is unknown or known and whether the network variables are
hidden or observable.
Known Structure (Fully Observable): In it, the conditional probability
tables can only be learned. These tables can be calculated by using the
sample data set’s statistics of the sample data set.
Unknown Structure (Fully Observable): In it, the major problem is
reconstructing the network topology. The problem can be called as a search-
through structure space and the fitting data for each structure reduces the
fixed-structure problem.
Known Structure (Hidden Variables): This is analogous to neural network
learning.
Unknown Structure (Hidden Variables): Whenever certain variables
cannot be observed, it becomes problematic to apply the prior techniques
for recovering structure of these variables, but these structures need averaging
over all the probable values of the unknown variables.
Reinforcement Learning
Reinforcement learning occurs at places where the agent cannot compare the
actions’s results directly. Reinforcement learning needs to find a successful function
by using such rewards. This form of learning is difficult than supervised learning
because in it, the agent does not what the right action is and just knows whether it
is doing well or poorly.
Following are the two basic types of information that an agent tries to learn:
Utility Function: In it, the agent understands the utility of being in many
states, and then chooses his actions to maximize the expected utility of Self - Learning
their outcomes. Material 197
Learning and Expert Action-Value: In it, the agent learns an action-value function by giving
Systems
the expected utility of performing an action in a given state. It is called
Q-learning and is a model-free approach.
- 1 - 1101 + 101101
A rule which can be induced from this data is that the strings with an even
number of 1’s are ‘+’ and those with an odd number of 1’s are ‘–’. This rule
would allow the classification of previously unseen strings (that is, 1001 is ‘+’).
Self - Learning
200 Material
Learning and Expert
Systems
Check Your Progress
1. What is the difference between association rule learning and decision
tree learning? NOTES
2. Define reinforcement learning.
3. What is rote learning?
4. How will you define the learning in problem solving?
5. Give the techniques for modelling the inductive learning process.
Self - Learning
Material 201
Learning and Expert
Systems
Environment
NOTES
Automation
Adopted
parameters
Delay Learner
Learning Automation
Standard System
Output
Learning Knowledge Performance Feedback
Input
Element Base Element Element
Two players are required for this game. Each player has to remove one
token and is also not able to access the token from more than one row. The player
who loses the last token is the loser and the opponent is considered the winner.
Let us consider the total calculation to compute the utility value of being the state.
Suppose the utility value is high, let us say 1 but there must be utility value in other
states as well. With the known starting, the computation is started with utility values.
Let us assume that agent reaches the goal S7 from S1 via state S2. Now, it is to be
found how many tines S2 is visited. If assumption is taken for 100 experiments, S2
is visited 5 times and utility state can be calculated as 5/100 = 0.05. Therefore,
agent moves from S1 and S2 or S6 but not via S5. It returns probability result as
0.5. If it is in S5, it can move to S2, S4, S6 with a probability result of 0.25. It is said
that key of learning automation is used to update the utility values. AI uses adaptive
dynamic programming and then utility, which is denoted by U (i) and is computed
with state i by using the following expression:
Σ j U(i) = R(i) + Σ j M ij U(j)
where, R(i) represents the reward of being state i, Mij represents the
probability of transition from state i to state j.
The given equation is said to be constraint equation. In practicality, several
piles of sticks are taken in the NIM game. An example can be taken as a set of
piles. The configuration of the piles is implemented by a monotone sequence of
integers, for example (1, 3, 5, 7) or (2, 2, 3, 9, 110). In this example, a player may
remove (in one turn) any number of sticks from one pile of as per choice. Thus,
(1, 3, 5, 7) would become (1, 1, 3, 5), if the player were to remove 6 sticks from
the last pile. The player who takes the last stick loses. After going through this
example, now you can get the rules of the NIM game. The NIM game (1, 2, 2)
can be presented by Figure 5.7.
Self - Learning
206 Material
MAX 1 Learning and Expert
You: : 122 Systems
MIN 0 1 0
Opponent: : 112
0 1 1 1 1 1 1 0
NOTES
MAX
You: : 1 2 11 2 12 11 12 111
1 1 1 1 0 0 1 1 0 0 0
MIN
Opponent: : 1 1 1 1 2 11 1 1 2 11 11
0 0 0 0 0
MIN 11 1 1 1 11
MAX
As shown in Figure 5.7, the number in the root consists of three sets, 1, 2,
2. Suppose you are the player who makes the first move. You may take one or
two sticks. After moving from one position to the other, it is your opponent’s turn
and the numbers in the nodes represent the sticks on the left. Then the opponent
moves one or two sticks and the status is shown in the next nodes and so on until
there is one stick left. So, this is the total game scenario for the NIM tree.
Knowledge Engineers
Inference Software
Engine
Users
Experts
Working Knowledge
Memory Base
Spreadsheets
Hardware
Until the mid 1980’s, expert systems were primarily developed using the
Lisp and Prolog artificial intelligence languages. However, since these languages
required long development time of about ten years, their usage has eventually
decreased to a large extent. The systems developed now generally make use of
expert system shell programs.
5.6.1 Need and Justification of Expert Systems
Nowadays, expert systems are applied to diverse fields. The need for these systems
is rising mainly due to the following reasons:
1. Human beings get tired from physical or mental workload but expert systems
are diligent.
2. Human beings can forget crucial details of a problem, but expert systems
are programmed to take care of the minutest detail.
4. Human beings may sometimes be inconsistent or biased in their decisions,
but expert systems always follow logic.
5. Human beings have limited working memory and are therefore unable to
comprehend large amounts of data quickly, but expert systems can store,
manipulate and retrieve large amount of data in seconds.
The various advantages, which justify the huge costs associated with experts
systems, are as follows:
1. Expert systems reproduce the knowledge and skills possessed by experts.
This reproduction enables wide distribution of the expertise, making it
available at a reasonable cost.
2. Expert systems are always consistent in their problem-solving abilities,
providing uniform answer at all times. There are no emotional or health
Self - Learning considerations that can vary their performance.
210 Material
3. Expert systems provide (almost) complete accessibility. They work 24 hours Learning and Expert
Systems
all days including weekends and holidays. They are never tired, nor do
they, ever take rest.
4. Expert systems also help in preserving expertise in situations where the NOTES
turnover of employees or experts is very high.
5. Expert systems are capable of solving problems even where complete
or exact data do not exist. This is an important feature because
complete and accurate information on a problem is rarely available in
the real world.
The applications of expert systems can be categorized into the following seven
major classes:
1. Diagnosis and troubleshooting devices: Expert systems can be used to
deduce faults and suggest corrective actions for malfunctioning devices or
processes.
2. Planning and scheduling: Expert systems are used to set goals and
determine a set of actions to achieve those goals. Such systems are widely
used for airline scheduling of flights, manufacturing job-shop scheduling and
manufacturing process planning.
3. Configuration of manufactured objects from subassemblies: One of
the most important expert system applications includes configuration,
whereby a solution to a problem is synthesized from a given set of elements
related by a set of constraints. The configuration technique is used in different
industries like modular home building, manufacturing and complex engineering
design and manufacturing.
4. Financial decision making: Expert system techniques are widely used in
the financial services industry. These programs assist the bankers in
determining whether to make loans to businesses and individuals. Insurance
companies also use these systems to assess the risk presented by the
customers and determine a price for the insurance. Expert systems are used
in foreign exchange trading.
5. Knowledge publishing: The primary function of expert systems used in
this area is to deliver knowledge that is relevant to the user’s problem. The
two most widely distributed expert systems which are used for knowledge
publishing are as follows: One is an advisor which counsels a user on
appropriate grammatical usage in a text; the second one is a tax advisor that
accompanies a tax preparation program and advises the user on tax strategy,
tactics and individual tax policy.
6. Process monitoring and control: Expert systems can also be used to
analyze real-time data from physical devices and notice anomalies, predict
trends and control optimality and failure correction. These systems can be
found in the steel making and oil refining industries.
7. Design and manufacturing: These systems assist in the design of physical
devices and processes, starting from high-level conceptual design of abstract
entities to factory floor configuration of manufacturing processes.
Self - Learning
Material 211
Learning and Expert
Systems
5.6.2 Stages of Expert Systems
As shown in Figure 5.9, the development of expert systems, generally, involves
the following stages:
NOTES Task Analysis
Knowledge Acquisition
Prototype Development
Self - Learning
212 Material
Production system architecture Learning and Expert
Systems
One of the most common examples of the system architecture of expert system is
production system. In this type of system, knowledge is represented in the form of
IF-THEN-ELSE production rules. For example, IF antecedent, THEN take the NOTES
consequent. The following example is taken from the knowledge base of one of
the expert systems available for marketing analysis.
If: The person has good communication and written communication.
Then: The person will be considered as having ability to work as a teacher.
Each production rule in such a system represents a single piece of knowledge and
sets of related production rules are used to achieve a goal. Expert systems of this
type conducting a session where the systems attempt to find the best goal using
information supplied by the user. The sequence of events comprises a question
and answer session. The two main methods of reasoning used in this architecture
are as follows:
1. Forward chaining: This method involves checking the condition part of a
rule to determine whether it is true or false. If the condition is true, then the
action part of the rule is also true. This procedure continues until a solution
is found or a dead-end is reached. Forward chaining is commonly referred
to as data-driven reasoning
2. Backward chaining: This is the reverse of forward chaining. It is used to
backtrack from a goal to the paths that lead to the goal. It is very useful
when all outcomes are known and the number of possible outcomes is not
large. In this case, a goal is specified and the expert system tries to determine
what conditions are needed to arrive at the specified goal. Backward chaining
is thus also called goal-driven.
Non-production system architecture
The non production system architecture of certain expert systems do not have rule
representation scheme. These systems employ more structure representation
schemes like frames, decision trees or specialized networks like neural networks.
Some of these architectures are discussed below.
Frame architecture
Frames are structured sets of closely related knowledge, which may include object’s
or concept’s names, main attributes of objects, their corresponding values and
possibly some attached procedures. These values are stored in specified slots of
the frame and individual frames are usually linked together.
Decision tree architecture
Expert system may also store information in the form of a decision tree, that is, in
a top to bottom manner. The values of attributes of an object determine a path to
a leaf node in the tree which contains the objects identification. Each object attribute
corresponds to a non terminal node in the tree and each branch of the decision
tree corresponds to a set of values. New nodes and branches can be added to the
tree when additional attributes are needed to further discriminate among new Self - Learning
objects. Material 213
Learning and Expert Black board system architecture
Systems
Black board architecture is a special type of knowledge based system which uses
a form of opportunistic reasoning. H. Penny Nii (1986) has aptly described the
NOTES blackboard problem solving strategy through the following analogy.
‘Imagine a room with a large black board on which a group of experts are
piecing together a jigsaw puzzle. Each of the experts has some special knowledge
about solving puzzles like border expert, shapes experts, colour expert etc. Each
member examines his or her pieces and decides if they will fit into the partially
completed puzzle. Those members having appropriate pieces go up to the black
board and update the evolving solution. The whole puzzle can be solved in complete
silence with no direct communication among members of the group. Each person
is self activating, knowing when to contribute to the solution. The solution evolves
in this incremental way with expert contributing dynamically on an opportunistic
basis, that is, as the opportunity to contribute to the solution arises.
The objects in the black board are hierarchically organized into levels which
facilitate analysis and solution. Information from one level serves as input to a set
of knowledge sources. The sources modify the knowledge and place it on the
same or different levels.’
Black boards system is applied on WEARSAY family of projects, which
are speech understanding systems developed to analyse complex scenes and model
the human cognitive processes.
Analogical reasoning architecture
Expert systems based on analogical architectures solve problems by finding similar
problems and their solutions and applying the known solution to the new problem,
possibly with some kind of modification.
These architectures require a large knowledge base having numerous
problem solutions. Previously encountered situations are stored as units in memory
and are content-indexed for rapid retrieval.
5.6.3 Representing and Using Domain Knowledge
An expert system requires a knowledge base in the domain in which it is developed
to solve the problems. This domain knowledge base must be such that the expert
system is able to use it efficiently. The most commonly used representation of the
knowledge base in the expert systems is done by defining a set of production
rules. These production rules are usually united with a frame system, which provides
a definition for the objects that occur in the rule. Different expert systems operate
on the rules in different ways. For example, the following code shows a rule in
English, which an expert system DEC VAX uses in a different version:
if: the most current and active context is distributing
mass-bus devices, and
there is a single-port disk drive, which has not been
assigned to a mass-bus, and
there are no unassigned dual-port disk drives, and
the number of devices that each mass-bus should support
Self - Learning is known, and
214 Material
there is a mass-bus that has been assigned to at least one Learning and Expert
disk drive and that should support the additional disk Systems
drives, and
the type of cable needed to connect the disk drive to the
previous device on the mass-bus is known NOTES
then: assign the disk drive to the mass-bus.
The above program, called R1, has a knowledge domain that contains a
set of actions to be taken for each circumstance. Also, it does not need to consider
all the possible alternatives as it is responsible for doing design tasks and hence,
does not require probabilistic information. Similarly, every expert system, designed
for carrying out distinct tasks, has its own set of knowledge domain. These systems
also make the use of the reasoning mechanism. Reasoning mechanism is required
in order to apply their knowledge to a given problem. Since these systems are
rule-based systems, they make the use of forward chaining, backward chaining or
mixed chaining algorithms for reasoning.
5.6.4 Functioning of MYCIN and Rule Induction (RI)
MYCIN was an early backward chaining expert system that applies artificial
intelligence. It is used to identify bacteria causing severe infections, such
as bacteremia and meningitis. It recommend antibiotics, with the dosage adjusted
according to patient’s body weight. This system was also used for the diagnosis of
blood clotting diseases. It was developed over five or six years in the early 1970s
at Stanford University and written in Lisp as the doctoral dissertation of Edward
Shortliffe under the direction of Bruce G. Buchanan, Stanley N. Cohen and others.
MYCIN operated using a fairly simple inference engine and a knowledge
base of ~600 rules. It would query the physician running the program via a long
series of simple yes/no or textual questions. At the end, it provided a list of possible
culprit bacteria ranked from high to low based on the probability of each diagnosis,
its confidence in each diagnosis’ probability, the reasoning behind each diagnosis
(that is, MYCIN would also list the questions and rules which led it to rank a
diagnosis a particular way), and its recommended course of drug treatment. The
MYCIN Expert System used backward chaining technology to diagnose infections
based on symptoms and medical history and recommend treatment based on the
data received.
Rule induction is an area of machine learning in which formal rules are
extracted from a set of observations. The rules extracted may represent a full
scientific model of the data, or merely represent local patterns in the data. Rule
induction is a technique that creates “if–else–then”-type rules from a set of input
variables and an output variable. A typical rule induction technique, such as Quinlan’s
C5, can be used to select variables because, as part of its processing, it applies
information theory calculations in order to choose the input variables (and their
values) that are most relevant to the values of the output variables. Therefore, the
least related input variables and values get pruned and disappear from the tree.
Once the tree is generated, the variables chosen by the rule induction technique
can be noted in the branches and used as a subset for further processing and
Self - Learning
Material 215
Learning and Expert analysis. Remember that the values of the output variable (the outcome of the rule)
Systems
are in the terminal (leaf) nodes of the tree. The rule induction technique also gives
additional information about the values and the variables: the ones higher up in the
tree are more general and apply to a wider set of cases, whereas the ones lower
NOTES down are more specific and apply to fewer cases.
Self - Learning
216 Material
Learning and Expert
5.8 SUMMARY Systems
Short-Answer Questions
1. What do you know about Bayesian networks?
2. List the various approaches of learning.
3. What is reinforcement learning?
4. Define inductive learning.
5. List the steps of Q-learning algorithm.
6. What are the advantages of a neural network?
7. How expert systems are different from conventional computer programs?
8. How the huge costs associated with expert systems can be justified?
9. What is the process of building up an expert system?
10. Distinguish between the decision tree and analogical reasoning architecture
of expert systems.
Long-Answer Questions
1. Discuss the theory and approaches of learning in detail.
2. Describe the various learning processes implemented in learning automation.
3. Write a short note on each of the following:
(i) Learning by Induction
(ii) Neural Networks
4. Explain the various components of a typical expert system.
5. Describe some applications of expert systems with me help of giving
Self - Learning examples.
218 Material
6. Elaborate on the production and non-production system architectures of Learning and Expert
Systems
expert systems.
7. Discuss the functioning of MYCIN and rule induction. Give appropriate
examples.
NOTES
Self - Learning
Material 219