CZ3005 Module 2 - Intelligent Agents and Search
CZ3005 Module 2 - Intelligent Agents and Search
Artificial Intelligence
https://personal.ntu.edu.sg/hanwangzhang/
Email: hanwangzhang@ntu.edu.sg
Office: N4-02c-87
Lesson Outline
Percepts
Environment ?
Agent
Actions
Effectors
Rational Agents
• A rational agent is one that does the right thing
• Rational action: action that maximises the expected value
of an objective performance measure given the percept
sequence to date
• Rationality depends on/subject to:
• performance measure
• everything that the agent has perceived so far
• built-in knowledge about the environment
• actions that can be performed
Example: Google X2: Driverless Taxi
• Percepts: video, speed, acceleration, engine status, GPS,
radar, …
• Actions: steer, accelerate, brake, horn, display, …
• Goals (Measure): safety, reach destination, maximise
profits, obey laws, passenger comfort,…
• Environment: Singapore urban
streets, highways, traffic,
pedestrians, weather, customers, …
Image source: https://en.wikipedia.org/wiki/Waymo#/media/File:Waymo_Chrysler_Pacifica_in_Los_Altos,_2017.jpg
More Examples
Agent Type Percepts Actions Goals Environment
Medical diagnosis Symptoms, findings, Questions, tests, Healthy patient, Patient, hospital
system patient’s answers treatments minimize costs
Part-picking robot Pixels of varying Pick up parts and Place parts in Conveyor belt with
intensity sorts into bins correct bins parts
Interactive English Typed words Print exercises, Maximize student’s Set of students
tutor suggestions, score on test
corrections
Types of Environment
Accessible (vs Agent's sensory apparatus gives it access to the
inaccessible) complete state of the environment
Deterministic (vs The next state of the environment is completely determined
nondeterministic) by the current state and the actions selected by the agent
Episodic (vs Each episode is not affected by the previous taken
Sequential) actions
Static (vs Environment does not change while an agent is
dynamic) deliberating
Discrete (vs A limited number of distinct percepts and actions
continuous)
Example: Driverless Taxi
Accessible? No. Some traffic information on road is missing
Deterministic? No. Some cars in front may turn right suddenly
Episodic? No. The current action is based on previous driving
actions
Static? No. When the taxi moves, Other cars are moving as
well
Discrete? No. Speed, Distance, Fuel consumption are in real
domains
Example: Chess
Accessible? Yes. All positions in chessboard can be observed
Deterministic? Yes. The outcome of each movement can be
determined
Episodic? No. The action depends on previous movements
Static? Yes. When there is no clock, when are you considering
the next step, the opponent can't move; Semi. When
there is a clock, and time is up, you will give up the
movement
Discrete? Yes. All positions and movements are in discrete
domains
More Examples
Environment Accessible Deterministic Episodic Static Discrete
Idea
• Systematically considers the expected outcomes of
different possible sequences of actions that lead to
states of known value
• Choose the best one
• shortest journey from A to B?
• most cost effective journey from A to B?
Design of Problem-Solving Agent
Steps
1. Goal formulation
2. Problem formulation (states, actions, goal test)
3. Solution (a sequence of legal action execution)
4. Search (algorithm of how to find a good solution)
• No knowledge uninformed search
• Knowledge informed search
Example: Romania
Goal: be in Bucharest
Problem:
– states: various cities
– actions: drive between cities
– goal test: city == Bucharest?
Solution:
– sequence of cities, e.g., Arad, Sibiu, Fagaras, Bucharest
Oradea Neamt
Zerind
lasi
Arad
Sibiu Fagaras
Vasliu
Urziceni
Mehadia Bucharest
Eforie
Dobreta
Craiova Giurgiu
Example: Vacuum Cleaner Agent
o
Robotic vacuum cleaners
move autonomously
o
Some can come back to
a docking station to
charge their batteries
o
A few are able to empty
their dust containers into
the dock as well
Example: A Simple Vacuum World
Two locations, each location may or may not contain dirt, and the agent may
be in one location or the other
1 2
3 4
5 6
7 8
States: 8 possible world states
Actions: left, right, and suck
Goal: clean up all dirt Two goal states, i.e. {7, 8}
Single-State Problem
3 4
5 6
7 8
– e.g.: start in #5
– Solution: right, suck
Multiple-State Problem
– Inaccessible world state (with limited sensory
information):
agent only knows which sets of states it is in
– Known outcome of action (deterministic)
1 2
3 4
5 6
7 8
• Solution:
a path (connecting sets of states) that leads to a set of states all of which are
goal states
Example: Vacuum World (Multiple-state
Version)
States: subset of the eight states
Operators: left, right, suck
Goal test: all states in state set have no dirt
Path cost: 1 per operator
Example: Vacuum World (Multiple-state
Version)
Example: 8-puzzle
• States: integer locations of tiles Start state
• number of states = 9!
• Actions: move blank left, right, up, down
• Goal test: = goal state (given)
• Path cost: 1 per move
Goal state
Real-World Problems
Route finding problems:
• Routing in computer networks
• Robot navigation
• Automated travel advisory
• Airline travel planning
Touring problems:
• Traveling Salesperson problem
• “Shortest tour": visit every city exactly once
Question to Think About
AlphaGo
https://forms.gle/dmFLBxgx7kBktFPdA
Search Algorithms
• Exploration of state space by
generating successors of Arad
already-explored states Timisoara
Zerind
• Frontier: candidate nodes Sibiu
for expansion
Arad Rimnicu
• Explored set Vilcea
Fagaras
Oradea
Sibiu Bucharest
Search Strategies
• A strategy is defined by picking the order of node expansion.
• Strategies are evaluated along the following dimensions:
1 A 2 A 3 A 4 A
B C B C B C
D E D E F G
A 1 2
S S
0
1 10
A B C
5 B 5 1 5 15
S G
S S
15 4 3
5
A B C A B C
C
15 5 15
G G G
11 10 11
Here we do not expand nodes that have been expanded.
Uniform-Cost Search
Complete Yes
Time # of nodes with path cost g <= cost of optimal solution
(eqv. # of nodes pop out from the priority queue)
Space # of nodes with path cost g <= cost of optimal solution
Optimal Yes
Depth-First Search
Expand deepest unexpanded node which can be implemented by a Last-In-
First-Out (LIFO) stack, Backtrack only when no more expansion
1 A 2 A 3 A 4 A 5 A 6 A
B C B C B C B C B C
D E D E D E E
7 A 8 A H I I
9 A 10 A 11 A 12 A
B C B C
E C C C C
E
F G F G F G
J K K
L M M
Depth-First Search
Denote
• d: maximum depth of the state space
Time
Space
Optimal No
Iterative Deepening Search
Iteratively estimate the max depth / of DLS one-by-one
Limit = 0
Limit = 1
Iterative Deepening Search
Iteratively estimate the max depth / of DLS one-by-one
Limit = 2
Iterative Deepening Search
Iteratively estimate the max depth / of DLS one-by-one
Limit = 3
Iterative Deepening Search...
Function ITERATIVE-DEEPENING-SEARCH(problem) returns a solution
sequence
inputs: problem, a problem
for depth to ∞ do
if DEPTH-LIMITED-SEARCH(problem, depth) succeeds then return its
result
end
Complete
return failureYes
Time
Space
Optimal Yes
CZ3005
Artificial Intelligence
https://personal.ntu.edu.sg/hanwangzhang/
Email: hanwangzhang@ntu.edu.sg
Office: N4-02c-87
Summary (we make assumptions for optimality)
Breadth- Uniform- Depth-First Depth- Iterative Bidirectional
Criterion first Cost Limited Deepening (if applicable)
Time
Space
Optimal Yes Yes No No Yes Yes
Complete Yes Yes No Yes, if Yes Yes
General Search
Uninformed search strategies Informed search strategies
• Systematic generation of new • Use problem-specific knowledge
states (Goal Test) • To decide the order of node
expansion
• Inefficient (exponential space
• Best First Search: expand the
and time complexity)
most desirable unexpanded node
• Use an evaluation function to
estimate the “desirability" of
each node
Evaluation function
• Path-cost function
• Cost from initial state to current state (search-node)
• Information on the cost toward the goal is not required.
• “Heuristic” function
• Estimated cost of the cheapest path from to a goal state:
• Exact cost cannot be determined
• depends only on the state at that node (may not include history)
• is not larger than the real cost (admissible)
Greedy Search
Expands the node that appears to be closest to goal
• Evaluation function :estimate of cost from to
• Function Greedy-Search(problem) returns solution
• Return Best-First-Search(problem, ) //
state Arad
Arad
Bucharest
366
0
Craiova 160
366 Dobreta 242
Efoire 161
Fagaras 176
Giurgiu 77
Hirsova 151
Lasi 226
Lugoj 244
Mehadia 241
Neamt 234
Oradea 380
Pitesti 98
Rimnicu Vilcea 193
Sibiu 253
Timisoara 329
Urziceni 80
Vaslui 199
Zerind 374
Example
b) After Straight-line distance to Bucharest
expanding Arad
Arad
Bucharest
366
0
Arad Craiova
Dobreta
160
242
Efoire 161
Sibiu Timisoara Zerind Fagaras 176
Giurgiu 77
253 329 374 Hirsova 151
Lasi 226
Lugoj 244
Mehadia 241
Neamt 234
Oradea 380
Pitesti 98
Rimnicu Vilcea 193
Sibiu 253
Timisoara 329
Urziceni 80
Vaslui 199
Zerind 374
Example
c) After Straight-line distance to Bucharest
Dobreta Eforie
Craiova
Giurgiu
Answer: No
Greedy Search...
• m: maximum depth of the search space
Complete No
Time
Space
Optimal No
A * Search
• Uniform-cost search
• g(n): cost to reach n (Past Experience)
• optimal and complete, but can be very
inefficient
• Greedy search
• h(n): cost from n to goal (Future Prediction)
• neither optimal nor complete, but cuts search
space considerably
A * Search
Idea: Combine Greedy search with Uniform-Cost search
Evaluation function:
(a) The initial state (b) After expanding Arad (c) After expanding Sibiu
Arad Arad Arad
366 = 0 + 366
Sibiu Zerind Sibiu Zerind
Timisoara
393 = 140 + 253 449 = 75 + 374 449 = 75 + 374
447 = 118 + 329
Timisoara Arad
Oradea
447 = 118 + 329 646 = 280 + 366 Rimnicu
671 = 291+ 380
Vilcea
Fagaras
413 = 220 + 193
415 = 239 + 176
Example: Route-finding from Arad to Bucharest
Best-first-search with evaluation function
Zerind
Sibiu
Timisoara 449 = 75 + 374
Arad 447 = 118 + 329
646 = 280 + 366 Oradea Rimnicu
Fagaras 671 = 291+ 380 Vilcea
Sibiu
Sibiu Craiova Pitesti 553 = 300 + 253
Bucharest
591 = 338 + 253 526 = 360 + 166
415 = 317 + 98
450 = 450 + 0
Example: Route-finding from Arad to Bucharest
Best-first-search with evaluation function
Arad
Sibiu
Sibiu Craiova Pitesti 553 = 300 + 253
Bucharest
591 = 338 + 253 526 = 360 + 166
450 = 450 + 0
Rimnicu
Bucharest Craiova Vilcea
#turns
#Moves = 250150
#stone positions
Another Example: Monte Carlo Search Tree (MCST)
MCST Key Steps
#Wins/#Games
MCST for Go is very expensive & slow
Solution:
1. For selection/expansion, no longer stats
(xx/xx) but f = g+h
2. For simulation, no longer real self-play but f =
g+h
MCST: Update f = g() + h()
f f f f f f f f
f f f f f f f f
No real play,
just an estimation