Chapter3 - Search4
Chapter3 - Search4
Lê Thanh Hương
School of Information and Communication Technology - HUST
Outline
2
Local beam search
3
Local beam search
4
Games and search
5
Why study games?
6
Why study games?
7
Relation of Games to Search
• Search – no adversary
• Solution is (heuristic) method for finding goal
• Heuristics and CSP techniques can find optimal solution
• Evaluation function: estimate of cost from start to goal through given node
• Examples: path planning, scheduling activities
• Games – adversary
• Solution is strategy (strategy specifies move for every possible opponent reply).
• Time limits force an approximate solution
• Evaluation function: evaluate “goodness” of game position
• Examples: chess, checkers, Othello, backgammon
• Ignoring computational complexity, games are a perfect application for a complete search.
• Of course, ignoring complexity is a bad idea, so games are a good place to study resource
bounded searches.
8
Types of Games
deterministic chance
perfect chess, checkers, go, othello backgammon monopoly
information
9
Minimax
10
Minimax
11
Optimal strategies
MINIMAX-VALUE(n)=
UTILITY(n) If n is a terminal
maxs successors(n) MINIMAX-VALUE(s) If n is a max node
mins successors(n) MINIMAX-VALUE(s) If n is a min node
12
Minimax
13
Minimax algorithm
14
Properties of minimax
15
Problem of minimax search
Alpha-beta pruning:
• Remove branches that do not influence final decision
• Revisit example …
16
α-β pruning
◼ Alpha values: the best values achievable for MAX, hence the max value so
far
◼ Beta values: the best values achievable for MIN, hence the min value so far
◼ At MAX level: compare result V of node to beta value. If V<beta, pass value
to parent node and BREAK
17
α-β pruning
18
α-β pruning example
19
α-β pruning example
20
α-β pruning example
21
α-β pruning example
22
Properties of α-β
• A simple example of the value of reasoning about which computations are relevant (a
form of metareasoning)
23
Why is it called α-β?
• α is the value of the best (i.e., highest-value) choice found so far at any
choice point along the path for max
• If v is worse than α, max will avoid it
→ prune that branch
• Define β similarly for min
24
The α-β algorithm
25
The α-β algorithm
26
Imperfect, real-time decisions
27
Cut-off search
• Change:
if TERMINAL-TEST(state) then return UTILITY(state)
into:
if CUTOFF-TEST(state,depth) then return EVAL(state)
28
Heuristic evaluation (EVAL)
• Idea: produce an estimate of the expected utility of the game from a given
position.
• Requirements:
➢ EVAL should order terminal-nodes in the same way as UTILITY.
➢ Computation may not take too long.
➢ For non-terminal states the EVAL should be strongly correlated with the actual chance of
winning.
• Example:
Expected value e(p) for each state p:
E(p) = (# open rows, columns, diagonals for MAX)
- (# open rows, columns, diagonals for MIN)
• MAX moves all lines that don’t have o; MIN moves all lines that don’t have x
29
Reduces state spaces of Tictactoe based on the symmetry of the states
Expected value e(p) for each state p:
E(p) = (# open rows, columns, diagonals for MAX)
- (# open rows, columns, diagonals for MIN)
MAX moves all lines that don’t have o; MIN moves
all lines that don’t have x
1
-1 1 -2
MIN goes
e(p) 1 0 1 0 -1 1 2 -1 0 -1 0 -2
30
Evaluation function example
31
Chess complexity
32
Deterministic games in practice
• Checkers: Chinook ended 40-year-reign of human world champion Marion Tinsley in 1994.
Used a precomputed endgame database defining perfect play for all positions involving 8 or
fewer pieces on the board, a total of 444 billion positions.
• Chess: Deep Blue defeated human world champion Garry Kasparov in a six-game match in
1997. Deep Blue searches 200 million positions per second, uses very sophisticated
evaluation, and undisclosed methods for extending some lines of search up to 40 ply.
• Othello: human champions refuse to compete against computers, who are too good.
• Go: human champions refuse to compete against computers, who are too bad. In go, b > 300,
so most programs use pattern knowledge bases to suggest plausible moves.
33
Nondeterministic games
change nodes
34
Backgammon
35
Expected minimax value
EXPECTED-MINIMAX-VALUE(n)=
UTILITY(n) If n is a terminal
maxssuccessors(n) EXPECTEDMINIMAX(s) If n is a max node
minssuccessors(n) EXPECTEDMINIMAX(s) If n is a max node
Σssuccessors(n) P(s) .EXPECTEDMINIMAX(s) If n is a chance node
36
Games of imperfect information
37