0% found this document useful (0 votes)
17 views35 pages

Part4.Game Playing

Uploaded by

Joshua Mithamo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views35 pages

Part4.Game Playing

Uploaded by

Joshua Mithamo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 35

Part4.

Game playing

Games. Why?
Minimax search

Alpha-beta pruning

Garry Kasparov and Deep Blue,


1997
Why are people interested
in games?

Some games:
 Ball games
 Card games
 Board games
 Computer games
 ...

2
Why study board
games ?
 One of the oldest sub-fields of AI

 Abstract and pure form of competition that


seems to require intelligence
 Easy to represent the states and actions
 Very little world knowledge required !
 Game playing is a special case of a search
problem, with some new requirements.

3
Game playing
 Up till now we have assumed the situation is not
going to change whilst we search.

 Game playing is not like this.



The opponent introduces uncertainty.

The opponent also wants to win

 Game Playing has been studied for a long time



Babbage (tic-tac-toe)

Turing (chess)

4
Why study board
games ?
 “A chess (or draughts) playing computer would
be proof of a machine doing something thought
to require intelligence”


Game playing research has contributed ideas on how
to make the best use of time to reach good decisions,
when reaching optimal decisions is impossible.

These ideas are applicable in tackling real-world
search problems.


Our limit here: 2-person games, with no chance
5
Why new techniques for
games?
 “Contingency” problem:

 We don’t know the opponents move !


 The size of the search space:
 Chess : ~15 moves possible per state, 80 ply

1580 nodes in tree
 Go : ~200 moves per state, 300 ply

200300 nodes in tree

 Game playing algorithms:


 Search tree only up to some depth bound
 Use an evaluation function at the depth bound
 Propagate the evaluation upwards in the tree

6
Types of games
Deterministic Chance

Perfect Chess, draughts, Backgammon,


go, othello, tic- monopoly
information
tac-toe

Imperfect Bridge, poker,


information scrabble

7
Game Playing - Chess
 Shannon - March 9th 1949 - New York
 Size of search space (10120 - average of 40 moves)

 200 million positions/second = 10100 years to evaluate


all possible games

 Searching to depth = 40, at one node per microsecond


it would take 1090 years to make its first move

8
Game Playing - Chess
 1957 - Newell and Simon predicted that a computer
would be chess champion within ten years
 Simon : “I was a little far-sighted with chess, but there
was no way to do it with machines that were as slow as
the ones way back then”
 1958 - First computer to play chess was an IBM 704 -
about one millionth capacity of deep blue.
 1967 : Mac Hack competed successfully in human
tournaments
 1983 : “Belle” obtained expert status from the United
States Chess Federation
 Mid 80’s : Scientists at Carnegie Mellon University
started work on what was to become Deep Blue.
 Project moved to IBM in 1989

9
Game Playing - Chess
 May 11th 1997, Gary Kasparov lost a six match game to
Deep blue
 3.5 to 2.5

 Two wins for deep blue, one win for Kasparov and

three draws
 Many computer scientists were excited at this

development
 But social scientists had there own feelings, a game is

not just a game, it too has got its emotional side.


 Computers are fast, but have no feelings

10
Game Playing - Checkers
(draughts)
 Arthur Samuel - 1952
 Written for an IBM 701
 1954 - Re-wrote for an IBM 704
 10,000 words of main memory

 Added a learning mechanism that learnt its


own evaluation function
 Learnt the evaluation function by playing
against itself
 After a few days it could beat its creator
 …. And compete on equal terms with strong
human players

11
Game Playing - Checkers
(draughts)
 Jonathon Schaeffer - 1996
 Developed Chinook
 Uses Alpha-Beta search
 Plays a perfect end game by means of a database

 In 1992 Chinook won the US Open


 ….. And challenged for the world championship
 Dr Marion Tinsley, had been world championship for
over 40 years
 … only losing three games in all that time
 Against Chinook she suffered her fourth and fifth defeat
 ….. But ultimately won 21.5 to 18.5

12
Game Playing - Checkers
(draughts)
 In August 1994 there was a re-match but Marion Tinsley
withdrew for health reasons
 Chinook became the official world champion
 Schaeffer claimed Chinook was rated at 2814
 The best human players are rated at 2632 and 2625
 Chinook did not include any learning mechanism

13
Game Playing - Checkers
(draughts)
 Kumar - 2000
 “Learnt” how to play a good game of checkers
 The program used a population of games with the best
competing for survival
 Learning was done using a neural network with the
synapses being changed by an evolutionary strategy
 The best program beat a commercial application 6-0

 The program was presented at CEC 2000 (San Diego)


and remain undefeated

14
Game Playing - Minimax
 Game Playing: An opponent tries to thwart your
every move
 1944 - John von Neumann outlined a search
method (Minimax) that maximised your position
whilst minimising your opponents

 In order to implement we need a method of


measuring how good a position is.
 Often called a utility function (or payoff function)

e.g. outcome of a game; win 1, loss -1, draw 0
 Initially this will be a value that describes our
position exactly

15
Game Playing - Minimax
 Restrictions:
 2 players: MAX (computer) and MIN
 (opponent)
deterministic, perfect information
 Select a depth-bound (say: 2) and evaluation function

MAX Select - Construct the tree up till


3 this move the depth-bound
- Compute the evaluation
function for the leaves
MIN
2 1 3
- Propagate the evaluation
function upwards:
MAX - taking minima in MIN
2 5 3 1 4 4 3 - taking maxima in MAX
16
example
MAX 1 A
Select this
move

MIN 1 B -3 C

MAX 4 D 1 E 2 F -3 G

4 -5 -5 1 -7 2 -3 -8

= terminal = agent = opponent


position 17
Alpha-Beta Pruning
 Generally applied optimization on Mini-max.
 Instead of:
 first creating the entire tree (up to depth-level)
 then doing all propagation
 Interleave the generation of the tree and the
propagation of values.
 Point:
 some of the obtained values in the tree will
provide information that other (non-generated)
parts are redundant and do not need to be
generated.

18
Alpha-Beta idea:
 Principles:
 generate the tree depth-first, left-to-right
 propagate final values of nodes as initial
estimates for their parent node.
MAX 2 - The MIN-value (1) is already
smaller than the MAX-value
of the parent (2)
- The MIN-value can only
decrease further,
MIN 2 =2 1
- The MAX-value is only
allowed to increase,
- No point in computing
further below this node
MAX
2 5 1
19
Terminology:
- The (temporary) values at MAX-nodes are ALPHA-
values
- The (temporary) values at MIN-nodes are BETA-values

MAX 2 Alpha-value

MIN 2 =2 1 Beta-value

MAX
2 5 1
20
The Alpha-Beta principles
(1):
- If an ALPHA-value is larger or equal than the Beta-
value of a descendant node:
stop generation of the children of the
descendant

MAX 2 Alpha-value


MIN 2 =2 1 Beta-value

MAX
2 5 1
21
The Alpha-Beta principles
(2):
- If an Beta-value is smaller or equal than the Alpha-
value of a descendant node:
stop generation of the children of the
descendant

MAX 2

MIN 2 =2 1 Beta-value


MAX Alpha-value
2 5 3
22
example
A
MAX

<=6 B C
MIN

6 D >=8 E
MAX

H I J K
6 5 8

= agent = opponent
23
Alpha-Beta Pruning example
>=6 A
MAX

6 B <=2 C
MIN

6 D >=8 E 2 F G
MAX

H I J K L M
6 5 8 2 1

= agent = opponent
24
Alpha-Beta Pruning example
>=6 A
MAX

6 B 2 C
MIN

6 D >=8 E 2 F G
MAX

H I J K L M
6 5 8 2 1

= agent = opponent
25
Alpha-Beta Pruning example
6 A
MAX

6 B 2 C beta
cutoff
MIN

6 D >=8 E alpha 2 F G
cutoff
MAX

H I J K L M
6 5 8 2 1

= agent = opponent
26
Mini-Max with  at work:
 4 16  5 31 = 5 39
MAX
86  5 23
= 4 15 = 5 30  3 38
MIN

82  2 10  1 18  1 33
 4 12  3 20  3 25  2 35
=85 9 8  9 27  6 29 = 3 37
= 4 14 = 5 22
MAX

8 7 3 9 1 6 2 4 1 1 3 5 3 9 2 6 5 2 1 2 3 9 7 2 8 6 4
1 3 4 7 9 11 13 17 19 21 24 26 28 32 34 36
11 static evaluations 27
“DEEP” cut-offs
- For game trees with at least 4 Min/Max layers:
the Alpha - Beta rules apply also to deeper
levels.
4

4

4

4 2

4 2

28
The Gain: Best case:
- If at every layer: the best node is the left-most one

MAX

MIN

MAX

Only THICK is explored 29


Example of a perfectly
ordered tree (Tutorial, try to prune this and see what
happens)

21 MAX

21 12 3 MIN

21 24 27 12 15 18 3 6 9
MAX

21 20 19 24 23 22 27 26 2512 11 10 15 14 13 18 17 163 2 1 6 5 4 9 8 7
30
Best case gain pictured:
# Static evaluations b = 10
No pruning
100000
10000 Alpha-Beta
Best case
1000
100
10
Depth
1 2 3 4 5 6 7

- Note: algorithmic scale.


- Conclusion: still exponential growth !!
- Worst case??
For some trees alpha-beta does nothing,
For some trees: impossible to reorder to avoid cut-offs
31
The horizon effect.

Queen lost Pawn lost

horizon = depth bound


of mini-max

Queen lost

Because of the depth-bound


we prefer to delay disasters, although we
don’t prevent them !!
 solution: heuristic continuations
32
Heuristic Continuation
In situations that are identifies as strategically crucial
e.g: king in danger, imminent piece loss, pawn
to become as queens, ...
extend the search beyond the depth-bound !

depth-bound

33
Time bounds:
How to play within reasonable time bounds?
Even with fixed depth-bound, times can vary strongly!

Solution:
Iterative Deepening !!!

34
Summary

 alpha-beta algorithm does the same


calculation as minimax, and is more efficient
since it prunes irrelevant branches.
 usually, the complete game tree is not
expanded, search is cut off at some point and
an evaluation function calculated to estimate
the utility of a state.
 So far, for a possibly good and efficient search:

select good search method/technique

provide info/heuristic if possible

apply prune irrelevant branches

35

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy