0% found this document useful (0 votes)
22 views53 pages

6 Game

The document discusses the significance of games in artificial intelligence, particularly focusing on adversarial games like Chess and Go, which serve as compact environments for testing AI strategies against rational opponents. It outlines various types of games, the minimax algorithm for decision-making, and enhancements like alpha-beta pruning to improve efficiency in search. Additionally, it highlights notable achievements in AI game playing, including the victories of computer programs like Deep Blue and Chinook against human champions.

Uploaded by

nihar.dhurde
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views53 pages

6 Game

The document discusses the significance of games in artificial intelligence, particularly focusing on adversarial games like Chess and Go, which serve as compact environments for testing AI strategies against rational opponents. It outlines various types of games, the minimax algorithm for decision-making, and enhancements like alpha-beta pruning to improve efficiency in search. Additionally, it highlights notable achievements in AI game playing, including the victories of computer programs like Deep Blue and Chinook against human champions.

Uploaded by

nihar.dhurde
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 53

Game Playing

Artificial Intelligence
Contributors

2
Why Game
▪ Why games?
▪ Games like Chess or Go are compact settings that mimic the uncertainty of
interacting with the natural world involving other rational agents
▪ For centuries humans have used them to exert their intelligence
▪ Recently, there has been great success in building game programs that challenge
human supremacy

▪ Games vs. Search Problems


▪ "Unpredictable" opponent
▪ Solution is a strategy
▪ Specifying a move for every possible opponent reply
▪ Time limits
▪ Generally intractable search space, so agents cannot enumerate all possible winning moves
▪ Unlikely to find goal, must approximate
Adversarial Games
Types of Games
▪ Many different kinds of games!

▪ Axes:
▪ Deterministic or stochastic?
▪ One, two, or more players?
▪ Zero sum?
▪ Perfect information (can you see the state)?

▪ Want algorithms for calculating a strategy (policy) which recommends a


move from each state
Types of Game
Deterministic Games
▪ Many possible formalizations, one is:
▪ States: S (start at s0)
▪ Players: P={1...N} (usually take turns)
▪ Actions: A (may depend on player / state)
▪ Transition Function: SxA → S
▪ Terminal Test: S → {t,f}
▪ Terminal Utilities: SxP → R

▪ Solution for a player is a policy: S → A


Zero-Sum Games

▪ Zero-Sum Games ▪ General Games


▪ Agents have opposite utilities (values on ▪ Agents have independent utilities (values on
outcomes) outcomes)
▪ Lets us think of a single value that one ▪ Cooperation, indifference, competition, and
maximizes and the other minimizes more are all possible
▪ Adversarial, pure competition
Adversarial Search
Single-Agent Trees

2 0 … 2 6 … 4 6
Value of a State
Value of a state: Non-Terminal States:
The best achievable
outcome (utility)
from that state

2 0 … 2 6 … 4 6
Terminal States:
Adversarial Game Trees

-20 -8 … -18 -5 … -10 +4 -20 +8


Minimax Values
States Under Agent’s Control: States Under Opponent’s Control:

-8 -5 -10 +8

Terminal States:
Tic-Tac-Toe Game Tree
tictactoe – 9!=362880 terminal nodes
Adversarial Search (Minimax)
▪ Deterministic, zero-sum games: Minimax values:
computed recursively
▪ Tic-tac-toe, chess, checkers
▪ One player maximizes result 5 max
▪ The other minimizes result

2 5 min
▪ Minimax search:
▪ A state-space search tree
▪ Players alternate turns
8 2 5 6
▪ Compute each node’s minimax value:
the best achievable utility against a
Terminal values:
rational (optimal) adversary part of the game
Minimax algorithm
▪ Minimax algorithm
▪ Perfect play for deterministic, 2-player game
▪ Max tries to maximize its score
▪ Min tries to minimize Max’s score (Min)
▪ Goal: Max to move to position with highest minimax value
→ Identify best achievable payoff against best play
Minimax Implementation

def max-value(state): def min-value(state):


initialize v = -∞ initialize v = +∞
for each successor of state: for each successor of state:
v = max(v, min-value(successor)) v = min(v, max-value(successor))
return v return v
Minimax Implementation (Dispatch)
def value(state):
if the state is a terminal state: return the state’s utility
if the next agent is MAX: return max-value(state)
if the next agent is MIN: return min-value(state)

def max-value(state): def min-value(state):


initialize v = -∞ initialize v = +∞
for each successor of state: for each successor of state:
v = max(v, value(successor)) v = min(v, value(successor))
return v return v
Minimax Example

3 2 2

3 12 8 2 4 6 14 5 2
Poll
What kind of search is Minimax Search?

A) BFS
B) DFS
C) UCS
D) A*
Minimax Example
Generalized minimax
▪ What if the game is not zero-sum, or has multiple players?
▪ Generalization of minimax: 8,8,1
▪ Terminals have utility tuples
▪ Node values are also utility tuples
▪ Each player maximizes its own component 8,8,1 7,7,2
▪ Can give rise to cooperation and
competition dynamically…

0,0,7 8,8,1 7,7,2 0,0,8

1,1,6 0,0,7 9,9,0 8,8,1 9,9,0 7,7,2 0,0,8 0,0,7


Minimax Efficiency
▪ How efficient is minimax?
▪ Just like (exhaustive) DFS
▪ Time: O(bm)
▪ Space: O(bm)
▪ Minimax Solution
▪ Using the current state as the initial state, build the game
tree uniformly to the maximal depth m (called horizon)
feasible within the time limit
▪ Evaluate the states of the leaf nodes
▪ Back up the results from the leaves to the root and pick
the best action assuming the best play by MIN (worst for
MAX)
▪ Example: For chess, b  35, m  100
▪ Exact solution is completely infeasible
▪ But, do we need to explore the whole tree?
Resource Limits
Resource Limits
▪ Problem: In realistic games, cannot search to leaves! 4 max
-2 4 min
▪ Solution 1: Bounded lookahead
▪ Search only to a preset depth limit or horizon -1 -2 4 9
▪ Use an evaluation function for non-terminal positions

▪ Guarantee of optimal play is gone


▪ More plies make a BIG difference

▪ Example:
▪ Suppose we have 100 seconds, can explore 10K nodes / sec
▪ So can check 1M nodes per move ? ? ? ?
▪ For chess, b=~35 so reaches about depth 4 – not so good
Depth Matters
▪ Evaluation functions are always
imperfect
▪ Deeper search => better play
(usually)
▪ Or, deeper search gives same quality
of play with a less accurate
evaluation function
▪ An important example of the tradeoff
between complexity of features and
complexity of computation
Evaluation Functions
Evaluation Function
▪ Function E: state s → number E(s)
▪ E(s) is a heuristics: estimates how favorable s is for MAX
▪ E(s) > 0 → s is favorable to MAX (the larger the better)
▪ E(s) < 0 → s is favorable to MIN
▪ Features may include
▪ E(s) = 0 → s is neutral ▪ Number of pieces of each type
▪ Number of possible moves
▪ Number of squares controlled
▪ Why using backed-up values?
▪ At each non-leaf node n, the backed-up value is the value of the best state that MAX
can reach at depth m if MIN plays well (by the same criterion as MAX applies to
itself)
▪ If E is to be trusted in the first place, then the backed-up value is a better estimate
of how favorable STATE(n) is than E(STATE(n))
Evaluation Functions
▪ Evaluation functions score non-terminals in depth-limited search

▪ Ideal function: returns the actual minimax value of the position


▪ In practice: typically weighted linear sum of features:

▪ e.g. f1(s) = (num white queens – num black queens), etc.


Example: A Heuristic for Tic-Tac-Toe

▪ Heuristic: E(n) = M(n) – O(n)


▪ M(n) = Total possible wining
lines for MAX
▪ O(n) = Total possible wining
lines for the opponent
Example: 2-ply Minimax on Tic-Tac-Toe
Tic-Tac-Toe Tree at horizon 2 shows the backed up Minimax values
Example: Tic-Tac-Toe (cont.)
2-ply Minimax
applied to one of
two possible MAX
second moves
Example: Tic-Tac-Toe (cont.)
2-ply Minimax
applied to X’s move
near the end of the
game
Can we do better?
▪ Yes ! Much better !
 3

3  -1

 Pruning
-1 This part of the tree can’t
have any effect on the value
that will be backed up to the
root
Game Tree Pruning
Intuition: prune the branches that can’t be chosen

3 2 2

3 12 8 2 4 6 14 5 2
Alpha-Beta Pruning Example
α = best option so far from any
MAX node on this path

α =3 α =3
β=3 β=2 β=14
β=5
β=2

3 12 8 2 14 5 2

We can prune when: min node won’t The order of generation matters: more pruning
be higher than 2, while parent max is possible if good moves come first
has seen something larger in another
branch
Alpha-Beta Implementation

α: MAX’s best option on path to root


β: MIN’s best option on path to root

def max-value(state, α, β): def min-value(state , α, β):


initialize v = -∞ initialize v = +∞
for each successor of state: for each successor of state:
v = max(v, value(successor, α, β)) v = min(v, value(successor, α, β))
if v ≥ β return v if v ≤ α return v
α = max(α, v) β = min(β, v)
return v return v
Alpha-Beta Implementation

Node Type LEAF MAX MIN


Update 𝛼 𝛽
Utility
Return
Value 𝛼 𝛽
Condition 𝛼≥𝛽
if Condition
True Then 𝛽 𝛼
Return
Alpha-Beta Pruning Example
α = best option so far from any 𝛽 = best option so far from any
MAX node on this path MIN node on this path
Default = -∞ α = -∞
3 Default = +∞
𝛽 = +∞

α = -∞ α=3 α=3 𝛼≥𝛽


𝛼≥𝛽
𝛽=3 +∞ +∞
𝛽=2 +∞
𝛽=5
2
14 Return α
Return α

3 12 8 2 4 6 14 5 2

Node Type LEAF MAX MIN


Update 𝛼 𝛽
Return Utility Value 𝛼 𝛽
Condition 𝛼≥𝛽
if Condition True Then Return 𝛽 𝛼
Minimax Quiz
What is the value of the top node?
A) 10
B) 100
C) 2
D) 4
Alpha Beta Quiz
Which branches are pruned?
A) e, l
B) g, l
C) g, k, l
D) g, n
For Practice
Visit this Website: https://pascscha.ch/info2/abTreePractice/
Alpha-Beta Pruning Properties

max

min

10 10 0
Alpha-Beta Pruning Properties
▪ This pruning has no effect on minimax value computed for the root!

▪ Values of intermediate nodes might be wrong


▪ Important: children of the root may have the wrong value max
▪ So the most naïve version won’t let you do action selection
min
▪ Good child ordering improves effectiveness of pruning

▪ With “perfect ordering”:


▪ Time complexity drops to O(bm/2) 10 10 0
▪ Doubles solvable depth!
▪ Full search of, e.g. chess, is still hopeless…

▪ This is a simple example of metareasoning (computing about what to compute)


Additional Refinements
▪ Waiting for Quiescence: continue the search until no drastic change occurs
from one level to the next.

▪ Secondary Search: after choosing a move, search a few more levels


beneath it to be sure it still looks good.

▪ Book Moves: for some parts of the game (especially initial and end moves),
keep a catalog of best moves to make.

48
Checkers: Tinsley vs. Chinook

Name: Marion Tinsley


Profession: Teach mathematics
Hobby: Checkers
Record: Over 42 years loses only 3
games of checkers

World champion for over 40 years


1994: Mr. Tinsley suffered his 4th and 5th
losses against Chinook
Chinook
▪ First computer to become official world champion of Checkers!
Chess: Kasparov vs. Deep Blue

Kasparov Deep Blue

5’10” Height 6’ 5”
176 lbs Weight 2,400 lbs
34 years Age 4 years
50 billion neurons Computers 32 RISC processors
+ 256 VLSI chess engines
2 pos/sec Speed 200,000,000 pos/sec
Extensive Knowledge Primitive
Electrical/chemical Power Source Electrical
Enormous Ego None
1997: Deep Blue wins by 3 wins, 1 loss, and 2 draws
Chess: Kasparov vs. Deep Junior

Deep Junior

8 CPU, 8 GB RAM, Win 2000


2,000,000 pos/sec
Available at $100

August 2, 2003:Match ends in a 3/3 tie!


Go: Goemate vs. ??
Name: Chen Zhixing
Profession: Retired
Computer skills: self-taught programmer
Author of Goemate (arguably the best Go program
available today)

Gave Goemate a 9 stone handicap and still easily


beat the program, thereby winning $15,000

Go has too high a branching factor for existing search


techniques. Current and future software must rely on
huge databases and pattern-recognition techniques
Game Playing State-of-the-Art
▪ Checkers: 1950: First computer player. 1994: First
computer champion: Chinook ended 40-year-reign
of human champion Marion Tinsley using complete
8-piece endgame. 2007: Checkers solved!

▪ Chess: 1997: Deep Blue defeats human champion


Gary Kasparov in a six-game match. Deep Blue
examined 200M positions per second, used very
sophisticated evaluation and undisclosed methods
for extending some lines of search up to 40 ply.
Current programs are even better, if less historic.

▪ Go: Human champions are now starting to be


challenged by machines. In go, b > 300! Classic
programs use pattern knowledge bases, but big
recent advances use Monte Carlo (randomized)
expansion methods.
Game Playing State-of-the-Art
▪ Checkers: 1950: First computer player. 1994: First
computer champion: Chinook ended 40-year-reign
of human champion Marion Tinsley using complete
8-piece endgame. 2007: Checkers solved!

▪ Chess: 1997: Deep Blue defeats human champion


Gary Kasparov in a six-game match. Deep Blue
examined 200M positions per second, used very
sophisticated evaluation and undisclosed methods
for extending some lines of search up to 40 ply.
Current programs are even better, if less historic.

▪ Go: 2016: Alpha GO defeats human champion.


Uses Monte Carlo Tree Search, learned evaluation
function.

▪ Pacman

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy