0% found this document useful (0 votes)

42 views42 pages

05 Games

Uploaded by

shahzad.dar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

42 views42 pages

05 Games

Uploaded by

shahzad.dar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 42

CS 5/7320

Artificial Intelligence

Adversarial Search
and Games
AIMA Chapter 5

Slides by Michael Hahsler

with figures from the AIMA textbook

This work is licensed under a Creative Commons

Attribution-ShareAlike 4.0 International License. "Reflected Chess pieces" by Adrian Askew
Games
• Games typically confront the agent with a
competitive (adversarial) environment affected by
an opponent (strategic environment).
• Games are episodic.
• We will focus on planning for
• two-player zero-sum games with
• deterministic game mechanics and
• perfect information (i.e., fully observable environment).

• We call the two players:

1) Max tries to maximize his utility.
2) Min tries to minimize Max’s utility since it is
a zero-sum game.
Definition of a Game
• Definition:
𝑠0 The initial state (position, board, hand).
𝐴𝑐𝑡𝑖𝑜𝑛𝑠(𝑠) Legal moves in state 𝑠.
𝑅𝑒𝑠𝑢𝑙𝑡(𝑠, 𝑎) Transition model.
𝑇𝑒𝑟𝑚𝑖𝑛𝑎𝑙(𝑠) Test for terminal states.
𝑈𝑡𝑖𝑙𝑖𝑡𝑦(𝑠) Utility for player Max for terminal states.

• State space: a graph defined by the initial state and the

transition function containing all reachable states (e.g.,
chess positions).
• Game tree: a search tree superimposed on the state
space. A complete game tree follows every sequence
from the current state to the terminal state (the game
ends).
Example: Tic-tac-toe

𝑠0 Empty board.
𝐴𝑐𝑡𝑖𝑜𝑛𝑠(𝑠) Play empty squares.
𝑅𝑒𝑠𝑢𝑙𝑡(𝑠, 𝑎) Symbol (x/o) is placed on empty square.
𝑇𝑒𝑟𝑚𝑖𝑛𝑎𝑙(𝑠) Did a player win or is the game a draw?
𝑈𝑡𝑖𝑙𝑖𝑡𝑦(𝑠) +1 if x wins, -1 if o wins and 0 for a draw.
Utility is only defined for terminal states.

Here player x is Max

and player o is Min.
Note: This game still uses a goal-based agent that
plans actions to reach a winning terminal state!
Tic-tac-toe: Partial Game Tree
Note: This game state / node
has no cycles! # of nodes

action / result 1

redundant path
9×8
The state space size (number of
possible boards) is much smaller
than:
39 = 19,683 states.

Terminal states However, the complete game tree is

have a known much larger because the same state
utility (board) can be reached in different
subtrees (redundant paths). The game
tree here is a little smaller than:
1 + 9 × 8 + 9 × 8 × 7 + ⋯ 9!
= 986,409 nodes
Exact Methods
• Model as nondeterministic actions: The
opponent is seen as part of an
environment with nondeterministic
actions. Non-determinism is the result of
the unknown moves by the opponent. We
consider all possible moves by the
opponent.
• Find optimal decisions: Minimax search
Methods for and Alpha-Beta pruning where each
player plays optimal to the end of the
game.
Adversarial
Games Heuristic Methods
(game tree is too large)
• Heuristic Alpha-Beta Tree Search:
a. Cut off game tree and use
heuristic for utility.
b. Forward Pruning: ignore poor
moves.
• Monte Carlo Tree search: Estimate utility
of a state by simulating complete games
and average the utility.
Nondeterministic Actions

Recall AND-OR Search from AIMA Chapter 4

Exact Methods
• Model as nondeterministic actions: The
opponent is seen as part of an environment with
nondeterministic actions. Non-determinism is the
result of the unknown moves by the opponent.
We consider all possible moves by the opponent.
• Find optimal decisions: Minimax search and
Methods Alpha-Beta pruning where each player plays
optimal to the end of the game.
for
Heuristic Methods
Adversarial (game tree is too large)

Games • Heuristic Alpha-Beta Tree Search:

a. Cut off game tree and use heuristic for
utility.
b. Forward Pruning: ignore poor moves.
• Monte Carlo Tree search: Estimate utility of a
state by simulating complete games and average
the utility.
Recall: Nondeterministic Actions
For planning, we do not know what the opponents moves will
be. We have already modeled this issue using
nondeterministic actions.

Outcome of actions in the environment is nondeterministic =

transition model need to describe uncertainty about the
opponent's behavior.
Each action consists of the move by the
player and all possible (i.e., nondeterministic)
responses by the opponent.
Example transition:
𝑅𝑒𝑠𝑢𝑙𝑡𝑠 𝑠1 , 𝑎 = 𝑠2 , 𝑠4 , 𝑠5
i.e., action 𝑎 in 𝑠1 can lead to one of several states (which is
called a belief state of the agent).
Recall: AND-OR DFS Search Algorithm
= nested If-then-else statements

// don’t follow loops my

// check all possible actions moves

all states that can result from

opponent’s moves

// check all possible current states

Go through
abandon subtree if a loss is found
opponent
moves
Tic-tac-toe: AND-OR Search
We play MAX and decide on our actions (OR).
MIN’s actions introduce non-determinism (AND).
Depth (ply)
Pick an action that leads to a
0
subtree with only win leaves.
OR

AND
2 Objective: Find a subtree that has only win
OR leaf nodes (utility +1). We can abandon a
subtree if we find a single loss (utility -1).
3

AND We call playing always the best move

playing optimally. Since we consider all
the opponent’s moves in the AND stage,
we also includes MIN’s best move. This
m means we consider MIN playing optimally.
Optimal Decisions
Minimax Search and Alpha-Beta Pruning
Exact Methods
• Model as nondeterministic actions: The
opponent is seen as part of an environment with
nondeterministic actions. Non-determinism is the
result of the unknown moves by the opponent.
We consider all possible moves by the opponent.
• Find optimal decisions: Minimax search and
Methods Alpha-Beta pruning where each player plays
optimal to the end of the game.
for
Heuristic Methods
Adversarial (game tree is too large)

Games • Heuristic Alpha-Beta Tree Search:

a. Cut off game tree and use heuristic for
utility.
b. Forward Pruning: ignore poor moves.
• Monte Carlo Tree search: Estimate utility of a
state by simulating complete games and average
the utility.
Idea: Minimax Decision
• Assign each state 𝑠 a minimax value that reflects the utility
realized if both players play optimally from 𝑠 to the end of
the game:
𝑈𝑡𝑖𝑙𝑖𝑡𝑦 𝑠 if 𝑡𝑒𝑟𝑚𝑖𝑛𝑎𝑙(𝑠)
max 𝑀𝑖𝑛𝑖𝑚𝑎𝑥 𝑅𝑒𝑠𝑢𝑙𝑡 𝑠, 𝑎 if 𝑚𝑜𝑣𝑒 = 𝑀𝑎𝑥
𝑀𝑖𝑛𝑖𝑚𝑎𝑥 𝑠 = 𝑎∈𝐴𝑐𝑡𝑖𝑜𝑛𝑠 𝑠
min 𝑀𝑖𝑛𝑖𝑚𝑎𝑥 𝑅𝑒𝑠𝑢𝑙𝑡 𝑠, 𝑎 if 𝑚𝑜𝑣𝑒 = 𝑀𝑖𝑛
𝑎∈𝐴𝑐𝑡𝑖𝑜𝑛𝑠 𝑠

• This is a recursive definition which can be solved from

terminal states backwards.
• The optimal decision for Max is the action that leads to the
state with the largest minimax value. That is the largest
possible utility if both players play optimally.
Minimax Search: Back-up
Minimax Values
Pick action that leads to the largest MV

MV MV MV MV MV MV MV MV MV
min
1 MV MV Determine MVs using a bottom-
max up strategy

0 1 1 • Max always picks the action

min that has the largest value.
… • Min always picks the action
that has the smallest value.

= minimax value (MV)

Approach: Follow tree to each
terminal node and back up
minimax value.

Note: This is just a generalization

of the AND-OR Tree Search and
returns the first action of the
conditional plan.

Represents
OR Search

Find the action that

leads to the best value.

Represents
AND Search
Exercise: Simple 2-Ply Game
MV
Max

𝑎1 𝑎2 𝑎3

MV MV MV
Min

𝑎1 𝑎3 𝑎1 𝑎3
𝑎1 𝑎3
𝑎2 𝑎2 𝑎2

Utility for Max 2 0 5 -5 -2 7 5 -7 4

Utility for Min

• What are the terminal state utilities for Min?

• Compute all MV (minimax values).
• How do we traverse the game tree? What is the Big-O notation for time and space?
• What is the optimal action for Max?
b: max branching factor
m: max depth of tree
Issue: Game Tree Size
• Minimax search traverses the complete game tree using DFS!

Space complexity: 𝑂 𝑏𝑚
Time complexity: 𝑂 𝑏𝑚

• Fast solution is only feasible for very simple games with small branching factor!

• Example: Tic-tac-toe
𝑏 = 9, 𝑚 = 9 → 𝑂 99 = 𝑂(387,420,489)
𝑏 decreases from 9 to 8, 7, … the actual size is smaller than:
1 9 9 × 8 9 × 8 × 7 … 9! = 986,409 nodes

• We need to reduce the search space! → Game tree pruning

Alpha-Beta Pruning
• Idea: Do not search parts of the tree if they do not make a
difference to the outcome.

• Observations:
• min(3, 𝑥, 𝑦) can never be more than 3
• max(5, min(3, 𝑥, 𝑦, … )) does not depend on the values of 𝑥 or 𝑦.
• Minimax search applies alternating min and max.

• Approach: maintain bounds for the minimax value

[𝛼, 𝛽] and prune subtrees (i.e., don’t follow actions) that do
not affect the current minimax value bound.
• Alpha is used by Max and means “𝑀𝑖𝑛𝑖𝑚𝑎𝑥(𝑠) is at least 𝛼.”
• Beta is used by Min and means “𝑀𝑖𝑛𝑖𝑚𝑎𝑥(𝑠) is at most 𝛽.”
Example: Alpha-Beta Search
[𝛼, β] Max updates α
Max Max (utility is at least)

Min Min Min updates 𝛽

(utility is at most)

Utility cannot be
more than 2 in the
Max Max subtree, but we
𝑣=3 already can get 3
Min Min from the first
𝑣≤2 subtree. Prune the
rest.

𝑣=3
[ 3, +∞ ] Max Max Once a subtree is
𝑣=2 fully evaluated,
Min Min the interval has a
length of 0
(𝛼 = 𝛽).
= minimax search + pruning

// v is the minimax value

Found a better action?

Abandon subtree if Max finds an
actions that has more value than
the best-known move Min has in
another subtree.

Found a better action?

Abandon subtree if Min finds an

actions that has less value than
the best-known move Max has in
another subtree.
Move Ordering for Alpha-Beta Search
• Idea: Pruning is more effective if good alpha-beta bounds can be
found in the first few checked subtrees.

• Move ordering for DFS = Check good moves for Min and Max
first.

• We need expert knowledge or some heuristic to determine what

a good move is.

• Issue: Optimal decision algorithms still scale poorly even when

using alpha-beta pruning with move ordering.
Exercise: Simple 2-Ply Game
[𝛼, 𝛽]
Max

𝑎1 𝑎2 𝑎3

[𝛼, 𝛽] [𝛼, 𝛽] [𝛼, 𝛽]

Min

𝑎1 𝑎3 𝑎1 𝑎3
𝑎1 𝑎3
𝑎2 𝑎2 𝑎2

Utility for Max 2 -5 5 7 0 2 5 -7 -4

• Find the [𝛼, 𝛽] intervals for all nodes.

• What part of the tree can be pruned?
• What would be the optimal move ordering?
Heuristic Alpha-Beta
Tree Search
Exact Methods
• Model as nondeterministic actions: The
opponent is seen as part of an environment with
nondeterministic actions. Non-determinism is the
result of the unknown moves by the opponent.
We consider all possible moves by the opponent.
• Find optimal decisions: Minimax search and
Methods Alpha-Beta pruning where each player plays
optimal to the end of the game.
for
Heuristic Methods
Adversarial (game tree is too large or search takes too long)

Games • Heuristic Alpha-Beta Tree Search:

a. Cut off game tree and use heuristic for
utility.
b. Forward Pruning: ignore poor moves.
• Monte Carlo Tree search: Estimate utility of a
state by simulating complete games and average
the utility.
Cutting off search
Reduce the search cost by restricting the search depth:
1. Stop search at a non-terminal node.
2. Use a heuristic evaluation function 𝐸𝑣𝑎𝑙 𝑠 to approximate the utility for
that node/state.

Needed properties of the evaluation function:

▪ Fast to compute.
▪ 𝐸𝑣𝑎𝑙 𝑠 ∈ 𝑈𝑡𝑖𝑙𝑖𝑡𝑦 𝑙𝑜𝑠𝑠 , 𝑈𝑡𝑖𝑙𝑖𝑡𝑦 𝑤𝑖𝑛
▪ Correlated with the actual chance of winning (e.g., using features of the state).

Examples:
1. A weighted linear function
𝐸𝑣𝑎𝑙 𝑠 = 𝑤1 𝑓1 𝑠 + 𝑤2 𝑓2 𝑠 + ⋯ + 𝑤𝑛 𝑓𝑛 (𝑠)
where 𝑓𝑖 is a feature of the state (e.g., # of pieces captured in chess).
2. A deep neural network trained on complete games.
Heuristic Alpha-Beta Tree Search:
Cutting off search HMV = heuristic minimax value
Depth (ply)
0 Pick the action with
the highest HMV

1 HMV HMV HMV HMV HMV HMV HMV HMV HMV

Eval = heuristic to estimate of the minimax

2 Eval Eval Eval
value/utility of the state.
Cut search off at depth =2

This is also called: search

with a “look ahead” of 2
Forward pruning

To save time, we can prune moves that appear bad.

There are many ways move quality can be evaluated:

• Low heuristic value.

• Low evaluation value after shallow search (cut-off search).
• Past experience.

Issue: May prune important moves.

Heuristic Alpha-Beta Tree Search:
Example for Forward Pruning
x … prune low HMV actions
x x x x x x
HMV HMV HMV HMV HMV HMV HMV HMV HMV

Perform complete alpha-

Eval Eval Eval
beta search on these.
Cut search off at depth =2

1. Perform Cut-off search.

2. Choose the n best actions
using the heuristic minimax
value and prune the rest.
3. Explore the chosen actions
using regular Alpha-Beta
Tree search with move
ordering.
Monte Carlo Tree
Search (MCTS)
Exact Methods
• Model as nondeterministic actions: The
opponent is seen as part of an environment with
nondeterministic actions. Non-determinism is the
result of the unknown moves by the opponent.
We consider all possible moves by the opponent.
• Find optimal decisions: Minimax search and
Methods Alpha-Beta pruning where each player plays
optimal to the end of the game.
for
Heuristic Methods
Adversarial (game tree is too large or search takes too long)

Games • Heuristic Alpha-Beta Tree Search:

• Playout policy: How to choose moves during the simulation runs?

Example playout policies:
• Random.
• Heuristics for good moves developed by experts.
• Learn good moves from self-play (e.g., with deep neural networks). We
will talk about this when we talk about “Learning from Examples.”

• Typically used for problems with

• High branching factor (many possible moves make the tree very wide).
• Unknown or hard to define good evaluation functions.
Pure Monte Carlo Search
Find the next best move.

• Method
1. Simulate 𝑁 playouts from the current state.
2. Select the move that results in the highest win percentage.

• Optimality Guarantee: Converges to optimal play for

stochastic games as 𝑁 increases.

• Typical strategy for 𝑁 : Do as many playouts as you can

given the available time budget for the move.
Playout Selection Strategy
Max can start a
playout at any of
these states. Which
one should it choose?

Issue: Pure Monte Carlo Search spends a lot of time to create playouts for bad
move.
Better: Select the starting state for playouts to focus on important parts of the
game tree (i.e., good moves).
This presents the following tradeoff:
Exploration: perform more playouts from
states that currently have no or few
playouts.

Exploitation: more playouts for states that have

done well to get more accurate estimates.
Selection using Upper Confidence
Bounds (UCB1)
Tradeoff constant ≈ 2
can be optimizes using experiments

𝑈 𝑛 log 𝑁(𝑃𝑎𝑟𝑒𝑛𝑡 𝑛 )
𝑈𝐶𝐵1 𝑛 = +𝐶
𝑁 𝑛 𝑁(𝑛)

Average utility High for nodes with few playouts relative to the
(=exploitation) parent node (=exploration). Goes to 0 for large 𝑁(𝑛)

𝑛 … node in the game tree

𝑈 𝑛 … total utility of all playouts going through node n
𝑁 𝑛 … number of playouts through n

Selection strategy: Select node with highest UCB1 score.

Monte Carlo Tree Search (MCTS)
Pure Monte Carlo search always start playouts from a given
state.
Monte Carlo Tree Search builds a partial game tree and can
start playouts from any state (node) in that tree.

Important considerations:
• We can use UCB1 as the selection strategy to decide what
part of the tree we should focus on for the next playout.
This balances exploration and exploitation.
• We typically can only store a small part of the game tree, so
we do not store the complete playout runs.
Highest UCB1 score UCB1 selection favors win
percentage more and more.

Wins/Playouts
White

Black

White

Black

White

(update counts)

Select leaf with Note: the simulation

highest UCB1 score path is not recorded to
preserve memory!
Online Play Using MCTS
• Search and update a partial tree to use up the time budget for the
move.
• Keep the relevant subtree from move to move and expand from
there.

Do highest Keep subtree and

playout move explore/exploit.
Wins/Playouts
White

Black
After
move
White

Black

White
Stochastic Games
Games With Random Events
Stochastic Games
• Game includes a “random action” 𝑟 (e.g., dice, dealt cards)
• Add chance nodes that calculate the expected value.

Backgammon
Expectiminimax
• Game includes a “random action” 𝑟 (e.g., dice, dealt cards).
• For chance nodes we calculate the expected minimax value.

𝐸𝑥𝑝𝑒𝑐𝑡𝑖𝑚𝑖𝑛𝑖𝑚𝑎𝑥 𝑠 =
𝑈𝑡𝑖𝑙𝑖𝑡𝑦 𝑠 if 𝑡𝑒𝑟𝑚𝑖𝑛𝑎𝑙(𝑠)
max 𝐸𝑥𝑝𝑒𝑐𝑡𝑖𝑚𝑖𝑛𝑖𝑚𝑎𝑥 𝑅𝑒𝑠𝑢𝑙𝑡 𝑠, 𝑎 if 𝑚𝑜𝑣𝑒 = 𝑀𝑎𝑥
𝑎∈𝐴𝑐𝑡𝑖𝑜𝑛𝑠 𝑠
min 𝐸𝑥𝑝𝑒𝑐𝑡𝑖𝑚𝑖𝑛𝑖𝑚𝑎𝑥 𝑅𝑒𝑠𝑢𝑙𝑡 𝑠, 𝑎 if 𝑚𝑜𝑣𝑒 = 𝑀𝑖𝑛
𝑎∈𝐴𝑐𝑡𝑖𝑜𝑛𝑠 𝑠

෍ 𝑃(𝑟)𝐸𝑥𝑝𝑒𝑐𝑡𝑖𝑚𝑖𝑛𝑖𝑚𝑎𝑥 𝑅𝑒𝑠𝑢𝑙𝑡 𝑠, 𝑟 if 𝑚𝑜𝑣𝑒 = 𝐶ℎ𝑎𝑛𝑐𝑒

𝑟

• Options:
• Use Minimax algorithm. Issue: Search tree size explodes if the number of
“random actions” is large. Think of drawing cards for poker!
• Cut-off search and approximate Expectiminimax with an evaluation function.
• Perform Monte Carlo Tree Search.
Scale only for tiny problems!
Nondeterministic actions:
• The opponent is seen as part of an
environment with nondeterministic
actions. Non-determinism is the
result of the unknown moves by the
opponent. All possible moves are
considered.
Optimal decisions:
• Minimax search and Alpha-Beta
pruning where each player plays
optimal to the end of the game.
• Choice nodes and Expectiminimax for
stochastic games.

Conclusion Heuristic Alpha-Beta Tree Search:

• Cut off game tree and use heuristic
evaluation function for utility (based
on state features).
• Forward Pruning: ignore poor moves.

State of the Art

• Learn heuristic from data using MCTS

Monte Carlo Tree search:

• Simulate complete games and
calculate proportion of wins.
• Use modified UCB1 scores to expand
the partial game tree.
• Learn playout policy using self-play
and deep learning.

EX750-5 Circuit Diagram
100% (1)
EX750-5 Circuit Diagram
18 pages
Rogue Stars: Skirmish Wargaming in a Science Fiction Underworld
From Everand
Rogue Stars: Skirmish Wargaming in a Science Fiction Underworld
Andrea Sfiligoi
4/5 (11)
05 Games
No ratings yet
05 Games
42 pages
Module 2 (Part 2)
No ratings yet
Module 2 (Part 2)
136 pages
GamePlaying Minimax Unit-2 SPS
No ratings yet
GamePlaying Minimax Unit-2 SPS
72 pages
3 GamePlaying - Minimax
No ratings yet
3 GamePlaying - Minimax
75 pages
Game-Playing & Adversarial Search
No ratings yet
Game-Playing & Adversarial Search
68 pages
Lecture11 AdversarialSearch
No ratings yet
Lecture11 AdversarialSearch
74 pages
L06 Adversarial Search
No ratings yet
L06 Adversarial Search
66 pages
AI All Units
No ratings yet
AI All Units
93 pages
SET394 - AI - Lecture 06 - Adversarial Search
No ratings yet
SET394 - AI - Lecture 06 - Adversarial Search
27 pages
6 Game
No ratings yet
6 Game
53 pages
L06 (Adversarial Search) Ori
No ratings yet
L06 (Adversarial Search) Ori
46 pages
Oradea: Bucharest Arad Craiova
No ratings yet
Oradea: Bucharest Arad Craiova
53 pages
Ai 03 004
No ratings yet
Ai 03 004
48 pages
CC511 Week 4
No ratings yet
CC511 Week 4
57 pages
Biti1113 Games in Ai
No ratings yet
Biti1113 Games in Ai
58 pages
AI Lec03 Adversarial Search
No ratings yet
AI Lec03 Adversarial Search
38 pages
Ai 4
No ratings yet
Ai 4
25 pages
Adversial Search
No ratings yet
Adversial Search
101 pages
Ai Lect 05
No ratings yet
Ai Lect 05
39 pages
Unit 3 Updated
No ratings yet
Unit 3 Updated
112 pages
Adversarial Search Two - Persons Game: Russel Norvig (Text) Book and Patrick Henry Winston (Reference Book)
No ratings yet
Adversarial Search Two - Persons Game: Russel Norvig (Text) Book and Patrick Henry Winston (Reference Book)
71 pages
Cs 171 07a Games MiniMax
No ratings yet
Cs 171 07a Games MiniMax
28 pages
Lecture 5 - Adversal Search
No ratings yet
Lecture 5 - Adversal Search
88 pages
Unit 2c Game Playing (Compatibility Mode)
No ratings yet
Unit 2c Game Playing (Compatibility Mode)
36 pages
IA c06 NoAnim
No ratings yet
IA c06 NoAnim
31 pages
Game Playing
No ratings yet
Game Playing
53 pages
04 Games PDF
No ratings yet
04 Games PDF
77 pages
Adveserial Search
No ratings yet
Adveserial Search
29 pages
UNIT 2 AI Notes
No ratings yet
UNIT 2 AI Notes
26 pages
Adversarial Search
No ratings yet
Adversarial Search
42 pages
AI-Lecture 6 (Adversarial Search)
No ratings yet
AI-Lecture 6 (Adversarial Search)
68 pages
Adversarial Search MinMax Alpha Beta Pruning
No ratings yet
Adversarial Search MinMax Alpha Beta Pruning
43 pages
Adversarial Search: Course: Artificial Intelligence Effective Period: September 2018
No ratings yet
Adversarial Search: Course: Artificial Intelligence Effective Period: September 2018
35 pages
Adversarial Search
No ratings yet
Adversarial Search
36 pages
AI Unit 3
No ratings yet
AI Unit 3
54 pages
Chapter. 06 - Adversarial Search and Games - No Embedded Videos
No ratings yet
Chapter. 06 - Adversarial Search and Games - No Embedded Videos
51 pages
CS2201 7
No ratings yet
CS2201 7
56 pages
Artificial Intelligence: Adversarial Search
No ratings yet
Artificial Intelligence: Adversarial Search
62 pages
Institute of Southern Punjab Multan: Syed Zohair Quain Haider Lecturer ISP Multan
No ratings yet
Institute of Southern Punjab Multan: Syed Zohair Quain Haider Lecturer ISP Multan
41 pages
Ai (Un 03)
No ratings yet
Ai (Un 03)
18 pages
Adversarial Search: in Artificial Intelligence
No ratings yet
Adversarial Search: in Artificial Intelligence
21 pages
AI Unit-III
No ratings yet
AI Unit-III
124 pages
cs188 sp23 Lec09
No ratings yet
cs188 sp23 Lec09
47 pages
6-A Star Search Adversarial Search-09!01!2025
No ratings yet
6-A Star Search Adversarial Search-09!01!2025
42 pages
AI - Unit - 2
No ratings yet
AI - Unit - 2
30 pages
Lecture 6 - Minmax Alpha Beta
No ratings yet
Lecture 6 - Minmax Alpha Beta
41 pages
Game Playing
No ratings yet
Game Playing
32 pages
Games
No ratings yet
Games
41 pages
Unit 3 - Ai - II Aiml Full-1
No ratings yet
Unit 3 - Ai - II Aiml Full-1
108 pages
Adversarial Search
No ratings yet
Adversarial Search
49 pages
Lec11&12-Adversarial Search
No ratings yet
Lec11&12-Adversarial Search
30 pages
Games Playing-2-57
No ratings yet
Games Playing-2-57
56 pages
Yapay Zeka - 8
No ratings yet
Yapay Zeka - 8
48 pages
Lec03 Ai Chapter6 Adversarial Search and Game Playing Aima
No ratings yet
Lec03 Ai Chapter6 Adversarial Search and Game Playing Aima
52 pages
W6-Adverserial Search
No ratings yet
W6-Adverserial Search
39 pages
CSC-411-AI-lec6-Adversarial Search
No ratings yet
CSC-411-AI-lec6-Adversarial Search
38 pages
Adversarial Search
No ratings yet
Adversarial Search
20 pages
The Doomed: Apocalyptic Horror Hunting: A Wargame
From Everand
The Doomed: Apocalyptic Horror Hunting: A Wargame
Chris McDowall
No ratings yet
BattleLaw
From Everand
BattleLaw
Raven Coyne
No ratings yet
F24 ANN 7A FINAL Best
No ratings yet
F24 ANN 7A FINAL Best
19 pages
Trust and Inspire Presentation
No ratings yet
Trust and Inspire Presentation
3 pages
02 Agents
No ratings yet
02 Agents
38 pages
22 Reinforcement Learning
No ratings yet
22 Reinforcement Learning
18 pages
Tautology
No ratings yet
Tautology
4 pages
06 CSP
No ratings yet
06 CSP
20 pages
DS - Chapter # 1
No ratings yet
DS - Chapter # 1
37 pages
Hello Kids Admissions Booklet June 2023
No ratings yet
Hello Kids Admissions Booklet June 2023
12 pages
Czarina T. Malvar v. Kraft Food Philippines, Inc.
No ratings yet
Czarina T. Malvar v. Kraft Food Philippines, Inc.
1 page
690196-Legal Change CTe Solution
No ratings yet
690196-Legal Change CTe Solution
2 pages
Erosion and Erosion-Corrosion of Metals: A.V. Levy
No ratings yet
Erosion and Erosion-Corrosion of Metals: A.V. Levy
12 pages
Impact of Fused Deposition Modeling (FDM) Process Parameters
No ratings yet
Impact of Fused Deposition Modeling (FDM) Process Parameters
13 pages
11 04 2019 Asea P1
No ratings yet
11 04 2019 Asea P1
40 pages
DL2108T02
No ratings yet
DL2108T02
10 pages
Chan V. Honda Motor Co., Ltd. and Honda Phil.: Rights, Regulations and Remedies) in Relation To Sec 170
No ratings yet
Chan V. Honda Motor Co., Ltd. and Honda Phil.: Rights, Regulations and Remedies) in Relation To Sec 170
3 pages
KFS Efaflu BMV
No ratings yet
KFS Efaflu BMV
5 pages
Delhi Metro Rail Corporation LTD
No ratings yet
Delhi Metro Rail Corporation LTD
3 pages
Glenndal 2
No ratings yet
Glenndal 2
7 pages
ADVANCED PNEUMATICS Reviewer
No ratings yet
ADVANCED PNEUMATICS Reviewer
4 pages
Silt Measuring Instrument
No ratings yet
Silt Measuring Instrument
6 pages
Lit - Ch01 - Kimmel Et Al. 2013 - Ch13-2
No ratings yet
Lit - Ch01 - Kimmel Et Al. 2013 - Ch13-2
28 pages
How To Grow More Vegetables
100% (7)
How To Grow More Vegetables
168 pages
Easa TCDS A.084 - Atr - 42 - Atr - 72 03 17102012 PDF
No ratings yet
Easa TCDS A.084 - Atr - 42 - Atr - 72 03 17102012 PDF
35 pages
Exercises On Exception Handling
No ratings yet
Exercises On Exception Handling
6 pages
As-Filed - Spacex Le
No ratings yet
As-Filed - Spacex Le
3 pages
Mini Jolly Dali 20 Manual
No ratings yet
Mini Jolly Dali 20 Manual
6 pages
Integrity Technical Quizpage
No ratings yet
Integrity Technical Quizpage
23 pages
Guidelines For Project Report UG & PG Programmes
No ratings yet
Guidelines For Project Report UG & PG Programmes
6 pages
JBL Store Concept Presentation
No ratings yet
JBL Store Concept Presentation
22 pages
Automated Accounting Management System 1
No ratings yet
Automated Accounting Management System 1
46 pages
DEBJANI MAITY Resume
No ratings yet
DEBJANI MAITY Resume
4 pages
Division 2 League Adm - Letter Mitunguu
No ratings yet
Division 2 League Adm - Letter Mitunguu
1 page
Non Linear Finite Element Method
No ratings yet
Non Linear Finite Element Method
54 pages
FAT For PLC PDF Programmable Logic Controller
No ratings yet
FAT For PLC PDF Programmable Logic Controller
1 page
Quiz 13 Questions and Answers
No ratings yet
Quiz 13 Questions and Answers
3 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

05 Games

Uploaded by

05 Games

Uploaded by

CS 5/7320

Slides by Michael Hahsler

This work is licensed under a Creative Commons

• We call the two players:

• State space: a graph defined by the initial state and the

Here player x is Max

Terminal states However, the complete game tree is

Recall AND-OR Search from AIMA Chapter 4

Games • Heuristic Alpha-Beta Tree Search:

Outcome of actions in the environment is nondeterministic =

// don’t follow loops my

all states that can result from

// check all possible current states

AND We call playing always the best move

Games • Heuristic Alpha-Beta Tree Search:

• This is a recursive definition which can be solved from

0 1 1 • Max always picks the action

= minimax value (MV)

Note: This is just a generalization

Find the action that

Utility for Max 2 0 5 -5 -2 7 5 -7 4

• What are the terminal state utilities for Min?

• We need to reduce the search space! → Game tree pruning

• Approach: maintain bounds for the minimax value

Min Min Min updates 𝛽

// v is the minimax value

Found a better action?

Found a better action?

Abandon subtree if Min finds an

• We need expert knowledge or some heuristic to determine what

• Issue: Optimal decision algorithms still scale poorly even when

[𝛼, 𝛽] [𝛼, 𝛽] [𝛼, 𝛽]

Utility for Max 2 -5 5 7 0 2 5 -7 -4

• Find the [𝛼, 𝛽] intervals for all nodes.

Games • Heuristic Alpha-Beta Tree Search:

Needed properties of the evaluation function:

1 HMV HMV HMV HMV HMV HMV HMV HMV HMV

Eval = heuristic to estimate of the minimax

This is also called: search

To save time, we can prune moves that appear bad.

There are many ways move quality can be evaluated:

• Low heuristic value.

Issue: May prune important moves.

Perform complete alpha-

1. Perform Cut-off search.

Games • Heuristic Alpha-Beta Tree Search:

• Playout policy: How to choose moves during the simulation runs?

• Typically used for problems with

• Optimality Guarantee: Converges to optimal play for

• Typical strategy for 𝑁 : Do as many playouts as you can

Exploitation: more playouts for states that have

𝑛 … node in the game tree

Selection strategy: Select node with highest UCB1 score.

Select leaf with Note: the simulation

Do highest Keep subtree and

෍ 𝑃(𝑟)𝐸𝑥𝑝𝑒𝑐𝑡𝑖𝑚𝑖𝑛𝑖𝑚𝑎𝑥 𝑅𝑒𝑠𝑢𝑙𝑡 𝑠, 𝑟 if 𝑚𝑜𝑣𝑒 = 𝐶ℎ𝑎𝑛𝑐𝑒

Conclusion Heuristic Alpha-Beta Tree Search:

State of the Art

Monte Carlo Tree search:

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.