Skip to content

Commit 8588d0c

Browse files
committed
Convert mdp.py to third edition.
1 parent 5d67fb5 commit 8588d0c

File tree

1 file changed

+3
-6
lines changed

1 file changed

+3
-6
lines changed

mdp.py

Lines changed: 3 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -6,19 +6,16 @@
66
dictionary of {state:number} pairs. We then define the value_iteration
77
and policy_iteration algorithms."""
88

9-
# (Written for the second edition of AIMA; expect some discrepanciecs
10-
# from the third edition until this gets reviewed.)
11-
129
from utils import *
1310

1411
class MDP:
1512
"""A Markov Decision Process, defined by an initial state, transition model,
1613
and reward function. We also keep track of a gamma value, for use by
1714
algorithms. The transition model is represented somewhat differently from
18-
the text. Instead of T(s, a, s') being a probability number for each
19-
state/action/state triplet, we instead have T(s, a) return a list of (p, s')
15+
the text. Instead of P(s' | s, a) being a probability number for each
16+
state/state/action triplet, we instead have T(s, a) return a list of (p, s')
2017
pairs. We also keep track of the possible states, terminal states, and
21-
actions for each state. [page 615]"""
18+
actions for each state. [page 646]"""
2219

2320
def __init__(self, init, actlist, terminals, gamma=.9):
2421
update(self, init=init, actlist=actlist, terminals=terminals,

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy