MIT14 12F12 Chapter12
MIT14 12F12 Chapter12
Repeated Games
In real life, most games are played within a larger context, and actions in a given situation
affect not only the present situation but also the future situations that may arise. When
a player acts in a given situation, he takes into account not only the implications of his
actions for the current situation but also their implications for the future. If the players
are patient and the current actions have significant implications for the future, then the
considerations about the future may take over. This may lead to a rich set of behavior
that may seem to be irrational when one considers the current situation alone. Such
ideas are captured in the repeated games, in which a "stage game" is played repeatedly.
The stage game is repeated regardless of what has been played in the previous games.
This chapter explores the basic ideas in the theory of repeated games and applies them
in a variety of economic problems. As it turns out, it is important whether the game is
repeated finitely or infinitely many times.
199
played in each previous play. A strategy then prescribes what player plays at each as a
function of the plays at dates 0, . . . , − 1. More precisely, let us call the outcomes of the
previous stage games a history, which will be a sequence (0 −1 ). A strategy in
the repeated game prescribes a strategy of the stage game for each history (0 −1 )
at each date .
For example, consider a situation in which two players play the Prisoners’ Dilemma
game,
5 5 0 6 (12.1)
6 0 1 1
twice. In that case, = {0 1} and is the Prisoners’ Dilemma game. The repeated
game, , can be represented in the extensive-form as
1
C D
2
C D C D
1 1 1 1
C D D C D C D
C
2 2 2 2
C D C D C D C D C D C D C D
C D
10 5 11 6 5 0 6 1 11 6 12 7 6 1 7 2
10 11 5 6 11 12 6 7 5 6 0 1 6 7 1 2
after a history of plays in the initial round. For example, after ( ) in the initial
round, we have subgame
1
C D
2
C D C D
10 5 11 6
10 11 5 6
where we add 5 to each player’s payoffs, corresponding to the payoff that he gets from
playing ( ) in the first round. Recall that adding a constant to a player’s payoff
does not change the preferences in a game, and hence the set of equilibria in this game
is the same as the original Prisoners’ Dilemma game, which possesses the unique Nash
equilibrium of ( ). This equilibrium is depicted in the figure. Likewise, in each proper
subgame, we add some constant to the players’ payoffs, and hence we have ( ) as
the unique Nash equilibrium at each of these subgames.
Therefore, the actions in the last round are independent of what is played in the
initial round. Hence, the players will ignore the future and play the game as if there is
no future game, each playing . Indeed, given the behavior in the last round, the game
in the initial round reduces to
6 6 1 7
7 1 2 2
where we add 1 to each player’s payoffs, accounting for his payoff in the last round. The
unique equilibrium of this reduced game is ( ). This leads to a unique subgame-
perfect equilibrium: At each history, each player plays .
What would happen for arbitrary ? The answer remains the same. In the last
day, , independent of what has been played in the previous rounds, there is a unique
Nash equilibrium for the resulting subgame: Each player plays . Hence, the actions
at day − 1 do not have any effect in what will be played in the next day. Then, we
can consider the subgame as a separate game of the Prisoners’ Dilemma. Indeed, the
202 CHAPTER 12. REPEATED GAMES
5 + 1 + 1 5 + 1 + 2 0 + 1 + 1 6 + 1 + 2
6 + 1 + 1 0 + 1 + 2 1 + 1 + 1 1 + 1 + 2
where 1 is the sum of the payoffs of from the previous plays at dates 0 −2. Here
we add for these payoffs and 1 for the last round payoff, all of which are independent
of what happens at date − 1. This is another version of the Prisoner’s dilemma, which
has the unique Nash equilibrium of ( ). Proceeding in this way all the way back to
date 0, we find out that there is a unique subgame-perfect equilibrium: At each and
for each history of previous plays, each player plays .
That is to say, although there are many repetitions in the game and the stakes in
the future may be high, any plan of actions other than playing myopically everywhere
unravels, as players cannot commit to any plan of action in the last round. This is
indeed a general result.
Theorem 12.1 Let be finite and assume that has a unique subgame-perfect equi-
librium ∗ . Then, has a unique subgame-perfect equilibrium, and according to this
equilibrium ∗ is played at each date independent of the history of the previous plays.
The proof of this result is left as a straightforward exercise. The result can be
illustrated by another important example. Consider the following Entry-Deterrence
game, where an entrant (Player 1) decides whether to enter a market or not, and the
incumbent (Player 2) decides whether to fight or accommodate the entrant if he enters.
1 Enter 2 Acc.
(1,1)
X Fight
(0,2) (-1,-1)
(12.2)
Consider the game where the Entry-Deterrence game is repeated twice, and all the
previous actions are observed. This game is depicted in the following figure.
12.1. FINITELY-REPEATED GAMES 203
X Fight
X Fight
Acc. 2 Enter 1
X Fight
(-1,1) (0,4)
(-1,1) (-2,-2)
As depicted in the extensive form, in the repeated game, at = 1, there are three
possible histories: , ( ), and ( ). A strategy of Player 1 assigns
an action, which has to be either Enter or , to be played at = 0 and action to be
played at = 1 for each possible outcome at = 0. In total, we need to determine 4
actions in order to define a strategy for Player 1. Similarly for Player 2.
Note that after the each outcome of the first play, the Entry-Deterrence game is
played again, where the payoff from the first play is added to each outcome. Since a
player’s preferences do not change when we add a number to his utility function, each
of the three games played on the second “day” is the same as the stage game (namely,
the Entry-Deterrence game above). The stage game has a unique subgame perfect
equilibrium, where the incumbent accommodates the entrant and the entrant enters the
market. In that case, each of the three games played on the second day has only this
equilibrium as its subgame perfect equilibrium. This is depicted in the following.
X Fight X Fight
Acc. 2 Enter 1
(1,3) (1,3) (0,0)
1 Enter 2 Acc.
Fight X (0,0)
X Fight
(-1,1) (0,4)
(-1,1) (-2,-2)
1 Enter 2 Acc.
(2,2)
X Fight
(1,3) (0,0)
X Fight X Fight
Acc. 2 Enter 1
(1,3) (1,3) (0,0)
1 Enter 2 Acc.
Fight X (0,0)
X Fight
(-1,1) (0,4)
(-1,1) (-2,-2)
This can be generalized for arbitrary as above. All these examples show that in
certain important games, no matter how high the stakes are in the future, the consid-
erations about the future will not affect the current actions, as the future outcomes do
not depend on the current actions. In the rest of the lectures we will show that these
are very peculiar examples. In general, in many subgame-perfect equilibria, the patient
players will take a long-term view, and their decisions will be determined mainly by the
future considerations.
Indeed, if the stage game has more than one equilibrium, then in the repeated game
we may have some subgame-perfect equilibria where, in some stages, players play some
actions that are not played in any subgame-perfect equilibrium of the stage game. This
is because the equilibrium to be played on the second day can be conditioned to the
play on the first day, in which case the “reduced game” for the first day is no longer
the same as the stage game, and thus may obtain some different equilibria. I will now
12.1. FINITELY-REPEATED GAMES 205
illustrate this using an example in Gibbons. (See Exercises 1 and 2 at the end of the
chapter before proceeding.)
Take = {0 1} and the stage game be
1 1 5 0 0 0
0 5 4 4 0 0
0 0 0 0 3 3
Notice that a strategy in a stage game prescribes what the player plays at = 0 and
what he plays at = 1 conditional on the history of the play at = 0. There are 9 such
histories, such as ( ), ( ), etc. A strategy of Player 1 is defined by determining
an action (,, or ) for = 0, and determining an action for each of these histories at
= 1 (There will be 10 actions in total.) Consider the following strategy profile:
2 2 6 1 1 1
1 6 7 7 1 1
1 1 1 1 4 4
Here, we add 3 to the payoffs at ( ) (for it leads to ( ) in the second round) and
add 1 for the payoffs at the other strategy profiles, for they lead to ( ) in the second
round. Clearly, ( ) is a Nash equilibrium in the reduced game, showing that the
above strategy profile is a subgame-perfect Nash equilibrium. In summary, players can
coordinate on different equilibria in the second round conditional on the behavior in the
206 CHAPTER 12. REPEATED GAMES
first round, and the players may play a non-equilibrium (or even irrational) strategies in
the first round, if those strategies lead to a better equilibrium later.
When there are multiple subgame-perfect Nash equilibria in the stage game, a large
number of outcome paths can result in a subgame-perfect Nash equilibrium of the re-
peated game even if it is repeated just twice. But not all outcome paths can be a result
of a subgame-perfect Nash equilibrium. In the following, I will illustrate why some of
the paths can and some paths cannot emerge in an equilibrium in the above example.
Can (( ) ( )) be an outcome of a subgame-perfect Nash equilibrium? The
answer is No. This is because in any Nash equilibrium, the players must play a Nash
equilibrium of the stage game in the last period on the path of equilibrium. Since ( )
is not a Nash equilibrium of the stage game (( ) ( )) cannot emerge in any Nash
equilibrium, let alone in a subgame-perfect Nash equilibrium.
Can (( ) ( )) be an outcome of a subgame-perfect Nash equilibrium in pure
strategies? The answer is No. Although ( ) is a Nash equilibrium of the stage game,
in a subgame-perfect Nash equilibrium, a Nash equilibrium of the stage game must
be played after every play in the first round. In particular, after ( ), the play is
either ( ) or ( ), yielding 6 or 8, respectively for Player 1. Since he gets only 5
from (( ) ( )), he has an incentive to deviate to in the first period. (What
about if we consider mixed subgame-perfect Nash equilibria or non-subgame-perfect
Nash equilibria?)
Can (( ) ( )) be an outcome of a subgame-perfect Nash equilibrium in pure
strategies? As it must be clear from the previous discussion the answer would be Yes
if and only if ( ) is played after every play of the period except for ( ). In that
case, the reduced game for the first period is
2 2 6 1 1 1
3 8 5 5 1 1
1 1 1 1 4 4
Since ( ) is indeed a Nash equilibrium of the reduced game, the answer is Yes. It is
the outcome of the following subgame-perfect Nash equilibrium: Play ( ) in the first
round; in the second round, play ( ) if ( ) is played in the first round and play
( ) otherwise.
12.2. INFINITELY REPEATED GAMES WITH OBSERVED ACTIONS 207
As an exercise, check also if (( ) ( )) or (( ) ( ) ( )) can be an
outcome of a subgame-perfect Nash equilibrium in pure strategies (in twice and thrice
repeated games, respectively).
X
∞
(; ) = = 0 + 1 + · · · + + · · ·
=0
X
∞
(1 − ) (; ) ≡ (1 − )
=0
Hence, the analysis does not change whether one uses or , but using is
simpler. In repeated games considered here, each player maximizes the present value
of the payoff stream he gets from the stage games, which will be played indefinitely.
Since the average value is simply a linear transformation of the present value, one can
also use average values instead of present values. Such a choice sometimes simplifies the
expressions without affecting the analyses.
In the repeated Prisoner’s Dilemma, the possible histories are -tuples of ( ) ( ) ( ),
and ( ), such as
( ) ( ) ( ) ( ) · · · ( )
where varies. A history at the beginning of date is denoted by = (0 −1 ),
where 0 is the outcome of stage game in round 0 ; is empty when = 0. For example,
in the repeated prisoners’ dilemma, (( ) ( )) is a history for = 2. In the repeated
entry-deterrence game, ( ) is a history for = 2.
A strategy in a repeated game, once again, determines a strategy in the stage game
for each history and for each . The important point is that the strategy in the stage
game at a given date can vary by histories. Here are some possible strategies in the
repeated Prisoner’s Dilemma game:
Naively Cooperate: Play always C (no matter what happened in the past).
Tit-for-Tat: Play at = 0, and at each 0, play whatever the other
player played at − 1.
Note that strategy profiles (Grim, Grim), (Naively Cooperate, Naively Cooperate)
and (Tit-for-Tat, Tit-for-Tat) all lead to the same outcome path:1
Nevertheless, they are quite distinct strategy profiles. Indeed, (Naively Cooperate,
Naively Cooperate) is not even a Nash equilibrium (why?), while (Grim, Grim) is a
subgame-perfect Nash equilibrium for large values of . On the other hand, while (Tit-
for-Tat, Tit-for-Tat) is a Nash equilibrium for large values of , it is not subgame-perfect.
All these will be clear momentarily.
where () is the stage-game payoff of player at in the original stage game, and
+1 ( ∗ ) is the present value of player at +1 from the payoff stream that results
1
Make sure that you can compute the outcome path for each strategy profile above.
210 CHAPTER 12. REPEATED GAMES
when all players follow ∗ starting with the history ( ) = (0 −1 ), which is a
history at the beginning of date + 1. Note that (|∗ ) is the time present value
of the payoff stream that results when the outcome of the stage game is in round and
everybody sticks to the strategy profile ∗ from the next period on. Note also that the
only difference between the original stage game and the augmented stage game is that
the payoff in the augmented game is (|∗ ) while the payoff in the original game is
().
Single-deviation principle now states that a strategy profile in the repeated game is
subgame-perfect if it always yields a subgame-perfect Nash equilibrium in the augmented
stage game:
Note that ∗ () is what player is supposed to play at the stage game after history
at date according to ∗ . Hence, ∗ () is a strategy in the stage game as well as
a strategy in the augmented stage game. Therefore, (∗1 () ∗ ()) is a strategy
profile in the augmented stage game, and a potential subgame-perfect Nash equilibrium.
Note also that, in order to show that ∗ is a subgame-perfect Nash equilibrium, one
must check for all histories and dates that ∗ yields a subgame-perfect Nash equi-
librium in the augmented stage game. Conversely, in order to show that ∗ is not a
subgame-perfect Nash equilibrium, one only needs to find one history (and date) for
which ∗ does not yield a subgame-perfect Nash equilibrium in the augmented stage
game. Finally, although the above result considers pure strategy profile ∗ the same
result is true for mixed strategies. The result is stated that way for clarity. The rest of
this section is devoted to illustration of single-deviation principle on infinitely repeated
Entry Deterrence and Prisoners’ Dilemma games.
At any given stage, the entrant enters the market if an only if the incum-
bent has accommodated the entrant sometimes in the past. The incumbent
accommodates the entrant if an only if he has accommodated the entrant
before.2
Using the single-deviation principle, we will now show that for large values of , this a
subgame-perfect Nash equilibrium. The strategy profile puts the histories in two groups:
1. The histories at which there was an entry and the incumbent has accommodated;
the histories that contain an entry , and
2. all the other histories, i.e., the histories that do not contain the entry at any
date.
= 1 + + 2 + · · · = 1 (1 − )
That is, for every outcome ∈ { }, +1 ( ∗ ) = . Hence, the aug-
mented stage game for and ∗ is
1 Enter 2 Acc.
(1+VA,1
,1++VA)
X Fight
0+VA
0+ -1+VA
-1+
2+VA
2+ -1+VA
-1+
2
This is a switching strategy, where initially incumbent fights whenever there is an entry and the
entrant never enters. If the incumbent happens to accommodate an entrant, they switch to the new
regime where the entrant enters the market no matter what the incumbent does after the switching,
and incumbent always accommodates the entrant.
212 CHAPTER 12. REPEATED GAMES
For example, if the incumbent accommodates the entrant at , his present value (at
) will be 1 + ; and if he fights his present value will be −1 + , and so on.
This is another version of the Entry-Deterrence game, where the constant is added
to the payoffs. The strategy profile ∗ yields (Enter, Accommodate) for round at
. According to single-deviation principle, (Enter, Accommodate) must be a subgame-
perfect equilibrium of the augmented stage game here. This is indeed the case, and ∗
passes the single-deviation test for such histories.
Now for some date consider a history = (0 −1 ) in the second group, where
the incumbent has never accommodated the entrant before, i.e., 0 differs from for
all 0 . Towards constructing the augmented stage game for , first consider the outcome
= at . In that case, at the beginning of + 1, the history is ( ), which
includes as in the previous paragraph. Hence, according to ∗ , Player 1 enters and
Player 2 accommodates at + 1, yielding a history that contains for the next period.
Therefore, in the continuation game, all histories are in the first group (containing ),
and the play is (Enter, Accommodate) at every 0 , resulting in the outcome path
( ). Starting from + 1, each player gets 1 for each date, resulting the
present value of +1 ( ∗ ) = . Now consider another outcome ∈ { }
in period . The continuation play for other outcomes is quite different now. At the
beginning of + 1, the history ( ) is either ( ) or ( ). Since does not contain
, neither does ( ). Hence, according to ∗ , at + 1, Player 1 exits, and Player 2
would have chosen Fight if there were an entry, yielding outcome for period + 1.
Consequently, at any 0 +1, the history is ( ), and Player 1 chooses to
exit at 0 according to ∗ . This results in the outcome path ( ). Therefore,
starting from + 1, Player 1 gets 0 and Player 2 gets 2 every day, yielding present values
of 1+1 ( ∗ ) = 0. and
2+1 ( ∗ ) = = 2 + 2 + 2 2 + · · · = 2 (1 − )
12.2. INFINITELY REPEATED GAMES WITH OBSERVED ACTIONS 213
1 Enter 2 Acc.
Acc.
(1+VA,1+
,1+VA)
X Fight
0+0
0+ -1+0
-1+
2+VF
2+ -1+VF
-1+
At this history the strategy profile prescribes ( ), i.e., the entrant does not
enter, and if he enters, the incumbent fights. Single-deviation principle requires then
that ( ) is a subgame-perfect equilibrium of the above augmented stage game.
Since is a best response to Fight, we only need to ensure that Player 2 weakly prefers
Fight to Accommodate after the entry in the above game. For this, we must have
−1 + ≥ 1 +
Substitution of the definitions of and in this inequality shows that this is equivalent
to3
≥ 23
We have considered all possible histories, and when ≥ 23, the strategy profile
has passed the single-deviation test. Therefore, when ≥ 23, the strategy profile is a
subgame-perfect equilibrium.
On the other hand, when 23, ∗ is not a subgame-perfect Nash equilibrium. To
show this it suffices to consider one history at which ∗ fails the single-deviation test. For
a history in the second group, the augmented stage game is as above, and ( )
is not a subgame-perfect equilibrium of this game, as 1 + −1 + .
equilibrium of the augmented stage game for for every history . This simplifies the
analysis substantially because one only needs to compute the payoffs without deviation
and with unilateral deviations in order to check whether the strategy profile is a Nash
equilibrium.
As an example, consider the infinitely repeated Prisoner’s dilemma game in (12.2).
Consider the strategy profile (Grim,Grim). There are two kinds of histories we need to
consider separately for this strategy profile.
2. Defection: Histories in which has been played by some one at some date.
First consider a Cooperation history for some . Now if both players play , then
according to (Grim,Grim), from + 1 on each player will play forever. This yields the
present value of
= 5 + 5 + 5 2 + · · · = 5 (1 − )
at + 1. If any player plays , then from + 1 on, all the histories will be Defection
histories and each will play forever. This yields the present value of
= 1 + + 2 + · · · = 1 (1 − )
at + 1. Now, at , if they both play , then the payoff of each player will be 5 + .
If Player 1 plays while Player 2 is playing C, then Player 1 gets 6 + , and Player
2 gets 0 + . Hence, the augmented stage game at the given history is
5 + 5 + 0 + 6 +
6 + 0 + 1 + 1 +
To pass the single-deviation test, (C,C) must be a Nash equilibrium of this game.4 (That
is, we fix a player’s action at and check if the other player has an incentive to deviate.)
4
It is important to note that we do not need to know all the payoffs in the reduced game. For
example, for this history we only need to check if ( ) is a Nash equilibrium of the reduced game, and
hence we do not need to compute the payoffs from ( ). In this example, it was easy to compute.
In general, it may be time consuming to compute the payoffs for all strategy profiles. In that case, it
will save a lot of time to ignore the strategy profiles in which more than one player deviates from the
prescribed behavior at .
12.2. INFINITELY REPEATED GAMES WITH OBSERVED ACTIONS 215
i.e.,
≥ 15
5 5 6 6
5 + 1− 5 + 1− 0 + 1− 26 +
1−2
6 6
6+ 1− 20 + 1− 2 1 + (1 − ) 1 + (1 − )
patient, their long-term incentives take over, and a large set of behavior may result in
equilibrium. Indeed, for any given feasible and "individually rational" payoff vector and
for sufficiently large values of , there exists some subgame perfect equilibrium that
yields the payoff vector as the average value of the payoff stream. This fact is called the
Folk Theorem. This section is devoted to presenting a basic version of folk theorem and
illustrating its proof.
Throughout this section, it is assumed that the stage game is a simultaneous action
game ( ) where set = {1 } is the set of players, = 1 × · · · × is a
finite set of strategy profiles, and : → R is the stage-game utility functions.
for some probability distribution : → [0 1] on . Note that is the smallest convex
set that contains all payoff vectors (1 () ()) from pure strategy profiles in the
stage game. A payoff vector is said to be feasible iff ∈ . Throughout this section,
is assumed to be -dimensional.
For a visual illustration consider the Prisoners’ Dilemma game in (12.1). The set
is plotted in Figure 12.1. Since there are two players, contains pairs = (1 2 ). The
payoff vectors from pure strategies are (1 1), (5 5), (6 0), and (0 6). The set is the
diamond shaped area that lies between the lines that connect these four points.
Note that for every strategy profile in the repeated game, the average payoff vector
from is in .7 This also implies that the same is true for mixed strategy profiles in the
repeated game. Conversely, if the players can collectively randomize on strategy profiles
7
Indeed, the average payoff vector can be written as
X
() = () (1 () ())
∈
218 CHAPTER 12. REPEATED GAMES
6
5
1
0
0 1 5 6
in the repeated games, all vectors ∈ could be obtained as average payoff vectors.
(See also the end of the section.)
where
X
() = (1 − )
∈
and is the set of dates at which is played on the outcome path of . Clearly,
X X X X
() = (1 − ) = (1 − ) = 1
∈ ∈ ∈ ∈
12.3. FOLK THEOREM 219
Here, the other players try to minimize the payoff of player by choosing a pure strategy
− for themselves, knowing that player will play a best response to − . Then, the
harshest punishment they could inflict on is . For example, in the prisoners’ dilemma
game, = 1 because gets maximum of 6 if the other player plays and gets maximum
of 1 if the other player plays .
Observe that in any pure-strategy Nash equilibrium ∗ of the repeated game, the
average payoff of player is at least . To see this, suppose that the average payoff of
is less than in ∗ . Now consider the strategy ̂ , such that for each history , ̂ ()
is a stage-game best response to ∗− (), i.e.,
¡ ¢ ¡ ¢
̂ () ∗− () = max ∗− ()
∈
Since
¡ ∗
¢
max − () ≥
∈
¡ ¢
for every , this implies that the average payoff from ̂ ∗− is at least 1, giving player
an incentive to deviate.
A lower bound for the average payoff from a mixed strategy Nash equilibrium is given
by minmax payoff, defined as
X Y
= min max ( ) ( − ) (12.4)
6= ∈
− ∈− =
6
where is a mixed strategy of in the stage game. Similarly to pure strategies one can
show that the average payoff of player is at least in any Nash equilibrium (mixed
or pure). Note that, by definition, ≤ . The equality can be strict. For example, in
the matching penny game
Head Tail
Head
−1 1 1 −1
Tail
1 −1 −1 1
the pure-strategy minmax payoff is 1 while minmax payoff is 0. (This is obtained
when () = ( ) = 12.) For the sake of exposition, it is assumed that
(1 ) ∈ .
A payoff vector is said to be individually rational iff ≥ for every ∈ .
220 CHAPTER 12. REPEATED GAMES
Theorem 12.3 (Folk Theorem) Let ∈ be such that for every player .
Then, there exists ¯ ∈ (0 1) such that for every ̄ there exists a subgame-perfect
equilibrium of the repeated game under which the average value of each player is .
Moreover, if for every above, then the subgame-perfect equilibrium above is in
pure strategies.
he Folk Theorem states that any strictly individually rational and feasible payoff
vector can be supported in subgame perfect Nash equilibrium when the players are
sufficiently patient. Since all equilibrium payoff vectors need to be individually rational
and feasible, the Folk Theorem provides a rough characterization of the equilibrium
payoff vectors when players are patient: the set of all feasible and individually rational
payoff vectors.
I will next illustrate the main idea of the proof for a special case. Assume that, in the
theorem, = (1 (∗ ) (∗ )) for some ∗ ∈ and there exists a Nash equilibrium
̂ of the stage game such that (̂) for every . In the prisoners’ dilemma example,
∗ = ( ), yielding = (5 5), and ̂ = ( ), yielding payoff vector (1 1). Recall
that in that case one could obtain from strategy profile (Grim, Grim), which is a
subgame-perfect Nash equilibrium when 15. The main idea here is a generalization
of Grim strategy. Consider the following strategy profile ∗ of the repeated game:
(1 − ) () + (̂)
because the players will switch to ̂ after any such play. Then, ∗ is a Nash equilibrium
of the augmented stage game if and only if
¡ ¢
≥ (1 − ) max ∗− + (̂) (12.5)
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) · · · ( ) ( ) ( ) ( ) · · ·
Here, I approximated by time averaging. When is large, one can obtain each exactly
by time averaging.8
8
For mathematically oriented students: imagine writing each weight () ∈ [0 1] in base 1.
222 CHAPTER 12. REPEATED GAMES
3 3 0 0 0 0
0 0 2 2 1 0
0 0 0 1 0 0
(a) Find a lower bound for the average payoff of each player in all pure strategy
Nash equilibria. Prove indeed that the payoff of a player is at least in
every pure-strategy Nash equilibrium.
Solution: Note that the pure strategy minmax payoff of each player is 1.
Hence, the payoff of a player cannot be less than . Indeed, if a player
mirrors what the other player is supposed to play in any history at which the
other player plays or according to the equilibrium and play if the other
player is supposed to play at the history, then his payoff would be at least
. Since he plays a best response in equilibrium, his payoff is at least that
amount. This lower bound is tight. For = 2 1, consider the strategy
profile
Play ( ) for the first periods and ( ) for the last periods; if any player
deviates from this path, play ( ) forever.
Note that the payoff from this strategy profile is . To check that this is
a Nash equilibrium, note that the best possible deviation is to play play
forever, which yields , giving no incentive to deviate. Note also that the
quilibrium here is not subgame-perfect.
if is even. Note that the total payoff of each player from this path is + 1.
Consider the following strategy profile.
Play according to the above path; if any player deviates from this path at
any ≤ 2 − 1, switch to ∗ [ − − 1] for the remaining ( − − 1)-times
repeated game; if any player deviates from this path at any 2, remain
on the path.
This is a subgame-perfect Nash equilibrium. There are three classes of histo-
ries to check. First, consider a history in which some player deviated from the
path at some 0 ≤ 2. In that case, the strategy profile already prescribes
to follow the subgame-perfect Nash equilibrium ∗ [ − 0 − 1] of the subgame
that starts from 0 + 1, which remains subgame perfect at the current sub-
game as well. Second, consider a history in which no player has deviated from
the path at any 0 ≤ 2 and take 2. In the continuation game, the
above strategy profile prescribes: play ( ) every day if is odd and play
( ) every day but the last day and play ( ) on the last day if is even.
Since ( ) and ( ) are Nash equilibria of the stage game, this is clearly a
subgame-perfect equilibrium of the remaining game. Finally, take ≤ 2
and consider any on-the path history. Now, a player’s payoff is + 1 if he
follows the strategy profile. If he deviates at , he gets at most 1 at and
( − − 1) + 1 ≤ from the next period on, where ( − − 1) + 1 is his
payoff from ∗ [ − − 1]. His total payoff cannot exceed + 1, and he has
no incentive to deviate.
2. Consider the infinitely repeated prisoners’ dilemma game of (12.1) with discount
factor = 0999.
224 CHAPTER 12. REPEATED GAMES
(a) Find a subgame-perfect Nash equilibrium in pure strategies under which the
average payoff of each player is in between 1.1 and 1.2. Verify that your
strategy profile is indeed a subgame-perfect Nash equilibrium.
³ ´
ˆ ˆ
Solution: Take any ̂ with 1 − + 5 = 1 + 4 ̂ ∈ (11 12), e.g., any ̂
between 2994 and 3687. Consider the strategy profile
Play ( ) at any ̂ and ( ) at ̂ and thereafter. If any player deviates
from this path, play ( ) forever.
³ ´
ˆ
Note that the average value of each player is 1 − + 5 ̂ ∈ (11 12). To
check that it is a subgame-perfect Nash equilibrium, first take any on-path
history with date ≥ ̂. At that history, the average value of each player is
5. If a player deviates, then his average value is only 6 (1 − ) + = 105.
Hence, he has no incentive to deviate. For ̂, the average value is
³ ´ ³ ´
ˆ
− ˆ ˆ ˆ
1− −
+ 5 ≥ 1 − + 5 11
5 ̂− ≥ 5 ̂ ≥ 09991608 5 ∼
= 10006
3. [Midterm 2, 2006] Two firms, 1 and 2, play the following infinitely repeated game
in which all the previous plays are observed, and each player tries to maximize
the discounted sum of his or her profits at the stage games where the discount
rate is = 099. At each date , simultaneously, each firm selects a price
∈ {001 002 099 1}. If 1 = 2 , then each firm sells 1 unit of the good;
otherwise, the cheaper firm sells 2 units and the more expensive firm sells 0 units.
Producing the good does not cost anything to firms. Find a subgame-perfect equi-
librium in which the average value of Firm 1 is at least 1.4. (Check that the
strategy profile you construct is indeed subgame-perfect equilibrium.)
Solution: (There are several such strategy profiles; I will show one of them.) In
order for the average value to exceed 1.4, the present value must exceed 140. We
can get average value of approximately 1.5 for player 1 by alternating between
(099 1), which yields (198 0), and (1 1), which yields (1 1). The average value
of that payoff stream for player 1 is
198 + ∼
= 149
1+
Here is a SPE with such equilibrium play: At even dates play (099 1) and at odd
226 CHAPTER 12. REPEATED GAMES
dates play (1 1); if any player ever deviates from this scheme, then play (001 001)
forever.
We use the single-deviation principle, to check that this is a SPE. First note that
in "deviation" mode, they play a Nash equilibrium of the stage game forever, and
it passes the single-deviation test. Now, consider an even and a history where
there has not been any deviation. Player 1 has no incentive to deviate: if he follows
the strategy, he will get the payoff stream 1.98, 1, 1.98, 1, 1.98, . . . ; if he deviates,
he will get , 0.01, 0.01, . . . where ≤ 196 ( = 1 for upward deviation). For
player 2: if he plays according to the strategy, he will get the payoff stream of 0,
1, 0, 1, 0, 1, . . . with present value of
¡ ¢
1 − 2 ∼= 4975
If he deviates, he will get , 0.01, 0.01, . . . where ≤ 196. (The best deviation is
2 = 098.) This yields present value of
4. [Midterm 2, 2011] Alice and Bob are a couple, playing the infinitely repeated game
with the following stage game and discount factor . Every day, simultaneously,
Alice and Bob spend ∈ [0 1] and ∈ [0 1] fraction of their time in their
relationship, respectively, receiving the stage payoffs = ln ( + ) + 1 −
and = ln ( + ) + 1 − , respectively. (Alice and Bob are denoted by
and , respectively.) For each of the strategy profiles below, find the conditions
on the parameters for which the strategy profile is a subgame-perfect equilibrium.
(a) Both players spend all of their time in their relationship (i.e. = = 1)
until somebody deviates; the deviating player spends 1 and the other player
spends 0 thereafter. (Find the range of .)
Solution: Since (1 0) and (0 1) are Nash equilibria of the stage game, there
is no incentive to deviate at any history with previous deviation by one player.
Now consider any other history, in which they both are supposed to spend 1.
If a player follows the strategy, his he average payoff is
ln 2
(ln (1 + ) + 1 − ) (1 − )
1 −
ln 2 ≥ 1 −
where the valeus on left and right hand sides of inequality are the average
values from following the strategy profile and best deviation, respectively.
One can write this as a lower bound on the discount factor:
≥ 1 − ln 2
forever. (Find the set of inequalities that must be satisfied by the parameters
ln () = 1; ln () ≤ − 1; ln () = ln + ln
Solution: Since (̃ 1 − ̃ ) is a Nash equilibrium of the stage game, there
is no incentive to deviate at state for any ∈ { }. In state , the
average payoff from following the strategy profile is ln 2. If a player deviates
at state , the next state is (as in part (a)), which gives the average payoff
of 1 − ̃ to . Hence, as in part (a), the average payoff from best deviation
is 1 − + (1 − ̃ ) = 1 − ̃ . Therefore, there is no incentive to deviate at
state iff ln 2 ≥ 1 − ̃ , i.e.
̃ ≥ 1 − ln 2 (12.6)
On the other hand, in state , the average payoff from following the strategy
is
(1 − ) ̂ +
≥ (1 − ) ̂ +
which simplifies to
≥
ˆ
By substituting the value of , one can write this condition as
ln 2 + (1 − ) (ln ̂ + 1 − )
ˆ ≥
ˆ (12.7)
Remark 12.1 One can make strategy profile above a subgame-perfect Nash
equilibrium by varying all three parameters ̂, ̃1 , ̃2 , and . For a fixed
(̂ ̃1 ̃2 ), both conditions bound the discound factors from below, yielding
½ ¾
1 − ln 2 1 − ln 2 ̂ − ln 2
≥ max 1 −
̃1 ̃2 ln ̂ + 1 − ̂
(To see this, observe that ln ̂ + 1 − ̂ 0.) Of course, when is fixed, the
above conditions can also be interpeted as bounds on ̃ and ̂. First, the
contribution of the guilty party in the divorce state cannot be too low:
1 − ln 2
̃ ≥
For otherwise, the parties deviate and marriage cannot be sustained. Second,
the above lower bound on also gives an absolute upper bound on the effort
level during the engagement. Since 1 and ln ̂ + 1 − ̂ 0, the condition
on implies that
ˆ ln 2 ∼
= 0693
For otherwise, the lower bound on would exceed 1. That is, one must start
small, as engagement may never turn into marriage otherwise. Of course,
one could also skip the engagement altogether.
5. [Final, 2001] This question is about a milkman and a customer. At any day, with
the given order,
(a) Assume that this is repeated for 100 days, and each player tries to maximize
the sum of his or her stage payoffs. Find all subgame-perfect equilibria of this
game.
230 CHAPTER 12. REPEATED GAMES
= −
The best deviation for him (at any history on the path of equilibrium play)
is to choose = 0 (and not being able to sell thereafter). In that case, his
average value is
= (1 − ) + 0 = (1 − )
− ≥ (1 − )
i.e.,
≥
In order for the customer to buy on the equilibrium path, it must also be true
that ≤ . Therefore,
≥ ≥
6. [Midterm 2 Make up, 2006] Since the British officer had a thick pen when he drew
the border, the border of Iraq and Kuwait is disputed. Unfortunately, the border
12.4. EXERCISES WITH SOLUTIONS 231
passes through an important oil field. In each year, simultaneously, each of these
countries decide whether to extract high () or low () amount of oil from this
field. Extracting high amount of oil from the common field hurts the other country.
In addition, Iraq has the option of attacking Kuwait ( ), which is costly for both
countries. The stage game is as follows:
2 2 4 1
1 4 3 3
−1 −1 −1 −2
Consider the infinitely repeated game with this stage game and with discount
factor = 09.
(a) Find a subgame perfect Nash equilibrium in which each country extracts low
() amount of oil every year on the equilibrium path.9
Solution: Consider the strategy profile
Play ( ) until somebody deviates and play ( ) thereafter.
(b) Find a subgame perfect Nash equilibrium in which Iraq extracts high ()
amount of oil and Kuwait extracts low () amount of oil every year on the
equilibrium path.
Solution: Consider the following ("Carrot and Stick") strategy profile10
There are two states: War and Peace. The game starts at state Peace. In
state Peace, they play ( ); they remain in Peace if ( ) is played and
switch to War otherwise. In state War, they play ( ); they switch to
The vector of average values is (4 1) in state Peace and (−1 −1) (1 − ) +
(4 1) = (5 − 1 2 − 1) in War. Note that both countries strictly prefer
9
That is, an outside observer would observe that each country extracts low amount of oil every year.
10
See the next chapter for more on Carrot and Stick strategies.
232 CHAPTER 12. REPEATED GAMES
2 (1 − ) + [2 − 1] ≤ 1
i.e., ≥ 12, which is indeed the case. In state War, Kuwait clearly has no
incentive to deviate. In that state, Iraq could possibly benefit from deviating
to , getting 2 (1 − ) + (2 − 1). It does not have an incentive to deviate
if
2 − 1 ≥ 2 (1 − ) + (2 − 1)
i.e.,
2 − 1 ≥ 2
7. [Selected from Midterms 2 in years 2001 and 2002] Below, there are pairs of stage
games and strategy profiles. For each pair, check whether the strategy profile is a
subgame-perfect Nash equilibrium of the infinitely repeated game with the given
stage game and discount factor = 099.
Strategy profile: Until some player deviates, Player 1 plays and Player
the players play a Nash equilibrium forever. Hence, we only need to check
that no player has any incentive to deviate on the path of equilibrium. Player
current period and gets zero thereafter. If he sticks to his equilibrium strategy,
1 = 2 − = 101
1 = 2 − 1 = 098
12.5 Exercises
1. How many strategies are there in twice-repeated prisoners dilemma game?
2. Suppose that the stage game is a two-player games in which each player has
strategies. How many strategies each player has in an -times repeated game?
4. Show that in any Nash equilibrium ∗ of the repeated game, the average payoff of
player is at least .
5. [Homework 4, 2011] Consider the infinitely repeated game with discount factor
= 099 and the following stage game (in which the players are trading favors):
Give Keep
Give 1 1 −1 2
Keep 2 −1 0 0
(a) Find a subgame perfect equilibrium under which the average expected payoff
of Player 1 is at least 133. Verify that your strategy profile is indeed a
subgame-perfect Nash equilibrium.
12.5. EXERCISES 235
(b) Find a subgame-perfect equilibrium under which the average expected payoff
of Player 1 is at least 149. Verify that your strategy profile is indeed a
subgame-perfect Nash equilibrium.
6. [Midterm 2, 2011] Consider the 100-times repeated game with the following stage
game:
1
I X
1
2
0
a b
2 2
L R L R
5 0 x 1
1 x 0 6
where is either 0 or 6.
(a) Find the set of pure-strategy subgame-perfect equilibria of the stage game
for each ∈ {0 6}.
(b) Take = 6. What is the highest payoff Player 2 can get in a subgame-perfect
equilibrium of the repeated game?
(c) Take = 0. Find a subgame-perfect equilibrium of the repeated game in
which Player 2 gets more than 300 (i.e. more than 3 per day on average)?
7. [Midterm 2, 2011] Consider an infinitely repeated game in which the stage game is
as in the previous problem. Take the discount factor = 099 and = 6. For each
strategy profile below, check whether it is a subgame-perfect Nash equilibrium.
(a) They play ( ) everyday until somebody deviates; they play ( ) there-
after.
(b) There are three states: , 1, and 2, where the play is ( ), ( ), and
( ), respectively. The game starts at state . After state , it switches
to state 1 if the play is ( ) and to state 2 if the play is ( ); it stays
236 CHAPTER 12. REPEATED GAMES
8. [Midterm 2 Make Up, 2011] Consider an infinitely repeated game in which the
discount factor is = 09 and the stage game is
4 4 0 5 0 0
5 0 3 3 −1 0
2 2 1 1 −2 0
0 0 0 −1 −3 −2
For each payoff vector below ( ), find a subgame perfect equilibrium of the
repeated game in which the average discounted payoff is ( ). Verify that the
strategy profile you identified is indeed a subgame perfect equilibrium.
9. [Midterm 2 Make Up, 2011] Consider the infinitely repeated game with the stage
game in the previous problem and the discount factor ∈ (0 1). For each of the
strategy profiles below, find the conditions on the discount factor for which the
strategy profile is a subgame-perfect equilibrium.
(a) At = 0, they play ( ). At each , they play ( ) if the play at − 1 is
( ) or if the play at − 2 is not ( ). Otherwise, they play ( ).
(b) There are 4 states: ( ), ( ), ( ), and ( ). At each state (1 2 ), the
play is (1 2 ). The game starts at state ( ). For any with (1 2 ), the
state at + 1 is
10. [Homework 4, 2011] Consider the -times repeated game with the following stage
game.
A X
(1,0,0)
I
L R
C
L R
L R
(a) For = 2, what is the largest payoff A can get in a subgame-perfect Nash
equilibrium in pure strategies?
11. [Homework 4, 2011] Consider the infinitely repeated game with discount factor
∈ (0 1) and the stage game in the previous problem. For each of the strategy
profile below, find the range of under which the strategy profile is a subgame-
perfect Nash equilibrium.
(a) A always plays . B and C both play until somebody deviates and play
thereafter.
(b) A plays I and B and C rotate between ( ), ( ), and ( ) until some-
body deviates; they play ( ) thereafter.
(Note that the outcome is ( ) ( ) ( ) ( ) ( ) .)
12. [Homework 4, 2007] Seagulls love shellfish. In order to break the shell, they need
to fly high up and drop the shellfish. The problem is the other seagulls on the
beach are kleptoparasites, and they steal the shellfish if they can reach it first. This
question tells the story of two seagulls, named Irene and Jonathan, who live in a
238 CHAPTER 12. REPEATED GAMES
crowded beach where it is impossible to drop the shellfish and get it before some
other gull steals it. The possible dates are = 0 1 2 3 with no upper bound.
Everyday, simultaneously Irene and Jonathan choose one of the two actions: "Up"
or "Down". Up means to fly high up with the shellfish and drop it next to the
other sea gull’s nest, and Down means to stay down in the nest. Up costs 0,
but if the other seagull is down, it eats the shellfish, getting payoff . That is,
we consider the infinitely repeated game with the following stage game
Up Down
Up − − −
Down − 0 0
and discount factor ∈ (0 1).11 For each strategy profile below, find the set of dis-
count factors under which the strategy profile is a subgame-perfect equilibrium.
(a) Irrespective of the history, Irene plays Up in the even dates and Down in the
odd dates; Jonathan plays Up in the odd dates and Down in the even dates.
(b) Irene plays Up in the even dates and Down in the odd dates while Jonathan
plays the other way around until someone fails to go Up in a day that he is
supposed to do so. They both stay Down thereafter.
(c) For days Irene goes Up and Jonathan stays Down; in the next days
Jonathan goes Up and Irene stays Down. This continues back and forth until
someone deviates. They both stay Down thereafter.
(d) Irene goes Up on "Sundays", i.e., at = 0 7 14 21 , and stays Down on
the other days, while Jonathan goes up everyday except for Sundays, when
he rests Down, until someone deviates; they both stay Down thereafter.
(e) At = 0, Irene goes Up and Jonathan stays Down, and then they alternate.
If a seagull fails to go Up at a history when is supposed to go Up, then
the next day goes Up and the other seagull stays Down, and they keep
alternating thereafter until someone fails to go Up when it is supposed to do
so. (For example, given the history, if Irene is supposed to go Up at but
11
Evolutionarily speaking, the discounted sum is the fitness of the genes, which determine the behav-
ior.
12.5. EXERCISES 239
13. [Homework 4, 2007] Consider the infinitely repeated game, between Alice and Bob,
with the following stage game:
Alice
Hire Fire
Bob
0
Work Shirk 0
-1
2
3
2
The discount factor is = 09. (Fire does not mean that the game ends.) For each
strategy profile below, check if it is a subgame-perfect equilibrium. If it is not a
SPE for = 09, find the set of discount factors under which it is a SPE.
(a) Alice Hires if and only if there is no Shirk in the history. Bob Works if and
only if there is no Shirk in the history.
(b) Alice Hires unless Bob (was hired and) Shirked in the previous period, in
which case she Fires. Bob always Works.
(c) There are three states: Employment, Punishment for Alice, and Punishment
for Bob. In the Employment state, Alice Hires and Bob Works. In the
Punishment state for Alice, Alice Hires but Bob Shirks. In the Punishment
state for Bob, Alice Fires, and Bob would have worked if Alice Hired him. The
game starts in Employment state. At any state, if only one player fails to play
what s/he is supposed to play at that state, then we go to the Punishment
state for that player in the next period; otherwise we go to the Employment
state in the next period.
240 CHAPTER 12. REPEATED GAMES
14. [Midterm 2, 2007] Consider the infinitely repeated game with the following stage
game
Chicken Lion
Chicken
3 3 1 4
Lion
4 1 0 0
and discount factor = 099. For each strategy profile below check if it is a
subgame-perfect equilibrium. (You need to state your arguments clearly; you will
not get any points for Yes or No answers.)
(a) There are two states: Cooperation and Fight. The game starts in the Cooper-
ation state. In Cooperation state, each player plays Chicken. If both players
play Chicken, then they remain in the Cooperation state; otherwise they go
to the Fight state in the next period. In the Fight state, both play Lion, and
they go back to the Cooperation state in the following period (regardless of
the actions).
(b) There are three states: Cooperation, P1 and P2. The game starts in the Co-
operation state. In the Cooperation state, each player plays Chicken. If they
play (Chicken, Chicken) or (Lion, Lion), then they remain in the Cooperation
state in the next period. If player plays Lion while the other player plays
Chicken, then in the next period they go to P state. In P state player plays
Chicken while the other player plays Lion; they then go back to Cooperation
state (regardless of the actions).
15. [Midterm 2 Make Up, 2007] Alice has two sons, Bob and Colin. Every day, she is to
choose between letting them play with the toys ("Play") or make them visit their
grandmother ("Visit"). If she make them visit their grandmother, each of them
gets 1. If she lets them play, then Bob and Colin simultaneously choose between
Grab and Share, which leads to the payoffs as in the following table, where the
third entry is the payoff of Alice:
Consider the infinitely repeated game with the above game is the stage game and
the discount factor is = 09. For each strategy profile below check if it is a
subgame-perfect equilibrium. Show your work.
(a) There are three states: Share, and . In Share state Alice lets
them play and Bob and Colin both share. In state (resp.
state), Alice lets them play, and Bob (resp. Colin) shares while the other
brother grabs. The game starts in Share state. If Bob (resp. Colin) does
not play what he is supposed to play while the other player plays what he is
supposed to play, then the next day we go to (resp. ) state; we
go to Share state next day otherwise.
(b) There are two states: Play and Visit. The game starts in the Play state. In
the Play state, Alice lets them play, and both sons share. In Play state, if
everybody does what they are supposed to do, we remain in Play state; we
go to Visit state next day otherwise. In the Visit state, Alice makes them
visit their grandmother, and they would both Grab if she let them play. In
the Visit state, they automatically go back to Play state next day.
16. [Homework 4, 2006] Alice has a restaurant, and Bob is a potential customer. Each
day Alice is to decide whether to use high quality supply (High) or low quality
supply (Low) to make the food, and Bob is to decide whether to buy or not at
price ∈ [1 3]. (At the time Bob buys the food, he cannot tell if it is of high
quality, but after buying he knows whether it was high or low quality.) The payoffs
for a given day is as follows.
The discount rate is = 099. For each of the following strategy profiles, find the
range of ∈ [1 3] for which the strategy profile is a subgame-perfect equilibrium
(a) There are two states: Trade and No-trade. The game starts at Trade state.
In Trade state, Alice uses High quality supply, and Bob Buys. If in the Trade
state Alice uses Low quality supply, then they go to the No-Trade state, in
242 CHAPTER 12. REPEATED GAMES
which for days Alice uses Low quality supply and Bob Skips. At the end of
day, independent of what happens, they go back to the Trade state.
(b) Alice is to use High quality supply in the even days, = 0 2 4 , and Low
quality supply in the odd days, = 1 3 5 ; Bob is to Buy everyday. If
anyone deviates from this program, then in the rest of the game Alice uses
Low quality and Bob Skips.12
17. [Homework 4, 2006] In question 1, take = 2, and check whether each of the
following is a subgame-perfect equilibrium. [We assume here that Bob somehow
can check whether the food was good in the previous day even if did not buy it.]
(a) Everyday Alice uses High quality supply. Bob buys the product in the first
day. Afterwards, Bob buys the product if and only if Alice has used High
quality supply in the previous day.
(b) There are two states: Trade and Punishment. The game starts at Trade state.
In Trade state, Alice uses High quality supply, and Bob Buys. In Trade state
if Alice uses Low quality, then we go to Punishment state. In Punishment
state, Alice uses High quality supply, and Bob Skips. In Punishment state, if
Alice uses Low quality supply or Bob Buys, then we remain in the Punishment
state; otherwise we go to Trade state.
18. [Homework 4, 2006] In an eating club, there are 2 members. Each day, each
member is to decide how much to eat, denoted by , and the payoff of for that
day is
√ 1 + · · · +
−
For = 099, check if either of the following strategy profiles is a subgame-perfect
equilibrium. [If you solve the problem for = 3, you will get 80%.]
(a) Each player eats = 14 units until somebody eats more than 14; thereafter
each eats = 2 4 units.
12
That is, at any 0 , Alice will use Low quality supply and Bob wil Skip in either of the following
cases: (i) Alice used Low quality supply at an even date 0 , or (ii) she used High quality supply at
an odd date 0 , or (iii) Bob Skipped at some date 0 .
12.5. EXERCISES 243
(b) Each player eats = 14 units until somebody eats more than 14; thereafter
each eats = 2 units.
19. [Homework 4, 2006] Each day Alice and Bob receive 1 dollar. Alice makes an offer
to Bob, and Bob accepts or rejects the offer, where ∈ {001 002 098 099}.
If Bob accepts the offer Alice gets 1− and Bob gets . If Bob rejects the offer, then
they both get 0. Find the values of for which the following is a subgame-prefect
equilibrium, where ̄ ∈ {001 002 098 099} is fixed.
At = 0, Alice offers ̄ and Bob accepts Alice’s offer, , if and only if ≥ ̄. They
keep doing this until Bob deviates from this program (i.e. until Bob accepts an
offer ̄, or Bob rejects an offer ≥ ̄). Thereafter, Alice offers = 001 and
Bob accepts any offer.
20. [Homework 3, 2004] Consider a Firm and a Worker. The firm first decides whether
to pay a wage 0 to the worker (hire him), and then the worker is to decide
whether work, which costs him 0 and produces to the firm where .
The payoffs are as follows:
Firm Worker
pay, work − −
pay, shirk −
don’t pay, work −
don’t pay, shirk 0 0
(b) Now consider the game this stage game is repeated infinitely many times and
the players discounts the future with . The following are strategy profiles
for this repeated game. For each of them, Check if it is a subgame-perfect
Nash equilibrium for large values of , and if so, find the lowest discount rate
that makes the strategy profile a subgame-perfect equilibrium.
i. No matter what happens, the firm always pays and the worker works.
ii. At any time , the worker works if he is paid at , and the firm always
pays.
244 CHAPTER 12. REPEATED GAMES
iii. At = 0, the firm pays and the worker works. At any time 0, the
firm pays if and only if the worker worked at all previous dates, and the
worker works if and only if he has worked at all previous dates.
iv. At = 0, the firm pays and the worker works. At any time 0, the
firm pays if and only if the worker worked at all previous dates at which
the firm paid, and the worker works if and only if he is paid at and he
has worked at all previous dates at which he was paid.
v. There are two states: Employment, and Unemployment. The game starts
at Employment. In this state, the firm pays, and the worker works if and
only if he has been paid at this date. If the worker shirks we go to Un-
employment state; otherwise we stay in Employment. In Unemployment
the firm does not pay and the worker shirks. After 0 days of Unem-
ployment we always go back to Employment. (Your answer should cover
each 0.)
21. Stage Game: Alice and Bob simultaneously choose contributions ∈ [0 1] and
∈ [0 1], respectively, and get payoffs = 2 − and = 2 − , respectively.
(a) (5 points) Find the set of rationalizable strategies in the Stage Game above.
(b) (10 points) Consider the infinitely repeated game with the Stage Game above
and with discount factor ∈ (0 1). For each , find the maximum (∗ ∗ )
such that there exists a subgame-perfect equilibrium of the repeated game
in which Alice and Bob contribute ∗ and ∗ , respectively, on the path of
equilibrium.
(c) (10 points) In part (b), now assume that at the beginning of each period
one of the players (Alice at periods = 0 2 4 and Bob at periods
= 1 3 5 ) offers a stream of contributions = ( +1 ) and =
( +1 ) for Alice and Bob, respectively, and the other player accepts or
rejects. If the offer is accepted then the game ends leading the automatic
contributions = ( +1 ) and = ( +1 ) from period on. If the
offer is rejected, they play the ³ Stage
´ Game and proceed to the next period.
ˆ ˆ such that the following is a subgame-perfect
Find ( ), ( ), and
equilibrium:
12.5. EXERCISES 245
∗ : When it is Alice’s turn, Alice offers ( ) and ( ) and Bob
accepts an offer ( ) if and only if (1 − ) [2 − + (2+1 − +1 ) + · · · ] ≥
2 − When it is Bob’s turn, Bob offers ( ) and ( )
and Alice accepts an offer ( ) if and only if (1 − ) [2 − + (2
³ +1´− +1 ) + · · · ] ≥
2 − If there is no agreement, in the stage game they play ˆ ˆ .
Verify that ∗ is a subgame perfect equilibrium for the values that you found.
(If you find it easier, you can consider only the constant streams of contribu-
tions = ( ) and = ( ).)
22. [Selected from Midterms 2 and make up exams in years 2002 and 2004] Below,
there are pairs of stage games and strategy profiles. For each pair, check whether
the strategy profile is a subgame-perfect equilibrium of the game in which the
stage game is repeated infinitely many times. Each agent tries to maximize the
discounted sum of his expected payoffs in the stage game, and the discount rate is
= 099. (Clearly explain your reasoning in each case.)
(a) Stage Game: There are 2 players. Each player, simultaneously, decides
23. [Midterm 2 Make Up, 2001] Consider the infinitely repeated game with the Pris-
oners’ Dilemma game
4 4 0 5
5 0 1 1
as its stage game. Each agent tries to maximize the discounted sum of his expected
payoffs in the stage game with discount rate .
(a) What is the lowest discount rate such that there exists a subgame perfect
equilibrium in which each player plays C on the path of equilibrium play?
[Hint: Note that a player can always guarantee himself an average payoff of
1 by playing D forever.]
(b) For sufficiently large values of , construct a subgame-perfect equilibrium in
which any agent’s action at any date only depends on the play at dates − 1
and − 2, and in which each player plays on the path of equilibrium play.
MIT OpenCourseWare
http://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.