《博弈论》(英文版) lect 11 Subgame perfection Marciano siniscalchi

Eco514-Game Theory lecture 11: Subgame perfection Marciano siniscalchi October 21. 1999 Introduction The notion of subgame perfection is the cornerstone of the theory of extensive games. It embodies its key intuitions-and provides a vivid example of the difficulties inherent in such a theor But, above all, it has proved to be extremely profitable in a variety of applications. More- over, it has spawned a huge theoretical literature which has attempted(often successfully to overcome the limitations and or expand the domain of application of subgame-perfect equilibrium Subgame perfection The basic intuition is apparent in the entry deterrence game of Figure 1 1E2 1.1 N0.20.2 E|-111,1 0,2 Figure 1: The Entry Deterrence game and its normal form Player 1 is a potential entrant who may decide to Enter or Not Enter a market; followin ntry, the incumbent, Player 2, decides whether to Fight or Acquiesce
Eco514—Game Theory Lecture 11: Subgame Perfection Marciano Siniscalchi October 21, 1999 Introduction The notion of subgame perfection is the cornerstone of the theory of extensive games. It embodies its key intuitions—and provides a vivid example of the difficulties inherent in such a theory. But, above all, it has proved to be extremely profitable in a variety of applications. Moreover, it has spawned a huge theoretical literature which has attempted (often successfully) to overcome the limitations and/or expand the domain of application of subgame-perfect equilibrium. Subgame perfection The basic intuition is apparent in the entry deterrence game of Figure 1. 1,1 r 1 0,2 N E r 2 f a -1,-1 f a N 0,2 0,2 E -1,-1 1,1 Figure 1: The Entry Deterrence game and its normal form. Player 1 is a potential entrant who may decide to Enter or Not Enter a market; following entry, the incumbent, Player 2, decides whether to Fight or Acquiesce. 1

The normal form of the Entry Deterrence game has two pure Nash equilibria, namely (N, f) and(E, a). However, the standard reasoning goes, the first equilibrium is somewhat unreasonable: after all, if Player 1 were to Enter, Player 2 would not want to fight, ex-post Thus, the(n, f)equilibrium is supported by a non-credible threat. For this reason, we wish to rule it out Reinhard Selten proposed the following. First, note that any non-terminal history h in game with observable actions defines a subgame--the game defined by all histories of which h is a subhistory. For instance, in the simple game of Figure 1, there is a subgame starting at(E), corresponding the histories(E),(E, a) and(E, f). The only other subgame is the whole game, which may be viewed as beginning after 0, the initial history A subgame-perfect equilibrium of the original game is then a profile of strategies which induces Nash equilibria in every subgame. Thus, the (N, f) equilibrium is not subgame- perfect, because f is not a Nash equilibrium(trivially, an optimal strategy) in the subgame tarting at(E) We are ready for a few formal definitions Definition 1 Fix an extensive game with observable actions T=(N, A, H, Z, P, UiieN and a non-terminal history hH\Z. The subgame starting at h is the extensive game with observable actions r(h)=(N, A Zh, Plh, (Uiln)ieN), where Hn=h:(h, h)E H is the set of continuation histories, i.e. seque profiles h' such that(h, h) is a history in H; Zn is defined analogously Phh is the restriction of the player correspondence P to continuation histories: formally, for every h'E Hh, Pln(h)=P((h, h)); and each UiIn is defined analogously Similarly, for any strategy profiles=(si)ieN E S in the game r, shh=(siIn)ieN is the strategy profile induced by s in r(h) by letting sin(h)=si((h, h)) for every h'E H]h and i∈Ph(h Note that, according to the definition, the whole game r corresponds to the subgame r(0) Definition 2 Consider a game r with observable actions. A strategy profile s E s is subgame-perfect equilibrium of r iff, for every non-terminal history hE H\Z, sIh is a Nash equilibrium of r(h) Since r=r(0), any subgame-perfect equilibrium(or SPE for short)is necessarily also a Nash equilibrium. The example given above shows that the converse is false Whenever one introduces a solution concept, the first order of business is to establish the existence of a solution according to the concept under consideration. In finite games with perfect information, this is easy to achieve using the backward induction algorithm. Proposition 0.1 Every finite extensive game with perfect information has a SPE
The normal form of the Entry Deterrence game has two pure Nash equilibria, namely (N, f) and (E, a). However, the standard reasoning goes, the first equilibrium is somewhat unreasonable: after all, if Player 1 were to Enter, Player 2 would not want to fight, ex-post. Thus, the (N, f) equilibrium is supported by a non-credible threat. For this reason, we wish to rule it out. Reinhard Selten proposed the following. First, note that any non-terminal history h in a game with observable actions defines a subgame—the game defined by all histories of which h is a subhistory. For instance, in the simple game of Figure 1, there is a subgame starting at (E), corresponding the histories (E), (E, a) and (E, f). The only other subgame is the whole game, which may be viewed as beginning after ∅, the initial history. A subgame-perfect equilibrium of the original game is then a profile of strategies which induces Nash equilibria in every subgame. Thus, the (N, f) equilibrium is not subgameperfect, because f is not a Nash equilibrium (trivially, an optimal strategy) in the subgame starting at (E). We are ready for a few formal definitions. Definition 1 Fix an extensive game with observable actions Γ = (N, A, H, Z, P,(Ui)i∈N ) and a non-terminal history h ∈ H \ Z. The subgame starting at h is the extensive game with observable actions Γ(h) = (N, A, H|h, Z|h, P|h,(Ui |h)i∈N ), where: H|h = {h 0 : (h, h0 ) ∈ H} is the set of continuation histories, i.e. sequences of action profiles h 0 such that (h, h0 ) is a history in H; Z|h is defined analogously; P|h is the restriction of the player correspondence P to continuation histories: formally, for every h 0 ∈ H|h, P|h(h 0 ) = P((h, h0 )); and each Ui |h is defined analogously. Similarly, for any strategy profile s = (si)i∈N ∈ S in the game Γ, s|h = (si |h)i∈N is the strategy profile induced by s in Γ(h) by letting si |h(h 0 ) = si((h, h0 )) for every h 0 ∈ H|h and i ∈ P|h(h 0 ). Note that, according to the definition, the whole game Γ corresponds to the subgame Γ(∅). Definition 2 Consider a game Γ with observable actions. A strategy profile s ∈ S is a subgame-perfect equilibrium of Γ iff, for every non-terminal history h ∈ H \Z, s|h is a Nash equilibrium of Γ(h). Since Γ = Γ(∅), any subgame-perfect equilibrium (or SPE for short) is necessarily also a Nash equilibrium. The example given above shows that the converse is false. Whenever one introduces a solution concept, the first order of business is to establish the existence of a solution according to the concept under consideration. In finite games with perfect information, this is easy to achieve using the backward induction algorithm. Proposition 0.1 Every finite extensive game with perfect information has a SPE. 2

Proof: We argue by induction on the length of histories. Let L=maxe(h): hEH\Z (L) Pick any h with e(h)=L. Then T(h)is a single-person decision problem; let bi (h) denote the optimal choice of i= P(h)at H, and define o(h)=(h, bi (h)), the corresponding outcome, and c(h)=bi(h), the continuation history induced by bi at h (e) Assume that the procedure has been carried out for all histories of length (+1,.,L and consider a history h with ((h)=e. Consider the decision problem faced by Player i P(h)and defined by the action set A (h), and by the assumption that action a E A: (h) leads to outcome o(h, a)). Denote by bi(h) Player is choice at h, and define o(h)=o((h, bi(h)) and c(h)=(bi(h), c((h, bi(h))) The procedure ends in finitely many steps. It should be clear that the resulting profile b=(b2) eN is a SPE.■ Note that the only reason why the argument does not generalize to games with observable actions is that we insist on choosing pure actions in each subgame Backward induction is viewed as the natural extension of dynamic programming to in- teractive environments. From a purely formal standpoint, this view is of course correct However, the interpretation of the two procedures is rather different; we return to this point belo The analogy with dynamic programming suggests that a version of the one-deviation principle might hold. This is indeed the case A preliminary definition: say that a game has finite horizon iff it has no infinite histories Thus, a finite game has finite horizon, but not conversely Proposition 0.2 Consider an extensive game r with observable actions and finite horizon A strategy profile s is a Spe of r iff there exists no player i E N, history h such that iE P(h), and action a'+si(h)with the property that the continuation strategy s'h defined sAh(o and Wh'E Hn\0), sln(h)=si(h, h) is a profitable deviation in r(h) Thus, assume there is no profitable one-time deviation, but that s is not a SPE. in r(h) Proof: Clearly, if s is a SPE, sih cannot be a profitable deviation from si Then there must exist a player iE N, a history h and at least one continuation strategy h that sin is a profitable dey siIn in r(h) Note that any profitable deviation sIh must differ from silh at more than one history Thus, pick a deviation sh which differs from siIn the least, i. e. such that no other devia- tion differs from silh at fewer histories(if there is more than one such deviations, pick one arbitrarily). This is well-defined because the game has a finite horizon Now let h' be the longest(continuation) history of r(h)where sAh differs from silh, i such that sn(h)+siln(h"). Again, this is well-defined because the game has a finite horizon
Proof: We argue by induction on the length of histories. Let L = max{`(h) : h ∈ H \Z}. (L) Pick any h with `(h) = L. Then Γ(h) is a single-person decision problem; let bi(h) denote the optimal choice of i = P(h) at H, and define o(h) = (h, bi(h)), the corresponding outcome, and c(h) = bi(h), the continuation history induced by bi at h. (`) Assume that the procedure has been carried out for all histories of length `+1, . . . , L, and consider a history h with `(h) = `. Consider the decision problem faced by Player i = P(h) and defined by the action set Ai(h), and by the assumption that action a ∈ Ai(h) leads to outcome o((h, a)). Denote by bi(h) Player i’s choice at h, and define o(h) = o((h, bi(h)) and c(h) = (bi(h), c((h, bi(h)))). The procedure ends in finitely many steps. It should be clear that the resulting profile b = (bi)i∈N is a SPE. Note that the only reason why the argument does not generalize to games with observable actions is that we insist on choosing pure actions in each subgame. Backward induction is viewed as the natural extension of dynamic programming to interactive environments. From a purely formal standpoint, this view is of course correct. However, the interpretation of the two procedures is rather different; we return to this point below. The analogy with dynamic programming suggests that a version of the one-deviation principle might hold. This is indeed the case. A preliminary definition: say that a game has finite horizon iff it has no infinite histories. Thus, a finite game has finite horizon, but not conversely. Proposition 0.2 Consider an extensive game Γ with observable actions and finite horizon. A strategy profile s is a SPE of Γ iff there exists no player i ∈ N, history h such that i ∈ P(h), and action a 0 6= si(h) with the property that the continuation strategy s 0 i |h defined by s 0 i |h(∅) = a 0 and ∀h 0 ∈ H|h \ {∅}, s0 i |h(h 0 ) = si(h, h0 ) is a profitable deviation in Γ(h). Proof: Clearly, if s is a SPE, s 0 i |h cannot be a profitable deviation from si |h in Γ(h). Thus, assume there is no profitable one-time deviation, but that s is not a SPE. Then there must exist a player i ∈ N, a history h and at least one continuation strategy s 00 i |h such that s 00 i |h is a profitable deviation from si |h in Γ(h). Note that any profitable deviation s 00 i |h must differ from si |h at more than one history. Thus, pick a deviation s 0 i |h which differs from si |h the least, i.e. such that no other deviation differs from si |h at fewer histories (if there is more than one such deviations, pick one arbitrarily). This is well-defined because the game has a finite horizon. Now let h 0 be the longest (continuation) history of Γ(h) where s 0 i |h differs from si |h, i.e such that s 0 i |h(h 0 ) 6= si |h(h 0 ). Again, this is well-defined because the game has a finite horizon. 3

Finally, consider the sub-subgame r(h"). By construction, slh differs from silh, only at the empty history of r(h"), i.e. it constitutes a one-time deviation. Moreover, it must be a profitable deviation in r(h ), because otherwise we could find a deviation s h in r(h)which differs from sihh at fewer histories than sih. Contradiction. I Subgame Perfection: A Critical Look The prototypical criticism of SPE is based on the Centipede game, depicted in Figure 2. 2A1 2.2 di 0,2 Figure 2: A Three-Legged Centipede The argument goes as follows. According to SPE, both players should go down at each history. However, consider Player 2s predicament: he understands this, and he also un- derstands that Player 1 should understand this. Say that Player 2 is certain that Player 1 is rational; then, if in the course of the game he is called upon to move, how should he interpret the fact that Player 1 chose al despite the fact that(1) she is rational and(2)she understands that Player 2 will go down? If this argument sounds too informal to be right as stated, you are absolutely correct! However, it does contain at least a grain of truth. But consider also the game in Figure 3 which is perhaps even more striking 24 3.3 d1 1,1 0,0 Figure 3: Deviations from SPE?
Finally, consider the sub-subgame Γ(h 0 ). By construction, s 0 i |h0 differs from si |h0 only at the empty history of Γ(h 0 ), i.e. it constitutes a one-time deviation. Moreover, it must be a profitable deviation in Γ(h 0 ), because otherwise we could find a deviation s 00 i |h in Γ(h) which differs from si |h at fewer histories than s 0 i |h. Contradiction. Subgame Perfection: A Critical Look The prototypical criticism of SPE is based on the Centipede game, depicted in Figure 2. 2,2 r 1 1,0 d1 a1 r 2 D A 0,2 r 1 d2 a2 3,0 Figure 2: A Three-Legged Centipede The argument goes as follows. According to SPE, both players should go down at each history. However, consider Player 2’s predicament: he understands this, and he also understands that Player 1 should understand this. Say that Player 2 is certain that Player 1 is rational; then, if in the course of the game he is called upon to move, how should he interpret the fact that Player 1 chose a1 despite the fact that (1) she is rational and (2) she understands that Player 2 will go down? If this argument sounds too informal to be right as stated, you are absolutely correct! However, it does contain at least a grain of truth. But consider also the game in Figure 3, which is perhaps even more striking. 3,3 r 1 2,2 d1 a1 r 2 D A 1,1 r 1 d2 a2 0,0 Figure 3: Deviations from SPE? 4

Here(did2, D) is not a SPE, but it is a Nash equilibrium. It fails the test of perfection because, upon being reached (contrary to the equilibrium prescription! Player 2"should" really expect Player 1 to continue with a2, not d2, because "Player 2 is certain that Player 1 is rational. But this argument is even more ambiguous Theorists have fiercely argued over these and similar examples. I cannot do justice to either side of the debate without a full-blown model of interactive beliefs specifically developed to account for dynamically evolving beliefs. However, for the time being, let me e a few informal but rigorous observation First, in a dynamic game, statements such as "Player 2 is certain that Player l is rational are meaningless. One must specify when, in the course of the game, that statement is assumed to be true-when, for example, Player 2 is certain that his opponent is rational, or Player 2 is certain that Player 1 expects Player 2 to choose D The reason is that, in a dynamic game(and especially in a game with observable actions! players acquire new information as the play unfolds. This information may lead them to update, i. e. "refine"their beliefs, or it may lead them to completely revise them For example, Player 2 might be certain, at the initial history, that Player 1 chooses di at the initial history. However, if Player 1 chooses al instead, Player 2 observes this, so it would be impossible for her to continue to be certain after the history(a,) that she chooses d1. Player 2 has to revise her beliefs Reasoning along these lines, the argument against backward induction in the Centipede game can be reformulated more convincingly. Suppose that, at the beginning of the game both players' beliefs are consistent with the unique SPE. Moreover, suppose that, at the beginning of the game, both players are certain that their opponent is rational (you can assume that there is common certainty of this: it is not going to matter Could it be the case that Player 1 chooses a1 at the initial history, contrary to the SPe prediction? The answer is, surprisingly, yes! For instance, this will be the unique best response of Player 1 if she is certain at the initial node that (1) Player 2 is certain, at the initial node, that Player I will play dyd2, and that Player I is rational(this much we had already assumed) (2) However, Player 2, upon observing al, will revise her beliefs and become certain at (an) that Player 1 is actually irrational and will choose a2 after(a1, A) (3)Player 2 is rational(again, this mud Ir assumptions Note that(2)+ 3)imply that Player 2, if he has to choose, will pick A and not D. But hen, of course, Player 1 will rationally choose an herself The key point is that, even if players'beliefs are concentrated on the equilibrium at the beginning of the game, it can still be the case that the way they revise their beliefs leads them to act differently, both off and on the equilibrium path To put it differently, in order to justify SPE one has to make assumptions about how players revise their beliefs. Characterizing backward induction in perfect-information games
Here (d1d2, D) is not a SPE, but it is a Nash equilibrium. It fails the test of perfection because, upon being reached (contrary to the equilibrium prescription!) Player 2 “should” really expect Player 1 to continue with a2, not d2, because “Player 2 is certain that Player 1 is rational.” But this argument is even more ambiguous! Theorists have fiercely argued over these and similar examples. I cannot do justice to either side of the debate without a full-blown model of interactive beliefs specifically developed to account for dynamically evolving beliefs. However, for the time being, let me propose a few informal but rigorous observation. First, in a dynamic game, statements such as “Player 2 is certain that Player 1 is rational” are meaningless. One must specify when, in the course of the game, that statement is assumed to be true—when, for example, Player 2 is certain that his opponent is rational, or Player 2 is certain that Player 1 expects Player 2 to choose D. The reason is that, in a dynamic game (and especially in a game with observable actions!), players acquire new information as the play unfolds. This information may lead them to update, i.e. “refine” their beliefs, or it may lead them to completely revise them. For example, Player 2 might be certain, at the initial history, that Player 1 chooses d1 at the initial history. However, if Player 1 chooses a1 instead, Player 2 observes this, so it would be impossible for her to continue to be certain after the history (a1) that she chooses d1. Player 2 has to revise her beliefs. Reasoning along these lines, the argument against backward induction in the Centipede game can be reformulated more convincingly. Suppose that, at the beginning of the game, both players’ beliefs are consistent with the unique SPE. Moreover, suppose that, at the beginning of the game, both players are certain that their opponent is rational (you can assume that there is common certainty of this: it is not going to matter). Could it be the case that Player 1 chooses a1 at the initial history, contrary to the SPE prediction? The answer is, surprisingly, yes! For instance, this will be the unique best response of Player 1 if she is certain at the initial node that: (1) Player 2 is certain, at the initial node, that Player 1 will play d1d2, and that Player 1 is rational (this much we had already assumed). (2) However, Player 2, upon observing a1, will revise her beliefs and become certain at (a1) that Player 1 is actually irrational and will choose a2 after (a1, A). (3) Player 2 is rational (again, this much was already part of our assumptions). Note that (2)+(3) imply that Player 2, if he has to choose, will pick A and not D. But then, of course, Player 1 will rationally choose a1 herself! The key point is that, even if players’ beliefs are concentrated on the equilibrium at the beginning of the game, it can still be the case that the way they revise their beliefs leads them to act differently, both off and on the equilibrium path. To put it differently, in order to justify SPE one has to make assumptions about how players revise their beliefs. Characterizing backward induction in perfect-information games 5

has become a little cottage industry, so I will spare you the references A similar point applies to the game in Figure 3: the Nash equilibrium(did, D) is sup- ported by the assumption that, upon being reached, Player 2 becomes certain that Player 1 is irrational. The reason I mention it is that, in the Centipede game,(d1d2, D)is the only Nash equilibrium, so normal-form analysis already yields the"standard prediction"without having to rely on a potentially problematic extensive-form analysis. The game in Figure 3 does not have this feature, so it might be regarded as a better example(Phil Reny should be credited for this)
has become a little cottage industry, so I will spare you the references. A similar point applies to the game in Figure 3: the Nash equilibrium (d1d2, D) is supported by the assumption that, upon being reached, Player 2 becomes certain that Player 1 is irrational. The reason I mention it is that, in the Centipede game, (d1d2, D) is the only Nash equilibrium, so normal-form analysis already yields the “standard prediction” without having to rely on a potentially problematic extensive-form analysis. The game in Figure 3 does not have this feature, so it might be regarded as a better example (Phil Reny should be credited for this). 6
按次数下载不扣除下载券;
注册用户24小时内重复下载只扣除一次;
顺序:VIP每日次数-->可用次数-->下载券;
- 《博弈论》(英文版) lect 10 Extensive Games with(Almost) Perfect Information.pdf
- 《博弈论》(英文版) lect 1 Game Theory Multiperson Decision Theory.pdf
- 《博弈论》(英文版) Marciano Siniscalchi Game Theory.doc
- 湖南商学院:《概率论》课程教学资源(PPT课件)第五章 大数定律和中心极限定理(5.2)中心极限定理.ppt
- 湖南商学院:《概率论》课程教学资源(PPT课件)第五章 大数定律和中心极限定理(5.1)大数定律.ppt
- 《数学分析》课程教学资源(考研辅导)第十六讲 数学分析补充大纲.doc
- 《数学分析》课程教学资源(考研辅导)第十五讲 阶的概念.doc
- 《数学分析》课程教学资源(考研辅导)第十四讲 广义积分的收敛性.doc
- 《数学分析》课程教学资源(考研辅导)第十三讲 微分方法的应用.doc
- 《数学分析》课程教学资源(考研辅导)第十二讲 凸函数及其应用.doc
- 《数学分析》课程教学资源(考研辅导)第十一讲 极限与连续.doc
- 《数学分析》课程教学资源(考研辅导)第十讲 级数的收敛性.doc
- 《数学分析》课程教学资源(考研辅导)第九讲 积分不等式.doc
- 《数学分析》课程教学资源(考研辅导)第八讲 数学分析补充大纲.doc
- 《数学分析》课程教学资源(考研辅导)第七讲 阶的概念.doc
- 《数学分析》课程教学资源(考研辅导)第六讲 广义积分的收敛性.doc
- 《数学分析》课程教学资源(考研辅导)第五讲 微分方法的应用.doc
- 《数学分析》课程教学资源(考研辅导)第四讲 凸函数及其应用.doc
- 《数学分析》课程教学资源(考研辅导)第三讲 极限与连续.doc
- 《数学分析》课程教学资源(考研辅导)第二讲 级数的收敛性.doc
- 《博弈论》(英文版) lect 12 Repeated Games.pdf
- 《博弈论》(英文版) lect 13 Repeated Games.pdf
- 《博弈论》(英文版) lect 14 General Extensive Games.pdf
- 《博弈论》(英文版) lect 15 Sequential Equilibrium.pdf
- 《博弈论》(英文版) lect 16 Applications of Sequential and Perfect Bayesian Equilibrium.pdf
- 《博弈论》(英文版) lect 2 (Iterated)Best Response Operators.pdf
- 《博弈论》(英文版) lect 3 Nash Equilibrium.pdf
- 《博弈论》(英文版) lect 4 Games with Payoff Uncertainty(1).pdf
- 《博弈论》(英文版) lect 5 Games with Payoff Uncertainty(2).pdf
- 《博弈论》(英文版) lect 6 Interactive Epistemology(1) Marciano Siniscalchi.pdf
- 《博弈论》(英文版) lect 7 Interactive Epistemology(2 Marciano siniscalchi.pdf
- 《博弈论》(英文版) lect 8 Applications(1)-Simultaneous Auctions Marciano siniscalchi.pdf
- 《博弈论》(英文版) lect 8.5 More on Auctions.pdf
- 《博弈论》(英文版) lect Forward Induction Marciano Siniscalchi January 10, 2000.pdf
- 《博弈论》(英文版) lect Game Theory Signaling Games Marciano Siniscalchi January.pdf
- 《博弈论》(英文版) lect TH.pdf
- 《博弈论》(英文版) ps1 Due Friday, September.pdf
- 《博弈论》(英文版) ps2 Due Thursday, October.pdf
- 《博弈论》(英文版) ps3 Due Thursday Octobe.pdf
- 《博弈论》(英文版) ps4 Due Tuesday, November.pdf