On the Convergence of Reinforcement Learning
This paper examines the convergence of payoffs and strategies in Erev and Roth`s model of reinforcement learning. When all players use this rule it eliminates iteratively dominated strategies and in two-person constant-sum games average payoffs converge to the value of the game. Strategies converge in constant-sum games with unique equilibria if they are pure or in 2 × 2 games also if they are mixed. The long-run behaviour of the learning rule is governed by equations related to Maynard Smith`s version of the replicator dynamic. Properties of the learning rule against general opponents are also studied. In particular it is shown that it guarantees that the lim sup of a player`s average payoffs is at least his minmax payoff.
|Date of creation:||01 Mar 2002|
|Date of revision:|
|Contact details of provider:|| Postal: |
Web page: http://www.economics.ox.ac.uk/
More information through EDIRC
Please report citation or reference errors to , or , if you are the registered author of the cited work, log in to your RePEc Author Service profile, click on "citations" and make appropriate adjustments.:
- Sergiu Hart & Andreu Mas-Colell, 1999.
"A general class of adaptative strategies,"
Economics Working Papers
373, Department of Economics and Business, Universitat Pompeu Fabra.
- J.-F. Laslier & R. Topol & B. Walliser, 1999.
"A behavioral learning process in games,"
THEMA Working Papers
99-03, THEMA (THéorie Economique, Modélisation et Applications), Université de Cergy-Pontoise.
- Ed Hopkins, 2001.
"Two Competing Models of How People Learn in Games,"
NajEcon Working Paper Reviews
- Erev, Ido & Roth, Alvin E, 1998. "Predicting How People Play Games: Reinforcement Learning in Experimental Games with Unique, Mixed Strategy Equilibria," American Economic Review, American Economic Association, vol. 88(4), pages 848-81, September.
- T. Borgers & R. Sarin, 2010.
"Learning Through Reinforcement and Replicator Dynamics,"
Levine's Working Paper Archive
380, David K. Levine.
- Borgers, Tilman & Sarin, Rajiv, 1997. "Learning Through Reinforcement and Replicator Dynamics," Journal of Economic Theory, Elsevier, vol. 77(1), pages 1-14, November.
- Tilman B�rgers & Rajiv Sarin, . "Learning Through Reinforcement and Replicator Dynamics," ELSE working papers 051, ESRC Centre on Economics Learning and Social Evolution.
- Josef Hofbauer & Karl H. Schlag, 2000.
"Sophisticated imitation in cyclic games,"
Journal of Evolutionary Economics,
Springer, vol. 10(5), pages 523-543.
- Sergiu Hart & Andreu Mas-Colell, 1997.
"A Simple Adaptive Procedure Leading to Correlated Equilibrium,"
Game Theory and Information
9703006, EconWPA, revised 24 Mar 1997.
- Sergiu Hart & Andreu Mas-Colell, 2000. "A Simple Adaptive Procedure Leading to Correlated Equilibrium," Econometrica, Econometric Society, vol. 68(5), pages 1127-1150, September.
- S. Hart & A. Mas-Collel, 2010. "A Simple Adaptive Procedure Leading to Correlated Equilibrium," Levine's Working Paper Archive 572, David K. Levine.
- Sergiu Hart & Andreu Mas-Colell, 1996. "A simple adaptive procedure leading to correlated equilibrium," Economics Working Papers 200, Department of Economics and Business, Universitat Pompeu Fabra, revised Dec 1996.
- Martin Posch, 1997. "Cycling in a stochastic learning algorithm for normal form games," Journal of Evolutionary Economics, Springer, vol. 7(2), pages 193-207.
- Colin Camerer & Teck-Hua Ho, 1999. "Experience-weighted Attraction Learning in Normal Form Games," Econometrica, Econometric Society, vol. 67(4), pages 827-874, July.
- Kuan, Chung-Ming & White, Halbert, 1994. "Adaptive Learning with Nonlinear Dynamics Driven by Dependent Processes," Econometrica, Econometric Society, vol. 62(5), pages 1087-1114, September.
- Gale, John & Binmore, Kenneth G. & Samuelson, Larry, 1995. "Learning to be imperfect: The ultimatum game," Games and Economic Behavior, Elsevier, vol. 8(1), pages 56-90.
- Benaim, Michel & Hirsch, Morris W., 1999. "Mixed Equilibria and Dynamical Systems Arising from Fictitious Play in Perturbed Games," Games and Economic Behavior, Elsevier, vol. 29(1-2), pages 36-72, October.
- Rustichini, Aldo, 1999. "Optimal Properties of Stimulus--Response Learning Models," Games and Economic Behavior, Elsevier, vol. 29(1-2), pages 244-273, October.
- Arthur, W Brian, 1993. "On Designing Economic Agents That Behave Like Human Agents," Journal of Evolutionary Economics, Springer, vol. 3(1), pages 1-22, February.
When requesting a correction, please mention this item's handle: RePEc:oxf:wpaper:96. See general information about how to correct material in RePEc.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Monica Birds)
If references are entirely missing, you can add them using this form.