Attainability of Boundary Points under Reinforcement Learning
This paper investigates the properties of the most common form of reinforcement learning (the "basic model" of Erev and Roth, American Economic Review, 88, 848-881, 1998). Stochastic approximation theory has been used to analyse the local stability of fixed points under this learning process. However, as we show, when such points are on the boundary of the state space, for example, pure strategy equilibria, standard results from the theory of stochastic approximation do not apply. We offer what we believe to be the correct treatment of boundary points, and provide a new and more general result: this model of learning converges with zero probability to fixed points which are unstable under the Maynard Smith or adjusted version of the evolutionary replicator dynamics. For two player games these are the fixed points that are linearly unstable under the standard replicator dynamics.
|Date of creation:||Jul 2003|
|Date of revision:|
|Contact details of provider:|| Postal: |
Web page: http://www.econ.ed.ac.uk/
More information through EDIRC
Please report citation or reference errors to , or , if you are the registered author of the cited work, log in to your RePEc Author Service profile, click on "citations" and make appropriate adjustments.:
- Sarin, Rajiv & Vahid, Farshid, 1999. "Payoff Assessments without Probabilities: A Simple Dynamic Model of Choice," Games and Economic Behavior, Elsevier, vol. 28(2), pages 294-309, August.
- Ellison, Glenn & Fudenberg, Drew, 2000.
"Learning Purified Mixed Equilibria,"
Journal of Economic Theory,
Elsevier, vol. 90(1), pages 84-115, January.
- Glenn Ellison & Drew Fudenberg, 1998. "Learning Purified Mixed Equilibria," Harvard Institute of Economic Research Working Papers 1817, Harvard - Institute of Economic Research.
- Ed Hopkins, 2001.
"Two Competing Models of How People Learn in Games,"
NajEcon Working Paper Reviews
- Laslier, J.-F. & Topol, R. & Walliser, B., 1999.
"A Behavioral Learning Process in Games,"
99-03, Paris X - Nanterre, U.F.R. de Sc. Ec. Gest. Maths Infor..
- Monderer, Dov & Shapley, Lloyd S., 1996. "Potential Games," Games and Economic Behavior, Elsevier, vol. 14(1), pages 124-143, May.
- Duffy, John & Hopkins, Ed, 2005.
"Learning, information, and sorting in market entry games: theory and evidence,"
Games and Economic Behavior,
Elsevier, vol. 51(1), pages 31-62, April.
- John Duffy & Ed Hopkins, 2001. "Learning, Information and Sorting in Market Entry Games: Theory and Evidence," ESE Discussion Papers 78, Edinburgh School of Economics, University of Edinburgh.
- John Duffy & Ed Hopkins, 2010. "Learning, Information and Sorting in Market Entry Games: Theory and Evidence," Levine's Working Paper Archive 506439000000000355, David K. Levine.
- T. Borgers & R. Sarin, 2010.
"Learning Through Reinforcement and Replicator Dynamics,"
Levine's Working Paper Archive
380, David K. Levine.
- Borgers, Tilman & Sarin, Rajiv, 1997. "Learning Through Reinforcement and Replicator Dynamics," Journal of Economic Theory, Elsevier, vol. 77(1), pages 1-14, November.
- Tilman B�rgers & Rajiv Sarin, . "Learning Through Reinforcement and Replicator Dynamics," ELSE working papers 051, ESRC Centre on Economics Learning and Social Evolution.
- Sandholm, William H, 2002. "Evolutionary Implementation and Congestion Pricing," Review of Economic Studies, Wiley Blackwell, vol. 69(3), pages 667-89, July.
- Martin Posch, 1997. "Cycling in a stochastic learning algorithm for normal form games," Journal of Evolutionary Economics, Springer, vol. 7(2), pages 193-207.
- Arthur, W Brian, 1993. "On Designing Economic Agents That Behave Like Human Agents," Journal of Evolutionary Economics, Springer, vol. 3(1), pages 1-22, February.
- Erev, Ido & Roth, Alvin E, 1998. "Predicting How People Play Games: Reinforcement Learning in Experimental Games with Unique, Mixed Strategy Equilibria," American Economic Review, American Economic Association, vol. 88(4), pages 848-81, September.
- Roth, Alvin E. & Erev, Ido, 1995. "Learning in extensive-form games: Experimental data and simple dynamic models in the intermediate term," Games and Economic Behavior, Elsevier, vol. 8(1), pages 164-212.
- Hofbauer, Josef & Hopkins, Ed, 2005.
"Learning in perturbed asymmetric games,"
Games and Economic Behavior,
Elsevier, vol. 52(1), pages 133-152, July.
- Colin Camerer & Teck-Hua Ho, 1999. "Experience-weighted Attraction Learning in Normal Form Games," Econometrica, Econometric Society, vol. 67(4), pages 827-874, July.
- Rustichini, Aldo, 1999. "Optimal Properties of Stimulus--Response Learning Models," Games and Economic Behavior, Elsevier, vol. 29(1-2), pages 244-273, October.
When requesting a correction, please mention this item's handle: RePEc:edn:esedps:79. See general information about how to correct material in RePEc.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Gina Reddie)
If references are entirely missing, you can add them using this form.