Reinforcement learning and the power law of practice: some analytical results
Erev and Roth (1998) among others provide a comprehensive analysis of experimental evidence on learning in games, based on a stochastic model of learning that accounts for two main elements: the Law of Effect (positive reinforcement of actions that perform well) and the Power Law of Practice (learning curves tend to be steeper initially). This note complements this literature by providing an analytical study of the properties of such learning models. Specifically, the paper shows that: (a) up to an error term, the stochastic process is driven by a system of discrete time difference equations of the replicator type. This carries an analogy with BÃ¶rgers and Sarin (1997), where reinforcement learning accounts only for the Law of Effect. (b) if the trajectories of the system of replicator equations converge sufficiently fast, then the probability that all realizations of the learning process over a possibly infinite spell of time lie within a given small distance of the solution path of the replicator dynamics becomes, from some time on, arbitrarily close to one. Fast convergence, in the form of exponential convergence, is shown to hold for any strict Nash equilibrium of the underlying game.
|Date of creation:||01 Jan 2002|
|Date of revision:|
|Contact details of provider:|| Postal: Highfield, Southampton SO17 1BJ|
Phone: (+44) 23 80592537
Fax: (+44) 23 80593858
Web page: http://www.economics.soton.ac.uk/
More information through EDIRC
Please report citation or reference errors to , or , if you are the registered author of the cited work, log in to your RePEc Author Service profile, click on "citations" and make appropriate adjustments.:
- Martin Posch, 1997. "Cycling in a stochastic learning algorithm for normal form games," Journal of Evolutionary Economics, Springer, vol. 7(2), pages 193-207.
- Roth, Alvin E. & Erev, Ido, 1995. "Learning in extensive-form games: Experimental data and simple dynamic models in the intermediate term," Games and Economic Behavior, Elsevier, vol. 8(1), pages 164-212.
- Barry Sopher & Dilip Mookherjee, 2000.
"Learning and Decision Costs in Experimental Constant Sum Games,"
Departmental Working Papers
199625, Rutgers University, Department of Economics.
- Mookherjee, Dilip & Sopher, Barry, 1997. "Learning and Decision Costs in Experimental Constant Sum Games," Games and Economic Behavior, Elsevier, vol. 19(1), pages 97-132, April.
- Barry Sopher & Dilip Mookherjee, 1997. "Learning and Decision Costs in Experimental Constant Sum Games," Departmental Working Papers 199527, Rutgers University, Department of Economics.
- Erev, Ido & Roth, Alvin E, 1998. "Predicting How People Play Games: Reinforcement Learning in Experimental Games with Unique, Mixed Strategy Equilibria," American Economic Review, American Economic Association, vol. 88(4), pages 848-81, September.
- Nick Feltovich, 2000. "Reinforcement-Based vs. Belief-Based Learning Models in Experimental Asymmetric-Information," Econometrica, Econometric Society, vol. 68(3), pages 605-642, May.
- Colin Camerer & Teck-Hua Ho, 1999. "Experience-weighted Attraction Learning in Normal Form Games," Econometrica, Econometric Society, vol. 67(4), pages 827-874, July.
- Young, H Peyton, 1993. "The Evolution of Conventions," Econometrica, Econometric Society, vol. 61(1), pages 57-84, January.
- Arthur, W Brian, 1993. "On Designing Economic Agents That Behave Like Human Agents," Journal of Evolutionary Economics, Springer, vol. 3(1), pages 1-22, February.
- Sarin, Rajiv & Vahid, Farshid, 2001.
"Predicting How People Play Games: A Simple Dynamic Model of Choice,"
Games and Economic Behavior,
Elsevier, vol. 34(1), pages 104-122, January.
- Sarin, R. & Vahid, F., 1999. "Predicting how People Play Games: a Simple Dynamic Model of Choice," Monash Econometrics and Business Statistics Working Papers 12/99, Monash University, Department of Econometrics and Business Statistics.
- Ken Binmore & Larry Samuelson, 1999. "Evolutionary Drift and Equilibrium Selection," Review of Economic Studies, Oxford University Press, vol. 66(2), pages 363-393.
- Kaniovski Yuri M. & Young H. Peyton, 1995. "Learning Dynamics in Games with Stochastic Perturbations," Games and Economic Behavior, Elsevier, vol. 11(2), pages 330-363, November.
- Rustichini, Aldo, 1999. "Optimal Properties of Stimulus--Response Learning Models," Games and Economic Behavior, Elsevier, vol. 29(1-2), pages 244-273, October.
When requesting a correction, please mention this item's handle: RePEc:stn:sotoec:0203. See general information about how to correct material in RePEc.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Chris Thorn)
If references are entirely missing, you can add them using this form.