This file is part of IDEAS, which uses RePEc data


[ Papers | Articles | Software | Books | Chapters | Authors | Institutions | JEL Classification | NEP reports | Search | New papers by email | Author registration | Rankings | Volunteers | FAQ | Blog | Help! ]

On the Convergence of Reinforcement Learning

Author info | Abstract | Publisher info | Download info | Related research | Statistics
Author Info
Alan Beggs

Additional information is available for the following registered author(s):

Abstract

This paper examines the convergence of payoffs and strategies in Erev and Roth`s model of reinforcement learning. When all players use this rule it eliminates iteratively dominated strategies and in two-person constant-sum games average payoffs converge to the value of the game. Strategies converge in constant-sum games with unique equilibria if they are pure or in 2 × 2 games also if they are mixed. The long-run behaviour of the learning rule is governed by equations related to Maynard Smith`s version of the replicator dynamic. Properties of the learning rule against general opponents are also studied. In particular it is shown that it guarantees that the lim sup of a player`s average payoffs is at least his minmax payoff.

Download Info
To download:

If you experience problems downloading a file, check if you have the proper application to view it first. Information about this may be contained in the File-Format links below. In case of further problems read the IDEAS help file. Note that these files are not on the IDEAS site. Please be patient as the files may be large.

File URL: http://www.economics.ox.ac.uk/research/WP/PDF/paper096.pdf
File Format: application/pdf
File Function:
Download Restriction: no

Publisher Info
Paper provided by University of Oxford, Department of Economics in its series Economics Series Working Papers with number 096.

Download reference. The following formats are available: HTML, plain text, BibTeX, RIS (EndNote), ReDIF
Length:
Date of creation: 2002
Date of revision:
Handle: RePEc:oxf:wpaper:096

Contact details of provider:
Postal: Manor Rd. Building, Oxford, OX1 3UQ
Email:
Web page: http://www.economics.ox.ac.uk/
More information through EDIRC

For technical questions regarding this item, or to correct its listing, contact: (Mark George).

Related research
Keywords: reinforcement learning games

Other versions of this item:

Find related papers by JEL classification:
C72 - Mathematical and Quantitative Methods - - Game Theory and Bargaining Theory - - - Noncooperative Games
D83 - Microeconomics - - Information, Knowledge, and Uncertainty - - - Search, Learning, and Information

Cited by:
(explanations, Please report citation or reference errors to , or , if you are the registered author of the cited work, log in to your RePEc Author Service profile, click on "citations" and make appropriate adjustments.)

  1. Antonella Ianni, 2007. "Learning Strict Nash Equilibria through Reinforcement," Economics Working Papers ECO2007/21, European University Institute. [Downloadable!]
  2. Roger Waldeck & Eric Darmon, 2006. "Can boundedly rational sellers learn to play Nash?," Journal of Economic Interaction and Coordination, Springer, vol. 1(2), pages 147-169, November. [Downloadable!] (restricted)
  3. Ed Hopkins & Martin Posch, 2003. "Attainability of Boundary Points under Reinforcement Learning," Levine's Bibliography 506439000000000350, UCLA Department of Economics. [Downloadable!]
    Other versions:
  4. Josef Hofbauer & Ed Hopkins, 2004. "Learning in Perturbed Asymmetric Games," ESE Discussion Papers 53, Edinburgh School of Economics, University of Edinburgh. [Downloadable!]
    Other versions:
Statistics
Access and download statistics

Did you know? You can include your works in the database easily by uploading them on the Munich Personal RePEc Archive (MPRA) if you do not have access to an institutional RePEc archive.

This page was last updated on 2008-11-17.


This information is provided to you by IDEAS at the Department of Economics, College of Liberal Arts and Sciences, University of Connecticut using RePEc data on a server sponsored by the Society for Economic Dynamics.