A reinforcement learning process in extensive form games
The CPR ("cumulative proportional reinforcement") learning rule stipulates that an agent chooses a move with a probability proportional to the cumulative payoff she obtained in the past with that move. Previously considered for strategies in normal form games (Laslier, Topol and Walliser, Games and Econ. Behav., 2001), the CPR rule is here adapted for actions in perfect information extensive form games. The paper shows that the action-based CPR process converges with probability one to the (unique) subgame perfect equilibrium.
(This abstract was borrowed from another version of this item.)
Volume (Year): 33 (2005)
Issue (Month): 2 (06)
|Contact details of provider:|| Web page: http://www.springer.com|
|Order Information:||Web: http://www.springer.com/economics/economic+theory/journal/182/PS2|
When requesting a correction, please mention this item's handle: RePEc:spr:jogath:v:33:y:2005:i:2:p:219-227. See general information about how to correct material in RePEc.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Sonal Shukla)or (Rebekah McClure)
If references are entirely missing, you can add them using this form.