IDEAS home Printed from https://ideas.repec.org/a/jas/jasssj/2007-11-2.html
   My bibliography  Save this article

Reinforcement Learning Dynamics in Social Dilemmas

Author

Abstract

In this paper we replicate and advance Macy and Flache's (2002; Proc. Natl. Acad. Sci. USA, 99, 7229–7236) work on the dynamics of reinforcement learning in 2×2 (2-player 2-strategy) social dilemmas. In particular, we provide further insight into the solution concepts that they describe, illustrate some recent analytical results on the dynamics of their model, and discuss the robustness of such results to occasional mistakes made by players in choosing their actions (i.e. trembling hands). It is shown here that the dynamics of their model are strongly dependent on the speed at which players learn. With high learning rates the system quickly reaches its asymptotic behaviour; on the other hand, when learning rates are low, two distinctively different transient regimes can be clearly observed. It is shown that the inclusion of small quantities of randomness in players' decisions can change the dynamics of the model dramatically.

Suggested Citation

  • Segismundo S. Izquierdo & Luis R. Izquierdo & Nicholas M. Gotts, 2008. "Reinforcement Learning Dynamics in Social Dilemmas," Journal of Artificial Societies and Social Simulation, Journal of Artificial Societies and Social Simulation, vol. 11(2), pages 1-1.
  • Handle: RePEc:jas:jasssj:2007-11-2
    as

    Download full text from publisher

    File URL: http://jasss.soc.surrey.ac.uk/11/2/1/1.pdf
    Download Restriction: no

    References listed on IDEAS

    as
    1. Karandikar, Rajeeva & Mookherjee, Dilip & Ray, Debraj & Vega-Redondo, Fernando, 1998. "Evolving Aspirations and Cooperation," Journal of Economic Theory, Elsevier, vol. 80(2), pages 292-331, June.
    2. Glenn Ellison, 2000. "Basins of Attraction, Long-Run Stochastic Stability, and the Speed of Step-by-Step Evolution," Review of Economic Studies, Oxford University Press, vol. 67(1), pages 17-45.
    3. Fernando Vega-Redondo & Frédéric Palomino, 1999. "Convergence of aspirations and (partial) cooperation in the prisoner's dilemma," International Journal of Game Theory, Springer;Game Theory Society, vol. 28(4), pages 465-488.
    4. Mookherjee Dilip & Sopher Barry, 1994. "Learning Behavior in an Experimental Matching Pennies Game," Games and Economic Behavior, Elsevier, vol. 7(1), pages 62-91, July.
    5. Margaret Edwards & Sylvie Huet & François Goreaud & Guillaume Deffuant, 2003. "Comparing an Individual-Based Model of Behaviour Diffusion with Its Mean Field Aggregate Approximation," Journal of Artificial Societies and Social Simulation, Journal of Artificial Societies and Social Simulation, vol. 6(4), pages 1-9.
    6. Youngse Kim, 1999. "Satisficing and optimality in 2þ2 common interest games," Economic Theory, Springer;Society for the Advancement of Economic Theory (SAET), vol. 13(2), pages 365-375.
    7. Duffy, John, 2006. "Agent-Based Models and Human Subject Experiments," Handbook of Computational Economics,in: Leigh Tesfatsion & Kenneth L. Judd (ed.), Handbook of Computational Economics, edition 1, volume 2, chapter 19, pages 949-1011 Elsevier.
    8. Mookherjee, Dilip & Sopher, Barry, 1997. "Learning and Decision Costs in Experimental Constant Sum Games," Games and Economic Behavior, Elsevier, vol. 19(1), pages 97-132, April.
    9. Borgers, Tilman & Sarin, Rajiv, 1997. "Learning Through Reinforcement and Replicator Dynamics," Journal of Economic Theory, Elsevier, vol. 77(1), pages 1-14, November.
    10. Binmore, K. & Samuelson, L., 1993. "An Economist's Perspective on the Evolution of Norms," Working papers 9323, Wisconsin Madison - Social Systems.
    11. Roth, Alvin E. & Erev, Ido, 1995. "Learning in extensive-form games: Experimental data and simple dynamic models in the intermediate term," Games and Economic Behavior, Elsevier, vol. 8(1), pages 164-212.
    12. Sylvie Huet & Margaret Edwards & Guillaume Deffuant, 2007. "Taking into Account the Variations of Neighbourhood Sizes in the Mean-Field Approximation of the Threshold Model on a Random Network," Journal of Artificial Societies and Social Simulation, Journal of Artificial Societies and Social Simulation, vol. 10(1), pages 1-10.
    13. J. Gary Polhill & Luis R. Izquierdo, 2005. "Lessons Learned from Converting the Artificial Stock Market to Interval Arithmetic," Journal of Artificial Societies and Social Simulation, Journal of Artificial Societies and Social Simulation, vol. 8(2), pages 1-2.
    14. José Manuel Galán & Luis R. Izquierdo, 2005. "Appearances Can Be Deceiving: Lessons Learned Re-Implementing Axelrod's 'Evolutionary Approach to Norms'," Journal of Artificial Societies and Social Simulation, Journal of Artificial Societies and Social Simulation, vol. 8(3), pages 1-2.
    15. Erev, Ido & Bereby-Meyer, Yoella & Roth, Alvin E., 1999. "The effect of adding a constant to all payoffs: experimental investigation, and implications for reinforcement learning models," Journal of Economic Behavior & Organization, Elsevier, vol. 39(1), pages 111-128, May.
    16. Andreas Flache & Michael W. Macy, 2002. "Stochastic Collusion and the Power Law of Learning," Journal of Conflict Resolution, Peace Science Society (International), vol. 46(5), pages 629-653, October.
    17. John G. Cross, 1973. "A Stochastic Learning Model of Economic Behavior," The Quarterly Journal of Economics, Oxford University Press, vol. 87(2), pages 239-266.
    18. Yan Chen & Fang-Fang Tang, 1998. "Learning and Incentive-Compatible Mechanisms for Public Goods Provision: An Experimental Study," Journal of Political Economy, University of Chicago Press, vol. 106(3), pages 633-662, June.
    19. Izquierdo, Luis R. & Izquierdo, Segismundo S. & Gotts, Nicholas M. & Polhill, J. Gary, 2007. "Transient and asymptotic dynamics of reinforcement learning in games," Games and Economic Behavior, Elsevier, vol. 61(2), pages 259-276, November.
    20. Bendor Jonathan & Mookherjee Dilip & Ray Debraj, 2001. "Reinforcement Learning in Repeated Interaction Games," The B.E. Journal of Theoretical Economics, De Gruyter, vol. 1(1), pages 1-44, March.
    21. Luis R. Izquierdo & J. Gary Polhill, 2006. "Is Your Model Susceptible to Floating-Point Errors?," Journal of Artificial Societies and Social Simulation, Journal of Artificial Societies and Social Simulation, vol. 9(4), pages 1-4.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. José Manuel Galán & Luis R. Izquierdo & Segismundo S. Izquierdo & José Ignacio Santos & Ricardo del Olmo & Adolfo López-Paredes & Bruce Edmonds, 2009. "Errors and Artefacts in Agent-Based Modelling," Journal of Artificial Societies and Social Simulation, Journal of Artificial Societies and Social Simulation, vol. 12(1), pages 1-1.
    2. repec:eee:apmaco:v:320:y:2018:i:c:p:485-494 is not listed on IDEAS
    3. Luis R. Izquierdo & Segismundo S. Izquierdo & José Manuel Galán & José Ignacio Santos, 2009. "Techniques to Understand Computer Simulations: Markov Chain Analysis," Journal of Artificial Societies and Social Simulation, Journal of Artificial Societies and Social Simulation, vol. 12(1), pages 1-6.
    4. Dan Miodownik & Britt Cartrite & Ravi Bhavnani, 2010. "Between Replication and Docking: "Adaptive Agents, Political Institutions, and Civic Traditions" Revisited," Journal of Artificial Societies and Social Simulation, Journal of Artificial Societies and Social Simulation, vol. 13(3), pages 1-1.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:jas:jasssj:2007-11-2. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Flaminio Squazzoni). General contact details of provider: .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.