IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1005034.html
   My bibliography  Save this article

Reinforcement Learning Explains Conditional Cooperation and Its Moody Cousin

Author

Listed:
  • Takahiro Ezaki
  • Yutaka Horita
  • Masanori Takezawa
  • Naoki Masuda

Abstract

Direct reciprocity, or repeated interaction, is a main mechanism to sustain cooperation under social dilemmas involving two individuals. For larger groups and networks, which are probably more relevant to understanding and engineering our society, experiments employing repeated multiplayer social dilemma games have suggested that humans often show conditional cooperation behavior and its moody variant. Mechanisms underlying these behaviors largely remain unclear. Here we provide a proximate account for this behavior by showing that individuals adopting a type of reinforcement learning, called aspiration learning, phenomenologically behave as conditional cooperator. By definition, individuals are satisfied if and only if the obtained payoff is larger than a fixed aspiration level. They reinforce actions that have resulted in satisfactory outcomes and anti-reinforce those yielding unsatisfactory outcomes. The results obtained in the present study are general in that they explain extant experimental results obtained for both so-called moody and non-moody conditional cooperation, prisoner’s dilemma and public goods games, and well-mixed groups and networks. Different from the previous theory, individuals are assumed to have no access to information about what other individuals are doing such that they cannot explicitly use conditional cooperation rules. In this sense, myopic aspiration learning in which the unconditional propensity of cooperation is modulated in every discrete time step explains conditional behavior of humans. Aspiration learners showing (moody) conditional cooperation obeyed a noisy GRIM-like strategy. This is different from the Pavlov, a reinforcement learning strategy promoting mutual cooperation in two-player situations.Author Summary: Laboratory experiments using human participants have shown that, in groups or contact networks, humans often behave as conditional cooperator or its moody variant. Although conditional cooperation in dyadic interaction is well understood, mechanisms underlying these behaviors in group or networks beyond a pair of individuals largely remain unclear. In this study, we show that players adopting a type of reinforcement learning exhibit these conditional cooperation behaviors. The results are general in the sense that the model explains experimental results to date obtained in various situations. It explains moody conditional cooperation, which is a recently discovered behavioral trait of humans, in addition to traditional conditional cooperation. It also explains experimental results obtained with both the prisoner’s dilemma and public goods games and with different population structure. Crucially, our model assumes that individuals do not have access to information about what other individuals are doing such that they cannot explicitly condition their behavior on how many others have previously cooperated. Thus, our results provide a proximate and unified understanding of these experimentally observed patterns.

Suggested Citation

  • Takahiro Ezaki & Yutaka Horita & Masanori Takezawa & Naoki Masuda, 2016. "Reinforcement Learning Explains Conditional Cooperation and Its Moody Cousin," PLOS Computational Biology, Public Library of Science, vol. 12(7), pages 1-13, July.
  • Handle: RePEc:plo:pcbi00:1005034
    DOI: 10.1371/journal.pcbi.1005034
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005034
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1005034&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1005034?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Fudenberg, Drew & Levine, David, 1998. "Learning in games," European Economic Review, Elsevier, vol. 42(3-5), pages 631-639, May.
    2. James W. Friedman, 1971. "A Non-cooperative Equilibrium for Supergames," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 38(1), pages 1-12.
    3. Bendor, Jonathan & Diermeier, Daniel & Ting, Michael M., 2000. "A Behavioral Model of Turnout," Research Papers 1627, Stanford University, Graduate School of Business.
    4. Fischbacher, Urs & Gachter, Simon & Fehr, Ernst, 2001. "Are people conditionally cooperative? Evidence from a public goods experiment," Economics Letters, Elsevier, vol. 71(3), pages 397-404, June.
    5. Selten, Reinhard & Stoecker, Rolf, 1986. "End behavior in sequences of finite Prisoner's Dilemma supergames A learning theory approach," Journal of Economic Behavior & Organization, Elsevier, vol. 7(1), pages 47-70, March.
    6. Urs Fischbacher & Simon Gachter, 2010. "Social Preferences, Beliefs, and the Dynamics of Free Riding in Public Goods Experiments," American Economic Review, American Economic Association, vol. 100(1), pages 541-556, March.
    7. Jonathan Bendor & Dilip Mookherjee & Debraj Ray, 2001. "Aspiration-Based Reinforcement Learning In Repeated Interaction Games: An Overview," International Game Theory Review (IGTR), World Scientific Publishing Co. Pte. Ltd., vol. 3(02n03), pages 159-174.
    8. Claudia Keser & Frans Van Winden, 2000. "Conditional Cooperation and Voluntary Contributions to Public Goods," Scandinavian Journal of Economics, Wiley Blackwell, vol. 102(1), pages 23-39, March.
    9. Bendor, Jonathan & Diermeier, Daniel & Ting, Michael, 2003. "A Behavioral Model of Turnout," American Political Science Review, Cambridge University Press, vol. 97(2), pages 261-280, May.
    10. Leigh Tesfatsion & Kenneth L. Judd (ed.), 2006. "Handbook of Computational Economics," Handbook of Computational Economics, Elsevier, edition 1, volume 2, number 2.
    11. Duffy, John, 2006. "Agent-Based Models and Human Subject Experiments," Handbook of Computational Economics, in: Leigh Tesfatsion & Kenneth L. Judd (ed.), Handbook of Computational Economics, edition 1, volume 2, chapter 19, pages 949-1011, Elsevier.
    12. Karandikar, Rajeeva & Mookherjee, Dilip & Ray, Debraj & Vega-Redondo, Fernando, 1998. "Evolving Aspirations and Cooperation," Journal of Economic Theory, Elsevier, vol. 80(2), pages 292-331, June.
    13. Jelena Grujić & Torsten Röhl & Dirk Semmann & Manfred Milinski & Arne Traulsen, 2012. "Consistent Strategy Updating in Spatial and Non-Spatial Behavioral Experiments Does Not Promote Cooperation in Social Networks," PLOS ONE, Public Library of Science, vol. 7(11), pages 1-8, November.
    14. Drew Fudenberg & David K. Levine, 1998. "The Theory of Learning in Games," MIT Press Books, The MIT Press, edition 1, volume 1, number 0262061945, December.
    15. Jelena Grujić & Constanza Fosco & Lourdes Araujo & José A Cuesta & Angel Sánchez, 2010. "Social Experiments in the Mesoscale: Humans Playing a Spatial Prisoner's Dilemma," PLOS ONE, Public Library of Science, vol. 5(11), pages 1-9, November.
    16. Benedikt Herrmann & Christian Thöni, 2009. "Measuring conditional cooperation: a replication study in Russia," Experimental Economics, Springer;Economic Science Association, vol. 12(1), pages 87-92, March.
    17. Ernst Fehr & Urs Fischbacher, 2004. "Social norms and human cooperation," Macroeconomics 0409026, University Library of Munich, Germany.
    18. Ananish Chaudhuri, 2011. "Sustaining cooperation in laboratory public goods experiments: a selective survey of the literature," Experimental Economics, Springer;Economic Science Association, vol. 14(1), pages 47-83, March.
    19. Guttman, Joel M., 2013. "On the evolution of conditional cooperation," European Journal of Political Economy, Elsevier, vol. 30(C), pages 15-34.
    20. Bendor Jonathan & Mookherjee Dilip & Ray Debraj, 2001. "Reinforcement Learning in Repeated Interaction Games," The B.E. Journal of Theoretical Economics, De Gruyter, vol. 1(1), pages 1-44, March.
    21. Editors The, 2007. "From the Editors," Basic Income Studies, De Gruyter, vol. 2(1), pages 1-5, June.
    22. Kirchkamp, Oliver & Nagel, Rosemarie, 2007. "Naive learning and cooperation in network experiments," Games and Economic Behavior, Elsevier, vol. 58(2), pages 269-292, February.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. You, Tao & Yang, Haochun & Wang, Jian & Zhang, Peng & Chen, Jinchao & Zhang, Ying, 2023. "Cooperative behavior under the influence of multiple experienced guiders in Prisoner’s dilemma game," Applied Mathematics and Computation, Elsevier, vol. 458(C).
    2. Castañeda, Gonzalo & Chávez-Juárez, Florian & Guerrero, Omar A., 2018. "How do governments determine policy priorities? Studying development strategies through spillover networks," Journal of Economic Behavior & Organization, Elsevier, vol. 154(C), pages 335-361.
    3. Xiaofeng Wang, 2021. "Costly Participation and The Evolution of Cooperation in the Repeated Public Goods Game," Dynamic Games and Applications, Springer, vol. 11(1), pages 161-183, March.
    4. Han, Xu & Zhao, Xiaowei & Xia, Haoxiang, 2022. "Hybrid learning promotes cooperation in the spatial prisoner’s dilemma game," Chaos, Solitons & Fractals, Elsevier, vol. 164(C).
    5. Jia, Danyang & Li, Tong & Zhao, Yang & Zhang, Xiaoqin & Wang, Zhen, 2022. "Empty nodes affect conditional cooperation under reinforcement learning," Applied Mathematics and Computation, Elsevier, vol. 413(C).
    6. You, Tao & Zhang, Hailun & Zhang, Ying & Li, Qing & Zhang, Peng & Yang, Mei, 2022. "The influence of experienced guider on cooperative behavior in the Prisoner’s dilemma game," Applied Mathematics and Computation, Elsevier, vol. 426(C).
    7. Wolfram Barfuss & Janusz Meylahn, 2022. "Intrinsic fluctuations of reinforcement learning promote cooperation," Papers 2209.01013, arXiv.org, revised Feb 2023.
    8. Geng, Yini & Liu, Yifan & Lu, Yikang & Shen, Chen & Shi, Lei, 2022. "Reinforcement learning explains various conditional cooperation," Applied Mathematics and Computation, Elsevier, vol. 427(C).
    9. Takahiro Ezaki & Naoki Masuda, 2017. "Reinforcement learning account of network reciprocity," PLOS ONE, Public Library of Science, vol. 12(12), pages 1-8, December.
    10. Molnar, Grant & Hammond, Caroline & Fu, Feng, 2023. "Reactive means in the iterated Prisoner’s dilemma," Applied Mathematics and Computation, Elsevier, vol. 458(C).
    11. Guo, Yujie & Zhang, Liming & Li, Haihong & Dai, Qionglin & Yang, Junzhong, 2023. "Network adaption based on environment feedback promotes cooperation in co-evolutionary games," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 617(C).
    12. Yang, Zhengzhi & Zheng, Lei & Perc, Matjaž & Li, Yumeng, 2024. "Interaction state Q-learning promotes cooperation in the spatial prisoner's dilemma game," Applied Mathematics and Computation, Elsevier, vol. 463(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Duffy, John, 2006. "Agent-Based Models and Human Subject Experiments," Handbook of Computational Economics, in: Leigh Tesfatsion & Kenneth L. Judd (ed.), Handbook of Computational Economics, edition 1, volume 2, chapter 19, pages 949-1011, Elsevier.
    2. Zhijian Wang & Yanran Zhou & Jaimie W. Lien & Jie Zheng & Bin Xu, 2016. "Extortion Can Outperform Generosity in the Iterated Prisoners' Dilemma," Levine's Bibliography 786969000000001297, UCLA Department of Economics.
    3. Yali Dong & Cong Li & Yi Tao & Boyu Zhang, 2015. "Evolution of Conformity in Social Dilemmas," PLOS ONE, Public Library of Science, vol. 10(9), pages 1-12, September.
    4. repec:cla:levarc:786969000000001297 is not listed on IDEAS
    5. Martin G. Kocher & Peter Martinsson & Kristian Ove R. Myrseth & Conny E. Wollbrant, 2017. "Strong, bold, and kind: self-control and cooperation in social dilemmas," Experimental Economics, Springer;Economic Science Association, vol. 20(1), pages 44-69, March.
    6. Gächter, Simon & Renner, Elke, 2018. "Leaders as role models and ‘belief managers’ in social dilemmas," Journal of Economic Behavior & Organization, Elsevier, vol. 154(C), pages 321-334.
    7. Josephine G. Gatua, 2021. "Information and cooperation in preventive health behavior: The case of bed net use in rural Kenya," Health Economics, John Wiley & Sons, Ltd., vol. 30(9), pages 2124-2143, September.
    8. Kölle, Felix & Quercia, Simone, 2021. "The influence of empirical and normative expectations on cooperation," Journal of Economic Behavior & Organization, Elsevier, vol. 190(C), pages 691-703.
    9. Martinsson, Peter & Pham-Khanh, Nam & Villegas-Palacio, Clara, 2013. "Conditional cooperation and disclosure in developing countries," Journal of Economic Psychology, Elsevier, vol. 34(C), pages 148-155.
    10. Ernesto Reuben & Sigrid Suetens, 2012. "Revisiting strategic versus non-strategic cooperation," Experimental Economics, Springer;Economic Science Association, vol. 15(1), pages 24-43, March.
    11. Weber, Till O. & Schulz, Jonathan F. & Beranek, Benjamin & Lambarraa-Lehnhardt, Fatima & Gächter, Simon, 2023. "The behavioral mechanisms of voluntary cooperation across culturally diverse societies: Evidence from the US, the UK, Morocco, and Turkey," Journal of Economic Behavior & Organization, Elsevier, vol. 215(C), pages 134-152.
    12. Tobias Cagala & Ulrich Glogowsky & Veronika Grimm & Johannes Rincke, 2019. "Public Goods Provision with Rent-extracting Administrators," The Economic Journal, Royal Economic Society, vol. 129(620), pages 1593-1617.
    13. Baader, Malte & Gächter, Simon & Lee, Kyeongtae & Sefton, Martin, 2022. "Social Preferences and the Variability of Conditional Cooperation," IZA Discussion Papers 15523, Institute of Labor Economics (IZA).
    14. E. J. Anderson & T. D. H. Cau, 2009. "Modeling Implicit Collusion Using Coevolution," Operations Research, INFORMS, vol. 57(2), pages 439-455, April.
    15. Simon Gaechter & Elke Renner, 2014. "Leaders as Role Models for the Voluntary Provision of Public Goods," Discussion Papers 2014-11, The Centre for Decision Research and Experimental Economics, School of Economics, University of Nottingham.
    16. Vanessa Mertins & Andrea B Schote & Wolfgang Hoffeld & Michele Griessmair & Jobst Meyer, 2011. "Genetic Susceptibility for Individual Cooperation Preferences: The Role of Monoamine Oxidase A Gene (MAOA) in the Voluntary Provision of Public Goods," PLOS ONE, Public Library of Science, vol. 6(6), pages 1-9, June.
    17. John Realpe-Gómez & Daniele Vilone & Giulia Andrighetto & Luis G. Nardin & Javier A. Montoya, 2018. "Learning Dynamics and Norm Psychology Supports Human Cooperation in a Large-Scale Prisoner’s Dilemma on Networks," Games, MDPI, vol. 9(4), pages 1-14, November.
    18. Akay, Alpaslan & Karabulut, Gökhan & Martinsson, Peter, 2011. "The Effect of Religion on Cooperation and Altruistic Punishment: Experimental Evidence from Public Goods Experiments," IZA Discussion Papers 6179, Institute of Labor Economics (IZA).
    19. Martorana, Marco F. & Mazza, Isidoro, 2012. "Adaptive voting: an empirical analysis of participation and choice," MPRA Paper 36165, University Library of Munich, Germany.
    20. Heymann, D. & Kawamura, E. & Perazzo, R. & Zimmermann, M.G., 2014. "Behavioral heuristics and market patterns in a Bertrand–Edgeworth game," Journal of Economic Behavior & Organization, Elsevier, vol. 105(C), pages 124-139.
    21. Martorana, Marco & Mazza, Isidoro, 2010. "Satisfaction and adaptation in voting behavior: an empirical exploration," DEMQ Working Paper Series 2010/6, University of Catania, Department of Economics and Quantitative Methods.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1005034. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.