Reinforcement Learning Explains Conditional Cooperation and Its Moody Cousin

My bibliography Save this article

Reinforcement Learning Explains Conditional Cooperation and Its Moody Cousin

Author

Listed:

Takahiro Ezaki
Yutaka Horita
Masanori Takezawa
Naoki Masuda

Registered:

Abstract

Direct reciprocity, or repeated interaction, is a main mechanism to sustain cooperation under social dilemmas involving two individuals. For larger groups and networks, which are probably more relevant to understanding and engineering our society, experiments employing repeated multiplayer social dilemma games have suggested that humans often show conditional cooperation behavior and its moody variant. Mechanisms underlying these behaviors largely remain unclear. Here we provide a proximate account for this behavior by showing that individuals adopting a type of reinforcement learning, called aspiration learning, phenomenologically behave as conditional cooperator. By definition, individuals are satisfied if and only if the obtained payoff is larger than a fixed aspiration level. They reinforce actions that have resulted in satisfactory outcomes and anti-reinforce those yielding unsatisfactory outcomes. The results obtained in the present study are general in that they explain extant experimental results obtained for both so-called moody and non-moody conditional cooperation, prisoner’s dilemma and public goods games, and well-mixed groups and networks. Different from the previous theory, individuals are assumed to have no access to information about what other individuals are doing such that they cannot explicitly use conditional cooperation rules. In this sense, myopic aspiration learning in which the unconditional propensity of cooperation is modulated in every discrete time step explains conditional behavior of humans. Aspiration learners showing (moody) conditional cooperation obeyed a noisy GRIM-like strategy. This is different from the Pavlov, a reinforcement learning strategy promoting mutual cooperation in two-player situations.Author Summary: Laboratory experiments using human participants have shown that, in groups or contact networks, humans often behave as conditional cooperator or its moody variant. Although conditional cooperation in dyadic interaction is well understood, mechanisms underlying these behaviors in group or networks beyond a pair of individuals largely remain unclear. In this study, we show that players adopting a type of reinforcement learning exhibit these conditional cooperation behaviors. The results are general in the sense that the model explains experimental results to date obtained in various situations. It explains moody conditional cooperation, which is a recently discovered behavioral trait of humans, in addition to traditional conditional cooperation. It also explains experimental results obtained with both the prisoner’s dilemma and public goods games and with different population structure. Crucially, our model assumes that individuals do not have access to information about what other individuals are doing such that they cannot explicitly condition their behavior on how many others have previously cooperated. Thus, our results provide a proximate and unified understanding of these experimentally observed patterns.

Suggested Citation

Takahiro Ezaki & Yutaka Horita & Masanori Takezawa & Naoki Masuda, 2016. "Reinforcement Learning Explains Conditional Cooperation and Its Moody Cousin," PLOS Computational Biology, Public Library of Science, vol. 12(7), pages 1-13, July.

Handle: RePEc:plo:pcbi00:1005034
DOI: 10.1371/journal.pcbi.1005034

Download full text from publisher

References listed on IDEAS

Fudenberg, Drew & Levine, David, 1998. "Learning in games," European Economic Review, Elsevier, vol. 42(3-5), pages 631-639, May.
- Drew Fudenberg & David K. Levine, 1998. "Learning in Games," Levine's Working Paper Archive 2222, David K. Levine.
James W. Friedman, 1971. "A Non-cooperative Equilibrium for Supergames," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 38(1), pages 1-12.
Bendor, Jonathan & Diermeier, Daniel & Ting, Michael M., 2000. "A Behavioral Model of Turnout," Research Papers 1627, Stanford University, Graduate School of Business.
Fischbacher, Urs & Gachter, Simon & Fehr, Ernst, 2001. "Are people conditionally cooperative? Evidence from a public goods experiment," Economics Letters, Elsevier, vol. 71(3), pages 397-404, June.
- Urs Fischbacher & Simon Gaechter & Ernst Fehr, "undated". "Are People Conditionally Cooperative? Evidence from a Public Goods Experiment," IEW - Working Papers 016, Institute for Empirical Research in Economics - University of Zurich.
Selten, Reinhard & Stoecker, Rolf, 1986. "End behavior in sequences of finite Prisoner's Dilemma supergames A learning theory approach," Journal of Economic Behavior & Organization, Elsevier, vol. 7(1), pages 47-70, March.
Urs Fischbacher & Simon Gachter, 2010. "Social Preferences, Beliefs, and the Dynamics of Free Riding in Public Goods Experiments," American Economic Review, American Economic Association, vol. 100(1), pages 541-556, March.
- Urs Fischbacher & Simon Gaechter, 2008. "Social Preferences, Beliefs, and the Dynamics of Free Riding in Public Good Experiments," CESifo Working Paper Series 2491, CESifo.
- Urs Fischbacher & Simon Gaechter, 2009. "Social Preferences, Beliefs, and the Dynamics of Free Riding in Public Good Experiments," Discussion Papers 2009-04, The Centre for Decision Research and Experimental Economics, School of Economics, University of Nottingham.
Jonathan Bendor & Dilip Mookherjee & Debraj Ray, 2001. "Aspiration-Based Reinforcement Learning In Repeated Interaction Games: An Overview," International Game Theory Review (IGTR), World Scientific Publishing Co. Pte. Ltd., vol. 3(02n03), pages 159-174.
Claudia Keser & Frans Van Winden, 2000. "Conditional Cooperation and Voluntary Contributions to Public Goods," Scandinavian Journal of Economics, Wiley Blackwell, vol. 102(1), pages 23-39, March.
- Claudia Keser & Frans A.A.M. van Winden, 2000. "Conditional Cooperation and Voluntary Contributions to Public Goods," Tinbergen Institute Discussion Papers 00-011/1, Tinbergen Institute.
Bendor, Jonathan & Diermeier, Daniel & Ting, Michael, 2003. "A Behavioral Model of Turnout," American Political Science Review, Cambridge University Press, vol. 97(2), pages 261-280, May.
Leigh Tesfatsion & Kenneth L. Judd (ed.), 2006. "Handbook of Computational Economics," Handbook of Computational Economics, Elsevier, edition 1, volume 2, number 2.
Duffy, John, 2006. "Agent-Based Models and Human Subject Experiments," Handbook of Computational Economics, in: Leigh Tesfatsion & Kenneth L. Judd (ed.), Handbook of Computational Economics, edition 1, volume 2, chapter 19, pages 949-1011, Elsevier.
- John Duffy, 2004. "Agent-Based Models and Human Subject Experiments," Computational Economics 0412001, University Library of Munich, Germany.
Karandikar, Rajeeva & Mookherjee, Dilip & Ray, Debraj & Vega-Redondo, Fernando, 1998. "Evolving Aspirations and Cooperation," Journal of Economic Theory, Elsevier, vol. 80(2), pages 292-331, June.
- Debraj Ray & Dilip Mookherjee & Fernando Vega Redondo & Rajeeva L. Karandikar, 1996. "Evolving aspirations and cooperation," Working Papers. Serie AD 1996-06, Instituto Valenciano de Investigaciones Económicas, S.A. (Ivie).
Jelena Grujić & Torsten Röhl & Dirk Semmann & Manfred Milinski & Arne Traulsen, 2012. "Consistent Strategy Updating in Spatial and Non-Spatial Behavioral Experiments Does Not Promote Cooperation in Social Networks," PLOS ONE, Public Library of Science, vol. 7(11), pages 1-8, November.
Drew Fudenberg & David K. Levine, 1998. "The Theory of Learning in Games," MIT Press Books, The MIT Press, edition 1, volume 1, number 0262061945, December.
- Drew Fudenberg & David K. Levine, 1996. "The Theory of Learning in Games," Levine's Working Paper Archive 624, David K. Levine.
Jelena Grujić & Constanza Fosco & Lourdes Araujo & José A Cuesta & Angel Sánchez, 2010. "Social Experiments in the Mesoscale: Humans Playing a Spatial Prisoner's Dilemma," PLOS ONE, Public Library of Science, vol. 5(11), pages 1-9, November.
Benedikt Herrmann & Christian Thöni, 2009. "Measuring conditional cooperation: a replication study in Russia," Experimental Economics, Springer;Economic Science Association, vol. 12(1), pages 87-92, March.
- Benedikt Herrmann & Christian Thoeni, 2007. "Measuring Conditional Cooperation: A Replication Study in Russia," Discussion Papers 2007-07, The Centre for Decision Research and Experimental Economics, School of Economics, University of Nottingham.
Ernst Fehr & Urs Fischbacher, 2004. "Social norms and human cooperation," Macroeconomics 0409026, University Library of Munich, Germany.
Ananish Chaudhuri, 2011. "Sustaining cooperation in laboratory public goods experiments: a selective survey of the literature," Experimental Economics, Springer;Economic Science Association, vol. 14(1), pages 47-83, March.
Guttman, Joel M., 2013. "On the evolution of conditional cooperation," European Journal of Political Economy, Elsevier, vol. 30(C), pages 15-34.
Bendor Jonathan & Mookherjee Dilip & Ray Debraj, 2001. "Reinforcement Learning in Repeated Interaction Games," The B.E. Journal of Theoretical Economics, De Gruyter, vol. 1(1), pages 1-44, March.
Editors The, 2007. "From the Editors," Basic Income Studies, De Gruyter, vol. 2(1), pages 1-5, June.
Kirchkamp, Oliver & Nagel, Rosemarie, 2007. "Naive learning and cooperation in network experiments," Games and Economic Behavior, Elsevier, vol. 58(2), pages 269-292, February.

Full references (including those not matched with items on IDEAS)

Citations

Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

Cited by:

You, Tao & Yang, Haochun & Wang, Jian & Zhang, Peng & Chen, Jinchao & Zhang, Ying, 2023. "Cooperative behavior under the influence of multiple experienced guiders in Prisoner’s dilemma game," Applied Mathematics and Computation, Elsevier, vol. 458(C).
Castañeda, Gonzalo & Chávez-Juárez, Florian & Guerrero, Omar A., 2018. "How do governments determine policy priorities? Studying development strategies through spillover networks," Journal of Economic Behavior & Organization, Elsevier, vol. 154(C), pages 335-361.
- Omar A. Guerrero & Gonzalo Casta~neda & Florian Ch'avez-Ju'arez, 2019. "How do governments determine policy priorities? Studying development strategies through spillover networks," Papers 1902.00432, arXiv.org.
Xiaofeng Wang, 2021. "Costly Participation and The Evolution of Cooperation in the Repeated Public Goods Game," Dynamic Games and Applications, Springer, vol. 11(1), pages 161-183, March.
Han, Xu & Zhao, Xiaowei & Xia, Haoxiang, 2022. "Hybrid learning promotes cooperation in the spatial prisoner’s dilemma game," Chaos, Solitons & Fractals, Elsevier, vol. 164(C).
Jia, Danyang & Li, Tong & Zhao, Yang & Zhang, Xiaoqin & Wang, Zhen, 2022. "Empty nodes affect conditional cooperation under reinforcement learning," Applied Mathematics and Computation, Elsevier, vol. 413(C).
You, Tao & Zhang, Hailun & Zhang, Ying & Li, Qing & Zhang, Peng & Yang, Mei, 2022. "The influence of experienced guider on cooperative behavior in the Prisoner’s dilemma game," Applied Mathematics and Computation, Elsevier, vol. 426(C).
Wolfram Barfuss & Janusz Meylahn, 2022. "Intrinsic fluctuations of reinforcement learning promote cooperation," Papers 2209.01013, arXiv.org, revised Feb 2023.
Geng, Yini & Liu, Yifan & Lu, Yikang & Shen, Chen & Shi, Lei, 2022. "Reinforcement learning explains various conditional cooperation," Applied Mathematics and Computation, Elsevier, vol. 427(C).
Takahiro Ezaki & Naoki Masuda, 2017. "Reinforcement learning account of network reciprocity," PLOS ONE, Public Library of Science, vol. 12(12), pages 1-8, December.
Molnar, Grant & Hammond, Caroline & Fu, Feng, 2023. "Reactive means in the iterated Prisoner’s dilemma," Applied Mathematics and Computation, Elsevier, vol. 458(C).
Guo, Yujie & Zhang, Liming & Li, Haihong & Dai, Qionglin & Yang, Junzhong, 2023. "Network adaption based on environment feedback promotes cooperation in co-evolutionary games," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 617(C).
Yang, Zhengzhi & Zheng, Lei & Perc, Matjaž & Li, Yumeng, 2024. "Interaction state Q-learning promotes cooperation in the spatial prisoner's dilemma game," Applied Mathematics and Computation, Elsevier, vol. 463(C).

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Duffy, John, 2006. "Agent-Based Models and Human Subject Experiments," Handbook of Computational Economics, in: Leigh Tesfatsion & Kenneth L. Judd (ed.), Handbook of Computational Economics, edition 1, volume 2, chapter 19, pages 949-1011, Elsevier.
- John Duffy, 2004. "Agent-Based Models and Human Subject Experiments," Computational Economics 0412001, University Library of Munich, Germany.
Zhijian Wang & Yanran Zhou & Jaimie W. Lien & Jie Zheng & Bin Xu, 2016. "Extortion Can Outperform Generosity in the Iterated Prisoners' Dilemma," Levine's Bibliography 786969000000001297, UCLA Department of Economics.
Yali Dong & Cong Li & Yi Tao & Boyu Zhang, 2015. "Evolution of Conformity in Social Dilemmas," PLOS ONE, Public Library of Science, vol. 10(9), pages 1-12, September.
repec:cla:levarc:786969000000001297 is not listed on IDEAS
Martin G. Kocher & Peter Martinsson & Kristian Ove R. Myrseth & Conny E. Wollbrant, 2017. "Strong, bold, and kind: self-control and cooperation in social dilemmas," Experimental Economics, Springer;Economic Science Association, vol. 20(1), pages 44-69, March.
- Kocher, Martin G. & Martinsson, Peter & Myrseth, Kristian Ove R. & Wollbrant, Conny, 2012. "Strong, Bold, and Kind: Self-Control and Cooperation in Social Dilemmas," Working Papers in Economics 523, University of Gothenburg, Department of Economics, revised 02 Apr 2013.
- Martin G. Kocher & Peter Martinsson & Kristian Ove R. Myrseth & Conny Wollbrant, 2013. "Strong, Bold, and Kind: Self-Control and Cooperation in Social Dilemmas," CESifo Working Paper Series 4200, CESifo.
- Kocher, Martin G. & Martinsson, Peter & Myrseth, Kristian Ove R. & Wollbrant, Conny E., 2017. "Strong, bold, and kind: self-control and cooperation in social dilemmas," Munich Reprints in Economics 55035, University of Munich, Department of Economics.
- Martin G. Kocher & Peter Martinsson & Kristian Ove R. Myrseth & Conny Wollbrant, 2012. "Strong, bold, and kind: Self-control and cooperation in social dilemmas," ESMT Research Working Papers ESMT-12-01 (R1), ESMT European School of Management and Technology, revised 28 Mar 2013.
- Kocher, Martin G. & Martinsson, Peter & Myrseth, Kristian Ove R. & Wollbrant, Conny, 2012. "Strong, Bold, and Kind: Self-Control and Cooperation in Social Dilemmas," Discussion Papers in Economics 12706, University of Munich, Department of Economics.
Gächter, Simon & Renner, Elke, 2018. "Leaders as role models and ‘belief managers’ in social dilemmas," Journal of Economic Behavior & Organization, Elsevier, vol. 154(C), pages 321-334.
Josephine G. Gatua, 2021. "Information and cooperation in preventive health behavior: The case of bed net use in rural Kenya," Health Economics, John Wiley & Sons, Ltd., vol. 30(9), pages 2124-2143, September.
Kölle, Felix & Quercia, Simone, 2021. "The influence of empirical and normative expectations on cooperation," Journal of Economic Behavior & Organization, Elsevier, vol. 190(C), pages 691-703.
- Felix Kölle & Simone Quercia, 2021. "The Influence of Empirical and Normative Expectations on Cooperation," ECONtribute Discussion Papers Series 099, University of Bonn and University of Cologne, Germany.
Martinsson, Peter & Pham-Khanh, Nam & Villegas-Palacio, Clara, 2013. "Conditional cooperation and disclosure in developing countries," Journal of Economic Psychology, Elsevier, vol. 34(C), pages 148-155.
- Martinsson, Peter & Pham-Khanh, Nam & Villegas-Palacio, Clara, 2012. "Conditional Cooperation and Disclosure in Developing Countries," Working Papers in Economics 541, University of Gothenburg, Department of Economics.
Ernesto Reuben & Sigrid Suetens, 2012. "Revisiting strategic versus non-strategic cooperation," Experimental Economics, Springer;Economic Science Association, vol. 15(1), pages 24-43, March.
- Reuben, E. & Suetens, S., 2009. "Revisiting Strategic versus Non-strategic Cooperation," Discussion Paper 2009-22, Tilburg University, Center for Economic Research.
- Reuben, E. & Suetens, S., 2009. "Revisiting Strategic versus Non-strategic Cooperation," Other publications TiSEM 4ed16b68-4a46-4565-a6ba-6, Tilburg University, School of Economics and Management.
- Reuben, Ernesto & Suetens, Sigrid, 2009. "Revisiting Strategic versus Non-Strategic Cooperation," IZA Discussion Papers 4107, Institute of Labor Economics (IZA).
Weber, Till O. & Schulz, Jonathan F. & Beranek, Benjamin & Lambarraa-Lehnhardt, Fatima & Gächter, Simon, 2023. "The behavioral mechanisms of voluntary cooperation across culturally diverse societies: Evidence from the US, the UK, Morocco, and Turkey," Journal of Economic Behavior & Organization, Elsevier, vol. 215(C), pages 134-152.
- Till O. Weber & Jonathan F. Schulz & Benjamin Beranek & Fatima Lambarraa-Lehnhardt & Simon Gaechter, 2023. "The Behavioral Mechanisms of Voluntary Cooperation across Culturally Diverse Societies: Evidence from the US, the UK, Morocco, and Turkey," CESifo Working Paper Series 10637, CESifo.
- Weber, Till O. & Schulz, Jonathan F. & Beranek, Benjamin & Lambarraa-Lehnhardt, Fatima & Gächter, Simon, 2023. "The Behavioral Mechanisms of Voluntary Cooperation across Culturally Diverse Societies: Evidence from the US, the UK, Morocco, and Turkey," IZA Discussion Papers 16415, Institute of Labor Economics (IZA).
Tobias Cagala & Ulrich Glogowsky & Veronika Grimm & Johannes Rincke, 2019. "Public Goods Provision with Rent-extracting Administrators," The Economic Journal, Royal Economic Society, vol. 129(620), pages 1593-1617.
- Tobias Cagala & Ulrich Glogowsky & Veronika Grimm & Johannes Rincke, 2017. "Public Goods Provision with Rent-Extracting Administrators," CESifo Working Paper Series 6801, CESifo.
Baader, Malte & Gächter, Simon & Lee, Kyeongtae & Sefton, Martin, 2022. "Social Preferences and the Variability of Conditional Cooperation," IZA Discussion Papers 15523, Institute of Labor Economics (IZA).
- Malte Baader & Simon Gaechter & Kyeongtae Lee & Martin Sefton, 2022. "Social preferences and the variability of conditional cooperation," Discussion Papers 2022-13, The Centre for Decision Research and Experimental Economics, School of Economics, University of Nottingham.
- Malte Baader & Simon Gaechter & Kyeongtae Lee & Martin Sefton, 2022. "Social Preferences and the Variability of Conditional Cooperation," CESifo Working Paper Series 9924, CESifo.
E. J. Anderson & T. D. H. Cau, 2009. "Modeling Implicit Collusion Using Coevolution," Operations Research, INFORMS, vol. 57(2), pages 439-455, April.
Simon Gaechter & Elke Renner, 2014. "Leaders as Role Models for the Voluntary Provision of Public Goods," Discussion Papers 2014-11, The Centre for Decision Research and Experimental Economics, School of Economics, University of Nottingham.
- Simon Gaechter & Elke Renner, 2014. "Leaders as Role Models for the Voluntary Provision of Public Goods," CESifo Working Paper Series 5049, CESifo.
- Gächter, Simon & Renner, Elke, 2014. "Leaders as Role Models for the Voluntary Provision of Public Goods," IZA Discussion Papers 8580, Institute of Labor Economics (IZA).
Vanessa Mertins & Andrea B Schote & Wolfgang Hoffeld & Michele Griessmair & Jobst Meyer, 2011. "Genetic Susceptibility for Individual Cooperation Preferences: The Role of Monoamine Oxidase A Gene (MAOA) in the Voluntary Provision of Public Goods," PLOS ONE, Public Library of Science, vol. 6(6), pages 1-9, June.
John Realpe-Gómez & Daniele Vilone & Giulia Andrighetto & Luis G. Nardin & Javier A. Montoya, 2018. "Learning Dynamics and Norm Psychology Supports Human Cooperation in a Large-Scale Prisoner’s Dilemma on Networks," Games, MDPI, vol. 9(4), pages 1-14, November.
Akay, Alpaslan & Karabulut, Gökhan & Martinsson, Peter, 2011. "The Effect of Religion on Cooperation and Altruistic Punishment: Experimental Evidence from Public Goods Experiments," IZA Discussion Papers 6179, Institute of Labor Economics (IZA).
Martorana, Marco F. & Mazza, Isidoro, 2012. "Adaptive voting: an empirical analysis of participation and choice," MPRA Paper 36165, University Library of Munich, Germany.
Heymann, D. & Kawamura, E. & Perazzo, R. & Zimmermann, M.G., 2014. "Behavioral heuristics and market patterns in a Bertrand–Edgeworth game," Journal of Economic Behavior & Organization, Elsevier, vol. 105(C), pages 124-139.
- Daniel Heymann & Enrique Kawamura & Roberto Perazzo & Martin Zimmermann, 2011. "Behavioral Heuristics and Market Patterns in a Bertrand-Edgeworth Game," Working Papers 108, Universidad de San Andres, Departamento de Economia, revised Mar 2011.
Martorana, Marco & Mazza, Isidoro, 2010. "Satisfaction and adaptation in voting behavior: an empirical exploration," DEMQ Working Paper Series 2010/6, University of Catania, Department of Economics and Quantitative Methods.
- Martorana, Marco Ferdinando & Mazza, Isidoro, 2010. "Satisfaction and adaptation in voting behavior: an empirical exploration," MPRA Paper 29135, University Library of Munich, Germany, revised Jan 2011.

More about this item

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1005034. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Reinforcement Learning Explains Conditional Cooperation and Its Moody Cousin

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Citations

Most related items

More about this item

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data