IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2209.01013.html
   My bibliography  Save this paper

Intrinsic fluctuations of reinforcement learning promote cooperation

Author

Listed:
  • Wolfram Barfuss
  • Janusz Meylahn

Abstract

In this work, we ask for and answer what makes classical temporal-difference reinforcement learning with epsilon-greedy strategies cooperative. Cooperating in social dilemma situations is vital for animals, humans, and machines. While evolutionary theory revealed a range of mechanisms promoting cooperation, the conditions under which agents learn to cooperate are contested. Here, we demonstrate which and how individual elements of the multi-agent learning setting lead to cooperation. We use the iterated Prisoner's dilemma with one-period memory as a testbed. Each of the two learning agents learns a strategy that conditions the following action choices on both agents' action choices of the last round. We find that next to a high caring for future rewards, a low exploration rate, and a small learning rate, it is primarily intrinsic stochastic fluctuations of the reinforcement learning process which double the final rate of cooperation to up to 80%. Thus, inherent noise is not a necessary evil of the iterative learning process. It is a critical asset for the learning of cooperation. However, we also point out the trade-off between a high likelihood of cooperative behavior and achieving this in a reasonable amount of time. Our findings are relevant for purposefully designing cooperative algorithms and regulating undesired collusive effects.

Suggested Citation

  • Wolfram Barfuss & Janusz Meylahn, 2022. "Intrinsic fluctuations of reinforcement learning promote cooperation," Papers 2209.01013, arXiv.org, revised Feb 2023.
  • Handle: RePEc:arx:papers:2209.01013
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2209.01013
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Wolfram Barfuss & Jonathan F. Donges & Steven J. Lade & Jürgen Kurths, 2018. "When optimization for governing human-environment tipping elements is neither sustainable nor safe," Nature Communications, Nature, vol. 9(1), pages 1-10, December.
    2. Takahiro Ezaki & Yutaka Horita & Masanori Takezawa & Naoki Masuda, 2016. "Reinforcement Learning Explains Conditional Cooperation and Its Moody Cousin," PLOS Computational Biology, Public Library of Science, vol. 12(7), pages 1-13, July.
    3. Ulrich Schwalbe, 2018. "Algorithms, Machine Learning, And Collusion," Journal of Competition Law and Economics, Oxford University Press, vol. 14(4), pages 568-607.
    4. Allan Dafoe & Yoram Bachrach & Gillian Hadfield & Eric Horvitz & Kate Larson & Thore Graepel, 2021. "Cooperative AI: machines must learn to find common ground," Nature, Nature, vol. 593(7857), pages 33-36, May.
    5. Usui, Yuki & Ueda, Masahiko, 2021. "Symmetric equilibrium of multi-agent reinforcement learning in repeated prisoner’s dilemma," Applied Mathematics and Computation, Elsevier, vol. 409(C).
    6. Emilio Calvano & Giacomo Calzolari & Vincenzo Denicolò & Sergio Pastorello, 2019. "Algorithmic Pricing What Implications for Competition Policy?," Review of Industrial Organization, Springer;The Industrial Organization Society, vol. 55(1), pages 155-171, August.
    7. Joseph E Harrington, 2018. "Developing Competition Law For Collusion By Autonomous Artificial Agents," Journal of Competition Law and Economics, Oxford University Press, vol. 14(3), pages 331-363.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Jay Armas & Wout Merbis & Janusz Meylahn & Soroush Rafiee Rad & Mauricio J. del Razo, 2023. "Risk aversion promotes cooperation," Papers 2306.05971, arXiv.org.
    2. Ding, Zhen-Wei & Zheng, Guo-Zhong & Cai, Chao-Ran & Cai, Wei-Ran & Chen, Li & Zhang, Ji-Qiang & Wang, Xu-Ming, 2023. "Emergence of cooperation in two-agent repeated games with reinforcement learning," Chaos, Solitons & Fractals, Elsevier, vol. 175(P1).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Werner, Tobias, 2021. "Algorithmic and human collusion," DICE Discussion Papers 372, Heinrich Heine University Düsseldorf, Düsseldorf Institute for Competition Economics (DICE).
    2. Martin, Simon & Rasch, Alexander, 2022. "Collusion by algorithm: The role of unobserved actions," DICE Discussion Papers 382, Heinrich Heine University Düsseldorf, Düsseldorf Institute for Competition Economics (DICE).
    3. Frédéric Marty & Thierry Warin, 2023. "Deciphering Algorithmic Collusion: Insights from Bandit Algorithms and Implications for Antitrust Enforcement," CIRANO Working Papers 2023s-26, CIRANO.
    4. Simon Martin & Alexander Rasch, 2022. "Collusion by Algorithm: The Role of Unobserved Actions," CESifo Working Paper Series 9629, CESifo.
    5. Aleksandar B. Todorov, 2022. "Algorithmic pricing and concerted behaviour – competitive challenges?," Economic Thought journal, Bulgarian Academy of Sciences - Economic Research Institute, issue 1, pages 90-107.
    6. Thomas Loots & Arnoud V. den Boer, 2023. "Data‐driven collusion and competition in a pricing duopoly with multinomial logit demand," Production and Operations Management, Production and Operations Management Society, vol. 32(4), pages 1169-1186, April.
    7. João E. Gata, 2019. "Controlling Algorithmic Collusion: short review of the literature, undecidability, and alternative approaches," Working Papers REM 2019/77, ISEG - Lisbon School of Economics and Management, REM, Universidade de Lisboa.
    8. Marcel Wieting & Geza Sapi, 2021. "Algorithms in the Marketplace: An Empirical Analysis of Automated Pricing in E-Commerce," Working Papers 21-06, NET Institute.
    9. Yiquan Gu & Leonardo Madio & Carlo Reggiani, 2019. "Exclusive Data, Price Manipulation and Market Leadership," CESifo Working Paper Series 7853, CESifo.
    10. Stefano Colombo & Aldo Pignataro, 2022. "Information accuracy and collusion," Journal of Economics & Management Strategy, Wiley Blackwell, vol. 31(3), pages 638-656, August.
    11. Peter Seele & Claus Dierksmeier & Reto Hofstetter & Mario D. Schultz, 2021. "Mapping the Ethicality of Algorithmic Pricing: A Review of Dynamic and Personalized Pricing," Journal of Business Ethics, Springer, vol. 170(4), pages 697-719, May.
    12. Michele Bisceglia & Jorge Padilla, 2023. "On sellers' cooperation in hybrid marketplaces," Journal of Economics & Management Strategy, Wiley Blackwell, vol. 32(1), pages 207-222, January.
    13. Florian Peiseler & Alexander Rasch & Shiva Shekhar, 2022. "Imperfect information, algorithmic price discrimination, and collusion," Scandinavian Journal of Economics, Wiley Blackwell, vol. 124(2), pages 516-549, April.
    14. Hans-Theo Normann & Martin Sternberg, 2021. "Human-Algorithm Interaction: Algorithmic Pricing in Hybrid Laboratory Markets," Discussion Paper Series of the Max Planck Institute for Research on Collective Goods 2021_11, Max Planck Institute for Research on Collective Goods, revised 13 Apr 2022.
    15. Nunan, Daniel & Di Domenico, MariaLaura, 2022. "Value creation in an algorithmic world: Towards an ethics of dynamic pricing," Journal of Business Research, Elsevier, vol. 150(C), pages 451-460.
    16. Axel Gautier & Ashwin Ittoo & Pieter Cleynenbreugel, 2020. "AI algorithms, price discrimination and collusion: a technological, economic and legal perspective," European Journal of Law and Economics, Springer, vol. 50(3), pages 405-435, December.
    17. Werner, Tobias, 2023. "Algorithmic and Human Collusion," VfS Annual Conference 2023 (Regensburg): Growth and the "sociale Frage" 277573, Verein für Socialpolitik / German Economic Association.
    18. Karsten T. Hansen & Kanishka Misra & Mallesh M. Pai, 2021. "Frontiers: Algorithmic Collusion: Supra-competitive Prices via," Marketing Science, INFORMS, vol. 40(1), pages 1-12, January.
    19. Brias, Antoine & Munch, Stephan B., 2021. "Ecosystem based multi-species management using Empirical Dynamic Programming," Ecological Modelling, Elsevier, vol. 441(C).
    20. Lucila Porto, 2022. "Q-Learning algorithms in a Hotelling model," Asociación Argentina de Economía Política: Working Papers 4587, Asociación Argentina de Economía Política.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2209.01013. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.