Symmetric equilibrium of multi-agent reinforcement learning in repeated prisoner’s dilemma

My bibliography Save this article

Symmetric equilibrium of multi-agent reinforcement learning in repeated prisoner’s dilemma

Author

Listed:

Usui, Yuki
Ueda, Masahiko

Registered:

Abstract

We investigate the repeated prisoner’s dilemma game where both players alternately use reinforcement learning to obtain their optimal memory-one strategies. We theoretically solve the simultaneous Bellman optimality equations of reinforcement learning. We find that the Win-stay Lose-shift strategy, the Grim strategy, and the strategy which always defects can form symmetric equilibrium of the mutual reinforcement learning process amongst all deterministic memory-one strategies.

Suggested Citation

Usui, Yuki & Ueda, Masahiko, 2021. "Symmetric equilibrium of multi-agent reinforcement learning in repeated prisoner’s dilemma," Applied Mathematics and Computation, Elsevier, vol. 409(C).

Handle: RePEc:eee:apmaco:v:409:y:2021:i:c:s0096300321004598
DOI: 10.1016/j.amc.2021.126370

Download full text from publisher

As the access to this document is restricted, you may want to search for a different version of it.

References listed on IDEAS

Imhof, Lorens & Nowak, Martin & Fudenberg, Drew, 2007. "Tit-for-Tat or Win-Stay, Lose-Shift?," Scholarly Articles 3200671, Harvard University Department of Economics.
Christian Hilbe & Krishnendu Chatterjee & Martin A. Nowak, 2018. "Partners and rivals in direct reciprocity," Nature Human Behaviour, Nature, vol. 2(7), pages 469-477, July.
Marc Harper & Vincent Knight & Martin Jones & Georgios Koutsovoulos & Nikoleta E Glynatsi & Owen Campbell, 2017. "Reinforcement learning produces dominant strategies for the Iterated Prisoner’s Dilemma," PLOS ONE, Public Library of Science, vol. 12(12), pages 1-33, December.

Full references (including those not matched with items on IDEAS)

Citations

Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

Cited by:

Masahiko Ueda, 2022. "Controlling Conditional Expectations by Zero-Determinant Strategies," SN Operations Research Forum, Springer, vol. 3(3), pages 1-22, September.
Ueda, Masahiko, 2023. "Memory-two strategies forming symmetric mutual reinforcement learning equilibrium in repeated prisoners’ dilemma game," Applied Mathematics and Computation, Elsevier, vol. 444(C).
Wang, Xianjia & Yang, Zhipeng & Liu, Yanli & Chen, Guici, 2023. "A reinforcement learning-based strategy updating model for the cooperative evolution," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 618(C).
Ding, Zhen-Wei & Zheng, Guo-Zhong & Cai, Chao-Ran & Cai, Wei-Ran & Chen, Li & Zhang, Ji-Qiang & Wang, Xu-Ming, 2023. "Emergence of cooperation in two-agent repeated games with reinforcement learning," Chaos, Solitons & Fractals, Elsevier, vol. 175(P1).
Yuan, Hairui & Meng, Xinzhu, 2022. "Replicator dynamics of the Hawk-Dove game with different stochastic noises in infinite populations," Applied Mathematics and Computation, Elsevier, vol. 430(C).
Wolfram Barfuss & Janusz Meylahn, 2022. "Intrinsic fluctuations of reinforcement learning promote cooperation," Papers 2209.01013, arXiv.org, revised Feb 2023.

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Ding, Zhen-Wei & Zheng, Guo-Zhong & Cai, Chao-Ran & Cai, Wei-Ran & Chen, Li & Zhang, Ji-Qiang & Wang, Xu-Ming, 2023. "Emergence of cooperation in two-agent repeated games with reinforcement learning," Chaos, Solitons & Fractals, Elsevier, vol. 175(P1).
Molnar, Grant & Hammond, Caroline & Fu, Feng, 2023. "Reactive means in the iterated Prisoner’s dilemma," Applied Mathematics and Computation, Elsevier, vol. 458(C).
Yohsuke Murase & Seung Ki Baek, 2021. "Friendly-rivalry solution to the iterated n-person public-goods game," PLOS Computational Biology, Public Library of Science, vol. 17(1), pages 1-17, January.
Masahiko Ueda & Toshiyuki Tanaka, 2020. "Linear algebraic structure of zero-determinant strategies in repeated games," PLOS ONE, Public Library of Science, vol. 15(4), pages 1-13, April.
Christopher Lee & Marc Harper & Dashiell Fryer, 2015. "The Art of War: Beyond Memory-one Strategies in Population Games," PLOS ONE, Public Library of Science, vol. 10(3), pages 1-16, March.
Laura Schmid & Farbod Ekbatani & Christian Hilbe & Krishnendu Chatterjee, 2023. "Quantitative assessment can stabilize indirect reciprocity under imperfect information," Nature Communications, Nature, vol. 14(1), pages 1-14, December.
Drew Fudenberg & David G. Rand & Anna Dreber, 2012. "Slow to Anger and Fast to Forgive: Cooperation in an Uncertain World," American Economic Review, American Economic Association, vol. 102(2), pages 720-749, April.
- Rand, David G & Fudenberg, Drew & Dreber, Anna, 2012. "Slow to Anger and Fast to Forgive: Cooperation in an Uncertain World," Scholarly Articles 11223697, Harvard University Department of Economics.
Evans, Alecia & Sesmero, Juan Pablo, 2022. "Noisy Payoffs in an Infinitely Repeated Prisoner’s Dilemma – Experimental Evidence," 2022 Annual Meeting, July 31-August 2, Anaheim, California 322434, Agricultural and Applied Economics Association.
Chang, Shuhua & Zhang, Zhipeng & Wu, Yu’e & Xie, Yunya, 2018. "Cooperation is enhanced by inhomogeneous inertia in spatial prisoner’s dilemma game," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 490(C), pages 419-425.
Kurokawa, Shun, 2019. "How memory cost, switching cost, and payoff non-linearity affect the evolution of persistence," Applied Mathematics and Computation, Elsevier, vol. 341(C), pages 174-192.
Yongkui Liu & Xiaojie Chen & Lin Zhang & Long Wang & Matjaž Perc, 2012. "Win-Stay-Lose-Learn Promotes Cooperation in the Spatial Prisoner's Dilemma Game," PLOS ONE, Public Library of Science, vol. 7(2), pages 1-8, February.
Sean Duffy & J. J. Naddeo & David Owens & John Smith, 2024. "Cognitive Load and Mixed Strategies: On Brains and Minimax," International Game Theory Review (IGTR), World Scientific Publishing Co. Pte. Ltd., vol. 26(03), pages 1-34, September.
- Duffy, Sean & Naddeo, JJ & Owens, David & Smith, John, 2016. "Cognitive load and mixed strategies: On brains and minimax," MPRA Paper 71878, University Library of Munich, Germany.
- Duffy, Sean & Naddeo, JJ & Owens, David & Smith, John, 2016. "Cognitive load and mixed strategies: On brains and minimax," MPRA Paper 89720, University Library of Munich, Germany.
Zhang, Huanren, 2018. "Errors can increase cooperation in finite populations," Games and Economic Behavior, Elsevier, vol. 107(C), pages 203-219.
Ma, Yin-Jie & Jiang, Zhi-Qiang & Podobnik, Boris, 2022. "Predictability of players’ actions as a mechanism to boost cooperation," Chaos, Solitons & Fractals, Elsevier, vol. 164(C).
El-Seidy, Essam & Soliman, Karim.M., 2016. "Iterated symmetric three-player prisoner’s dilemma game," Applied Mathematics and Computation, Elsevier, vol. 282(C), pages 117-127.
Zhou, Zhizhuo & Rong, Zhihai & Yang, Wen & Wu, Zhi-Xi, 2024. "Coevolution of extortion strategies with mixed imitation and aspiration learning dynamics in spatial Prisoner’s Dilemma game," Chaos, Solitons & Fractals, Elsevier, vol. 188(C).
Ochea, Marius-Ionut, 2013. "Evolution of repeated prisoner's dilemma play under logit dynamics," Journal of Economic Dynamics and Control, Elsevier, vol. 37(12), pages 2483-2499.
Xiaofeng Wang, 2021. "Costly Participation and The Evolution of Cooperation in the Repeated Public Goods Game," Dynamic Games and Applications, Springer, vol. 11(1), pages 161-183, March.
Wang, Tao & Chen, Zhigang & Li, Kenli & Deng, Xiaoheng & Li, Deng, 2014. "Memory does not necessarily promote cooperation in dilemma games," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 395(C), pages 218-227.
Baumgartner, Nora & Sloot, Daniel & Günther, Anne & Hahnel, Ulf J.J., 2025. "Development and test of a dual-pathway model of personal and community factors driving new energy technology adoption - The case of V2G in three European countries," Ecological Economics, Elsevier, vol. 230(C).

More about this item

Keywords

Repeated prisoner’s dilemma game; Reinforcement learning;

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:apmaco:v:409:y:2021:i:c:s0096300321004598. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: https://www.journals.elsevier.com/applied-mathematics-and-computation .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Symmetric equilibrium of multi-agent reinforcement learning in repeated prisoner’s dilemma

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Citations

Most related items

More about this item

Keywords

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data