IDEAS home Printed from https://ideas.repec.org/a/eee/chsofr/v201y2025ip2s0960077925013219.html

Cooperation dynamics driven by reinforcement learning with interactive diversity in structured populations

Author

Listed:
  • An, Tianbo
  • Zhang, Huizhen
  • Zhang, Zhanshuo
  • Liu, Guanghui
  • Li, Jiayu
  • Chen, Liangyu
  • Wang, Zhen

Abstract

In reality, individuals tend to make different decisions based on differences in relationships and behaviors with their neighbors. Based on this observation, the paper explores the evolution of cooperative behavior when agents develop separated actions for each neighbor by the reinforcement learning approach. Through simulation experiments, it is shown that our model improves the cooperative level compared to results that only consider the agent’s own behavior. This is because agents tend to adopt cooperative strategies toward their neighbors while avoiding exploitation, thus promoting the steady expansion of cooperation. Notably, we find that agents do not always choose the action with the highest expected rewards. Therefore, we classify the behavior strategies of the agents into 16 types, corresponding to all possible combinations of actions selected in different states. We observe that agents adopting a specific behavior strategy tend to dominate the evolutionary process: when they choose to cooperate, they switch to defection in the next round regardless of the opponent’s action; conversely, when they defect, they switch to cooperation in the next round, again independent of the opponent’s behavior. These agents are typically distributed among others with different strategy types, playing a bridging and buffering role. By facilitating the expansion of neighboring agents, they contribute to the spread of cooperative behavior and ultimately enhance the overall level of cooperation in the population. Similar phenomena are also observed under initial specific distributions (e.g., ALLC, ALLD). Next, the hyperparameters of reinforcement learning are analyzed, and the results show that cooperation is easier to maintain and expand when agents make decisions based on past experiences and fully consider potential future rewards. We also compare this model with a control model that adopted the assumption of interactive homogeneity, and further examine the impact of different network structures on the cooperative evolution. Finally, we introduce the memory mechanism of agents as an extended analysis of the model.

Suggested Citation

  • An, Tianbo & Zhang, Huizhen & Zhang, Zhanshuo & Liu, Guanghui & Li, Jiayu & Chen, Liangyu & Wang, Zhen, 2025. "Cooperation dynamics driven by reinforcement learning with interactive diversity in structured populations," Chaos, Solitons & Fractals, Elsevier, vol. 201(P2).
  • Handle: RePEc:eee:chsofr:v:201:y:2025:i:p2:s0960077925013219
    DOI: 10.1016/j.chaos.2025.117308
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0960077925013219
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.chaos.2025.117308?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to

    for a different version of it.

    References listed on IDEAS

    as
    1. Wang, Tao & Chen, Zhigang & Li, Kenli & Deng, Xiaoheng & Li, Deng, 2014. "Memory does not necessarily promote cooperation in dilemma games," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 395(C), pages 218-227.
    2. Zheng, Guozhong & Zhang, Jiqiang & Deng, Shengfeng & Cai, Weiran & Chen, Li, 2024. "Evolution of cooperation in the public goods game with Q-learning," Chaos, Solitons & Fractals, Elsevier, vol. 188(C).
    3. Quan, Ji & Dong, Xu & Wang, Xianjia, 2022. "Rational conformity behavior in social learning promotes cooperation in spatial public goods game," Applied Mathematics and Computation, Elsevier, vol. 425(C).
    4. Allan Dafoe & Yoram Bachrach & Gillian Hadfield & Eric Horvitz & Kate Larson & Thore Graepel, 2021. "Cooperative AI: machines must learn to find common ground," Nature, Nature, vol. 593(7857), pages 33-36, May.
    5. Peter S. Park & Martin A. Nowak & Christian Hilbe, 2022. "Cooperation in alternating interactions with memory constraints," Nature Communications, Nature, vol. 13(1), pages 1-11, December.
    6. Ding, Hong & Zhang, Geng-shun & Wang, Shi-hao & Li, Juan & Wang, Zhen, 2019. "Q-learning boosts the evolution of cooperation in structured population by involving extortion," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 536(C).
    7. Raphael Thomadsen & Pradeep Bhardwaj, 2011. "Cooperation in Games with Forgetfulness," Management Science, INFORMS, vol. 57(2), pages 363-375, February.
    8. Li, Yipeng & Hu, Xiangyue & Jin, Xing & Zhang, Huizhen & Yang, Jiajia & Wang, Zhen, 2025. "Environmental information perception enhances cooperation in stochastic public goods games via Q-learning," Applied Mathematics and Computation, Elsevier, vol. 504(C).
    9. Yang, Zhengzhi & Zheng, Lei & Perc, Matjaž & Li, Yumeng, 2024. "Interaction state Q-learning promotes cooperation in the spatial prisoner's dilemma game," Applied Mathematics and Computation, Elsevier, vol. 463(C).
    10. Zhang, Huizhen & An, Tianbo & Yan, Pingping & Hu, Kaipeng & An, Jinjin & Shi, Lijuan & Zhao, Jian & Wang, Jingrui, 2024. "Exploring cooperative evolution with tunable payoff’s loners using reinforcement learning," Chaos, Solitons & Fractals, Elsevier, vol. 178(C).
    11. Zhang, Wei & Zhao, Dongkai & Jin, Xing & Zhang, Huizhen & An, Tianbo & Cui, Guanghai & Wang, Zhen, 2025. "Q-learning facilitates norm emergence in metanorm game model with topological structures," Chaos, Solitons & Fractals, Elsevier, vol. 195(C).
    12. Geng, Yini & Liu, Yifan & Lu, Yikang & Shen, Chen & Shi, Lei, 2022. "Reinforcement learning explains various conditional cooperation," Applied Mathematics and Computation, Elsevier, vol. 427(C).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Xie, Kai & Szolnoki, Attila, 2026. "Reinforcement learning in evolutionary game theory: A brief review of recent developments," Applied Mathematics and Computation, Elsevier, vol. 510(C).
    2. Huang, Yijie, 2025. "The evolution of cooperation in multi-games with reinforcement learning," Chaos, Solitons & Fractals, Elsevier, vol. 201(P2).
    3. Liu, Xinyu & Jin, Wei & Chen, Guanrong & Tang, Changbing & Qian, Youhua & Jin, Weifeng, 2025. "Small groups nurturing collective wisdom: The punishment-prediction reinforcement learning mechanism for multi-group cooperation," Chaos, Solitons & Fractals, Elsevier, vol. 201(P1).
    4. Lee, Hsuan-Wei & Weng, Yi-Ning, 2025. "Granular Q-learning adaptation boosts collective welfare in multi-agent Prisoner’s Dilemma," Chaos, Solitons & Fractals, Elsevier, vol. 199(P1).
    5. Wang, Chengjie & Deng, Juan & Zhao, Hui & Li, Li, 2024. "Effect of Q-learning on the evolution of cooperation behavior in collective motion: An improved Vicsek model," Applied Mathematics and Computation, Elsevier, vol. 482(C).
    6. Shen, Shaofei & Zhang, Xuejun & Xu, Aobo & Duan, Taisen, 2024. "An adaptive exploration mechanism for Q-learning in spatial public goods games," Chaos, Solitons & Fractals, Elsevier, vol. 189(P1).
    7. Yang, Zhengzhi & Zheng, Lei & Perc, Matjaž & Li, Yumeng, 2024. "Interaction state Q-learning promotes cooperation in the spatial prisoner's dilemma game," Applied Mathematics and Computation, Elsevier, vol. 463(C).
    8. Wu, Binjie & Shen, Shaofei & Wang, Jiafeng & Wan, Haibin, 2025. "Q-learning promotes the evolution of fairness and generosity in the ultimatum game," Chaos, Solitons & Fractals, Elsevier, vol. 200(P2).
    9. Huang, Chaochao & Wang, Chaoqian, 2024. "Memory-based involution dilemma on square lattices," Chaos, Solitons & Fractals, Elsevier, vol. 178(C).
    10. Wang, Weining & Shang, Lihui & Wu, Yipeng & Hu, Mingjian & Wang, Weiyu, 2025. "Bio-inspired mechanism promotes cooperation in spatial public goods games," Chaos, Solitons & Fractals, Elsevier, vol. 200(P3).
    11. Yang, Yujin & Zhao, Dawei & Wang, Juan, 2025. "Evolution of cooperation in spatial public goods games driven by reinforcement learning and environmental feedback," Chaos, Solitons & Fractals, Elsevier, vol. 199(P1).
    12. Li, Yipeng & Hu, Xiangyue & Jin, Xing & Zhang, Huizhen & Yang, Jiajia & Wang, Zhen, 2025. "Environmental information perception enhances cooperation in stochastic public goods games via Q-learning," Applied Mathematics and Computation, Elsevier, vol. 504(C).
    13. Lin, Jiaying & Yang, Junzhong, 2025. "Emotion-coupled Q-learning with cognitive bias enhances cooperation in evolutionary prisoner’s dilemma games," Chaos, Solitons & Fractals, Elsevier, vol. 200(P1).
    14. Mangold, Gustavo C. & Vainstein, Mendeli H. & Fernandes, Heitor C.M., 2025. "Dilution, diffusion and symbiosis in spatial prisoner’s dilemma with reinforcement learning," Chaos, Solitons & Fractals, Elsevier, vol. 201(P3).
    15. Lv, Shaojie & Li, Jiaying & Zhao, Changheng, 2025. "Reinforcement learning in spatial public goods games with environmental feedbacks," Chaos, Solitons & Fractals, Elsevier, vol. 195(C).
    16. Zheng, Guozhong & Zhang, Jiqiang & Deng, Shengfeng & Cai, Weiran & Chen, Li, 2024. "Evolution of cooperation in the public goods game with Q-learning," Chaos, Solitons & Fractals, Elsevier, vol. 188(C).
    17. Zhang, Yongqiang & Zheng, Zehao & Zhang, Xiaoming & Ma, Jinlong, 2025. "Dynamic punishment-reputation synergy drives cooperation in spatial public goods game," Applied Mathematics and Computation, Elsevier, vol. 506(C).
    18. Zhang, Lan & Li, Yuqin & Xie, Yuan & Feng, Yuee & Huang, Changwei, 2025. "The combined effects of conformity and reinforcement learning on the evolution of cooperation in public goods games," Chaos, Solitons & Fractals, Elsevier, vol. 193(C).
    19. Zhang, Yali & Lu, Yikang & Jin, Haoyu & Dong, Yuting & Du, Chunpeng & Shi, Lei, 2024. "The impact of dynamic reward on cooperation in the spatial public goods game," Chaos, Solitons & Fractals, Elsevier, vol. 187(C).
    20. Lin, Jiaying & Long, Pinduo & Liang, Jinfeng & Dai, Qionglin & Li, Haihong & Yang, Junzhong, 2025. "The coevolution of cooperation: Integrating Q-learning and occasional social interactions in evolutionary games," Chaos, Solitons & Fractals, Elsevier, vol. 194(C).

    More about this item

    Keywords

    ;
    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:chsofr:v:201:y:2025:i:p2:s0960077925013219. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Thayer, Thomas R. (email available below). General contact details of provider: https://www.journals.elsevier.com/chaos-solitons-and-fractals .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.