IDEAS home Printed from https://ideas.repec.org/p/ems/eureri/7323.html
   My bibliography  Save this paper

A Theoretical Analysis of Cooperative Behavior in Multi-Agent Q-learning

Author

Listed:
  • Waltman, L.
  • Kaymak, U.

Abstract

A number of experimental studies have investigated whether cooperative behavior may emerge in multi-agent Q-learning. In some studies cooperative behavior did emerge, in others it did not. This report provides a theoretical analysis of this issue. The analysis focuses on multi-agent Q-learning in iterated prisoner’s dilemmas. It is shown that under certain assumptions cooperative behavior may emerge when multi-agent Q-learning is applied in an iterated prisoner’s dilemma. An important consequence of the analysis is that multi-agent Q-learning may result in non-Nash behavior. It is found experimentally that the theoretical results derived in this report are quite robust to violations of the underlying assumptions.

Suggested Citation

  • Waltman, L. & Kaymak, U., 2006. "A Theoretical Analysis of Cooperative Behavior in Multi-Agent Q-learning," ERIM Report Series Research in Management ERS-2006-006-LIS, Erasmus Research Institute of Management (ERIM), ERIM is the joint research institute of the Rotterdam School of Management, Erasmus University and the Erasmus School of Economics (ESE) at Erasmus University Rotterdam.
  • Handle: RePEc:ems:eureri:7323
    as

    Download full text from publisher

    File URL: https://repub.eur.nl/pub/7323/ERS%202006%20006%20LIS.pdf
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Kandori Michihiro & Rob Rafael, 1995. "Evolution of Equilibria in the Long Run: A General Theory and Applications," Journal of Economic Theory, Elsevier, vol. 65(2), pages 383-414, April.
    2. Christina Fang & Steven Orla Kimbrough & Stefano Pace & Annapurna Valluri & Zhiqiang Zheng, 2002. "On Adaptive Emergence of Trust Behavior in the Game of Stag Hunt," Group Decision and Negotiation, Springer, vol. 11(6), pages 449-467, November.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Waltman, Ludo & Kaymak, Uzay, 2008. "Q-learning agents in a Cournot oligopoly model," Journal of Economic Dynamics and Control, Elsevier, vol. 32(10), pages 3275-3293, October.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Sanjeev Goyal & Fernando Vega-Redondo, 2000. "Learning, Network Formation and Coordination," Econometric Society World Congress 2000 Contributed Papers 0113, Econometric Society.
    2. Hofbauer, Josef & Sorger, Gerhard, 1999. "Perfect Foresight and Equilibrium Selection in Symmetric Potential Games," Journal of Economic Theory, Elsevier, vol. 85(1), pages 1-23, March.
    3. Carlos Alós-Ferrer & Georg Kirchsteiger & Markus Walzl, 2010. "On the Evolution of Market Institutions: The Platform Design Paradox," Economic Journal, Royal Economic Society, vol. 120(543), pages 215-243, March.
    4. , & , & ,, 2008. "Monotone methods for equilibrium selection under perfect foresight dynamics," Theoretical Economics, Econometric Society, vol. 3(2), June.
    5. Jehiel, Philippe, 1998. "Learning to Play Limited Forecast Equilibria," Games and Economic Behavior, Elsevier, vol. 22(2), pages 274-298, February.
    6. Hehenkamp, Burkhard & Kaarbøe, Oddvar M., 2004. "Equilibrium selection in supermodular games with mean payoff technologies," Working Papers in Economics 08/04, University of Bergen, Department of Economics.
    7. Oechssler, Jorg, 1997. "An Evolutionary Interpretation of Mixed-Strategy Equilibria," Games and Economic Behavior, Elsevier, vol. 21(1-2), pages 203-237, October.
    8. Ennio Bilancini & Leonardo Boncinelli, 2020. "The evolution of conventions under condition-dependent mistakes," Economic Theory, Springer;Society for the Advancement of Economic Theory (SAET), vol. 69(2), pages 497-521, March.
    9. Maruta, Toshimasa, 1997. "On the Relationship between Risk-Dominance and Stochastic Stability," Games and Economic Behavior, Elsevier, vol. 19(2), pages 221-234, May.
    10. Fudenberg, Drew & Imhof, Lorens A., 2006. "Imitation processes with small mutations," Journal of Economic Theory, Elsevier, vol. 131(1), pages 251-262, November.
    11. Hsiao-Chi Chen & Yunshyong Chow & Li-Chau Wu, 2013. "Imitation, local interaction, and coordination," International Journal of Game Theory, Springer;Game Theory Society, vol. 42(4), pages 1041-1057, November.
    12. Tanaka, Yasuhito, 2001. "Evolution to equilibrium in an asymmetric oligopoly with differentiated goods," International Journal of Industrial Organization, Elsevier, vol. 19(9), pages 1423-1440, November.
    13. Khan, Abhimanyu, 2022. "Expected utility versus cumulative prospect theory in an evolutionary model of bargaining," Journal of Economic Dynamics and Control, Elsevier, vol. 137(C).
    14. Ianni, Antonella, 2000. "Learning correlated equilibria in potential games," Discussion Paper Series In Economics And Econometrics 0012, Economics Division, School of Social Sciences, University of Southampton.
    15. Alos-Ferrer, Carlos & Weidenholzer, Simon, 2007. "Partial bandwagon effects and local interactions," Games and Economic Behavior, Elsevier, vol. 61(2), pages 179-197, November.
    16. Kukushkin, Nikolai S., 2015. "Cournot tatonnement and potentials," Journal of Mathematical Economics, Elsevier, vol. 59(C), pages 117-127.
    17. Banerjee, Abhijit & Weibull, Jorgen W., 2000. "Neutrally Stable Outcomes in Cheap-Talk Coordination Games," Games and Economic Behavior, Elsevier, vol. 32(1), pages 1-24, July.
    18. Ana Mauleon & Nils Roehl & Vincent Vannetelbosch, 2014. "Constitutions and Social Networks," Working Papers CIE 74, Paderborn University, CIE Center for International Economics.
    19. Oddvar M. Kaarbøe & Alexander F. Tieman, 0000. "Equilibrium Selection in Games with Macroeconomic Complementarities," Tinbergen Institute Discussion Papers 99-096/1, Tinbergen Institute.
    20. Dawid, Herbert, 2000. "On the emergence of exchange and mediation in a production economy," Journal of Economic Behavior & Organization, Elsevier, vol. 41(1), pages 27-53, January.

    More about this item

    Keywords

    Cooperation; Multi-Agent Q-Learning; Multi-Agent Reinforcement Learning; Nash Equilibrium; Prisoner’s Dilemma;
    All these keywords.

    JEL classification:

    • C51 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Model Construction and Estimation
    • L15 - Industrial Organization - - Market Structure, Firm Strategy, and Market Performance - - - Information and Product Quality
    • M - Business Administration and Business Economics; Marketing; Accounting; Personnel Economics
    • O32 - Economic Development, Innovation, Technological Change, and Growth - - Innovation; Research and Development; Technological Change; Intellectual Property Rights - - - Management of Technological Innovation and R&D

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:ems:eureri:7323. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: RePub (email available below). General contact details of provider: https://edirc.repec.org/data/erimanl.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.