Formal Contracts Mitigate Social Dilemmas in Multi-Agent RL

My bibliography Save this paper

Formal Contracts Mitigate Social Dilemmas in Multi-Agent RL

Author

Listed:

Andreas A. Haupt
Phillip J. K. Christoffersen
Mehul Damani
Dylan Hadfield-Menell

Registered:

Abstract

Multi-agent Reinforcement Learning (MARL) is a powerful tool for training autonomous agents acting independently in a common environment. However, it can lead to sub-optimal behavior when individual incentives and group incentives diverge. Humans are remarkably capable at solving these social dilemmas. It is an open problem in MARL to replicate such cooperative behaviors in selfish agents. In this work, we draw upon the idea of formal contracting from economics to overcome diverging incentives between agents in MARL. We propose an augmentation to a Markov game where agents voluntarily agree to binding transfers of reward, under pre-specified conditions. Our contributions are theoretical and empirical. First, we show that this augmentation makes all subgame-perfect equilibria of all Fully Observable Markov Games exhibit socially optimal behavior, given a sufficiently rich space of contracts. Next, we show that for general contract spaces, and even under partial observability, richer contract spaces lead to higher welfare. Hence, contract space design solves an exploration-exploitation tradeoff, sidestepping incentive issues. We complement our theoretical analysis with experiments. Issues of exploration in the contracting augmentation are mitigated using a training methodology inspired by multi-objective reinforcement learning: Multi-Objective Contract Augmentation Learning (MOCA). We test our methodology in static, single-move games, as well as dynamic domains that simulate traffic, pollution management and common pool resource management.

Suggested Citation

Andreas A. Haupt & Phillip J. K. Christoffersen & Mehul Damani & Dylan Hadfield-Menell, 2022. "Formal Contracts Mitigate Social Dilemmas in Multi-Agent RL," Papers 2208.10469, arXiv.org, revised Jan 2024.

Handle: RePEc:arx:papers:2208.10469

Download full text from publisher

References listed on IDEAS

Michael Curry & Tuomas Sandholm & John Dickerson, 2022. "Differentiable Economics for Randomized Affine Maximizer Auctions," Papers 2202.02872, arXiv.org.
Martin J. Osborne & Ariel Rubinstein, 1994. "A Course in Game Theory," MIT Press Books, The MIT Press, edition 1, volume 1, number 0262650401, December.
- Martin J Osborne & Ariel Rubinstein, 2009. "A Course in Game Theory," Levine's Bibliography 814577000000000225, UCLA Department of Economics.
Emilio Calvano & Giacomo Calzolari & Vincenzo Denicolò & Sergio Pastorello, 2020. "Artificial Intelligence, Algorithmic Pricing, and Collusion," American Economic Review, American Economic Association, vol. 110(10), pages 3267-3297, October.
- Calzolari, Giacomo & Calvano, Emilio & Denicolo, Vincenzo & Pastorello, Sergio, 2018. "Artificial intelligence, algorithmic pricing and collusion," CEPR Discussion Papers 13405, C.E.P.R. Discussion Papers.
John Asker & Chaim Fershtman & Ariel Pakes, 2021. "Artificial Intelligence and Pricing: The Impact of Algorithm Design," NBER Working Papers 28535, National Bureau of Economic Research, Inc.
- Fershtman, Chaim & Asker, John & Pakes, Ariel, 2021. "Artificial intelligence and Pricing: The Impact of Algorithm Design," CEPR Discussion Papers 15880, C.E.P.R. Discussion Papers.

Full references (including those not matched with items on IDEAS)

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Martin, Simon & Rasch, Alexander, 2022. "Collusion by algorithm: The role of unobserved actions," DICE Discussion Papers 382, Heinrich Heine University Düsseldorf, Düsseldorf Institute for Competition Economics (DICE).
Aniko …ry & Ali Horta su & Kevin Williams, 2022. "Dynamic Price Competition: Theory and Evidence from Airline Markets," Cowles Foundation Discussion Papers 2341R1, Cowles Foundation for Research in Economics, Yale University, revised Apr 2023.
Simon Martin & Alexander Rasch, 2022. "Collusion by Algorithm: The Role of Unobserved Actions," CESifo Working Paper Series 9629, CESifo.
Andreas Haupt & Aroon Narayanan, 2022. "Risk Preferences of Learning Algorithms," Papers 2205.04619, arXiv.org, revised Dec 2023.
Battigalli, Pierpaolo & Bonanno, Giacomo, 1997. "The Logic of Belief Persistence," Economics and Philosophy, Cambridge University Press, vol. 13(1), pages 39-59, April.
- Giacomo Bonanno & Pierpaolo Battigalli, 2004. "The Logic Of Belief Persistency," Working Papers 206, University of California, Davis, Department of Economics.
Szabó, György & Borsos, István & Szombati, Edit, 2019. "Games, graphs and Kirchhoff laws," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 521(C), pages 416-423.
Shi, Yi & Deng, Yawen & Wang, Guoan & Xu, Jiuping, 2020. "Stackelberg equilibrium-based eco-economic approach for sustainable development of kitchen waste disposal with subsidy policy: A case study from China," Energy, Elsevier, vol. 196(C).
Marc Le Menestrel, 2003. "A one-shot Prisoners’ Dilemma with procedural utility," Economics Working Papers 819, Department of Economics and Business, Universitat Pompeu Fabra.
Cheng‐Kuang Wu & Yi‐Ming Chen & Dachrahn Wu & Ching‐Lin Chi, 2020. "A Game Theory Approach for Assessment of Risk and Deployment of Police Patrols in Response to Criminal Activity in San Francisco," Risk Analysis, John Wiley & Sons, vol. 40(3), pages 534-549, March.
Inkoo Cho & Noah Williams, 2024. "Collusive Outcomes Without Collusion," Papers 2403.07177, arXiv.org.
Nasimeh Heydaribeni & Achilleas Anastasopoulos, 2019. "Linear Equilibria for Dynamic LQG Games with Asymmetric Information and Dependent Types," Papers 1909.04834, arXiv.org.
Müller, Christoph, 2020. "Robust implementation in weakly perfect Bayesian strategies," Journal of Economic Theory, Elsevier, vol. 189(C).
Hitoshi Matsushima, 2019. "Implementation without expected utility: ex-post verifiability," Social Choice and Welfare, Springer;The Society for Social Choice and Welfare, vol. 53(4), pages 575-585, December.
- Hitoshi Matsushima, 2018. "Implementation without Expected Utility: Ex-Post Verifiability," CARF F-Series CARF-F-443, Center for Advanced Research in Finance, Faculty of Economics, The University of Tokyo.
Dasgupta Utteeyo, 2011. "Are Entry Threats Always Credible?," The B.E. Journal of Economic Analysis & Policy, De Gruyter, vol. 11(1), pages 1-41, December.
Baran Han, 2018. "The role and welfare rationale of secondary sanctions: A theory and a case study of the US sanctions targeting Iran," Conflict Management and Peace Science, Peace Science Society (International), vol. 35(5), pages 474-502, September.
Carlos Pimienta & Jianfei Shen, 2014. "On the equivalence between (quasi-)perfect and sequential equilibria," International Journal of Game Theory, Springer;Game Theory Society, vol. 43(2), pages 395-402, May.
- Carlos Pimienta & Jianfei Shen, 2011. "On the Equivalence between (Quasi)-perfect and sequential equilibria," Discussion Papers 2012-01, School of Economics, The University of New South Wales.
Asheim, Geir & Søvik, Ylva, 2003. "The semantics of preference-based belief operators," Memorandum 05/2003, Oslo University, Department of Economics.
Wang, Yafeng & Graham, Brett, 2009. "Generalized Maximum Entropy estimation of discrete sequential move games of perfect information," MPRA Paper 21331, University Library of Munich, Germany.
repec:dau:papers:123456789/6818 is not listed on IDEAS
Tobias Harks & Martin Hoefer & Anja Schedel & Manuel Surek, 2021. "Efficient Black-Box Reductions for Separable Cost Sharing," Mathematics of Operations Research, INFORMS, vol. 46(1), pages 134-158, February.
Karbowski, Adam, 2011. "O kilku modelach samolubnego karania w ekonomii behawioralnej [Evolution of altruism in the light of behavioral economics]," MPRA Paper 69604, University Library of Munich, Germany.

More about this item

NEP fields

This paper has been announced in the following NEP Reports:

NEP-CTA-2022-09-19 (Contract Theory and Applications)
NEP-GTH-2022-09-19 (Game Theory)

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2208.10469. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Formal Contracts Mitigate Social Dilemmas in Multi-Agent RL

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Most related items

More about this item

NEP fields

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data