IDEAS home Printed from https://ideas.repec.org/a/inm/ormoor/v48y2023i2p656-686.html

A General Framework for Learning Mean-Field Games

Author

Listed:
  • Xin Guo

    (Industrial Engineering and Operations Research Department, University of California–Berkeley, Berkeley, California 94720; Amazon, Seattle, Washington 98109)

  • Anran Hu

    (Industrial Engineering and Operations Research Department, University of California–Berkeley, Berkeley, California 94720)

  • Renyuan Xu

    (Daniel J. Epstein Department of Industrial Systems & Engineering, Viterbi School of Engineering, University of Southern California, Los Angeles, California 90089; Mathematical Institute, University of Oxford, Oxford OX2 6GG, United Kingdom)

  • Junzi Zhang

    (Amazon, Seattle, Washington 98109; Institute for Computational & Mathematical Engineering, Stanford University, California 94305)

Abstract

This paper presents a general mean-field game (GMFG) framework for simultaneous learning and decision making in stochastic games with a large population. It first establishes the existence of a unique Nash equilibrium to this GMFG, and it demonstrates that naively combining reinforcement learning with the fixed-point approach in classical mean-field games yields unstable algorithms. It then proposes value-based and policy-based reinforcement learning algorithms (GMF-V and GMF-P, respectively) with smoothed policies, with analysis of their convergence properties and computational complexities. Experiments on an equilibrium product pricing problem demonstrate that two specific instantiations of GMF-V with Q-learning and GMF-P with trust region policy optimization—GMF-V-Q and GMF-P-TRPO, respectively—are both efficient and robust in the GMFG setting. Moreover, their performance is superior in convergence speed, accuracy, and stability when compared with existing algorithms for multiagent reinforcement learning in the N -player setting.

Suggested Citation

  • Xin Guo & Anran Hu & Renyuan Xu & Junzi Zhang, 2023. "A General Framework for Learning Mean-Field Games," Mathematics of Operations Research, INFORMS, vol. 48(2), pages 656-686, May.
  • Handle: RePEc:inm:ormoor:v:48:y:2023:i:2:p:656-686
    DOI: 10.1287/moor.2022.1274
    as

    Download full text from publisher

    File URL: http://dx.doi.org/10.1287/moor.2022.1274
    Download Restriction: no

    File URL: https://libkey.io/10.1287/moor.2022.1274?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Xin Guo & Renyuan Xu & Thaleia Zariphopoulou, 2022. "Entropy Regularization for Mean Field Games with Learning," Mathematics of Operations Research, INFORMS, vol. 47(4), pages 3239-3260, November.
    2. Lacker, Daniel, 2015. "Mean field games via controlled martingale problems: Existence of Markovian equilibria," Stochastic Processes and their Applications, Elsevier, vol. 125(7), pages 2856-2894.
    3. Alison L. Gibbs & Francis Edward Su, 2002. "On Choosing and Bounding Probability Metrics," International Statistical Review, International Statistical Institute, vol. 70(3), pages 419-435, December.
    4. Charles-Albert Lehalle & Charafeddine Mouzouni, 2019. "A mean field game of portfolio trading and its consequences on perceived correlations," Working Papers hal-02003143, HAL.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Uğur Aydin & Naci Saldi, 2026. "Robustness and Approximation of Discrete-Time Mean-Field Games Under Discounted Cost Criterion," Mathematics of Operations Research, INFORMS, vol. 51(1), pages 185-217, January.
    2. Li Li & Xiquan Jiang & Dianchao Lin, 2025. "On an Interlocking Flexible Car Use Restriction Policy: Theory, Learning and Experiment," Transportation Science, INFORMS, vol. 59(5), pages 883-908, September.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Samuel Daudin, 2022. "Optimal Control of Diffusion Processes with Terminal Constraint in Law," Journal of Optimization Theory and Applications, Springer, vol. 195(1), pages 1-41, October.
    2. Fenner, Trevor & Levene, Mark & Loizou, George, 2010. "Predicting the long tail of book sales: Unearthing the power-law exponent," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 389(12), pages 2416-2421.
    3. Arvind Shrivats & Dena Firoozi & Sebastian Jaimungal, 2020. "A Mean-Field Game Approach to Equilibrium Pricing in Solar Renewable Energy Certificate Markets," Papers 2003.04938, arXiv.org, revised Aug 2021.
    4. Gerhold, Stefan & Gülüm, I. Cetin, 2019. "Peacocks nearby: Approximating sequences of measures," Stochastic Processes and their Applications, Elsevier, vol. 129(7), pages 2406-2436.
    5. Xuejun Zhao & Ruihao Zhu & William B. Haskell, 2022. "Learning to Price Supply Chain Contracts against a Learning Retailer," Papers 2211.04586, arXiv.org.
    6. Xuejun Zhao & William B. Haskell & Guodong Yu, 2024. "Supply Chain Contracts in the Small Data Regime," Manufacturing & Service Operations Management, INFORMS, vol. 26(4), pages 1387-1401, July.
    7. Min Dai & Yuchao Dong & Yanwei Jia & Xun Yu Zhou, 2026. "Merton's Problem with Recursive Perturbed Utility," Papers 2602.13544, arXiv.org.
    8. Sebastian Jaimungal, 2022. "Reinforcement learning and stochastic optimisation," Finance and Stochastics, Springer, vol. 26(1), pages 103-129, January.
    9. Puppo, L. & Pedroni, N. & Maio, F. Di & Bersano, A. & Bertani, C. & Zio, E., 2021. "A Framework based on Finite Mixture Models and Adaptive Kriging for Characterizing Non-Smooth and Multimodal Failure Regions in a Nuclear Passive Safety System," Reliability Engineering and System Safety, Elsevier, vol. 216(C).
    10. Marie Ernst & Yvik Swan, 2022. "Distances Between Distributions Via Stein’s Method," Journal of Theoretical Probability, Springer, vol. 35(2), pages 949-987, June.
    11. Crimaldi, Irene & Dai Pra, Paolo & Louis, Pierre-Yves & Minelli, Ida G., 2019. "Synchronization and functional central limit theorems for interacting reinforced random walks," Stochastic Processes and their Applications, Elsevier, vol. 129(1), pages 70-101.
    12. Kaitong Hu & Zhenjie Ren & Junjian Yang, 2019. "Principal-agent problem with multiple principals," Working Papers hal-02088486, HAL.
    13. Masaaki Fujii & Akihiko Takahashi, 2021. "A Mean Field Game Approach to Equilibrium Pricing with Market Clearing Condition," CIRJE F-Series CIRJE-F-1177, CIRJE, Faculty of Economics, University of Tokyo.
    14. Bezemek, Z.W. & Spiliopoulos, K., 2023. "Large deviations for interacting multiscale particle systems," Stochastic Processes and their Applications, Elsevier, vol. 155(C), pages 27-108.
    15. Leandro Nascimento, 2022. "Bounded arbitrage and nearly rational behavior," Papers 2212.02680, arXiv.org, revised Apr 2025.
    16. Giacomo Aletti & Caterina May & Piercesare Secchi, 2012. "A Functional Equation Whose Unknown is $\mathcal{P}([0,1])$ Valued," Journal of Theoretical Probability, Springer, vol. 25(4), pages 1207-1232, December.
    17. Patrick Marsh, 2019. "The role of information in nonstationary regression," Discussion Papers 19/04, University of Nottingham, Granger Centre for Time Series Econometrics.
    18. Omar Besbes & Will Ma & Omar Mouchtaki, 2025. "Beyond IID: Data-Driven Decision Making in Heterogeneous Environments," Management Science, INFORMS, vol. 71(12), pages 10538-10555, December.
    19. White, Staci A. & Herbei, Radu, 2015. "A Monte Carlo approach to quantifying model error in Bayesian parameter estimation," Computational Statistics & Data Analysis, Elsevier, vol. 83(C), pages 168-181.
    20. Mononen, Lasse, 2025. "On Preference for Simplicity and Probability Weighting," Center for Mathematical Economics Working Papers 748, Center for Mathematical Economics, Bielefeld University.

    More about this item

    Keywords

    ;
    ;
    ;
    ;
    ;
    ;
    ;
    ;
    ;

    JEL classification:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:ormoor:v:48:y:2023:i:2:p:656-686. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Asher (email available below). General contact details of provider: https://edirc.repec.org/data/inforea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.