IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2511.02016.html
   My bibliography  Save this paper

ABIDES-MARL: A Multi-Agent Reinforcement Learning Environment for Endogenous Price Formation and Execution in a Limit Order Book

Author

Listed:
  • Patrick Cheridito
  • Jean-Loup Dupret
  • Zhexin Wu

Abstract

We present ABIDES-MARL, a framework that combines a new multi-agent reinforcement learning (MARL) methodology with a new realistic limit-order-book (LOB) simulation system to study equilibrium behavior in complex financial market games. The system extends ABIDES-Gym by decoupling state collection from kernel interruption, enabling synchronized learning and decision-making for multiple adaptive agents while maintaining compatibility with standard RL libraries. It preserves key market features such as price-time priority and discrete tick sizes. Methodologically, we use MARL to approximate equilibrium-like behavior in multi-period trading games with a finite number of heterogeneous agents-an informed trader, a liquidity trader, noise traders, and competing market makers-all with individual price impacts. This setting bridges optimal execution and market microstructure by embedding the liquidity trader's optimization problem within a strategic trading environment. We validate the approach by solving an extended Kyle model within the simulation system, recovering the gradual price discovery phenomenon. We then extend the analysis to a liquidity trader's problem where market liquidity arises endogenously and show that, at equilibrium, execution strategies shape market-maker behavior and price dynamics. ABIDES-MARL provides a reproducible foundation for analyzing equilibrium and strategic adaptation in realistic markets and contributes toward building economically interpretable agentic AI systems for finance.

Suggested Citation

  • Patrick Cheridito & Jean-Loup Dupret & Zhexin Wu, 2025. "ABIDES-MARL: A Multi-Agent Reinforcement Learning Environment for Endogenous Price Formation and Execution in a Limit Order Book," Papers 2511.02016, arXiv.org.
  • Handle: RePEc:arx:papers:2511.02016
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2511.02016
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Bokai Cao & Saizhuo Wang & Xinyi Lin & Xiaojun Wu & Haohan Zhang & Lionel M. Ni & Jian Guo, 2025. "From Deep Learning to LLMs: A survey of AI in Quantitative Investment," Papers 2503.21422, arXiv.org.
    2. Obizhaeva, Anna A. & Wang, Jiang, 2013. "Optimal trading strategy and supply/demand dynamics," Journal of Financial Markets, Elsevier, vol. 16(1), pages 1-32.
    3. Hongyang Yang & Boyu Zhang & Neng Wang & Cheng Guo & Xiaoli Zhang & Likun Lin & Junlin Wang & Tianyu Zhou & Mao Guan & Runjia Zhang & Christina Dan Wang, 2024. "FinRobot: An Open-Source AI Agent Platform for Financial Applications using Large Language Models," Papers 2405.14767, arXiv.org, revised May 2024.
    4. Gianbiagio Curato & Jim Gatheral & Fabrizio Lillo, 2017. "Optimal execution with non-linear transient market impact," Quantitative Finance, Taylor & Francis Journals, vol. 17(1), pages 41-54, January.
    5. Thibault Jaisson, 2021. "Deep differentiable reinforcement learning and optimal trading," Papers 2112.02944, arXiv.org, revised Apr 2022.
    6. Ciamac C. Moallemi & Muye Wang, 2022. "A reinforcement learning approach to optimal execution," Quantitative Finance, Taylor & Francis Journals, vol. 22(6), pages 1051-1069, June.
    7. Holden, Craig W & Subrahmanyam, Avanidhar, 1992. "Long-Lived Private Information and Imperfect Competition," Journal of Finance, American Finance Association, vol. 47(1), pages 247-270, March.
    8. Ho, Thomas & Stoll, Hans R., 1981. "Optimal dealer pricing under transactions and return uncertainty," Journal of Financial Economics, Elsevier, vol. 9(1), pages 47-73, March.
    9. Glosten, Lawrence R. & Milgrom, Paul R., 1985. "Bid, ask and transaction prices in a specialist market with heterogeneously informed traders," Journal of Financial Economics, Elsevier, vol. 14(1), pages 71-100, March.
    10. Kyle, Albert S, 1985. "Continuous Auctions and Insider Trading," Econometrica, Econometric Society, vol. 53(6), pages 1315-1335, November.
    11. Bertsimas, Dimitris & Lo, Andrew W., 1998. "Optimal control of execution costs," Journal of Financial Markets, Elsevier, vol. 1(1), pages 1-50, April.
    12. Jean-Loup Dupret & Donatien Hainaut, 2025. "Optimal liquidation under indirect price impact with propagator," Quantitative Finance, Taylor & Francis Journals, vol. 25(3), pages 359-381, March.
    13. Albert S. Kyle, 1989. "Informed Speculation with Imperfect Competition," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 56(3), pages 317-355.
    14. Foster, F Douglas & Viswanathan, S, 1993. "The Effect of Public Information and Competition on Trading Volume and Price Volatility," The Review of Financial Studies, Society for Financial Studies, vol. 6(1), pages 23-56.
    15. Fengpei Li & Vitalii Ihnatiuk & Yu Chen & Jiahe Lin & Ryan J. Kinnear & Anderson Schneider & Yuriy Nevmyvaka & Henry Lam, 2024. "Do price trajectory data increase the efficiency of market impact estimation?," Quantitative Finance, Taylor & Francis Journals, vol. 24(5), pages 545-568, May.
    16. Yangyang Yu & Haohang Li & Zhi Chen & Yuechen Jiang & Yang Li & Denghui Zhang & Rong Liu & Jordan W. Suchow & Khaldoun Khashanah, 2023. "FinMem: A Performance-Enhanced LLM Trading Agent with Layered Memory and Character Design," Papers 2311.13743, arXiv.org, revised Dec 2023.
    17. Thibault Jaisson, 2022. "Deep differentiable reinforcement learning and optimal trading," Quantitative Finance, Taylor & Francis Journals, vol. 22(8), pages 1429-1443, August.
    18. Foster, F Douglas & Viswanathan, S, 1996. "Strategic Trading When Agents Forecast the Forecasts of Others," Journal of Finance, American Finance Association, vol. 51(4), pages 1437-1478, September.
    19. Engle, Robert F, 1982. "Autoregressive Conditional Heteroscedasticity with Estimates of the Variance of United Kingdom Inflation," Econometrica, Econometric Society, vol. 50(4), pages 987-1007, July.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Vayanos, Dimitri & Wang, Jiang, 2013. "Market Liquidity—Theory and Empirical Evidence ," Handbook of the Economics of Finance, in: G.M. Constantinides & M. Harris & R. M. Stulz (ed.), Handbook of the Economics of Finance, volume 2, chapter 0, pages 1289-1361, Elsevier.
    2. Dimitri Vayanos & Jiang Wang, 2012. "Market Liquidity -- Theory and Empirical Evidence," NBER Working Papers 18251, National Bureau of Economic Research, Inc.
    3. LOVO, Stefano M. & CALCAGNO, R., 2001. "Market efficiency and Price Formation when Dealers are Asymmetrically Informed," HEC Research Papers Series 737, HEC Paris.
    4. Ledenyov, Dimitri O. & Ledenyov, Viktor O., 2015. "Wave function method to forecast foreign currencies exchange rates at ultra high frequency electronic trading in foreign currencies exchange markets," MPRA Paper 67470, University Library of Munich, Germany.
    5. van Kervel, Vincent & Kwan, Amy & Westerholm, P. Joakim, 2023. "Order splitting and interacting with a counterparty," Journal of Financial Markets, Elsevier, vol. 66(C).
    6. Jos Van Bommel & Jay Dahya & Zhihong Shi, 2010. "An empirical investigation of the speed of information aggregation: a study of IPOs," International Journal of Banking, Accounting and Finance, Inderscience Enterprises Ltd, vol. 2(1), pages 47-79.
    7. Lof, Matthijs & van Bommel, Jos, 2023. "Asymmetric information and the distribution of trading volume," Journal of Corporate Finance, Elsevier, vol. 82(C).
    8. Pascual, Roberto & Escribano, Álvaro & Tapia, Mikel, 1999. "How does liquidity behave? A multidimensional analysis of NYSE stocks," DEE - Working Papers. Business Economics. WB 6433, Universidad Carlos III de Madrid. Departamento de Economía de la Empresa.
    9. Muendler, Marc-Andreas, 2008. "Risk-neutral investors do not acquire information," Finance Research Letters, Elsevier, vol. 5(3), pages 156-161, September.
    10. King, Michael R. & Osler, Carol L. & Rime, Dagfinn, 2013. "The market microstructure approach to foreign exchange: Looking back and looking forward," Journal of International Money and Finance, Elsevier, vol. 38(C), pages 95-119.
    11. Choi, Jin Hyuk & Larsen, Kasper & Seppi, Duane J., 2019. "Information and trading targets in a dynamic market equilibrium," Journal of Financial Economics, Elsevier, vol. 132(3), pages 22-49.
    12. Olivier Guéant, 2016. "The Financial Mathematics of Market Liquidity: From Optimal Execution to Market Making," Post-Print hal-01393136, HAL.
    13. Sastry, Ravi & Thompson, Rex, 2019. "Strategic trading with risk aversion and information flow," Journal of Financial Markets, Elsevier, vol. 44(C), pages 1-16.
    14. Lof, Matthijs & Bommel, Jos van, 2018. "Asymmetric information and the distribution of trading volume," Research Discussion Papers 1, Bank of Finland.
    15. repec:zbw:bofrdp:001 is not listed on IDEAS
    16. Kashyap, Ravi, 2020. "David vs Goliath (You against the Markets), A dynamic programming approach to separate the impact and timing of trading costs," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 545(C).
    17. Marcello Monga, 2024. "Automated Market Making and Decentralized Finance," Papers 2407.16885, arXiv.org.
    18. Carole Comerton-Forde & Michael A. O'Brien & P. Joakim Westerholm, 2007. "An Empirical Analysis of Strategic Behaviour Models," Australian Journal of Management, Australian School of Business, vol. 32(2), pages 181-203, December.
    19. Jin Hyuk Choi & Kasper Larsen & Duane J. Seppi, 2015. "Information and Trading Targets in a Dynamic Market Equilibrium," Papers 1502.02083, arXiv.org, revised Sep 2015.
    20. repec:zbw:bofrdp:2018_001 is not listed on IDEAS
    21. Ackert, Lucy F. & Church, Bryan K. & Zhang, Ping, 2018. "Informed traders’ performance and the information environment: Evidence from experimental asset markets," Accounting, Organizations and Society, Elsevier, vol. 70(C), pages 1-15.
    22. Bart Taub, 2018. "Inconspicuousness and obfuscation: how large shareholders dynamically manipulate output and information for trading purposes," Annals of Finance, Springer, vol. 14(4), pages 429-464, November.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2511.02016. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.