ABIDES-MARL: A Multi-Agent Reinforcement Learning Environment for Endogenous Price Formation and Execution in a Limit Order Book

ABIDES-MARL: A Multi-Agent Reinforcement Learning Environment for Endogenous Price Formation and Execution in a Limit Order Book

Author

Listed:

Patrick Cheridito
Jean-Loup Dupret
Zhexin Wu

Abstract

We present ABIDES-MARL, a framework that combines a new multi-agent reinforcement learning (MARL) methodology with a new realistic limit-order-book (LOB) simulation system to study equilibrium behavior in complex financial market games. The system extends ABIDES-Gym by decoupling state collection from kernel interruption, enabling synchronized learning and decision-making for multiple adaptive agents while maintaining compatibility with standard RL libraries. It preserves key market features such as price-time priority and discrete tick sizes. Methodologically, we use MARL to approximate equilibrium-like behavior in multi-period trading games with a finite number of heterogeneous agents-an informed trader, a liquidity trader, noise traders, and competing market makers-all with individual price impacts. This setting bridges optimal execution and market microstructure by embedding the liquidity trader's optimization problem within a strategic trading environment. We validate the approach by solving an extended Kyle model within the simulation system, recovering the gradual price discovery phenomenon. We then extend the analysis to a liquidity trader's problem where market liquidity arises endogenously and show that, at equilibrium, execution strategies shape market-maker behavior and price dynamics. ABIDES-MARL provides a reproducible foundation for analyzing equilibrium and strategic adaptation in realistic markets and contributes toward building economically interpretable agentic AI systems for finance.

Suggested Citation

Patrick Cheridito & Jean-Loup Dupret & Zhexin Wu, 2025. "ABIDES-MARL: A Multi-Agent Reinforcement Learning Environment for Endogenous Price Formation and Execution in a Limit Order Book," Papers 2511.02016, arXiv.org.

Handle: RePEc:arx:papers:2511.02016

Download full text from publisher

References listed on IDEAS

Bokai Cao & Saizhuo Wang & Xinyi Lin & Xiaojun Wu & Haohan Zhang & Lionel M. Ni & Jian Guo, 2025. "From Deep Learning to LLMs: A survey of AI in Quantitative Investment," Papers 2503.21422, arXiv.org.
Hongyang Yang & Boyu Zhang & Neng Wang & Cheng Guo & Xiaoli Zhang & Likun Lin & Junlin Wang & Tianyu Zhou & Mao Guan & Runjia Zhang & Christina Dan Wang, 2024. "FinRobot: An Open-Source AI Agent Platform for Financial Applications using Large Language Models," Papers 2405.14767, arXiv.org, revised May 2024.
Gianbiagio Curato & Jim Gatheral & Fabrizio Lillo, 2017. "Optimal execution with non-linear transient market impact," Quantitative Finance, Taylor & Francis Journals, vol. 17(1), pages 41-54, January.
Thibault Jaisson, 2021. "Deep differentiable reinforcement learning and optimal trading," Papers 2112.02944, arXiv.org, revised Apr 2022.
Ciamac C. Moallemi & Muye Wang, 2022. "A reinforcement learning approach to optimal execution," Quantitative Finance, Taylor & Francis Journals, vol. 22(6), pages 1051-1069, June.
Holden, Craig W & Subrahmanyam, Avanidhar, 1992. "Long-Lived Private Information and Imperfect Competition," Journal of Finance, American Finance Association, vol. 47(1), pages 247-270, March.
Obizhaeva, Anna A. & Wang, Jiang, 2013. "Optimal trading strategy and supply/demand dynamics," Journal of Financial Markets, Elsevier, vol. 16(1), pages 1-32.
- Anna Obizhaeva & Jiang Wang, 2005. "Optimal Trading Strategy and Supply/Demand Dynamics," NBER Working Papers 11444, National Bureau of Economic Research, Inc.
Glosten, Lawrence R. & Milgrom, Paul R., 1985. "Bid, ask and transaction prices in a specialist market with heterogeneously informed traders," Journal of Financial Economics, Elsevier, vol. 14(1), pages 71-100, March.
- Lawrence R. Glosten & Paul R. Milgrom, 1983. "Bid, Ask and Transaction Prices in a Specialist Market with Heterogeneously Informed Traders," Discussion Papers 570, Northwestern University, Center for Mathematical Studies in Economics and Management Science.
Kyle, Albert S, 1985. "Continuous Auctions and Insider Trading," Econometrica, Econometric Society, vol. 53(6), pages 1315-1335, November.
Bertsimas, Dimitris & Lo, Andrew W., 1998. "Optimal control of execution costs," Journal of Financial Markets, Elsevier, vol. 1(1), pages 1-50, April.
Jean-Loup Dupret & Donatien Hainaut, 2025. "Optimal liquidation under indirect price impact with propagator," Quantitative Finance, Taylor & Francis Journals, vol. 25(3), pages 359-381, March.
Ho, Thomas & Stoll, Hans R., 1981. "Optimal dealer pricing under transactions and return uncertainty," Journal of Financial Economics, Elsevier, vol. 9(1), pages 47-73, March.
- Thomas Ho & Hans Stoll, "undated". "Optimal Dealer Pricing Under Transactions and Return Uncertainty," Rodney L. White Center for Financial Research Working Papers 27-79, Wharton School Rodney L. White Center for Financial Research.
Albert S. Kyle, 1989. "Informed Speculation with Imperfect Competition," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 56(3), pages 317-355.
Fengpei Li & Vitalii Ihnatiuk & Yu Chen & Jiahe Lin & Ryan J. Kinnear & Anderson Schneider & Yuriy Nevmyvaka & Henry Lam, 2024. "Do price trajectory data increase the efficiency of market impact estimation?," Quantitative Finance, Taylor & Francis Journals, vol. 24(5), pages 545-568, May.
Foster, F Douglas & Viswanathan, S, 1993. "The Effect of Public Information and Competition on Trading Volume and Price Volatility," The Review of Financial Studies, Society for Financial Studies, vol. 6(1), pages 23-56.
Yangyang Yu & Haohang Li & Zhi Chen & Yuechen Jiang & Yang Li & Denghui Zhang & Rong Liu & Jordan W. Suchow & Khaldoun Khashanah, 2023. "FinMem: A Performance-Enhanced LLM Trading Agent with Layered Memory and Character Design," Papers 2311.13743, arXiv.org, revised Dec 2023.
Thibault Jaisson, 2022. "Deep differentiable reinforcement learning and optimal trading," Quantitative Finance, Taylor & Francis Journals, vol. 22(8), pages 1429-1443, August.
Foster, F Douglas & Viswanathan, S, 1996. "Strategic Trading When Agents Forecast the Forecasts of Others," Journal of Finance, American Finance Association, vol. 51(4), pages 1437-1478, September.
Engle, Robert F, 1982. "Autoregressive Conditional Heteroscedasticity with Estimates of the Variance of United Kingdom Inflation," Econometrica, Econometric Society, vol. 50(4), pages 987-1007, July.

Full references (including those not matched with items on IDEAS)

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Vayanos, Dimitri & Wang, Jiang, 2013. "Market Liquidityâ€”Theory and Empirical Evidence ," Handbook of the Economics of Finance, in: G.M. Constantinides & M. Harris & R. M. Stulz (ed.), Handbook of the Economics of Finance, volume 2, chapter 0, pages 1289-1361, Elsevier.
LOVO, Stefano M. & CALCAGNO, R., 2001. "Market efficiency and Price Formation when Dealers are Asymmetrically Informed," HEC Research Papers Series 737, HEC Paris.
Ledenyov, Dimitri O. & Ledenyov, Viktor O., 2015. "Wave function method to forecast foreign currencies exchange rates at ultra high frequency electronic trading in foreign currencies exchange markets," MPRA Paper 67470, University Library of Munich, Germany.
van Kervel, Vincent & Kwan, Amy & Westerholm, P. Joakim, 2023. "Order splitting and interacting with a counterparty," Journal of Financial Markets, Elsevier, vol. 66(C).
Jos Van Bommel & Jay Dahya & Zhihong Shi, 2010. "An empirical investigation of the speed of information aggregation: a study of IPOs," International Journal of Banking, Accounting and Finance, Inderscience Enterprises Ltd, vol. 2(1), pages 47-79.
Vayanos, Dimitri & Wang, Jiang, 2012. "Market liquidity - theory and empirical evidence," LSE Research Online Documents on Economics 119044, London School of Economics and Political Science, LSE Library.
- Dimitri Vayanos & Jiang Wang, 2012. "Market Liquidity -- Theory and Empirical Evidence," NBER Working Papers 18251, National Bureau of Economic Research, Inc.
- Dimitri Vayanos & Jiang Wang, 2012. "Market Liquidity - Theory and Empirical Evidence," FMG Discussion Papers dp709, Financial Markets Group.
Lof, Matthijs & van Bommel, Jos, 2023. "Asymmetric information and the distribution of trading volume," Journal of Corporate Finance, Elsevier, vol. 82(C).
- Lof, Matthijs & Bommel, Jos van, 2018. "Asymmetric information and the distribution of trading volume," Bank of Finland Research Discussion Papers 1/2018, Bank of Finland.
- Lof, Matthijs & Bommel, Jos van, 2018. "Asymmetric information and the distribution of trading volume," Bank of Finland Research Discussion Papers 1/2018, Bank of Finland.
Pascual, Roberto & Escribano, Álvaro & Tapia, Mikel, 1999. "How does liquidity behave? A multidimensional analysis of NYSE stocks," DEE - Working Papers. Business Economics. WB 6433, Universidad Carlos III de Madrid. Departamento de EconomÃa de la Empresa.
King, Michael R. & Osler, Carol L. & Rime, Dagfinn, 2013. "The market microstructure approach to foreign exchange: Looking back and looking forward," Journal of International Money and Finance, Elsevier, vol. 38(C), pages 95-119.
- Michael King & Carol Osler & Dagfinn Rime, 2012. "The Market Microstructure Approach to Foreign Exchange: Looking Back and Looking Forward," Working Papers 54, Brandeis University, Department of Economics and International Business School.
- Michael R. King & Carol Osler & Dagfinn Rime, 2013. "The market microstructure approach to foreign exchange - Looking back and looking forward," Working Paper 2013/12, Norges Bank.
Muendler, Marc-Andreas, 2008. "Risk-neutral investors do not acquire information," Finance Research Letters, Elsevier, vol. 5(3), pages 156-161, September.
- Muendler, Marc-Andreas, 2005. "Risk Neutral Investors Do Not Acquire Information¤," University of California at San Diego, Economics Working Paper Series qt8fg5g853, Department of Economics, UC San Diego.
Choi, Jin Hyuk & Larsen, Kasper & Seppi, Duane J., 2019. "Information and trading targets in a dynamic market equilibrium," Journal of Financial Economics, Elsevier, vol. 132(3), pages 22-49.
Olivier Guéant, 2016. "The Financial Mathematics of Market Liquidity: From Optimal Execution to Market Making," Post-Print hal-01393136, HAL.
Sastry, Ravi & Thompson, Rex, 2019. "Strategic trading with risk aversion and information flow," Journal of Financial Markets, Elsevier, vol. 44(C), pages 1-16.
Lof, Matthijs & Bommel, Jos van, 2018. "Asymmetric information and the distribution of trading volume," Research Discussion Papers 1, Bank of Finland.
- Lof, Matthijs & Bommel, Jos van, 2018. "Asymmetric information and the distribution of trading volume," Research Discussion Papers 1/2018, Bank of Finland.
Kashyap, Ravi, 2020. "David vs Goliath (You against the Markets), A dynamic programming approach to separate the impact and timing of trading costs," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 545(C).
Marcello Monga, 2024. "Automated Market Making and Decentralized Finance," Papers 2407.16885, arXiv.org.
Carole Comerton-Forde & Michael A. O'Brien & P. Joakim Westerholm, 2007. "An Empirical Analysis of Strategic Behaviour Models," Australian Journal of Management, Australian School of Business, vol. 32(2), pages 181-203, December.
Jin Hyuk Choi & Kasper Larsen & Duane J. Seppi, 2015. "Information and Trading Targets in a Dynamic Market Equilibrium," Papers 1502.02083, arXiv.org, revised Sep 2015.
Ackert, Lucy F. & Church, Bryan K. & Zhang, Ping, 2018. "Informed traders’ performance and the information environment: Evidence from experimental asset markets," Accounting, Organizations and Society, Elsevier, vol. 70(C), pages 1-15.
Bart Taub, 2018. "Inconspicuousness and obfuscation: how large shareholders dynamically manipulate output and information for trading purposes," Annals of Finance, Springer, vol. 14(4), pages 429-464, November.

More about this item

NEP fields

This paper has been announced in the following NEP Reports:

NEP-CMP-2025-11-10 (Computational Economics)
NEP-MST-2025-11-10 (Market Microstructure)

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2511.02016. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

ABIDES-MARL: A Multi-Agent Reinforcement Learning Environment for Endogenous Price Formation and Execution in a Limit Order Book

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Most related items

More about this item

NEP fields

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data