IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2511.15262.html

Reinforcement Learning in Queue-Reactive Models: Application to Optimal Execution

Author

Listed:
  • Tomas Espana
  • Yadh Hafsi
  • Fabrizio Lillo
  • Edoardo Vittori

Abstract

We investigate the use of Reinforcement Learning for the optimal execution of meta-orders, where the objective is to execute incrementally large orders while minimizing implementation shortfall and market impact over an extended period of time. Departing from traditional parametric approaches to price dynamics and impact modeling, we adopt a model-free, data-driven framework. Since policy optimization requires counterfactual feedback that historical data cannot provide, we employ the Queue-Reactive Model to generate realistic and tractable limit order book simulations that encompass transient price impact, and nonlinear and dynamic order flow responses. Methodologically, we train a Double Deep Q-Network agent on a state space comprising time, inventory, price, and depth variables, and evaluate its performance against established benchmarks. Numerical simulation results show that the agent learns a policy that is both strategic and tactical, adapting effectively to order book conditions and outperforming standard approaches across multiple training configurations. These findings provide strong evidence that model-free Reinforcement Learning can yield adaptive and robust solutions to the optimal execution problem.

Suggested Citation

  • Tomas Espana & Yadh Hafsi & Fabrizio Lillo & Edoardo Vittori, 2025. "Reinforcement Learning in Queue-Reactive Models: Application to Optimal Execution," Papers 2511.15262, arXiv.org.
  • Handle: RePEc:arx:papers:2511.15262
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2511.15262
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Jean-Philippe Bouchaud & Yuval Gefen & Marc Potters & Matthieu Wyart, 2003. "Fluctuations and response in financial markets: the subtle nature of `random' price changes," Papers cond-mat/0307332, arXiv.org, revised Aug 2003.
    2. Damian Eduardo Taranto & Giacomo Bormetti & Jean-Philippe Bouchaud & Fabrizio Lillo & Bence Tóth, 2018. "Linear models for the impact of order flow on prices. I. History dependent impact models," Quantitative Finance, Taylor & Francis Journals, vol. 18(6), pages 903-915, June.
    3. Fabrizio Pomponio & Frédéric Abergel, 2013. "Multiple-limit trades : empirical facts and application to lead-lag measures," Post-Print hal-00745317, HAL.
    4. Jim Gatheral & Alexander Schied, 2011. "Optimal Trade Execution Under Geometric Brownian Motion In The Almgren And Chriss Framework," International Journal of Theoretical and Applied Finance (IJTAF), World Scientific Publishing Co. Pte. Ltd., vol. 14(03), pages 353-368.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jean-Philippe Bouchaud & Julien Kockelkoren & Marc Potters, 2006. "Random walks, liquidity molasses and critical response in financial markets," Quantitative Finance, Taylor & Francis Journals, vol. 6(2), pages 115-123.
    2. Eyal Neuman & Alexander Schied, 2018. "Protecting Pegged Currency Markets from Speculative Investors," Papers 1801.07784, arXiv.org, revised Feb 2021.
    3. Xiaoyue Li & John M. Mulvey, 2023. "Optimal Portfolio Execution in a Regime-switching Market with Non-linear Impact Costs: Combining Dynamic Program and Neural Network," Papers 2306.08809, arXiv.org.
    4. Olivier Guéant & Charles-Albert Lehalle, 2015. "General Intensity Shapes In Optimal Liquidation," Mathematical Finance, Wiley Blackwell, vol. 25(3), pages 457-495, July.
    5. Claudio Bellani & Damiano Brigo, 2021. "Mechanics of good trade execution in the framework of linear temporary market impact," Quantitative Finance, Taylor & Francis Journals, vol. 21(1), pages 143-163, January.
    6. Fengpei Li & Vitalii Ihnatiuk & Ryan Kinnear & Anderson Schneider & Yuriy Nevmyvaka, 2022. "Do price trajectory data increase the efficiency of market impact estimation?," Papers 2205.13423, arXiv.org, revised Mar 2023.
    7. Nicolas Huth & Frédéric Abergel, 2012. "The times change: multivariate subordination, empirical facts," Post-Print hal-00620841, HAL.
    8. Christopher Lorenz & Alexander Schied, 2013. "Drift dependence of optimal trade execution strategies under transient price impact," Finance and Stochastics, Springer, vol. 17(4), pages 743-770, October.
    9. Karol Wawrzyniak & Wojciech Wi'slicki, 2013. "Grand canonical minority game as a sign predictor," Papers 1309.3399, arXiv.org.
    10. Yan Dolinsky & Doron Greenstein, 2024. "A Note on Optimal Liquidation with Linear Price Impact," Papers 2402.14100, arXiv.org, revised Aug 2024.
    11. Aur'elien Alfonsi & Antje Fruth & Alexander Schied, 2007. "Optimal execution strategies in limit order books with general shape functions," Papers 0708.1756, arXiv.org, revised Feb 2010.
    12. R'emy Chicheportiche & Jean-Philippe Bouchaud, 2012. "The fine-structure of volatility feedback I: multi-scale self-reflexivity," Papers 1206.2153, arXiv.org, revised Sep 2013.
    13. Cesari, Riccardo & Marzo, Massimiliano & Zagaglia, Paolo, 2012. "Effective Trade Execution," MPRA Paper 39619, University Library of Munich, Germany.
    14. Svitlana Vyetrenko & David Byrd & Nick Petosa & Mahmoud Mahfouz & Danial Dervovic & Manuela Veloso & Tucker Hybinette Balch, 2019. "Get Real: Realism Metrics for Robust Limit Order Book Market Simulations," Papers 1912.04941, arXiv.org.
    15. Guanxing Fu & Ulrich Horst & Xiaonyu Xia, 2022. "Portfolio liquidation games with self‐exciting order flow," Mathematical Finance, Wiley Blackwell, vol. 32(4), pages 1020-1065, October.
    16. Francesco Cordoni & Fabrizio Lillo, 2022. "Transient impact from the Nash equilibrium of a permanent market impact game," Papers 2205.00494, arXiv.org, revised Mar 2023.
    17. Aurélien Alfonsi & Alexander Schied, 2010. "Optimal trade execution and absence of price manipulations in limit order book models," Post-Print hal-00397652, HAL.
    18. Damiano Brigo & Giuseppe Di Graziano, 2013. "Optimal execution comparison across risks and dynamics, with solutions for displaced diffusions," Papers 1304.2942, arXiv.org, revised May 2014.
    19. Olivier Guedj & Jean-Philippe Bouchaud, 2004. "Experts' earning forecasts: bias, herding and gossamer information," Science & Finance (CFM) working paper archive 500062, Science & Finance, Capital Fund Management.
    20. Adam Blazejewski & Richard Coggins, 2004. "A piecewise linear model for trade sign inference," Finance 0412012, University Library of Munich, Germany.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2511.15262. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.