IDEAS home Printed from https://ideas.repec.org/a/bla/mathfi/v33y2023i3p437-503.html
   My bibliography  Save this article

Recent advances in reinforcement learning in finance

Author

Listed:
  • Ben Hambly
  • Renyuan Xu
  • Huining Yang

Abstract

The rapid changes in the finance industry due to the increasing amount of data have revolutionized the techniques on data processing and data analysis and brought new theoretical and computational challenges. In contrast to classical stochastic control theory and other analytical approaches for solving financial decision‐making problems that heavily reply on model assumptions, new developments from reinforcement learning (RL) are able to make full use of the large amount of financial data with fewer model assumptions and to improve decisions in complex financial environments. This survey paper aims to review the recent developments and use of RL approaches in finance. We give an introduction to Markov decision processes, which is the setting for many of the commonly used RL approaches. Various algorithms are then introduced with a focus on value‐ and policy‐based methods that do not require any model assumptions. Connections are made with neural networks to extend the framework to encompass deep RL algorithms. We then discuss in detail the application of these RL algorithms in a variety of decision‐making problems in finance, including optimal execution, portfolio optimization, option pricing and hedging, market making, smart order routing, and robo‐advising. Our survey concludes by pointing out a few possible future directions for research.

Suggested Citation

  • Ben Hambly & Renyuan Xu & Huining Yang, 2023. "Recent advances in reinforcement learning in finance," Mathematical Finance, Wiley Blackwell, vol. 33(3), pages 437-503, July.
  • Handle: RePEc:bla:mathfi:v:33:y:2023:i:3:p:437-503
    DOI: 10.1111/mafi.12382
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/mafi.12382
    Download Restriction: no

    File URL: https://libkey.io/10.1111/mafi.12382?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Longstaff, Francis A & Schwartz, Eduardo S, 2001. "Valuing American Options by Simulation: A Simple Least-Squares Approach," The Review of Financial Studies, Society for Financial Studies, vol. 14(1), pages 113-147.
    2. Olivier Gu'eant & Charles-Albert Lehalle & Joaquin Fernandez Tapia, 2011. "Dealing with the Inventory Risk. A solution to the market making problem," Papers 1105.3115, arXiv.org, revised Aug 2012.
    3. Thomas Spooner & Rahul Savani, 2020. "Robust Market Making via Adversarial Reinforcement Learning," Papers 2003.01820, arXiv.org, revised Jul 2020.
    4. T. Tony Ke & Zuo-Jun Max Shen & J. Miguel Villas-Boas, 2016. "Search for Information on Multiple Products," Management Science, INFORMS, vol. 62(12), pages 3576-3603, December.
    5. R. H. Strotz, 1955. "Myopia and Inconsistency in Dynamic Utility Maximization," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 23(3), pages 165-180.
    6. Olivier Gu'eant & Charles-Albert Lehalle & Joaquin Fernandez Tapia, 2011. "Optimal Portfolio Liquidation with Limit Orders," Papers 1106.3279, arXiv.org, revised Jul 2012.
    7. Leland, Hayne E, 1985. "Option Pricing and Replication with Transactions Costs," Journal of Finance, American Finance Association, vol. 40(5), pages 1283-1301, December.
    8. Alexandre Carbonneau & Fr'ed'eric Godin, 2021. "Deep Equal Risk Pricing of Financial Derivatives with Multiple Hedging Instruments," Papers 2102.12694, arXiv.org.
    9. Merton, Robert C. & Samuelson, Paul A., 1974. "Fallacy of the log-normal approximation to optimal portfolio decision-making over many periods," Journal of Financial Economics, Elsevier, vol. 1(1), pages 67-94, May.
    10. Bastien Baldacci & Iuliia Manziuk & Thibaut Mastrolia & Mathieu Rosenbaum, 2019. "Market making and incentives design in the presence of a dark pool: a deep reinforcement learning approach," Papers 1912.01129, arXiv.org.
    11. Robert C. Merton, 2005. "Theory of rational option pricing," World Scientific Book Chapters, in: Sudipto Bhattacharya & George M Constantinides (ed.), Theory Of Valuation, chapter 8, pages 229-288, World Scientific Publishing Co. Pte. Ltd..
    12. Anirban Chakraborti & Ioane Muni Toke & Marco Patriarca & Frederic Abergel, 2011. "Econophysics review: I. Empirical facts," Quantitative Finance, Taylor & Francis Journals, vol. 11(7), pages 991-1012.
    13. Mark Broadie & Jerome B. Detemple, 2004. "ANNIVERSARY ARTICLE: Option Pricing: Valuation Models and Applications," Management Science, INFORMS, vol. 50(9), pages 1145-1177, September.
    14. Wenhang Bao & Xiao-yang Liu, 2019. "Multi-Agent Deep Reinforcement Learning for Liquidation Strategy Analysis," Papers 1906.11046, arXiv.org.
    15. Heston, Steven L, 1993. "A Closed-Form Solution for Options with Stochastic Volatility with Applications to Bond and Currency Options," The Review of Financial Studies, Society for Financial Studies, vol. 6(2), pages 327-343.
    16. Figlewski, Stephen, 1989. " Options Arbitrage in Imperfect Markets," Journal of Finance, American Finance Association, vol. 44(5), pages 1289-1311, December.
    17. Rama Cont & Arseniy Kukanov, 2017. "Optimal order placement in limit order markets," Quantitative Finance, Taylor & Francis Journals, vol. 17(1), pages 21-39, January.
    18. Olivier Guéant & Iuliia Manziuk, 2019. "Deep Reinforcement Learning for Market Making in Corporate Bonds: Beating the Curse of Dimensionality," Applied Mathematical Finance, Taylor & Francis Journals, vol. 26(5), pages 387-452, September.
    19. Bastien Baldacci & Iuliia Manziuk, 2020. "Adaptive trading strategies across liquidity pools," Papers 2008.07807, arXiv.org.
    20. Igor Halperin, 2019. "The QLBS Q-Learner goes NuQLear: fitted Q iteration, inverse RL, and option portfolios," Quantitative Finance, Taylor & Francis Journals, vol. 19(9), pages 1543-1553, September.
    21. Terry Lingze Meng & Matloob Khushi, 2019. "Reinforcement Learning in Financial Markets," Data, MDPI, vol. 4(3), pages 1-17, July.
    22. Zhengyao Jiang & Dixing Xu & Jinjun Liang, 2017. "A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem," Papers 1706.10059, arXiv.org, revised Jul 2017.
    23. Zhipeng Liang & Hao Chen & Junhao Zhu & Kangkang Jiang & Yanran Li, 2018. "Adversarial Deep Reinforcement Learning in Portfolio Management," Papers 1808.09940, arXiv.org, revised Nov 2018.
    24. Yanwei Jia & Xun Yu Zhou, 2021. "Policy Evaluation and Temporal-Difference Learning in Continuous Time and Space: A Martingale Approach," Papers 2108.06655, arXiv.org, revised Feb 2022.
    25. Jay Cao & Jacky Chen & John Hull & Zissis Poulos, 2021. "Deep Hedging of Derivatives Using Reinforcement Learning," Papers 2103.16409, arXiv.org.
    26. Olivier Guéant & Iuliia Manziuk, 2019. "Deep Reinforcement Learning for Market Making in Corporate Bonds: Beating the Curse of Dimensionality," Post-Print hal-03252505, HAL.
    27. Obizhaeva, Anna A. & Wang, Jiang, 2013. "Optimal trading strategy and supply/demand dynamics," Journal of Financial Markets, Elsevier, vol. 16(1), pages 1-32.
    28. Magnus Wiese & Robert Knobloch & Ralf Korn & Peter Kretschmer, 2020. "Quant GANs: deep generation of financial time series," Quantitative Finance, Taylor & Francis Journals, vol. 20(9), pages 1419-1440, September.
    29. Ben Hambly & Renyuan Xu & Huining Yang, 2020. "Policy Gradient Methods for the Noisy Linear Quadratic Regulator over a Finite Horizon," Papers 2011.10300, arXiv.org, revised Jun 2021.
    30. R. Cont, 2001. "Empirical properties of asset returns: stylized facts and statistical issues," Quantitative Finance, Taylor & Francis Journals, vol. 1(2), pages 223-236.
    31. Olivier Gu'eant & Iuliia Manziuk, 2019. "Deep reinforcement learning for market making in corporate bonds: beating the curse of dimensionality," Papers 1910.13205, arXiv.org.
    32. Anirban Chakraborti & Ioane Muni Toke & Marco Patriarca & Frédéric Abergel, 2011. "Econophysics review: I. Empirical facts," Post-Print hal-00621058, HAL.
    33. Black, Fischer & Scholes, Myron S, 1973. "The Pricing of Options and Corporate Liabilities," Journal of Political Economy, University of Chicago Press, vol. 81(3), pages 637-654, May-June.
    34. Matthew Dixon & Igor Halperin, 2020. "G-Learner and GIRL: Goal Based Wealth Management with Reinforcement Learning," Papers 2002.10990, arXiv.org.
    35. Alexandre Carbonneau & Frédéric Godin, 2021. "Equal risk pricing of derivatives with deep hedging," Quantitative Finance, Taylor & Francis Journals, vol. 21(4), pages 593-608, April.
    36. Olivier Guéant & Iuliia Manziuk, 2019. "Deep Reinforcement Learning for Market Making in Corporate Bonds: Beating the Curse of Dimensionality," Université Paris1 Panthéon-Sorbonne (Post-Print and Working Papers) hal-03252505, HAL.
    37. Haoran Wang & Shi Yu, 2021. "Robo-Advising: Enhancing Investment with Inverse Optimization and Deep Reinforcement Learning," Papers 2105.09264, arXiv.org.
    38. Sumitra Ganesh & Nelson Vadori & Mengda Xu & Hua Zheng & Prashant Reddy & Manuela Veloso, 2019. "Reinforcement Learning for Market Making in a Multi-agent Dealer Market," Papers 1911.05892, arXiv.org.
    39. MOSSIN, Jan, 1968. "Optimal multiperiod portfolio policies," LIDAM Reprints CORE 19, Université catholique de Louvain, Center for Operations Research and Econometrics (CORE).
    40. Amir Mosavi & Pedram Ghamisi & Yaser Faghan & Puhong Duan, 2020. "Comprehensive Review of Deep Reinforcement Learning Methods and Applications in Economics," Papers 2004.01509, arXiv.org.
    41. Shuo Sun & Rundong Wang & Bo An, 2021. "Reinforcement Learning for Quantitative Trading," Papers 2109.13851, arXiv.org.
    42. Mosavi, Amir & Faghan, Yaser & Ghamisi, Pedram & Duan, Puhong & Ardabili, Sina Faizollahzadeh & Hassan, Salwana & Band, Shahab S., 2020. "Comprehensive Review of Deep Reinforcement Learning Methods and Applications in Economics," OSF Preprints jrc58, Center for Open Science.
    43. Susanne Klöppel & Martin Schweizer, 2007. "Dynamic Indifference Valuation Via Convex Risk Measures," Mathematical Finance, Wiley Blackwell, vol. 17(4), pages 599-627, October.
    44. Thomas Spooner & John Fearnley & Rahul Savani & Andreas Koukorinis, 2018. "Market Making via Reinforcement Learning," Papers 1804.04216, arXiv.org.
    45. Cox, John C. & Ross, Stephen A. & Rubinstein, Mark, 1979. "Option pricing: A simplified approach," Journal of Financial Economics, Elsevier, vol. 7(3), pages 229-263, September.
    46. Amirhosein Mosavi & Yaser Faghan & Pedram Ghamisi & Puhong Duan & Sina Faizollahzadeh Ardabili & Ely Salwana & Shahab S. Band, 2020. "Comprehensive Review of Deep Reinforcement Learning Methods and Applications in Economics," Mathematics, MDPI, vol. 8(10), pages 1-42, September.
    47. Longstaff, Francis A & Schwartz, Eduardo S, 2001. "Valuing American Options by Simulation: A Simple Least-Squares Approach," University of California at Los Angeles, Anderson Graduate School of Management qt43n1k4jb, Anderson Graduate School of Management, UCLA.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Woosung Koh & Insu Choi & Yuntae Jang & Gimin Kang & Woo Chang Kim, 2023. "Curriculum Learning and Imitation Learning for Model-free Control on Financial Time-series," Papers 2311.13326, arXiv.org, revised Jan 2024.
    2. Xianhua Peng & Chenyin Gong & Xue Dong He, 2023. "Reinforcement Learning for Financial Index Tracking," Papers 2308.02820, arXiv.org.
    3. Xiangyu Cui & Xun Li & Yun Shi & Si Zhao, 2023. "Discrete-Time Mean-Variance Strategy Based on Reinforcement Learning," Papers 2312.15385, arXiv.org.
    4. Reilly Pickard & Yuri Lawryshyn, 2023. "Deep Reinforcement Learning for Dynamic Stock Option Hedging: A Review," Mathematics, MDPI, vol. 11(24), pages 1-19, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ben Hambly & Renyuan Xu & Huining Yang, 2021. "Recent Advances in Reinforcement Learning in Finance," Papers 2112.04553, arXiv.org, revised Feb 2023.
    2. Bruno Gašperov & Stjepan Begušić & Petra Posedel Šimović & Zvonko Kostanjčar, 2021. "Reinforcement Learning Approaches to Optimal Market Making," Mathematics, MDPI, vol. 9(21), pages 1-22, October.
    3. Shuo Sun & Rundong Wang & Bo An, 2021. "Reinforcement Learning for Quantitative Trading," Papers 2109.13851, arXiv.org.
    4. Zoran Stoiljkovic, 2023. "Applying Reinforcement Learning to Option Pricing and Hedging," Papers 2310.04336, arXiv.org.
    5. Bruno Gav{s}perov & Zvonko Kostanjv{c}ar, 2022. "Deep Reinforcement Learning for Market Making Under a Hawkes Process-Based Limit Order Book Model," Papers 2207.09951, arXiv.org.
    6. Pankaj Kumar, 2021. "Deep Hawkes Process for High-Frequency Market Making," Papers 2109.15110, arXiv.org.
    7. Xiaoyu Tan & Zili Zhang & Xuejun Zhao & Shuyi Wang, 2022. "DeepPricing: pricing convertible bonds based on financial time-series generative adversarial networks," Financial Innovation, Springer;Southwestern University of Finance and Economics, vol. 8(1), pages 1-38, December.
    8. Alexandre Carbonneau & Fr'ed'eric Godin, 2021. "Deep equal risk pricing of financial derivatives with non-translation invariant risk measures," Papers 2107.11340, arXiv.org.
    9. Lim, Terence & Lo, Andrew W. & Merton, Robert C. & Scholes, Myron S., 2006. "The Derivatives Sourcebook," Foundations and Trends(R) in Finance, now publishers, vol. 1(5–6), pages 365-572, April.
    10. Bastien Baldacci & Jerome Benveniste & Gordon Ritter, 2020. "Optimal trading without optimal control," Papers 2012.12945, arXiv.org.
    11. Duffie, Darrell, 2003. "Intertemporal asset pricing theory," Handbook of the Economics of Finance, in: G.M. Constantinides & M. Harris & R. M. Stulz (ed.), Handbook of the Economics of Finance, edition 1, volume 1, chapter 11, pages 639-742, Elsevier.
    12. Olivier Guéant, 2016. "The Financial Mathematics of Market Liquidity: From Optimal Execution to Market Making," Post-Print hal-01393136, HAL.
    13. Bastien Baldacci & Iuliia Manziuk, 2020. "Adaptive trading strategies across liquidity pools," Papers 2008.07807, arXiv.org.
    14. Mark Broadie & Jerome B. Detemple, 2004. "ANNIVERSARY ARTICLE: Option Pricing: Valuation Models and Applications," Management Science, INFORMS, vol. 50(9), pages 1145-1177, September.
    15. Duy Nguyen, 2018. "A hybrid Markov chain-tree valuation framework for stochastic volatility jump diffusion models," International Journal of Financial Engineering (IJFE), World Scientific Publishing Co. Pte. Ltd., vol. 5(04), pages 1-30, December.
    16. Li, Chenxu & Ye, Yongxin, 2019. "Pricing and Exercising American Options: an Asymptotic Expansion Approach," Journal of Economic Dynamics and Control, Elsevier, vol. 107(C), pages 1-1.
    17. Chen, Ding & Härkönen, Hannu J. & Newton, David P., 2014. "Advancing the universality of quadrature methods to any underlying process for option pricing," Journal of Financial Economics, Elsevier, vol. 114(3), pages 600-612.
    18. Katarzyna Toporek, 2012. "Simple is better. Empirical comparison of American option valuation methods," Ekonomia journal, Faculty of Economic Sciences, University of Warsaw, vol. 29.
    19. Francesco Rotondi, 2019. "American Options on High Dividend Securities: A Numerical Investigation," Risks, MDPI, vol. 7(2), pages 1-20, May.
    20. Zafar Ahmad & Reilly Browne & Rezaul Chowdhury & Rathish Das & Yushen Huang & Yimin Zhu, 2023. "Fast American Option Pricing using Nonlinear Stencils," Papers 2303.02317, arXiv.org, revised Oct 2023.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:mathfi:v:33:y:2023:i:3:p:437-503. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: http://www.blackwellpublishing.com/journal.asp?ref=0960-1627 .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.