IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2112.04553.html
   My bibliography  Save this paper

Recent Advances in Reinforcement Learning in Finance

Author

Listed:
  • Ben Hambly
  • Renyuan Xu
  • Huining Yang

Abstract

The rapid changes in the finance industry due to the increasing amount of data have revolutionized the techniques on data processing and data analysis and brought new theoretical and computational challenges. In contrast to classical stochastic control theory and other analytical approaches for solving financial decision-making problems that heavily reply on model assumptions, new developments from reinforcement learning (RL) are able to make full use of the large amount of financial data with fewer model assumptions and to improve decisions in complex financial environments. This survey paper aims to review the recent developments and use of RL approaches in finance. We give an introduction to Markov decision processes, which is the setting for many of the commonly used RL approaches. Various algorithms are then introduced with a focus on value and policy based methods that do not require any model assumptions. Connections are made with neural networks to extend the framework to encompass deep RL algorithms. Our survey concludes by discussing the application of these RL algorithms in a variety of decision-making problems in finance, including optimal execution, portfolio optimization, option pricing and hedging, market making, smart order routing, and robo-advising.

Suggested Citation

  • Ben Hambly & Renyuan Xu & Huining Yang, 2021. "Recent Advances in Reinforcement Learning in Finance," Papers 2112.04553, arXiv.org, revised Feb 2023.
  • Handle: RePEc:arx:papers:2112.04553
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2112.04553
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Michael Karpe & Jin Fang & Zhongyao Ma & Chen Wang, 2020. "Multi-Agent Reinforcement Learning in a Realistic Limit Order Book Market Simulation," Papers 2006.05574, arXiv.org, revised Sep 2020.
    2. Longstaff, Francis A & Schwartz, Eduardo S, 2001. "Valuing American Options by Simulation: A Simple Least-Squares Approach," The Review of Financial Studies, Society for Financial Studies, vol. 14(1), pages 113-147.
    3. Luciano Pomatto & Philipp Strack & Omer Tamuz, 2018. "The Cost of Information: The Case of Constant Marginal Costs," Papers 1812.04211, arXiv.org, revised Feb 2023.
    4. Olivier Guéant & Iuliia Manziuk, 2019. "Deep Reinforcement Learning for Market Making in Corporate Bonds: Beating the Curse of Dimensionality," Post-Print hal-03252505, HAL.
    5. Obizhaeva, Anna A. & Wang, Jiang, 2013. "Optimal trading strategy and supply/demand dynamics," Journal of Financial Markets, Elsevier, vol. 16(1), pages 1-32.
    6. Thomas Spooner & Rahul Savani, 2020. "Robust Market Making via Adversarial Reinforcement Learning," Papers 2003.01820, arXiv.org, revised Jul 2020.
    7. T. Tony Ke & Zuo-Jun Max Shen & J. Miguel Villas-Boas, 2016. "Search for Information on Multiple Products," Management Science, INFORMS, vol. 62(12), pages 3576-3603, December.
    8. R. H. Strotz, 1955. "Myopia and Inconsistency in Dynamic Utility Maximization," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 23(3), pages 165-180.
    9. Olivier Gu'eant & Charles-Albert Lehalle & Joaquin Fernandez Tapia, 2011. "Optimal Portfolio Liquidation with Limit Orders," Papers 1106.3279, arXiv.org, revised Jul 2012.
    10. Leland, Hayne E, 1985. "Option Pricing and Replication with Transactions Costs," Journal of Finance, American Finance Association, vol. 40(5), pages 1283-1301, December.
    11. Ben Hambly & Renyuan Xu & Huining Yang, 2020. "Policy Gradient Methods for the Noisy Linear Quadratic Regulator over a Finite Horizon," Papers 2011.10300, arXiv.org, revised Jun 2021.
    12. Alexandre Carbonneau & Fr'ed'eric Godin, 2021. "Deep Equal Risk Pricing of Financial Derivatives with Multiple Hedging Instruments," Papers 2102.12694, arXiv.org.
    13. Duan Li & Wan‐Lung Ng, 2000. "Optimal Dynamic Portfolio Selection: Multiperiod Mean‐Variance Formulation," Mathematical Finance, Wiley Blackwell, vol. 10(3), pages 387-406, July.
    14. Merton, Robert C. & Samuelson, Paul A., 1974. "Fallacy of the log-normal approximation to optimal portfolio decision-making over many periods," Journal of Financial Economics, Elsevier, vol. 1(1), pages 67-94, May.
    15. Norden E. Huang & Man‐Li Wu & Wendong Qu & Steven R. Long & Samuel S. P. Shen, 2003. "Applications of Hilbert–Huang transform to non‐stationary financial time series analysis," Applied Stochastic Models in Business and Industry, John Wiley & Sons, vol. 19(3), pages 245-268, July.
    16. Bastien Baldacci & Iuliia Manziuk & Thibaut Mastrolia & Mathieu Rosenbaum, 2019. "Market making and incentives design in the presence of a dark pool: a deep reinforcement learning approach," Papers 1912.01129, arXiv.org.
    17. Robert C. Merton, 2005. "Theory of rational option pricing," World Scientific Book Chapters, in: Sudipto Bhattacharya & George M Constantinides (ed.), Theory Of Valuation, chapter 8, pages 229-288, World Scientific Publishing Co. Pte. Ltd..
    18. Anirban Chakraborti & Ioane Muni Toke & Marco Patriarca & Frederic Abergel, 2011. "Econophysics review: I. Empirical facts," Quantitative Finance, Taylor & Francis Journals, vol. 11(7), pages 991-1012.
    19. R. Cont, 2001. "Empirical properties of asset returns: stylized facts and statistical issues," Quantitative Finance, Taylor & Francis Journals, vol. 1(2), pages 223-236.
    20. Olivier Gu'eant & Iuliia Manziuk, 2019. "Deep reinforcement learning for market making in corporate bonds: beating the curse of dimensionality," Papers 1910.13205, arXiv.org.
    21. Anirban Chakraborti & Ioane Muni Toke & Marco Patriarca & Frédéric Abergel, 2011. "Econophysics review: I. Empirical facts," Post-Print hal-00621058, HAL.
    22. Mark Broadie & Jerome B. Detemple, 2004. "ANNIVERSARY ARTICLE: Option Pricing: Valuation Models and Applications," Management Science, INFORMS, vol. 50(9), pages 1145-1177, September.
    23. Black, Fischer & Scholes, Myron S, 1973. "The Pricing of Options and Corporate Liabilities," Journal of Political Economy, University of Chicago Press, vol. 81(3), pages 637-654, May-June.
    24. Fabien Guilbaud & Huyên Pham, 2013. "Optimal high-frequency trading with limit and market orders," Quantitative Finance, Taylor & Francis Journals, vol. 13(1), pages 79-94, January.
    25. Wenhang Bao & Xiao-yang Liu, 2019. "Multi-Agent Deep Reinforcement Learning for Liquidation Strategy Analysis," Papers 1906.11046, arXiv.org.
    26. Heston, Steven L, 1993. "A Closed-Form Solution for Options with Stochastic Volatility with Applications to Bond and Currency Options," The Review of Financial Studies, Society for Financial Studies, vol. 6(2), pages 327-343.
    27. Alexandre Carbonneau & Frédéric Godin, 2021. "Equal risk pricing of derivatives with deep hedging," Quantitative Finance, Taylor & Francis Journals, vol. 21(4), pages 593-608, April.
    28. Suleyman Basak & Georgy Chabakauri, 2010. "Dynamic Mean-Variance Asset Allocation," The Review of Financial Studies, Society for Financial Studies, vol. 23(8), pages 2970-3016, August.
    29. Sebastian Jaimungal & Silvana Pesenti & Ye Sheng Wang & Hariom Tatsat, 2021. "Robust Risk-Aware Reinforcement Learning," Papers 2108.10403, arXiv.org, revised Dec 2021.
    30. Figlewski, Stephen, 1989. " Options Arbitrage in Imperfect Markets," Journal of Finance, American Finance Association, vol. 44(5), pages 1289-1311, December.
    31. Olivier Guéant & Iuliia Manziuk, 2019. "Deep Reinforcement Learning for Market Making in Corporate Bonds: Beating the Curse of Dimensionality," Université Paris1 Panthéon-Sorbonne (Post-Print and Working Papers) hal-03252505, HAL.
    32. Yagna Patel, 2018. "Optimizing Market Making using Multi-Agent Reinforcement Learning," Papers 1812.10252, arXiv.org.
    33. Marco Avellaneda & Sasha Stoikov, 2008. "High-frequency trading in a limit order book," Quantitative Finance, Taylor & Francis Journals, vol. 8(3), pages 217-224.
    34. Yoshiharu Sato, 2019. "Model-Free Reinforcement Learning for Financial Portfolios: A Brief Survey," Papers 1904.04973, arXiv.org, revised May 2019.
    35. Rama Cont & Arseniy Kukanov, 2017. "Optimal order placement in limit order markets," Quantitative Finance, Taylor & Francis Journals, vol. 17(1), pages 21-39, January.
    36. Dieter Hendricks & Diane Wilcox, 2014. "A reinforcement learning extension to the Almgren-Chriss model for optimal trade execution," Papers 1403.2229, arXiv.org.
    37. Olivier Guéant & Iuliia Manziuk, 2019. "Deep Reinforcement Learning for Market Making in Corporate Bonds: Beating the Curse of Dimensionality," Applied Mathematical Finance, Taylor & Francis Journals, vol. 26(5), pages 387-452, September.
    38. Sumitra Ganesh & Nelson Vadori & Mengda Xu & Hua Zheng & Prashant Reddy & Manuela Veloso, 2019. "Reinforcement Learning for Market Making in a Multi-agent Dealer Market," Papers 1911.05892, arXiv.org.
    39. MOSSIN, Jan, 1968. "Optimal multiperiod portfolio policies," LIDAM Reprints CORE 19, Université catholique de Louvain, Center for Operations Research and Econometrics (CORE).
    40. Amir Mosavi & Pedram Ghamisi & Yaser Faghan & Puhong Duan, 2020. "Comprehensive Review of Deep Reinforcement Learning Methods and Applications in Economics," Papers 2004.01509, arXiv.org.
    41. Bastien Baldacci & Iuliia Manziuk, 2020. "Adaptive trading strategies across liquidity pools," Papers 2008.07807, arXiv.org.
    42. Fischer, Thomas G., 2018. "Reinforcement learning in financial markets - a survey," FAU Discussion Papers in Economics 12/2018, Friedrich-Alexander University Erlangen-Nuremberg, Institute for Economics.
    43. Mosavi, Amir & Faghan, Yaser & Ghamisi, Pedram & Duan, Puhong & Ardabili, Sina Faizollahzadeh & Hassan, Salwana & Band, Shahab S., 2020. "Comprehensive Review of Deep Reinforcement Learning Methods and Applications in Economics," OSF Preprints jrc58, Center for Open Science.
    44. Susanne Klöppel & Martin Schweizer, 2007. "Dynamic Indifference Valuation Via Convex Risk Measures," Mathematical Finance, Wiley Blackwell, vol. 17(4), pages 599-627, October.
    45. Jean-David Fermanian & Olivier Guéant & Arnaud Rachez, 2015. "Agents' Behavior on Multi-Dealer-to-Client Bond Trading Platforms," Working Papers 2015-11, Center for Research in Economics and Statistics.
    46. Igor Halperin, 2019. "The QLBS Q-Learner goes NuQLear: fitted Q iteration, inverse RL, and option portfolios," Quantitative Finance, Taylor & Francis Journals, vol. 19(9), pages 1543-1553, September.
    47. Thomas Spooner & John Fearnley & Rahul Savani & Andreas Koukorinis, 2018. "Market Making via Reinforcement Learning," Papers 1804.04216, arXiv.org.
    48. Cox, John C. & Ross, Stephen A. & Rubinstein, Mark, 1979. "Option pricing: A simplified approach," Journal of Financial Economics, Elsevier, vol. 7(3), pages 229-263, September.
    49. Hakansson, Nils H, 1971. "Multi-Period Mean-Variance Analysis: Toward A General Theory of Portfolio Choice," Journal of Finance, American Finance Association, vol. 26(4), pages 857-884, September.
    50. Zhengyao Jiang & Dixing Xu & Jinjun Liang, 2017. "A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem," Papers 1706.10059, arXiv.org, revised Jul 2017.
    51. Volodymyr Mnih & Koray Kavukcuoglu & David Silver & Andrei A. Rusu & Joel Veness & Marc G. Bellemare & Alex Graves & Martin Riedmiller & Andreas K. Fidjeland & Georg Ostrovski & Stig Petersen & Charle, 2015. "Human-level control through deep reinforcement learning," Nature, Nature, vol. 518(7540), pages 529-533, February.
    52. Zhipeng Liang & Hao Chen & Junhao Zhu & Kangkang Jiang & Yanran Li, 2018. "Adversarial Deep Reinforcement Learning in Portfolio Management," Papers 1808.09940, arXiv.org, revised Nov 2018.
    53. Nelson Vadori & Sumitra Ganesh & Prashant Reddy & Manuela Veloso, 2020. "Risk-Sensitive Reinforcement Learning: a Martingale Approach to Reward Uncertainty," Papers 2006.12686, arXiv.org, revised Sep 2020.
    54. Longstaff, Francis A & Schwartz, Eduardo S, 2001. "Valuing American Options by Simulation: A Simple Least-Squares Approach," University of California at Los Angeles, Anderson Graduate School of Management qt43n1k4jb, Anderson Graduate School of Management, UCLA.
    55. Jay Cao & Jacky Chen & John Hull & Zissis Poulos, 2021. "Deep Hedging of Derivatives Using Reinforcement Learning," Papers 2103.16409, arXiv.org.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Woosung Koh & Insu Choi & Yuntae Jang & Gimin Kang & Woo Chang Kim, 2023. "Curriculum Learning and Imitation Learning for Model-free Control on Financial Time-series," Papers 2311.13326, arXiv.org, revised Jan 2024.
    2. Xianhua Peng & Chenyin Gong & Xue Dong He, 2023. "Reinforcement Learning for Financial Index Tracking," Papers 2308.02820, arXiv.org.
    3. Reilly Pickard & Yuri Lawryshyn, 2023. "Deep Reinforcement Learning for Dynamic Stock Option Hedging: A Review," Mathematics, MDPI, vol. 11(24), pages 1-19, December.
    4. Xiangyu Cui & Xun Li & Yun Shi & Si Zhao, 2023. "Discrete-Time Mean-Variance Strategy Based on Reinforcement Learning," Papers 2312.15385, arXiv.org.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ben Hambly & Renyuan Xu & Huining Yang, 2023. "Recent advances in reinforcement learning in finance," Mathematical Finance, Wiley Blackwell, vol. 33(3), pages 437-503, July.
    2. Bruno Gašperov & Stjepan Begušić & Petra Posedel Šimović & Zvonko Kostanjčar, 2021. "Reinforcement Learning Approaches to Optimal Market Making," Mathematics, MDPI, vol. 9(21), pages 1-22, October.
    3. Shuo Sun & Rundong Wang & Bo An, 2021. "Reinforcement Learning for Quantitative Trading," Papers 2109.13851, arXiv.org.
    4. Zoran Stoiljkovic, 2023. "Applying Reinforcement Learning to Option Pricing and Hedging," Papers 2310.04336, arXiv.org.
    5. Alexandre Carbonneau & Fr'ed'eric Godin, 2021. "Deep equal risk pricing of financial derivatives with non-translation invariant risk measures," Papers 2107.11340, arXiv.org.
    6. Bastien Baldacci & Jerome Benveniste & Gordon Ritter, 2020. "Optimal trading without optimal control," Papers 2012.12945, arXiv.org.
    7. Olivier Guéant, 2016. "The Financial Mathematics of Market Liquidity: From Optimal Execution to Market Making," Post-Print hal-01393136, HAL.
    8. Bruno Gav{s}perov & Zvonko Kostanjv{c}ar, 2022. "Deep Reinforcement Learning for Market Making Under a Hawkes Process-Based Limit Order Book Model," Papers 2207.09951, arXiv.org.
    9. Bastien Baldacci & Iuliia Manziuk, 2020. "Adaptive trading strategies across liquidity pools," Papers 2008.07807, arXiv.org.
    10. Pankaj Kumar, 2021. "Deep Hawkes Process for High-Frequency Market Making," Papers 2109.15110, arXiv.org.
    11. Jiafa He & Cong Zheng & Can Yang, 2023. "Integrating Tick-level Data and Periodical Signal for High-frequency Market Making," Papers 2306.17179, arXiv.org.
    12. Lim, Terence & Lo, Andrew W. & Merton, Robert C. & Scholes, Myron S., 2006. "The Derivatives Sourcebook," Foundations and Trends(R) in Finance, now publishers, vol. 1(5–6), pages 365-572, April.
    13. Duffie, Darrell, 2003. "Intertemporal asset pricing theory," Handbook of the Economics of Finance, in: G.M. Constantinides & M. Harris & R. M. Stulz (ed.), Handbook of the Economics of Finance, edition 1, volume 1, chapter 11, pages 639-742, Elsevier.
    14. Bastien Baldacci & Philippe Bergault & Olivier Gu'eant, 2019. "Algorithmic market making for options," Papers 1907.12433, arXiv.org, revised Jul 2020.
    15. Xiaoyu Tan & Zili Zhang & Xuejun Zhao & Shuyi Wang, 2022. "DeepPricing: pricing convertible bonds based on financial time-series generative adversarial networks," Financial Innovation, Springer;Southwestern University of Finance and Economics, vol. 8(1), pages 1-38, December.
    16. Mark Broadie & Jerome B. Detemple, 2004. "ANNIVERSARY ARTICLE: Option Pricing: Valuation Models and Applications," Management Science, INFORMS, vol. 50(9), pages 1145-1177, September.
    17. Bastien Baldacci & Philippe Bergault & Dylan Possamai, 2022. "A mean-field game of market-making against strategic traders," Papers 2203.13053, arXiv.org.
    18. Thomas Spooner & Rahul Savani, 2020. "Robust Market Making via Adversarial Reinforcement Learning," Papers 2003.01820, arXiv.org, revised Jul 2020.
    19. Duy Nguyen, 2018. "A hybrid Markov chain-tree valuation framework for stochastic volatility jump diffusion models," International Journal of Financial Engineering (IJFE), World Scientific Publishing Co. Pte. Ltd., vol. 5(04), pages 1-30, December.
    20. Li, Chenxu & Ye, Yongxin, 2019. "Pricing and Exercising American Options: an Asymptotic Expansion Approach," Journal of Economic Dynamics and Control, Elsevier, vol. 107(C), pages 1-1.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2112.04553. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.