IDEAS home Printed from https://ideas.repec.org/a/bla/mathfi/v30y2020i4p1273-1308.html
   My bibliography  Save this article

Continuous‐time mean–variance portfolio selection: A reinforcement learning framework

Author

Listed:
  • Haoran Wang
  • Xun Yu Zhou

Abstract

We approach the continuous‐time mean–variance portfolio selection with reinforcement learning (RL). The problem is to achieve the best trade‐off between exploration and exploitation, and is formulated as an entropy‐regularized, relaxed stochastic control problem. We prove that the optimal feedback policy for this problem must be Gaussian, with time‐decaying variance. We then prove a policy improvement theorem, based on which we devise an implementable RL algorithm. We find that our algorithm and its variant outperform both traditional and deep neural network based algorithms in our simulation and empirical studies.

Suggested Citation

  • Haoran Wang & Xun Yu Zhou, 2020. "Continuous‐time mean–variance portfolio selection: A reinforcement learning framework," Mathematical Finance, Wiley Blackwell, vol. 30(4), pages 1273-1308, October.
  • Handle: RePEc:bla:mathfi:v:30:y:2020:i:4:p:1273-1308
    DOI: 10.1111/mafi.12281
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/mafi.12281
    Download Restriction: no

    File URL: https://libkey.io/10.1111/mafi.12281?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. R. H. Strotz, 1955. "Myopia and Inconsistency in Dynamic Utility Maximization," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 23(3), pages 165-180.
    2. Hutchinson, James M & Lo, Andrew W & Poggio, Tomaso, 1994. "A Nonparametric Approach to Pricing and Hedging Derivative Securities via Learning Networks," Journal of Finance, American Finance Association, vol. 49(3), pages 851-889, July.
    3. Haoran Wang, 2019. "Large scale continuous-time mean-variance portfolio allocation via reinforcement learning," Papers 1907.11718, arXiv.org, revised Aug 2019.
    4. Duan Li & Wan‐Lung Ng, 2000. "Optimal Dynamic Portfolio Selection: Multiperiod Mean‐Variance Formulation," Mathematical Finance, Wiley Blackwell, vol. 10(3), pages 387-406, July.
    5. Mannor, Shie & Tsitsiklis, John N., 2013. "Algorithmic aspects of mean–variance optimization in Markov decision processes," European Journal of Operational Research, Elsevier, vol. 231(3), pages 645-653.
    6. David Silver & Aja Huang & Chris J. Maddison & Arthur Guez & Laurent Sifre & George van den Driessche & Julian Schrittwieser & Ioannis Antonoglou & Veda Panneershelvam & Marc Lanctot & Sander Dieleman, 2016. "Mastering the game of Go with deep neural networks and tree search," Nature, Nature, vol. 529(7587), pages 484-489, January.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Xiaofei Shi & Daran Xu & Zhanhao Zhang, 2021. "Deep Learning Algorithms for Hedging with Frictions," Papers 2111.01931, arXiv.org, revised Dec 2022.
    2. Xiaofei Shi & Daran Xu & Zhanhao Zhang, 2023. "Deep learning algorithms for hedging with frictions," Digital Finance, Springer, vol. 5(1), pages 113-147, March.
    3. Min Dai & Yuchao Dong & Yanwei Jia & Xun Yu Zhou, 2023. "Learning Merton's Strategies in an Incomplete Market: Recursive Entropy Regularization and Biased Gaussian Exploration," Papers 2312.11797, arXiv.org.
    4. Xia Han & Ruodu Wang & Xun Yu Zhou, 2022. "Choquet regularization for reinforcement learning," Papers 2208.08497, arXiv.org.
    5. Xiangyu Cui & Xun Li & Yun Shi & Si Zhao, 2023. "Discrete-Time Mean-Variance Strategy Based on Reinforcement Learning," Papers 2312.15385, arXiv.org.
    6. Wing Fung Chong & Haoen Cui & Yuxuan Li, 2021. "Pseudo-Model-Free Hedging for Variable Annuities via Deep Reinforcement Learning," Papers 2107.03340, arXiv.org, revised Oct 2022.
    7. De Gennaro Aquino, Luca & Sornette, Didier & Strub, Moris S., 2023. "Portfolio selection with exploration of new investment assets," European Journal of Operational Research, Elsevier, vol. 310(2), pages 773-792.
    8. Magni, Carlo Alberto & Marchioni, Andrea & Baschieri, Davide, 2023. "The Attribution Matrix and the joint use of Finite Change Sensitivity Index and Residual Income for value-based performance measurement," European Journal of Operational Research, Elsevier, vol. 306(2), pages 872-892.
    9. Min Dai & Hanqing Jin & Xi Yang, 2024. "Data-driven Option Pricing," Papers 2401.11158, arXiv.org.
    10. Carlo Alberto Magni & Andrea Marchioni, 2022. "Performance attribution, time-weighted rate of return, and clean finite change sensitivity index," Journal of Asset Management, Palgrave Macmillan, vol. 23(1), pages 62-72, February.
    11. Sang Hu & Zihan Zhou, 2024. "Exploratory Dividend Optimization with Entropy Regularization," JRFM, MDPI, vol. 17(1), pages 1-23, January.
    12. Alexandre Carbonneau & Fr'ed'eric Godin, 2021. "Deep equal risk pricing of financial derivatives with non-translation invariant risk measures," Papers 2107.11340, arXiv.org.
    13. Dong-Mei Zhu & Jia-Wen Gu & Feng-Hui Yu & Tak-Kuen Siu & Wai-Ki Ching, 2021. "Optimal pairs trading with dynamic mean-variance objective," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 94(1), pages 145-168, August.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Haoran Wang & Shi Yu, 2021. "Robo-Advising: Enhancing Investment with Inverse Optimization and Deep Reinforcement Learning," Papers 2105.09264, arXiv.org.
    2. Haoran Wang & Xun Yu Zhou, 2019. "Continuous-Time Mean-Variance Portfolio Selection: A Reinforcement Learning Framework," Papers 1904.11392, arXiv.org, revised May 2019.
    3. Xiangyu Cui & Xun Li & Yun Shi & Si Zhao, 2023. "Discrete-Time Mean-Variance Strategy Based on Reinforcement Learning," Papers 2312.15385, arXiv.org.
    4. Xiang Meng, 2019. "Dynamic Mean-Variance Portfolio Optimisation," Papers 1907.03093, arXiv.org.
    5. Zhou Fang, 2023. "Continuous-Time Path-Dependent Exploratory Mean-Variance Portfolio Construction," Papers 2303.02298, arXiv.org.
    6. Xiangyu Cui & Xun Li & Duan Li & Yun Shi, 2014. "Time Consistent Behavior Portfolio Policy for Dynamic Mean-Variance Formulation," Papers 1408.6070, arXiv.org, revised Aug 2015.
    7. Zhang, Caibin & Liang, Zhibin, 2022. "Optimal time-consistent reinsurance and investment strategies for a jump–diffusion financial market without cash," The North American Journal of Economics and Finance, Elsevier, vol. 59(C).
    8. Li Xia, 2020. "Risk‐Sensitive Markov Decision Processes with Combined Metrics of Mean and Variance," Production and Operations Management, Production and Operations Management Society, vol. 29(12), pages 2808-2827, December.
    9. Y. Zhang & Z. Jin & J. Wei & G. Yin, 2022. "Mean-variance portfolio selection with dynamic attention behavior in a hidden Markov model," Papers 2205.08743, arXiv.org.
    10. Amirhosein Mosavi & Yaser Faghan & Pedram Ghamisi & Puhong Duan & Sina Faizollahzadeh Ardabili & Ely Salwana & Shahab S. Band, 2020. "Comprehensive Review of Deep Reinforcement Learning Methods and Applications in Economics," Mathematics, MDPI, vol. 8(10), pages 1-42, September.
    11. Zhiping Chen & Liyuan Wang & Ping Chen & Haixiang Yao, 2019. "Continuous-Time Mean–Variance Optimization For Defined Contribution Pension Funds With Regime-Switching," International Journal of Theoretical and Applied Finance (IJTAF), World Scientific Publishing Co. Pte. Ltd., vol. 22(06), pages 1-33, September.
    12. Ben Hambly & Renyuan Xu & Huining Yang, 2021. "Recent Advances in Reinforcement Learning in Finance," Papers 2112.04553, arXiv.org, revised Feb 2023.
    13. Ying Hu & Hanqing Jin & Xun Yu Zhou, 2020. "Consistent Investment of Sophisticated Rank-Dependent Utility Agents in Continuous Time," Working Papers hal-02624308, HAL.
    14. Li, Yongwu & Li, Zhongfei, 2013. "Optimal time-consistent investment and reinsurance strategies for mean–variance insurers with state dependent risk aversion," Insurance: Mathematics and Economics, Elsevier, vol. 53(1), pages 86-97.
    15. Bodo Herzog & Sufyan Osamah, 2019. "Reverse Engineering of Option Pricing: An AI Application," IJFS, MDPI, vol. 7(4), pages 1-12, November.
    16. Zilan Liu & Yijun Wang & Ya Huang & Jieming Zhou, 2022. "Optimal Time-Consistent Investment and Premium Control Strategies for Insurers with Constraint under the Heston Model," Mathematics, MDPI, vol. 10(7), pages 1-22, March.
    17. Felix Fie{ss}inger & Mitja Stadje, 2023. "Time-Consistent Asset Allocation for Risk Measures in a L\'evy Market," Papers 2305.09471, arXiv.org, revised Jun 2023.
    18. Bingyan Han & Chi Seng Pun & Hoi Ying Wong, 2021. "Robust state-dependent mean–variance portfolio selection: a closed-loop approach," Finance and Stochastics, Springer, vol. 25(3), pages 529-561, July.
    19. Ying Hu & Hanqing Jin & Xun Yu Zhou, 2020. "Consistent Investment of Sophisticated Rank-Dependent Utility Agents in Continuous Time," Papers 2006.01979, arXiv.org.
    20. Bian, Lihua & Li, Zhongfei & Yao, Haixiang, 2018. "Pre-commitment and equilibrium investment strategies for the DC pension plan with regime switching and a return of premiums clause," Insurance: Mathematics and Economics, Elsevier, vol. 81(C), pages 78-94.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:mathfi:v:30:y:2020:i:4:p:1273-1308. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: http://www.blackwellpublishing.com/journal.asp?ref=0960-1627 .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.