IDEAS home Printed from https://ideas.repec.org/a/bla/mathfi/v30y2020i4p1273-1308.html

Continuous‐time mean–variance portfolio selection: A reinforcement learning framework

Author

Listed:
  • Haoran Wang
  • Xun Yu Zhou

Abstract

We approach the continuous‐time mean–variance portfolio selection with reinforcement learning (RL). The problem is to achieve the best trade‐off between exploration and exploitation, and is formulated as an entropy‐regularized, relaxed stochastic control problem. We prove that the optimal feedback policy for this problem must be Gaussian, with time‐decaying variance. We then prove a policy improvement theorem, based on which we devise an implementable RL algorithm. We find that our algorithm and its variant outperform both traditional and deep neural network based algorithms in our simulation and empirical studies.

Suggested Citation

  • Haoran Wang & Xun Yu Zhou, 2020. "Continuous‐time mean–variance portfolio selection: A reinforcement learning framework," Mathematical Finance, Wiley Blackwell, vol. 30(4), pages 1273-1308, October.
  • Handle: RePEc:bla:mathfi:v:30:y:2020:i:4:p:1273-1308
    DOI: 10.1111/mafi.12281
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/mafi.12281
    Download Restriction: no

    File URL: https://libkey.io/10.1111/mafi.12281?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. R. H. Strotz, 1955. "Myopia and Inconsistency in Dynamic Utility Maximization," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 23(3), pages 165-180.
    2. Hutchinson, James M & Lo, Andrew W & Poggio, Tomaso, 1994. "A Nonparametric Approach to Pricing and Hedging Derivative Securities via Learning Networks," Journal of Finance, American Finance Association, vol. 49(3), pages 851-889, July.
    3. Duan Li & Wan‐Lung Ng, 2000. "Optimal Dynamic Portfolio Selection: Multiperiod Mean‐Variance Formulation," Mathematical Finance, Wiley Blackwell, vol. 10(3), pages 387-406, July.
    4. Mannor, Shie & Tsitsiklis, John N., 2013. "Algorithmic aspects of mean–variance optimization in Markov decision processes," European Journal of Operational Research, Elsevier, vol. 231(3), pages 645-653.
    5. Haoran Wang, 2019. "Large scale continuous-time mean-variance portfolio allocation via reinforcement learning," Papers 1907.11718, arXiv.org, revised Aug 2019.
    6. David Silver & Aja Huang & Chris J. Maddison & Arthur Guez & Laurent Sifre & George van den Driessche & Julian Schrittwieser & Ioannis Antonoglou & Veda Panneershelvam & Marc Lanctot & Sander Dieleman, 2016. "Mastering the game of Go with deep neural networks and tree search," Nature, Nature, vol. 529(7587), pages 484-489, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Haoran Wang & Xun Yu Zhou, 2019. "Continuous-Time Mean-Variance Portfolio Selection: A Reinforcement Learning Framework," Papers 1904.11392, arXiv.org, revised May 2019.
    2. Haoran Wang & Shi Yu, 2021. "Robo-Advising: Enhancing Investment with Inverse Optimization and Deep Reinforcement Learning," Papers 2105.09264, arXiv.org.
    3. Xiangyu Cui & Xun Li & Yun Shi & Si Zhao, 2023. "Discrete-Time Mean-Variance Strategy Based on Reinforcement Learning," Papers 2312.15385, arXiv.org.
    4. Xiang Meng, 2019. "Dynamic Mean-Variance Portfolio Optimisation," Papers 1907.03093, arXiv.org.
    5. Zhou Fang, 2023. "Continuous-Time Path-Dependent Exploratory Mean-Variance Portfolio Construction," Papers 2303.02298, arXiv.org.
    6. Xiangyu Cui & Xun Li & Duan Li & Yun Shi, 2014. "Time Consistent Behavior Portfolio Policy for Dynamic Mean-Variance Formulation," Papers 1408.6070, arXiv.org, revised Aug 2015.
    7. Zhang, Caibin & Liang, Zhibin, 2022. "Optimal time-consistent reinsurance and investment strategies for a jump–diffusion financial market without cash," The North American Journal of Economics and Finance, Elsevier, vol. 59(C).
    8. Bian, Lihua & Li, Zhongfei & Yao, Haixiang, 2018. "Pre-commitment and equilibrium investment strategies for the DC pension plan with regime switching and a return of premiums clause," Insurance: Mathematics and Economics, Elsevier, vol. 81(C), pages 78-94.
    9. Caibin Zhang & Zhibin Liang & Kam Chuen Yuen, 2019. "Optimal dynamic reinsurance with common shock dependence and state-dependent risk aversion," International Journal of Financial Engineering (IJFE), World Scientific Publishing Co. Pte. Ltd., vol. 6(01), pages 1-45, March.
    10. Li Xia, 2020. "Risk‐Sensitive Markov Decision Processes with Combined Metrics of Mean and Variance," Production and Operations Management, Production and Operations Management Society, vol. 29(12), pages 2808-2827, December.
    11. Dong-Mei Zhu & Jia-Wen Gu & Feng-Hui Yu & Tak-Kuen Siu & Wai-Ki Ching, 2021. "Optimal pairs trading with dynamic mean-variance objective," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 94(1), pages 145-168, August.
    12. Y. Zhang & Z. Jin & J. Wei & G. Yin, 2022. "Mean-variance portfolio selection with dynamic attention behavior in a hidden Markov model," Papers 2205.08743, arXiv.org.
    13. Haoran Wang, 2019. "Large scale continuous-time mean-variance portfolio allocation via reinforcement learning," Papers 1907.11718, arXiv.org, revised Aug 2019.
    14. Amirhosein Mosavi & Yaser Faghan & Pedram Ghamisi & Puhong Duan & Sina Faizollahzadeh Ardabili & Ely Salwana & Shahab S. Band, 2020. "Comprehensive Review of Deep Reinforcement Learning Methods and Applications in Economics," Mathematics, MDPI, vol. 8(10), pages 1-42, September.
    15. Zhiping Chen & Liyuan Wang & Ping Chen & Haixiang Yao, 2019. "Continuous-Time Mean–Variance Optimization For Defined Contribution Pension Funds With Regime-Switching," International Journal of Theoretical and Applied Finance (IJTAF), World Scientific Publishing Co. Pte. Ltd., vol. 22(06), pages 1-33, September.
    16. Ben Hambly & Renyuan Xu & Huining Yang, 2021. "Recent Advances in Reinforcement Learning in Finance," Papers 2112.04553, arXiv.org, revised Feb 2023.
    17. Ying Hu & Hanqing Jin & Xun Yu Zhou, 2020. "Consistent Investment of Sophisticated Rank-Dependent Utility Agents in Continuous Time," Working Papers hal-02624308, HAL.
    18. Li, Yongwu & Li, Zhongfei, 2013. "Optimal time-consistent investment and reinsurance strategies for mean–variance insurers with state dependent risk aversion," Insurance: Mathematics and Economics, Elsevier, vol. 53(1), pages 86-97.
    19. Tomas Björk & Agatha Murgoci & Xun Yu Zhou, 2014. "Mean–Variance Portfolio Optimization With State-Dependent Risk Aversion," Mathematical Finance, Wiley Blackwell, vol. 24(1), pages 1-24, January.
    20. Huy Chau & Duy Nguyen & Thai Nguyen, 2024. "Continuous-time optimal investment with portfolio constraints: a reinforcement learning approach," Papers 2412.10692, arXiv.org.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:mathfi:v:30:y:2020:i:4:p:1273-1308. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: http://www.blackwellpublishing.com/journal.asp?ref=0960-1627 .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.