IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2312.15385.html
   My bibliography  Save this paper

Discrete-Time Mean-Variance Strategy Based on Reinforcement Learning

Author

Listed:
  • Xiangyu Cui
  • Xun Li
  • Yun Shi
  • Si Zhao

Abstract

This paper studies a discrete-time mean-variance model based on reinforcement learning. Compared with its continuous-time counterpart in \cite{zhou2020mv}, the discrete-time model makes more general assumptions about the asset's return distribution. Using entropy to measure the cost of exploration, we derive the optimal investment strategy, whose density function is also Gaussian type. Additionally, we design the corresponding reinforcement learning algorithm. Both simulation experiments and empirical analysis indicate that our discrete-time model exhibits better applicability when analyzing real-world data than the continuous-time model.

Suggested Citation

  • Xiangyu Cui & Xun Li & Yun Shi & Si Zhao, 2023. "Discrete-Time Mean-Variance Strategy Based on Reinforcement Learning," Papers 2312.15385, arXiv.org.
  • Handle: RePEc:arx:papers:2312.15385
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2312.15385
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Panayiotis Theodossiou, 1998. "Financial Data and the Skewed Generalized T Distribution," Management Science, INFORMS, vol. 44(12-Part-1), pages 1650-1661, December.
    2. Cui, Xiangyu & Gao, Jianjun & Li, Xun & Li, Duan, 2014. "Optimal multi-period mean–variance policy under no-shorting constraint," European Journal of Operational Research, Elsevier, vol. 234(2), pages 459-468.
    3. Dietmar Maringer & Tikesh Ramtohul, 2012. "Regime-switching recurrent reinforcement learning for investment decision making," Computational Management Science, Springer, vol. 9(1), pages 89-107, February.
    4. Zhengyao Jiang & Dixing Xu & Jinjun Liang, 2017. "A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem," Papers 1706.10059, arXiv.org, revised Jul 2017.
    5. Yanwei Jia & Xun Yu Zhou, 2021. "Policy Gradient and Actor-Critic Learning in Continuous Time and Space: Theory and Algorithms," Papers 2111.11232, arXiv.org, revised Jul 2022.
    6. Zhipeng Liang & Hao Chen & Junhao Zhu & Kangkang Jiang & Yanran Li, 2018. "Adversarial Deep Reinforcement Learning in Portfolio Management," Papers 1808.09940, arXiv.org, revised Nov 2018.
    7. Duan Li & Wan‐Lung Ng, 2000. "Optimal Dynamic Portfolio Selection: Multiperiod Mean‐Variance Formulation," Mathematical Finance, Wiley Blackwell, vol. 10(3), pages 387-406, July.
    8. Mannor, Shie & Tsitsiklis, John N., 2013. "Algorithmic aspects of mean–variance optimization in Markov decision processes," European Journal of Operational Research, Elsevier, vol. 231(3), pages 645-653.
    9. Yoshiharu Sato, 2019. "Model-Free Reinforcement Learning for Financial Portfolios: A Brief Survey," Papers 1904.04973, arXiv.org, revised May 2019.
    10. Ben Hambly & Renyuan Xu & Huining Yang, 2021. "Recent Advances in Reinforcement Learning in Finance," Papers 2112.04553, arXiv.org, revised Feb 2023.
    11. Ben Hambly & Renyuan Xu & Huining Yang, 2023. "Recent advances in reinforcement learning in finance," Mathematical Finance, Wiley Blackwell, vol. 33(3), pages 437-503, July.
    12. Haoran Wang & Xun Yu Zhou, 2020. "Continuous‐time mean–variance portfolio selection: A reinforcement learning framework," Mathematical Finance, Wiley Blackwell, vol. 30(4), pages 1273-1308, October.
    13. Haoran Wang, 2019. "Large scale continuous-time mean-variance portfolio allocation via reinforcement learning," Papers 1907.11718, arXiv.org, revised Aug 2019.
    14. Yanwei Jia & Xun Yu Zhou, 2021. "Policy Evaluation and Temporal-Difference Learning in Continuous Time and Space: A Martingale Approach," Papers 2108.06655, arXiv.org, revised Feb 2022.
    15. Yanwei Jia & Xun Yu Zhou, 2022. "q-Learning in Continuous Time," Papers 2207.00713, arXiv.org, revised Apr 2023.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Zhou Fang, 2023. "Continuous-Time Path-Dependent Exploratory Mean-Variance Portfolio Construction," Papers 2303.02298, arXiv.org.
    2. Ben Hambly & Renyuan Xu & Huining Yang, 2021. "Recent Advances in Reinforcement Learning in Finance," Papers 2112.04553, arXiv.org, revised Feb 2023.
    3. Haoran Wang & Xun Yu Zhou, 2020. "Continuous‐time mean–variance portfolio selection: A reinforcement learning framework," Mathematical Finance, Wiley Blackwell, vol. 30(4), pages 1273-1308, October.
    4. Haoran Wang & Shi Yu, 2021. "Robo-Advising: Enhancing Investment with Inverse Optimization and Deep Reinforcement Learning," Papers 2105.09264, arXiv.org.
    5. Ben Hambly & Renyuan Xu & Huining Yang, 2023. "Recent advances in reinforcement learning in finance," Mathematical Finance, Wiley Blackwell, vol. 33(3), pages 437-503, July.
    6. Gang Huang & Xiaohua Zhou & Qingyang Song, 2020. "Deep reinforcement learning for portfolio management," Papers 2012.13773, arXiv.org, revised Apr 2022.
    7. De Gennaro Aquino, Luca & Sornette, Didier & Strub, Moris S., 2023. "Portfolio selection with exploration of new investment assets," European Journal of Operational Research, Elsevier, vol. 310(2), pages 773-792.
    8. Reilly Pickard & Yuri Lawryshyn, 2023. "Deep Reinforcement Learning for Dynamic Stock Option Hedging: A Review," Mathematics, MDPI, vol. 11(24), pages 1-19, December.
    9. Xiangyu Cui & Xun Li & Duan Li & Yun Shi, 2014. "Time Consistent Behavior Portfolio Policy for Dynamic Mean-Variance Formulation," Papers 1408.6070, arXiv.org, revised Aug 2015.
    10. Xin Huang & Duan Li & Daniel Zhuoyu Long, 2020. "Scenario-decomposition Solution Framework for Nonseparable Stochastic Control Problems," Papers 2010.08985, arXiv.org.
    11. Amir Mosavi & Pedram Ghamisi & Yaser Faghan & Puhong Duan, 2020. "Comprehensive Review of Deep Reinforcement Learning Methods and Applications in Economics," Papers 2004.01509, arXiv.org.
    12. Alexandre Carbonneau & Fr'ed'eric Godin, 2021. "Deep equal risk pricing of financial derivatives with non-translation invariant risk measures," Papers 2107.11340, arXiv.org.
    13. Xiangyu Cui & Duan Li & Xun Li, 2014. "Mean-Variance Policy for Discrete-time Cone Constrained Markets: The Consistency in Efficiency and Minimum-Variance Signed Supermartingale Measure," Papers 1403.0718, arXiv.org.
    14. Xiangyu Cui & Jianjun Gao & Yun Shi, 2021. "Multi-period mean–variance portfolio optimization with management fees," Operational Research, Springer, vol. 21(2), pages 1333-1354, June.
    15. Hanwen Zhang & Duy-Minh Dang, 2023. "A monotone numerical integration method for mean-variance portfolio optimization under jump-diffusion models," Papers 2309.05977, arXiv.org.
    16. Mei-Li Shen & Cheng-Feng Lee & Hsiou-Hsiang Liu & Po-Yin Chang & Cheng-Hong Yang, 2021. "An Effective Hybrid Approach for Forecasting Currency Exchange Rates," Sustainability, MDPI, vol. 13(5), pages 1-29, March.
    17. Stefania Corsaro & Valentina De Simone & Zelda Marino & Salvatore Scognamiglio, 2022. "l 1 -Regularization in Portfolio Selection with Machine Learning," Mathematics, MDPI, vol. 10(4), pages 1-15, February.
    18. Pun, Chi Seng & Wong, Hoi Ying, 2019. "A linear programming model for selection of sparse high-dimensional multiperiod portfolios," European Journal of Operational Research, Elsevier, vol. 273(2), pages 754-771.
    19. Mengying Zhu & Xiaolin Zheng & Yan Wang & Yuyuan Li & Qianqiao Liang, 2019. "Adaptive Portfolio by Solving Multi-armed Bandit via Thompson Sampling," Papers 1911.05309, arXiv.org, revised Nov 2019.
    20. Amirhosein Mosavi & Yaser Faghan & Pedram Ghamisi & Puhong Duan & Sina Faizollahzadeh Ardabili & Ely Salwana & Shahab S. Band, 2020. "Comprehensive Review of Deep Reinforcement Learning Methods and Applications in Economics," Mathematics, MDPI, vol. 8(10), pages 1-42, September.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2312.15385. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.