IDEAS home Printed from https://ideas.repec.org/a/eee/dyncon/v158y2024ics0165188923001938.html
   My bibliography  Save this article

Reinforcement learning for continuous-time mean-variance portfolio selection in a regime-switching market

Author

Listed:
  • Wu, Bo
  • Li, Lingfei

Abstract

We propose a reinforcement learning (RL) approach to solve the continuous-time mean-variance portfolio selection problem in a regime-switching market, where the market regime is unobservable. To encourage exploration for learning, we formulate an exploratory stochastic control problem with an entropy-regularized mean-variance objective. We obtain semi-analytical representations of the optimal value function and optimal policy, which involve unknown solutions to two linear parabolic partial differential equations (PDEs). We utilize these representations to parametrize the value function and policy for learning with the unknown solutions to the PDEs approximated based on polynomials. We develop an actor-critic RL algorithm to learn the optimal policy through interactions with the market environment. The algorithm carries out filtering to obtain the belief probability of the market regime and performs policy evaluation and policy gradient updates alternately. Empirical results demonstrate the advantages of our RL algorithm in relatively long-term investment problems over the classical control approach and an RL algorithm developed for the continuous-time mean-variance problem without considering regime switches.

Suggested Citation

  • Wu, Bo & Li, Lingfei, 2024. "Reinforcement learning for continuous-time mean-variance portfolio selection in a regime-switching market," Journal of Economic Dynamics and Control, Elsevier, vol. 158(C).
  • Handle: RePEc:eee:dyncon:v:158:y:2024:i:c:s0165188923001938
    DOI: 10.1016/j.jedc.2023.104787
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0165188923001938
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.jedc.2023.104787?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Andrew Ang & Allan Timmermann, 2012. "Regime Changes and Financial Markets," Annual Review of Financial Economics, Annual Reviews, vol. 4(1), pages 313-337, October.
    2. Jun Tu, 2010. "Is Regime Switching in Stock Returns Important in Portfolio Decisions?," Management Science, INFORMS, vol. 56(7), pages 1198-1215, July.
    3. Min Dai & Zhou Yang & Qing Zhang & Qiji Jim Zhu, 2016. "Optimal Trend Following Trading Rules," Mathematics of Operations Research, INFORMS, vol. 41(2), pages 626-642, May.
    4. Michaud, Richard O. & Michaud, Robert O., 2008. "Efficient Asset Management: A Practical Guide to Stock Portfolio Optimization and Asset Allocation," OUP Catalogue, Oxford University Press, edition 2, number 9780195331912, Decembrie.
    5. Guidolin, Massimo & Timmermann, Allan, 2007. "Asset allocation under multivariate regime switching," Journal of Economic Dynamics and Control, Elsevier, vol. 31(11), pages 3503-3544, November.
    6. Elliott, Robert J. & Siu, Tak Kuen & Badescu, Alex, 2010. "On mean-variance portfolio selection under a hidden Markovian regime-switching model," Economic Modelling, Elsevier, vol. 27(3), pages 678-686, May.
    7. Dietmar Maringer & Tikesh Ramtohul, 2012. "Regime-switching recurrent reinforcement learning for investment decision making," Computational Management Science, Springer, vol. 9(1), pages 89-107, February.
    8. Jörn Sass & Ulrich Haussmann, 2004. "Optimizing the terminal wealth under partial information: The drift process as a continuous time Markov chain," Finance and Stochastics, Springer, vol. 8(4), pages 553-577, November.
    9. Massimo Guidolin & Allan Timmermann, 2008. "International asset allocation under regime switching, skew, and kurtosis preferences," The Review of Financial Studies, Society for Financial Studies, vol. 21(2), pages 889-935, April.
    10. Yanwei Jia & Xun Yu Zhou, 2021. "Policy Gradient and Actor-Critic Learning in Continuous Time and Space: Theory and Algorithms," Papers 2111.11232, arXiv.org, revised Jul 2022.
    11. Yanwei Jia & Xun Yu Zhou, 2021. "Policy Evaluation and Temporal-Difference Learning in Continuous Time and Space: A Martingale Approach," Papers 2108.06655, arXiv.org, revised Feb 2022.
    12. Sun, Yeneng, 2006. "The exact law of large numbers via Fubini extension and characterization of insurable risks," Journal of Economic Theory, Elsevier, vol. 126(1), pages 31-69, January.
    13. Harry Markowitz, 1952. "Portfolio Selection," Journal of Finance, American Finance Association, vol. 7(1), pages 77-91, March.
    14. Duan Li & Wan‐Lung Ng, 2000. "Optimal Dynamic Portfolio Selection: Multiperiod Mean‐Variance Formulation," Mathematical Finance, Wiley Blackwell, vol. 10(3), pages 387-406, July.
    15. Ben Hambly & Renyuan Xu & Huining Yang, 2021. "Recent Advances in Reinforcement Learning in Finance," Papers 2112.04553, arXiv.org, revised Feb 2023.
    16. Sebastian Jaimungal, 2022. "Reinforcement learning and stochastic optimisation," Finance and Stochastics, Springer, vol. 26(1), pages 103-129, January.
    17. Ben Hambly & Renyuan Xu & Huining Yang, 2023. "Recent advances in reinforcement learning in finance," Mathematical Finance, Wiley Blackwell, vol. 33(3), pages 437-503, July.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Xiangyu Cui & Xun Li & Yun Shi & Si Zhao, 2023. "Discrete-Time Mean-Variance Strategy Based on Reinforcement Learning," Papers 2312.15385, arXiv.org.
    2. Su, Xiaoshan & Bai, Manying & Han, Yingwei, 2021. "Robust portfolio selection with regime switching and asymmetric dependence," Economic Modelling, Elsevier, vol. 99(C).
    3. Platanakis, Emmanouil & Sakkas, Athanasios & Sutcliffe, Charles, 2019. "Harmful diversification: Evidence from alternative investments," The British Accounting Review, Elsevier, vol. 51(1), pages 1-23.
    4. Bernardi, Mauro & Catania, Leopoldo, 2018. "Portfolio optimisation under flexible dynamic dependence modelling," Journal of Empirical Finance, Elsevier, vol. 48(C), pages 1-18.
    5. Hematizadeh, Roksana & Tajaddini, Reza & Hallahan, Terrence, 2022. "Dynamic asset allocation strategy using a state-dependent Markov model: Applications to international equity markets," Journal of International Money and Finance, Elsevier, vol. 128(C).
    6. Campani, Carlos Heitor & Garcia, René & Lewin, Marcelo, 2021. "Optimal portfolio strategies in the presence of regimes in asset returns," Journal of Banking & Finance, Elsevier, vol. 123(C).
    7. Levy, Moshe & Kaplanski, Guy, 2015. "Portfolio selection in a two-regime world," European Journal of Operational Research, Elsevier, vol. 242(2), pages 514-524.
    8. Massimo Guidolin & Federica Ria, 2011. "Regime shifts in mean-variance efficient frontiers: Some international evidence," Journal of Asset Management, Palgrave Macmillan, vol. 12(5), pages 322-349, November.
    9. Marcelo Lewin & Carlos Heitor Campani, 2023. "Constrained portfolio strategies in a regime-switching economy," Financial Markets and Portfolio Management, Springer;Swiss Society for Financial Market Research, vol. 37(1), pages 27-59, March.
    10. Zhou Fang, 2023. "Continuous-Time Path-Dependent Exploratory Mean-Variance Portfolio Construction," Papers 2303.02298, arXiv.org.
    11. Zhang, Caibin & Liang, Zhibin, 2022. "Optimal time-consistent reinsurance and investment strategies for a jump–diffusion financial market without cash," The North American Journal of Economics and Finance, Elsevier, vol. 59(C).
    12. Guidolin, Massimo & Hyde, Stuart, 2012. "Can VAR models capture regime shifts in asset returns? A long-horizon strategic asset allocation perspective," Journal of Banking & Finance, Elsevier, vol. 36(3), pages 695-716.
    13. Chen, Ping & Yam, S.C.P., 2013. "Optimal proportional reinsurance and investment with regime-switching for mean–variance insurers," Insurance: Mathematics and Economics, Elsevier, vol. 53(3), pages 871-883.
    14. Penaranda, Francisco, 2007. "Portfolio choice beyond the traditional approach," LSE Research Online Documents on Economics 24481, London School of Economics and Political Science, LSE Library.
    15. Guidolin, Massimo & Liu, Hening, 2016. "Ambiguity Aversion and Underdiversification," Journal of Financial and Quantitative Analysis, Cambridge University Press, vol. 51(4), pages 1297-1323, August.
    16. Yao, Haixiang & Li, Zhongfei & Chen, Shumin, 2014. "Continuous-time mean–variance portfolio selection with only risky assets," Economic Modelling, Elsevier, vol. 36(C), pages 244-251.
    17. Yanwei Jia & Xun Yu Zhou, 2022. "q-Learning in Continuous Time," Papers 2207.00713, arXiv.org, revised Apr 2023.
    18. Yao, Haixiang & Li, Danping & Wu, Huiling, 2022. "Dynamic trading with uncertain exit time and transaction costs in a general Markov market," International Review of Financial Analysis, Elsevier, vol. 84(C).
    19. Anna Battauz & Alessandro Sbuelz, 2018. "Non†myopic portfolio choice with unpredictable returns: The jump†to†default case," European Financial Management, European Financial Management Association, vol. 24(2), pages 192-208, March.
    20. Kenwin Maung, 2021. "Estimating high-dimensional Markov-switching VARs," Papers 2107.12552, arXiv.org.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:dyncon:v:158:y:2024:i:c:s0165188923001938. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/jedc .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.