Reinforcement learning for continuous-time mean-variance portfolio selection in a regime-switching market

My bibliography Save this article

Reinforcement learning for continuous-time mean-variance portfolio selection in a regime-switching market

Author

Listed:

Wu, Bo
Li, Lingfei

Registered:

Abstract

We propose a reinforcement learning (RL) approach to solve the continuous-time mean-variance portfolio selection problem in a regime-switching market, where the market regime is unobservable. To encourage exploration for learning, we formulate an exploratory stochastic control problem with an entropy-regularized mean-variance objective. We obtain semi-analytical representations of the optimal value function and optimal policy, which involve unknown solutions to two linear parabolic partial differential equations (PDEs). We utilize these representations to parametrize the value function and policy for learning with the unknown solutions to the PDEs approximated based on polynomials. We develop an actor-critic RL algorithm to learn the optimal policy through interactions with the market environment. The algorithm carries out filtering to obtain the belief probability of the market regime and performs policy evaluation and policy gradient updates alternately. Empirical results demonstrate the advantages of our RL algorithm in relatively long-term investment problems over the classical control approach and an RL algorithm developed for the continuous-time mean-variance problem without considering regime switches.

Suggested Citation

Wu, Bo & Li, Lingfei, 2024. "Reinforcement learning for continuous-time mean-variance portfolio selection in a regime-switching market," Journal of Economic Dynamics and Control, Elsevier, vol. 158(C).

Handle: RePEc:eee:dyncon:v:158:y:2024:i:c:s0165188923001938
DOI: 10.1016/j.jedc.2023.104787

Download full text from publisher

As the access to this document is restricted, you may want to search for a different version of it.

References listed on IDEAS

Andrew Ang & Allan Timmermann, 2012. "Regime Changes and Financial Markets," Annual Review of Financial Economics, Annual Reviews, vol. 4(1), pages 313-337, October.
- Andrew Ang & Allan Timmermann, 2011. "Regime Changes and Financial Markets," NBER Working Papers 17182, National Bureau of Economic Research, Inc.
- Timmermann, Allan & Ang, Andrew, 2011. "Regime Changes and Financial Markets," CEPR Discussion Papers 8480, C.E.P.R. Discussion Papers.
Jun Tu, 2010. "Is Regime Switching in Stock Returns Important in Portfolio Decisions?," Management Science, INFORMS, vol. 56(7), pages 1198-1215, July.
Min Dai & Zhou Yang & Qing Zhang & Qiji Jim Zhu, 2016. "Optimal Trend Following Trading Rules," Mathematics of Operations Research, INFORMS, vol. 41(2), pages 626-642, May.
Michaud, Richard O. & Michaud, Robert O., 2008. "Efficient Asset Management: A Practical Guide to Stock Portfolio Optimization and Asset Allocation," OUP Catalogue, Oxford University Press, edition 2, number 9780195331912, Decembrie.
Guidolin, Massimo & Timmermann, Allan, 2007. "Asset allocation under multivariate regime switching," Journal of Economic Dynamics and Control, Elsevier, vol. 31(11), pages 3503-3544, November.
- Massimo Guidolin & Allan Timmerman, 2006. "Asset allocation under multivariate regime switching," Working Papers 2005-002, Federal Reserve Bank of St. Louis.
Elliott, Robert J. & Siu, Tak Kuen & Badescu, Alex, 2010. "On mean-variance portfolio selection under a hidden Markovian regime-switching model," Economic Modelling, Elsevier, vol. 27(3), pages 678-686, May.
Dietmar Maringer & Tikesh Ramtohul, 2012. "Regime-switching recurrent reinforcement learning for investment decision making," Computational Management Science, Springer, vol. 9(1), pages 89-107, February.
Jörn Sass & Ulrich Haussmann, 2004. "Optimizing the terminal wealth under partial information: The drift process as a continuous time Markov chain," Finance and Stochastics, Springer, vol. 8(4), pages 553-577, November.
Massimo Guidolin & Allan Timmermann, 2008. "International asset allocation under regime switching, skew, and kurtosis preferences," The Review of Financial Studies, Society for Financial Studies, vol. 21(2), pages 889-935, April.
- Massimo Guidolin & Allan Timmerman, 2006. "International asset allocation under regime switching, skew and kurtosis preferences," Working Papers 2005-034, Federal Reserve Bank of St. Louis.
Yanwei Jia & Xun Yu Zhou, 2021. "Policy Gradient and Actor-Critic Learning in Continuous Time and Space: Theory and Algorithms," Papers 2111.11232, arXiv.org, revised Jul 2022.
Yanwei Jia & Xun Yu Zhou, 2021. "Policy Evaluation and Temporal-Difference Learning in Continuous Time and Space: A Martingale Approach," Papers 2108.06655, arXiv.org, revised Feb 2022.
Sun, Yeneng, 2006. "The exact law of large numbers via Fubini extension and characterization of insurable risks," Journal of Economic Theory, Elsevier, vol. 126(1), pages 31-69, January.
Harry Markowitz, 1952. "Portfolio Selection," Journal of Finance, American Finance Association, vol. 7(1), pages 77-91, March.
Duan Li & Wan‐Lung Ng, 2000. "Optimal Dynamic Portfolio Selection: Multiperiod Mean‐Variance Formulation," Mathematical Finance, Wiley Blackwell, vol. 10(3), pages 387-406, July.
Ben Hambly & Renyuan Xu & Huining Yang, 2021. "Recent Advances in Reinforcement Learning in Finance," Papers 2112.04553, arXiv.org, revised Feb 2023.
Sebastian Jaimungal, 2022. "Reinforcement learning and stochastic optimisation," Finance and Stochastics, Springer, vol. 26(1), pages 103-129, January.
Ben Hambly & Renyuan Xu & Huining Yang, 2023. "Recent advances in reinforcement learning in finance," Mathematical Finance, Wiley Blackwell, vol. 33(3), pages 437-503, July.

Full references (including those not matched with items on IDEAS)

Citations

Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

Cited by:

Chen Ziyi & Gu Jia-wen, 2025. "Exploratory Utility Maximization Problem with Tsallis Entropy," Papers 2502.01269, arXiv.org.
Yuling Max Chen & Bin Li & David Saunders, 2025. "Exploratory Mean-Variance Portfolio Optimization with Regime-Switching Market Dynamics," Papers 2501.16659, arXiv.org.
Xuefeng Gao & Lingfei Li & Xun Yu Zhou, 2024. "Reinforcement Learning for Jump-Diffusions, with Financial Applications," Papers 2405.16449, arXiv.org, revised Jan 2025.
Min Dai & Yu Sun & Zuo Quan Xu & Xun Yu Zhou, 2024. "Learning to Optimally Stop Diffusion Processes, with Financial Applications," Papers 2408.09242, arXiv.org, revised Sep 2024.
Yong-Jun Liu, 2025. "Multi-period fuzzy portfolio selection model with preference-regret criterion," Fuzzy Optimization and Decision Making, Springer, vol. 24(1), pages 1-27, March.

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Xiangyu Cui & Xun Li & Yun Shi & Si Zhao, 2023. "Discrete-Time Mean-Variance Strategy Based on Reinforcement Learning," Papers 2312.15385, arXiv.org.
Su, Xiaoshan & Bai, Manying & Han, Yingwei, 2021. "Robust portfolio selection with regime switching and asymmetric dependence," Economic Modelling, Elsevier, vol. 99(C).
Platanakis, Emmanouil & Sakkas, Athanasios & Sutcliffe, Charles, 2019. "Harmful diversification: Evidence from alternative investments," The British Accounting Review, Elsevier, vol. 51(1), pages 1-23.
- Emmanouil Platanakis & Athanasios Sakkas & Charles Sutcliffe, 2017. "Harmful Diversification: Evidence from Alternative Investments," ICMA Centre Discussion Papers in Finance icma-dp2017-09, Henley Business School, University of Reading.
Bernardi, Mauro & Catania, Leopoldo, 2018. "Portfolio optimisation under flexible dynamic dependence modelling," Journal of Empirical Finance, Elsevier, vol. 48(C), pages 1-18.
Levy, Moshe & Kaplanski, Guy, 2015. "Portfolio selection in a two-regime world," European Journal of Operational Research, Elsevier, vol. 242(2), pages 514-524.
Massimo Guidolin & Federica Ria, 2011. "Regime shifts in mean-variance efficient frontiers: Some international evidence," Journal of Asset Management, Palgrave Macmillan, vol. 12(5), pages 322-349, November.
- Massimo Guidolin & Federica Ria, 2010. "Regime shifts in mean-variance efficient frontiers: some international evidence," Working Papers 2010-040, Federal Reserve Bank of St. Louis.
Marcelo Lewin & Carlos Heitor Campani, 2023. "Constrained portfolio strategies in a regime-switching economy," Financial Markets and Portfolio Management, Springer;Swiss Society for Financial Market Research, vol. 37(1), pages 27-59, March.
Hematizadeh, Roksana & Tajaddini, Reza & Hallahan, Terrence, 2022. "Dynamic asset allocation strategy using a state-dependent Markov model: Applications to international equity markets," Journal of International Money and Finance, Elsevier, vol. 128(C).
Campani, Carlos Heitor & Garcia, René & Lewin, Marcelo, 2021. "Optimal portfolio strategies in the presence of regimes in asset returns," Journal of Banking & Finance, Elsevier, vol. 123(C).
Zhou Fang, 2023. "Continuous-Time Path-Dependent Exploratory Mean-Variance Portfolio Construction," Papers 2303.02298, arXiv.org.
Bouyaddou, Youssef & Jebabli, Ikram, 2025. "Integration of investor behavioral perspective and climate change in reinforcement learning for portfolio optimization," Research in International Business and Finance, Elsevier, vol. 73(PB).
Zhang, Caibin & Liang, Zhibin, 2022. "Optimal time-consistent reinsurance and investment strategies for a jump–diffusion financial market without cash," The North American Journal of Economics and Finance, Elsevier, vol. 59(C).
Guidolin, Massimo & Hyde, Stuart, 2012. "Can VAR models capture regime shifts in asset returns? A long-horizon strategic asset allocation perspective," Journal of Banking & Finance, Elsevier, vol. 36(3), pages 695-716.
- Massimo Guidolin & Stuart Hyde, 2010. "Can VAR models capture regime shifts in asset returns? a long-horizon strategic asset allocation perspective," Working Papers 2010-002, Federal Reserve Bank of St. Louis.
- Massimo Guidolin & Stuart Hyde, 2011. "Can VAR Models Capture Regime Shifts in Asset Returns? A Long-Horizon Strategic Asset Allocation Perspective," Working Papers 414, IGIER (Innocenzo Gasparini Institute for Economic Research), Bocconi University.
Chen, Ping & Yam, S.C.P., 2013. "Optimal proportional reinsurance and investment with regime-switching for mean–variance insurers," Insurance: Mathematics and Economics, Elsevier, vol. 53(3), pages 871-883.
Chen, Zhiping & Li, Gang & Zhao, Yonggan, 2014. "Time-consistent investment policies in Markovian markets: A case of mean–variance analysis," Journal of Economic Dynamics and Control, Elsevier, vol. 40(C), pages 293-316.
Penaranda, Francisco, 2007. "Portfolio choice beyond the traditional approach," LSE Research Online Documents on Economics 24481, London School of Economics and Political Science, LSE Library.
Guidolin, Massimo & Liu, Hening, 2016. "Ambiguity Aversion and Underdiversification," Journal of Financial and Quantitative Analysis, Cambridge University Press, vol. 51(4), pages 1297-1323, August.
- Massimo Guidolin & Hening Liu, 2013. "Ambiguity Aversion and Under-diversification," Working Papers 483, IGIER (Innocenzo Gasparini Institute for Economic Research), Bocconi University.
Huy Chau & Duy Nguyen & Thai Nguyen, 2024. "Continuous-time optimal investment with portfolio constraints: a reinforcement learning approach," Papers 2412.10692, arXiv.org.
Jiang, Yifu & Olmo, Jose & Atwi, Majed, 2025. "High-dimensional multi-period portfolio allocation using deep reinforcement learning," International Review of Economics & Finance, Elsevier, vol. 98(C).
Yao, Haixiang & Li, Zhongfei & Chen, Shumin, 2014. "Continuous-time mean–variance portfolio selection with only risky assets," Economic Modelling, Elsevier, vol. 36(C), pages 244-251.

More about this item

Keywords

Reinforcement learning; Actor-critic; Mean-variance; Portfolio selection; Partial information; Regime-switching; Wonham's filter;
All these keywords.

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:dyncon:v:158:y:2024:i:c:s0165188923001938. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/jedc .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Reinforcement learning for continuous-time mean-variance portfolio selection in a regime-switching market

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Citations

Most related items

More about this item

Keywords

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data