IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0332779.html

Factor-based deep reinforcement learning for asset allocation: Comparative analysis of static and dynamic beta reward designs

Author

Listed:
  • Nak Hyun Jung
  • Taeyeon Oh

Abstract

Traditional asset allocation rules, while effective in stable phases, tend to erode once markets enter volatile regimes or undergo structural breaks. Research in deep reinforcement learning (DRL) has usually emphasized raw-return rewards, leaving aside the role of factor exposures (β) that shape both risk-adjusted payoffs and adaptive responses.This paper advances a Factor-based Deep Reinforcement Learning for Asset Allocation (FDRL) framework in which β sensitivities—estimated via rolling regressions on momentum, volatility, deviation, and volume signals—inform both the state representation and the reward design. Five reward variants are examined (Sharpe, Sortino, Static-β, Dynamic-β, Momentum-β) using PPO, SAC, and TD3 across equities, cryptocurrencies, macroeconomic instruments, and mixed portfolios.Empirically, β-based rewards generate heterogeneous but interpretable patterns. In equities, Dynamic-β improves annualized returns from roughly 20% (Sharpe baseline) to 23–24%, with Sharpe rising from 1.04 to about 1.27 across windows. In cryptocurrencies, Dynamic-/Momentum-β achieve 38–43% annual returns but remain highly regime-sensitive, with drawdowns often exceeding –35%. In macro instruments, Static-β delivers the most stable behaviour, maintaining volatilities near 8–9% and limiting drawdowns to roughly –18%. In mixed-asset portfolios, Momentum-β under TD3 produces the strongest gains (cumulative returns above 70–80%), exceeding equal-weight baselines whose CAGR remains near 19–22% with Sharpe ratios around 1.25.All findings were validated through beta-window sensitivity checks (30/60/90/120 days), regime-conditional analysis, and multiple robustness tests including HAC, Wilcoxon, jackknife Sharpe, moving-block bootstrap, and false-discovery-rate adjustments. These diagnostics confirm that the main performance patterns are not driven by window choice or serial dependence.Four contributions follow. First, a reward structure operationalizing time-varying β. Second, systematic benchmarking of factor-sensitive objectives. Third, evidence on asymmetric outcomes across asset classes. Finally, a framework that reconciles responsiveness with interpretability and risk discipline in allocation.

Suggested Citation

  • Nak Hyun Jung & Taeyeon Oh, 2025. "Factor-based deep reinforcement learning for asset allocation: Comparative analysis of static and dynamic beta reward designs," PLOS ONE, Public Library of Science, vol. 20(12), pages 1-26, December.
  • Handle: RePEc:plo:pone00:0332779
    DOI: 10.1371/journal.pone.0332779
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0332779
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0332779&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0332779?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Fama, Eugene F. & French, Kenneth R., 1993. "Common risk factors in the returns on stocks and bonds," Journal of Financial Economics, Elsevier, vol. 33(1), pages 3-56, February.
    2. Ben Hambly & Renyuan Xu & Huining Yang, 2021. "Recent Advances in Reinforcement Learning in Finance," Papers 2112.04553, arXiv.org, revised Feb 2023.
    3. Ben Hambly & Renyuan Xu & Huining Yang, 2023. "Recent advances in reinforcement learning in finance," Mathematical Finance, Wiley Blackwell, vol. 33(3), pages 437-503, July.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Minshuo Chen & Renyuan Xu & Yumin Xu & Ruixun Zhang, 2025. "Diffusion Factor Models: Generating High-Dimensional Returns with Factor Structure," Papers 2504.06566, arXiv.org, revised Jan 2026.
    2. Bouyaddou, Youssef & Jebabli, Ikram, 2025. "Integration of investor behavioral perspective and climate change in reinforcement learning for portfolio optimization," Research in International Business and Finance, Elsevier, vol. 73(PB).
    3. François, Pascal & Gauthier, Geneviève & Godin, Frédéric & Mendoza, Carlos Octavio Pérez, 2025. "Is the difference between deep hedging and delta hedging a statistical arbitrage?," Finance Research Letters, Elsevier, vol. 73(C).
    4. Alejandra de-la-Rica-Escudero & Eduardo C Garrido-Merchán & María Coronado-Vaca, 2025. "Explainable post hoc portfolio management financial policy of a Deep Reinforcement Learning agent," PLOS ONE, Public Library of Science, vol. 20(1), pages 1-19, January.
    5. Wu, Bo & Li, Lingfei, 2024. "Reinforcement learning for continuous-time mean-variance portfolio selection in a regime-switching market," Journal of Economic Dynamics and Control, Elsevier, vol. 158(C).
    6. Konrad Mueller & Amira Akkari & Lukas Gonon & Ben Wood, 2024. "Fast Deep Hedging with Second-Order Optimization," Papers 2410.22568, arXiv.org.
    7. Nicole Bäuerle & Anna Jaśkiewicz, 2024. "Markov decision processes with risk-sensitive criteria: an overview," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 99(1), pages 141-178, April.
    8. Haoren Zhu & Pengfei Zhao & Wilfred Siu Hung NG & Dik Lun Lee, 2024. "Financial Assets Dependency Prediction Utilizing Spatiotemporal Patterns," Papers 2406.11886, arXiv.org.
    9. Jaskaran Singh Walia & Aarush Sinha & Srinitish Srinivasan & Srihari Unnikrishnan, 2025. "Predicting Liquidity-Aware Bond Yields using Causal GANs and Deep Reinforcement Learning with LLM Evaluation," Papers 2502.17011, arXiv.org.
    10. Jiang, Yifu & Olmo, Jose & Atwi, Majed, 2025. "High-dimensional multi-period portfolio allocation using deep reinforcement learning," International Review of Economics & Finance, Elsevier, vol. 98(C).
    11. Rongwei Liu & Jin Zheng & John Cartlidge, 2025. "Deep Reinforcement Learning for Optimal Asset Allocation Using DDPG with TiDE," Papers 2508.20103, arXiv.org.
    12. Guojun Xiong & Zhiyang Deng & Keyi Wang & Yupeng Cao & Haohang Li & Yangyang Yu & Xueqing Peng & Mingquan Lin & Kaleb E Smith & Xiao-Yang Liu & Jimin Huang & Sophia Ananiadou & Qianqian Xie, 2025. "FLAG-Trader: Fusion LLM-Agent with Gradient-based Reinforcement Learning for Financial Trading," Papers 2502.11433, arXiv.org, revised Feb 2025.
    13. Daniil Karzanov & Rub'en Garz'on & Mikhail Terekhov & Caglar Gulcehre & Thomas Raffinot & Marcin Detyniecki, 2025. "Regret-Optimized Portfolio Enhancement through Deep Reinforcement Learning and Future Looking Rewards," Papers 2502.02619, arXiv.org.
    14. Yuanfei Cui & Fengtong Yao, 2024. "RETRACTED ARTICLE: Integrating Deep Learning and Reinforcement Learning for Enhanced Financial Risk Forecasting in Supply Chain Management," Journal of the Knowledge Economy, Springer;Portland International Center for Management of Engineering and Technology (PICMET), vol. 15(4), pages 20091-20110, December.
    15. Xiangyu Cui & Xun Li & Yun Shi & Si Zhao, 2023. "Discrete-Time Mean-Variance Strategy Based on Reinforcement Learning," Papers 2312.15385, arXiv.org.
    16. Ahmad Aghapour & Erhan Bayraktar & Fengyi Yuan, 2025. "Solving dynamic portfolio selection problems via score-based diffusion models," Papers 2507.09916, arXiv.org, revised Aug 2025.
    17. Shanyu Han & Yang Liu & Xiang Yu, 2025. "Risk-sensitive Reinforcement Learning Based on Convex Scoring Functions," Papers 2505.04553, arXiv.org, revised May 2025.
    18. Horikawa, Hiroaki & Nakagawa, Kei, 2024. "Relationship between deep hedging and delta hedging: Leveraging a statistical arbitrage strategy," Finance Research Letters, Elsevier, vol. 62(PA).
    19. Yu, Hongxiang & Wang, Ziqi & Weng, Yudong & Wang, Liying, 2024. "The impact of guarantee network on the risk of corporate stock price crash: Discussing the moderating effect of internal control quality," International Review of Economics & Finance, Elsevier, vol. 96(PC).
    20. Fuwei Jiang & Jie Kang & Ruzheng Tian & Qingdong Xu, 2025. "Black‐Scholes Meet Imitation Learning: Evidence From Deep Hedging in China," Journal of Futures Markets, John Wiley & Sons, Ltd., vol. 45(8), pages 1071-1087, August.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0332779. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.