IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0315528.html
   My bibliography  Save this article

Explainable post hoc portfolio management financial policy of a Deep Reinforcement Learning agent

Author

Listed:
  • Alejandra de-la-Rica-Escudero
  • Eduardo C Garrido-Merchán
  • María Coronado-Vaca

Abstract

Financial portfolio management investment policies computed quantitatively by modern portfolio theory techniques like the Markowitz model rely on a set of assumptions that are not supported by data in high volatility markets such as the technological sector or cryptocurrencies. Hence, quantitative researchers are looking for alternative models to tackle this problem. Concretely, portfolio management (PM) is a problem that has been successfully addressed recently by Deep Reinforcement Learning (DRL) approaches. In particular, DRL algorithms train an agent by estimating the distribution of the expected reward of every action performed by an agent given any financial state in a simulator, also called gymnasium. However, these methods rely on Deep Neural Networks model to represent such a distribution, that although they are universal approximator models, capable of representing this distribution over time, they cannot explain its behaviour, given by a set of parameters that are not interpretable. Critically, financial investors policies require predictions to be interpretable, to assess whether they follow a reasonable behaviour, so DRL agents are not suited to follow a particular policy or explain their actions. In this work, driven by the motivation of making DRL explainable, we developed a novel Explainable DRL (XDRL) approach for PM, integrating the Proximal Policy Optimization (PPO) DRL algorithm with the model agnostic explainable machine learning techniques of feature importance, SHAP and LIME to enhance transparency in prediction time. By executing our methodology, we can interpret in prediction time the actions of the agent to assess whether they follow the requisites of an investment policy or to assess the risk of following the agent’s suggestions. We empirically illustrate it by successfully identifying key features influencing investment decisions, which demonstrate the ability to explain the agent actions in prediction time. We propose the first explainable post hoc PM financial policy of a DRL agent.

Suggested Citation

  • Alejandra de-la-Rica-Escudero & Eduardo C Garrido-Merchán & María Coronado-Vaca, 2025. "Explainable post hoc portfolio management financial policy of a Deep Reinforcement Learning agent," PLOS ONE, Public Library of Science, vol. 20(1), pages 1-19, January.
  • Handle: RePEc:plo:pone00:0315528
    DOI: 10.1371/journal.pone.0315528
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0315528
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0315528&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0315528?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Xinyi Li & Yinchuan Li & Yuancheng Zhan & Xiao-Yang Liu, 2019. "Optimistic Bull or Pessimistic Bear: Adaptive Deep Reinforcement Learning for Stock Portfolio Allocation," Papers 1907.01503, arXiv.org.
    2. D. Sykes Wilford, 2012. "True Markowitz or assumptions we break and why it matters," Review of Financial Economics, John Wiley & Sons, vol. 21(3), pages 93-101, September.
    3. Zheng Hao & Haowei Zhang & Yipu Zhang, 2023. "Stock Portfolio Management by Using Fuzzy Ensemble Deep Reinforcement Learning Algorithm," JRFM, MDPI, vol. 16(3), pages 1-14, March.
    4. Wilford, D. Sykes, 2012. "True Markowitz or assumptions we break and why it matters," Review of Financial Economics, Elsevier, vol. 21(3), pages 93-101.
    5. Ben Hambly & Renyuan Xu & Huining Yang, 2021. "Recent Advances in Reinforcement Learning in Finance," Papers 2112.04553, arXiv.org, revised Feb 2023.
    6. Akhter Mohiuddin Rather & V. N. Sastry & Arun Agarwal, 2017. "Stock market prediction and Portfolio selection models: a survey," OPSEARCH, Springer;Operational Research Society of India, vol. 54(3), pages 558-579, September.
    7. Jonathan Sadighian, 2019. "Deep Reinforcement Learning in Cryptocurrency Market Making," Papers 1911.08647, arXiv.org.
    8. Ben Hambly & Renyuan Xu & Huining Yang, 2023. "Recent advances in reinforcement learning in finance," Mathematical Finance, Wiley Blackwell, vol. 33(3), pages 437-503, July.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Alejandra de la Rica Escudero & Eduardo C. Garrido-Merchan & Maria Coronado-Vaca, 2024. "Explainable Post hoc Portfolio Management Financial Policy of a Deep Reinforcement Learning agent," Papers 2407.14486, arXiv.org.
    2. Bouyaddou, Youssef & Jebabli, Ikram, 2025. "Integration of investor behavioral perspective and climate change in reinforcement learning for portfolio optimization," Research in International Business and Finance, Elsevier, vol. 73(PB).
    3. François, Pascal & Gauthier, Geneviève & Godin, Frédéric & Mendoza, Carlos Octavio Pérez, 2025. "Is the difference between deep hedging and delta hedging a statistical arbitrage?," Finance Research Letters, Elsevier, vol. 73(C).
    4. Wu, Bo & Li, Lingfei, 2024. "Reinforcement learning for continuous-time mean-variance portfolio selection in a regime-switching market," Journal of Economic Dynamics and Control, Elsevier, vol. 158(C).
    5. Konrad Mueller & Amira Akkari & Lukas Gonon & Ben Wood, 2024. "Fast Deep Hedging with Second-Order Optimization," Papers 2410.22568, arXiv.org.
    6. Nicole Bäuerle & Anna Jaśkiewicz, 2024. "Markov decision processes with risk-sensitive criteria: an overview," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 99(1), pages 141-178, April.
    7. Haoren Zhu & Pengfei Zhao & Wilfred Siu Hung NG & Dik Lun Lee, 2024. "Financial Assets Dependency Prediction Utilizing Spatiotemporal Patterns," Papers 2406.11886, arXiv.org.
    8. Jaskaran Singh Walia & Aarush Sinha & Srinitish Srinivasan & Srihari Unnikrishnan, 2025. "Predicting Liquidity-Aware Bond Yields using Causal GANs and Deep Reinforcement Learning with LLM Evaluation," Papers 2502.17011, arXiv.org.
    9. Jiang, Yifu & Olmo, Jose & Atwi, Majed, 2025. "High-dimensional multi-period portfolio allocation using deep reinforcement learning," International Review of Economics & Finance, Elsevier, vol. 98(C).
    10. Guojun Xiong & Zhiyang Deng & Keyi Wang & Yupeng Cao & Haohang Li & Yangyang Yu & Xueqing Peng & Mingquan Lin & Kaleb E Smith & Xiao-Yang Liu & Jimin Huang & Sophia Ananiadou & Qianqian Xie, 2025. "FLAG-Trader: Fusion LLM-Agent with Gradient-based Reinforcement Learning for Financial Trading," Papers 2502.11433, arXiv.org, revised Feb 2025.
    11. Daniil Karzanov & Rub'en Garz'on & Mikhail Terekhov & Caglar Gulcehre & Thomas Raffinot & Marcin Detyniecki, 2025. "Regret-Optimized Portfolio Enhancement through Deep Reinforcement Learning and Future Looking Rewards," Papers 2502.02619, arXiv.org.
    12. Yuanfei Cui & Fengtong Yao, 2024. "Integrating Deep Learning and Reinforcement Learning for Enhanced Financial Risk Forecasting in Supply Chain Management," Journal of the Knowledge Economy, Springer;Portland International Center for Management of Engineering and Technology (PICMET), vol. 15(4), pages 20091-20110, December.
    13. Xiangyu Cui & Xun Li & Yun Shi & Si Zhao, 2023. "Discrete-Time Mean-Variance Strategy Based on Reinforcement Learning," Papers 2312.15385, arXiv.org.
    14. Shanyu Han & Yang Liu & Xiang Yu, 2025. "Risk-sensitive Reinforcement Learning Based on Convex Scoring Functions," Papers 2505.04553, arXiv.org, revised May 2025.
    15. Horikawa, Hiroaki & Nakagawa, Kei, 2024. "Relationship between deep hedging and delta hedging: Leveraging a statistical arbitrage strategy," Finance Research Letters, Elsevier, vol. 62(PA).
    16. Yu, Hongxiang & Wang, Ziqi & Weng, Yudong & Wang, Liying, 2024. "The impact of guarantee network on the risk of corporate stock price crash: Discussing the moderating effect of internal control quality," International Review of Economics & Finance, Elsevier, vol. 96(PC).
    17. Yuheng Zheng & Zihan Ding, 2024. "Reinforcement Learning in High-frequency Market Making," Papers 2407.21025, arXiv.org, revised Aug 2024.
    18. Kun Yang & Nikhil Krishnan & Sanjeev R. Kulkarni, 2025. "Financial Data Analysis with Robust Federated Logistic Regression," Papers 2504.20250, arXiv.org.
    19. Minshuo Chen & Renyuan Xu & Yumin Xu & Ruixun Zhang, 2025. "Diffusion Factor Models: Generating High-Dimensional Returns with Factor Structure," Papers 2504.06566, arXiv.org, revised May 2025.
    20. Woosung Koh & Insu Choi & Yuntae Jang & Gimin Kang & Woo Chang Kim, 2023. "Curriculum Learning and Imitation Learning for Model-free Control on Financial Time-series," Papers 2311.13326, arXiv.org, revised Jan 2024.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0315528. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.