IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2509.12764.html

Myopic Optimality: why reinforcement learning portfolio management strategies lose money

Author

Listed:
  • Yuming Ma

Abstract

Myopic optimization (MO) outperforms reinforcement learning (RL) in portfolio management: RL yields lower or negative returns, higher variance, larger costs, heavier CVaR, lower profitability, and greater model risk. We model execution/liquidation frictions with mark-to-market accounting. Using Malliavin calculus (Clark-Ocone/BEL), we derive policy gradients and risk shadow price, unifying HJB and KKT. This gives dual gap and convergence results: geometric MO vs. RL floors. We quantify phantom profit in RL via Malliavin policy-gradient contamination analysis and define a control-affects-dynamics (CAD) premium of RL indicating plausibly positive.

Suggested Citation

  • Yuming Ma, 2025. "Myopic Optimality: why reinforcement learning portfolio management strategies lose money," Papers 2509.12764, arXiv.org.
  • Handle: RePEc:arx:papers:2509.12764
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2509.12764
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Hans Buehler & Phillip Murray & Mikko S. Pakkanen & Ben Wood, 2021. "Deep Hedging: Learning to Remove the Drift under Trading Frictions with Minimal Equivalent Near-Martingale Measures," Papers 2111.07844, arXiv.org, revised Jan 2022.
    2. Jan Kallsen & Johannes Muhle-Karbe, 2015. "Option Pricing And Hedging With Small Transaction Costs," Mathematical Finance, Wiley Blackwell, vol. 25(4), pages 702-723, October.
    3. Merton, Robert C., 1971. "Optimum consumption and portfolio rules in a continuous-time model," Journal of Economic Theory, Elsevier, vol. 3(4), pages 373-413, December.
    4. David Wu & Sebastian Jaimungal, 2023. "Robust Risk-Aware Option Hedging," Papers 2303.15216, arXiv.org, revised Dec 2023.
    5. M. H. A. Davis & A. R. Norman, 1990. "Portfolio Selection with Transaction Costs," Mathematics of Operations Research, INFORMS, vol. 15(4), pages 676-713, November.
    6. David Wu & Sebastian Jaimungal, 2023. "Robust Risk-Aware Option Hedging," Applied Mathematical Finance, Taylor & Francis Journals, vol. 30(3), pages 153-174, May.
    7. Merton, Robert C, 1969. "Lifetime Portfolio Selection under Uncertainty: The Continuous-Time Case," The Review of Economics and Statistics, MIT Press, vol. 51(3), pages 247-257, August.
    8. A. E. Whalley & P. Wilmott, 1997. "An Asymptotic Analysis of an Optimal Hedging Model for Option Pricing with Transaction Costs," Mathematical Finance, Wiley Blackwell, vol. 7(3), pages 307-324, July.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Florent Gallien & Serge Kassibrakis & Semyon Malamud, 2018. "Hedge or Rebalance: Optimal Risk Management with Transaction Costs," Risks, MDPI, vol. 6(4), pages 1-14, October.
    2. Jan Kallsen & Johannes Muhle-Karbe, 2013. "The General Structure of Optimal Investment and Consumption with Small Transaction Costs," Papers 1303.3148, arXiv.org, revised May 2015.
    3. Xinfu Chen & Min Dai & Wei Jiang & Cong Qin, 2022. "Asymptotic analysis of long‐term investment with two illiquid and correlated assets," Mathematical Finance, Wiley Blackwell, vol. 32(4), pages 1133-1169, October.
    4. Johannes Muhle-Karbe & Max Reppen & H. Mete Soner, 2016. "A Primer on Portfolio Choice with Small Transaction Costs," Papers 1612.01302, arXiv.org, revised May 2017.
    5. Yingting Miao & Qiang Zhang, 2023. "Optimal Investment and Consumption Strategies with General and Linear Transaction Costs under CRRA Utility," Papers 2304.07672, arXiv.org.
    6. Bjork, Tomas, 2009. "Arbitrage Theory in Continuous Time," OUP Catalogue, Oxford University Press, edition 3, number 9780199574742.
    7. Marcos Escobar-Anel & Michel Kschonnek & Rudi Zagst, 2022. "Portfolio optimization: not necessarily concave utility and constraints on wealth and allocation," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 95(1), pages 101-140, February.
    8. Jean-Pierre Fouque & Ruimeng Hu & Ronnie Sircar, 2021. "Sub- and Super-solution Approach to Accuracy Analysis of Portfolio Optimization Asymptotics in Multiscale Stochastic Factor Market," Papers 2106.11510, arXiv.org, revised Oct 2021.
    9. Baojun Bian & Xinfu Chen & Min Dai & Shuaijie Qian, 2021. "Penalty method for portfolio selection with capital gains tax," Mathematical Finance, Wiley Blackwell, vol. 31(3), pages 1013-1055, July.
    10. Dai, Min & Wang, Hefei & Yang, Zhou, 2012. "Leverage management in a bull–bear switching market," Journal of Economic Dynamics and Control, Elsevier, vol. 36(10), pages 1585-1599.
    11. Zuo Quan Xu & Fahuai Yi, 2014. "An Optimal Consumption-Investment Model with Constraint on Consumption," Papers 1404.7698, arXiv.org.
    12. Jules Arzel & Noureddine Lehdili, 2026. "Bridging Stochastic Control and Deep Hedging: Structural Priors for No-Transaction Band Networks," Papers 2603.29994, arXiv.org.
    13. Davi Valladão & Thuener Silva & Marcus Poggi, 2019. "Time-consistent risk-constrained dynamic portfolio optimization with transactional costs and time-dependent returns," Annals of Operations Research, Springer, vol. 282(1), pages 379-405, November.
    14. Girlich, Hans-Joachim, 2003. "Transaction costs in finance and inventory research," International Journal of Production Economics, Elsevier, vol. 81(1), pages 341-350, January.
    15. Michael Monoyios, 2004. "Performance of utility-based strategies for hedging basis risk," Quantitative Finance, Taylor & Francis Journals, vol. 4(3), pages 245-255.
    16. Mark Broadie & Weiwei Shen, 2016. "High-Dimensional Portfolio Optimization With Transaction Costs," International Journal of Theoretical and Applied Finance (IJTAF), World Scientific Publishing Co. Pte. Ltd., vol. 19(04), pages 1-49, June.
    17. Valeri Zakamouline, 2005. "A unified approach to portfolio optimization with linear transaction costs," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 62(2), pages 319-343, November.
    18. Collin-Dufresne, Pierre & Daniel, Kent & Sağlam, Mehmet, 2020. "Liquidity regimes and optimal dynamic asset allocation," Journal of Financial Economics, Elsevier, vol. 136(2), pages 379-406.
    19. Chellathurai, Thamayanthi & Draviam, Thangaraj, 2007. "Dynamic portfolio selection with fixed and/or proportional transaction costs using non-singular stochastic optimal control theory," Journal of Economic Dynamics and Control, Elsevier, vol. 31(7), pages 2168-2195, July.
    20. Albert Altarovici & Max Reppen & H. Mete Soner, 2016. "Optimal Consumption and Investment with Fixed and Proportional Transaction Costs," Papers 1610.03958, arXiv.org.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2509.12764. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.