IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2509.12764.html
   My bibliography  Save this paper

Myopic Optimality: why reinforcement learning portfolio management strategies lose money

Author

Listed:
  • Yuming Ma

Abstract

Myopic optimization (MO) outperforms reinforcement learning (RL) in portfolio management: RL yields lower or negative returns, higher variance, larger costs, heavier CVaR, lower profitability, and greater model risk. We model execution/liquidation frictions with mark-to-market accounting. Using Malliavin calculus (Clark-Ocone/BEL), we derive policy gradients and risk shadow price, unifying HJB and KKT. This gives dual gap and convergence results: geometric MO vs. RL floors. We quantify phantom profit in RL via Malliavin policy-gradient contamination analysis and define a control-affects-dynamics (CAD) premium of RL indicating plausibly positive.

Suggested Citation

  • Yuming Ma, 2025. "Myopic Optimality: why reinforcement learning portfolio management strategies lose money," Papers 2509.12764, arXiv.org.
  • Handle: RePEc:arx:papers:2509.12764
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2509.12764
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Merton, Robert C., 1971. "Optimum consumption and portfolio rules in a continuous-time model," Journal of Economic Theory, Elsevier, vol. 3(4), pages 373-413, December.
    2. Merton, Robert C, 1969. "Lifetime Portfolio Selection under Uncertainty: The Continuous-Time Case," The Review of Economics and Statistics, MIT Press, vol. 51(3), pages 247-257, August.
    3. A. E. Whalley & P. Wilmott, 1997. "An Asymptotic Analysis of an Optimal Hedging Model for Option Pricing with Transaction Costs," Mathematical Finance, Wiley Blackwell, vol. 7(3), pages 307-324, July.
    4. Hans Buehler & Phillip Murray & Mikko S. Pakkanen & Ben Wood, 2021. "Deep Hedging: Learning to Remove the Drift under Trading Frictions with Minimal Equivalent Near-Martingale Measures," Papers 2111.07844, arXiv.org, revised Jan 2022.
    5. David Wu & Sebastian Jaimungal, 2023. "Robust Risk-Aware Option Hedging," Papers 2303.15216, arXiv.org, revised Dec 2023.
    6. David Wu & Sebastian Jaimungal, 2023. "Robust Risk-Aware Option Hedging," Applied Mathematical Finance, Taylor & Francis Journals, vol. 30(3), pages 153-174, May.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jan Kallsen & Johannes Muhle-Karbe, 2013. "The General Structure of Optimal Investment and Consumption with Small Transaction Costs," Papers 1303.3148, arXiv.org, revised May 2015.
    2. Xinfu Chen & Min Dai & Wei Jiang & Cong Qin, 2022. "Asymptotic analysis of long‐term investment with two illiquid and correlated assets," Mathematical Finance, Wiley Blackwell, vol. 32(4), pages 1133-1169, October.
    3. Michael Monoyios, 2004. "Performance of utility-based strategies for hedging basis risk," Quantitative Finance, Taylor & Francis Journals, vol. 4(3), pages 245-255.
    4. Florent Gallien & Serge Kassibrakis & Semyon Malamud, 2018. "Hedge or Rebalance: Optimal Risk Management with Transaction Costs," Risks, MDPI, vol. 6(4), pages 1-14, October.
    5. Yingting Miao & Qiang Zhang, 2023. "Optimal Investment and Consumption Strategies with General and Linear Transaction Costs under CRRA Utility," Papers 2304.07672, arXiv.org.
    6. An, Jongbong & Jeon, Junkee & Kim, Takwon, 2025. "Optimal portfolio and retirement decisions with costly job switching options," Applied Mathematics and Computation, Elsevier, vol. 491(C).
    7. Auffret, Philippe, 2001. "An alternative unifying measure of welfare gains from risk-sharing," Policy Research Working Paper Series 2676, The World Bank.
    8. Chen, An & Hieber, Peter & Sureth, Caren, 2022. "Pay for tax certainty? Advance tax rulings for risky investment under multi-dimensional tax uncertainty," arqus Discussion Papers in Quantitative Tax Research 273, arqus - Arbeitskreis Quantitative Steuerlehre.
    9. Andreas Fagereng & Luigi Guiso & Davide Malacrino & Luigi Pistaferri, 2020. "Heterogeneity and Persistence in Returns to Wealth," Econometrica, Econometric Society, vol. 88(1), pages 115-170, January.
    10. John H. Cochrane, 1999. "New facts in finance," Economic Perspectives, Federal Reserve Bank of Chicago, vol. 23(Q III), pages 36-58.
    11. John Y. Campbell & Luis M. Viceira & Joshua S. White, 2003. "Foreign Currency for Long-Term Investors," Economic Journal, Royal Economic Society, vol. 113(486), pages 1-25, March.
    12. Stephen Satchell & Susan Thorp, 2007. "Scenario Analysis with Recursive Utility: Dynamic Consumption Plans for Charitable Endowments," Research Paper Series 209, Quantitative Finance Research Centre, University of Technology, Sydney.
    13. Hong‐Chih Huang, 2010. "Optimal Multiperiod Asset Allocation: Matching Assets to Liabilities in a Discrete Model," Journal of Risk & Insurance, The American Risk and Insurance Association, vol. 77(2), pages 451-472, June.
    14. Orszag, J. Michael & Yang, Hong, 1995. "Portfolio choice with Knightian uncertainty," Journal of Economic Dynamics and Control, Elsevier, vol. 19(5-7), pages 873-900.
    15. Bjork, Tomas, 2009. "Arbitrage Theory in Continuous Time," OUP Catalogue, Oxford University Press, edition 3, number 9780199574742.
    16. E. Nasakkala & J. Keppo, 2008. "Hydropower with Financial Information," Applied Mathematical Finance, Taylor & Francis Journals, vol. 15(5-6), pages 503-529.
    17. Letendre, Marc-Andre & Smith, Gregor W., 2001. "Precautionary saving and portfolio allocation: DP by GMM," Journal of Monetary Economics, Elsevier, vol. 48(1), pages 197-215, August.
    18. Jorge Braga de Macedo & Jeffrey Goldstein & David Meerschwam, 1984. "International Portfolio Diversification: Short-Term Financial Assets and Gold," NBER Chapters, in: Exchange Rate Theory and Practice, pages 199-238, National Bureau of Economic Research, Inc.
    19. Pliska, Stanley R. & Ye, Jinchun, 2007. "Optimal life insurance purchase and consumption/investment under uncertain lifetime," Journal of Banking & Finance, Elsevier, vol. 31(5), pages 1307-1319, May.
    20. Castañeda, Pablo & Devoto, Benjamín, 2016. "On the structural estimation of an optimal portfolio rule," Finance Research Letters, Elsevier, vol. 16(C), pages 290-300.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2509.12764. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.