IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2512.12420.html

Deep Hedging with Reinforcement Learning: A Practical Framework for Option Risk Management

Author

Listed:
  • Travon Lucius
  • Christian Koch Jr
  • Jacob Starling
  • Julia Zhu
  • Miguel Urena
  • Carrie Hu

Abstract

We present a reinforcement-learning (RL) framework for dynamic hedging of equity index option exposures under realistic transaction costs and position limits. We hedge a normalized option-implied equity exposure (one unit of underlying delta, offset via SPY) by trading the underlying index ETF, using the option surface and macro variables only as state information and not as a direct pricing engine. Building on the "deep hedging" paradigm of Buehler et al. (2019), we design a leak-free environment, a cost-aware reward function, and a lightweight stochastic actor-critic agent trained on daily end-of-day panel data constructed from SPX/SPY implied volatility term structure, skew, realized volatility, and macro rate context. On a fixed train/validation/test split, the learned policy improves risk-adjusted performance versus no-hedge, momentum, and volatility-targeting baselines (higher point-estimate Sharpe); only the GAE policy's test-sample Sharpe is statistically distinguishable from zero, although confidence intervals overlap with a long-SPY benchmark so we stop short of claiming formal dominance. Turnover remains controlled and the policy is robust to doubled transaction costs. The modular codebase, comprising a data pipeline, simulator, and training scripts, is engineered for extensibility to multi-asset overlays, alternative objectives (e.g., drawdown or CVaR), and intraday data. From a portfolio management perspective, the learned overlay is designed to sit on top of an existing SPX or SPY allocation, improving the portfolio's mean-variance trade-off with controlled turnover and drawdowns. We discuss practical implications for portfolio overlays and outline avenues for future work.

Suggested Citation

  • Travon Lucius & Christian Koch Jr & Jacob Starling & Julia Zhu & Miguel Urena & Carrie Hu, 2025. "Deep Hedging with Reinforcement Learning: A Practical Framework for Option Risk Management," Papers 2512.12420, arXiv.org.
  • Handle: RePEc:arx:papers:2512.12420
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2512.12420
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Leland, Hayne E, 1985. "Option Pricing and Replication with Transactions Costs," Journal of Finance, American Finance Association, vol. 40(5), pages 1283-1301, December.
    2. Robert C. Merton, 2005. "Theory of rational option pricing," World Scientific Book Chapters, in: Sudipto Bhattacharya & George M Constantinides (ed.), Theory Of Valuation, chapter 8, pages 229-288, World Scientific Publishing Co. Pte. Ltd..
    3. Black, Fischer & Scholes, Myron S, 1973. "The Pricing of Options and Corporate Liabilities," Journal of Political Economy, University of Chicago Press, vol. 81(3), pages 637-654, May-June.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Wang, Jun & Liang, Jin-Rong & Lv, Long-Jin & Qiu, Wei-Yuan & Ren, Fu-Yao, 2012. "Continuous time Black–Scholes equation with transaction costs in subdiffusive fractional Brownian motion regime," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 391(3), pages 750-759.
    2. Mastinšek Miklavž, 2015. "Reduction of the Mean Hedging Transaction Costs / Redukcija povprečnih transakcijskih stroškov hedging tehnike," Naše gospodarstvo/Our economy, Sciendo, vol. 61(5), pages 23-31, October.
    3. Yilun Zhang & Zheng Tang & Hexiang Sun & Yufeng Shi, 2026. "Deep g-Pricing for CSI 300 Index Options with Volatility Trajectories and Market Sentiment," Papers 2601.18804, arXiv.org.
    4. Bjork, Tomas, 2009. "Arbitrage Theory in Continuous Time," OUP Catalogue, Oxford University Press, edition 3, number 9780199574742.
    5. Bas Peeters & Cees L. Dert & André Lucas, 2003. "Black Scholes for Portfolios of Options in Discrete Time: the Price is Right, the Hedge is wrong," Tinbergen Institute Discussion Papers 03-090/2, Tinbergen Institute.
    6. Hu, Yuan & Lindquist, W. Brent & Rachev, Svetlozar T. & Shirvani, Abootaleb & Fabozzi, Frank J., 2022. "Market complete option valuation using a Jarrow-Rudd pricing tree with skewness and kurtosis," Journal of Economic Dynamics and Control, Elsevier, vol. 137(C).
    7. Suresh M. Sundaresan, 2000. "Continuous‐Time Methods in Finance: A Review and an Assessment," Journal of Finance, American Finance Association, vol. 55(4), pages 1569-1622, August.
    8. Duffie, Darrell, 2003. "Intertemporal asset pricing theory," Handbook of the Economics of Finance, in: G.M. Constantinides & M. Harris & R. M. Stulz (ed.), Handbook of the Economics of Finance, edition 1, volume 1, chapter 11, pages 639-742, Elsevier.
    9. Minting Zhu & Mancang Wang & Jingyu Wu, 2024. "An Option Pricing Formula for Active Hedging Under Logarithmic Investment Strategy," Mathematics, MDPI, vol. 12(23), pages 1-20, December.
    10. Ben Hambly & Renyuan Xu & Huining Yang, 2021. "Recent Advances in Reinforcement Learning in Finance," Papers 2112.04553, arXiv.org, revised Feb 2023.
    11. Jos'e Manuel Corcuera, 2021. "The Golden Age of the Mathematical Finance," Papers 2102.06693, arXiv.org, revised Mar 2021.
    12. Wang, Xiao-Tian & Yan, Hai-Gang & Tang, Ming-Ming & Zhu, En-Hui, 2010. "Scaling and long-range dependence in option pricing III: A fractional version of the Merton model with transaction costs," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 389(3), pages 452-458.
    13. Lv, Longjin & Xiao, Jianbin & Fan, Liangzhong & Ren, Fuyao, 2016. "Correlated continuous time random walk and option pricing," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 447(C), pages 100-107.
    14. Chuang-Chang Chang & Jun-Biao Lin, 2010. "The valuation of multivariate contingent claims under transformed trinomial approaches," Review of Quantitative Finance and Accounting, Springer, vol. 34(1), pages 23-36, January.
    15. Boyle, Phelim & Tian, Weidong, 2008. "The design of equity-indexed annuities," Insurance: Mathematics and Economics, Elsevier, vol. 43(3), pages 303-315, December.
    16. Clewlow, Les & Hodges, Stewart, 1997. "Optimal delta-hedging under transactions costs," Journal of Economic Dynamics and Control, Elsevier, vol. 21(8-9), pages 1353-1376, June.
    17. Perrakis, Stylianos & Lefoll, Jean, 2000. "Option pricing and replication with transaction costs and dividends," Journal of Economic Dynamics and Control, Elsevier, vol. 24(11-12), pages 1527-1561, October.
    18. DeMarzo, Peter M. & Kremer, Ilan & Mansour, Yishay, 2016. "Robust option pricing: Hannan and Blackwell meet Black and Scholes," Journal of Economic Theory, Elsevier, vol. 163(C), pages 410-434.
    19. Dimitris Bertsimas & Leonid Kogan & Andrew W. Lo, 2001. "When Is Time Continuous?," World Scientific Book Chapters, in: Marco Avellaneda (ed.), Quantitative Analysis In Financial Markets Collected Papers of the New York University Mathematical Finance Seminar(Volume II), chapter 3, pages 71-102, World Scientific Publishing Co. Pte. Ltd..
    20. Raymond Chiang & John Okunev & Mark Tippett, 1997. "Stochastic interest rates, transaction costs, and immunizing foreign currency risk," Journal of Futures Markets, John Wiley & Sons, Ltd., vol. 17(5), pages 579-598, August.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2512.12420. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.