IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2511.17304.html

Law-Strength Frontiers and a No-Free-Lunch Result for Law-Seeking Reinforcement Learning on Volatility Law Manifolds

Author

Listed:
  • Jian'an Zhang

Abstract

We study reinforcement learning (RL) on volatility surfaces through the lens of Scientific AI. We ask whether axiomatic no-arbitrage laws, imposed as soft penalties on a learned world model, can reliably align high-capacity RL agents, or mainly create Goodhart-style incentives to exploit model errors. From classical static no-arbitrage conditions we build a finite-dimensional convex volatility law manifold of admissible total-variance surfaces, together with a metric law-penalty functional and a Graceful Failure Index (GFI) that normalizes law degradation under shocks. A synthetic generator produces law-consistent trajectories, while a recurrent neural world model trained without law regularization exhibits structured off-manifold errors. On this testbed we define a Goodhart decomposition \(r = r^{\mathcal{M}} + r^\perp\), where \(r^\perp\) is ghost arbitrage from off-manifold prediction error. We prove a ghost-arbitrage incentive theorem for PPO-type agents, a law-strength trade-off theorem showing that stronger penalties eventually worsen P\&L, and a no-free-lunch theorem: under a law-consistent world model and law-aligned strategy class, unconstrained law-seeking RL cannot Pareto-dominate structural baselines on P\&L, penalties, and GFI. In experiments on an SPX/VIX-like world model, simple structural strategies form the empirical law-strength frontier, while all law-seeking RL variants underperform and move into high-penalty, high-GFI regions. Volatility thus provides a concrete case where reward shaping with verifiable penalties is insufficient for robust law alignment.

Suggested Citation

  • Jian'an Zhang, 2025. "Law-Strength Frontiers and a No-Free-Lunch Result for Law-Seeking Reinforcement Learning on Volatility Law Manifolds," Papers 2511.17304, arXiv.org.
  • Handle: RePEc:arx:papers:2511.17304
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2511.17304
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Daniel Guterding, 2022. "Sparse modeling approach to the arbitrage-free interpolation of plain-vanilla option prices and implied volatilities," Papers 2205.10865, arXiv.org, revised May 2023.
    2. Christian Bayer & Peter Friz & Jim Gatheral, 2016. "Pricing under rough volatility," Quantitative Finance, Taylor & Francis Journals, vol. 16(6), pages 887-904, June.
    3. Salinas, David & Flunkert, Valentin & Gasthaus, Jan & Januschowski, Tim, 2020. "DeepAR: Probabilistic forecasting with autoregressive recurrent networks," International Journal of Forecasting, Elsevier, vol. 36(3), pages 1181-1191.
    4. Jim Gatheral & Antoine Jacquier, 2014. "Arbitrage-free SVI volatility surfaces," Quantitative Finance, Taylor & Francis Journals, vol. 14(1), pages 59-71, January.
    5. Johannes Ruf & Weiguan Wang, 2019. "Neural networks for option pricing and hedging: a literature review," Papers 1911.05620, arXiv.org, revised May 2020.
    6. Matthias R. Fengler, 2005. "Semiparametric Modeling of Implied Volatility," Springer Finance, Springer, number 978-3-540-30591-0, October.
    7. Patrick Hagan & Graeme West, 2006. "Interpolation Methods for Curve Construction," Applied Mathematical Finance, Taylor & Francis Journals, vol. 13(2), pages 89-129.
    8. Volodymyr Mnih & Koray Kavukcuoglu & David Silver & Andrei A. Rusu & Joel Veness & Marc G. Bellemare & Alex Graves & Martin Riedmiller & Andreas K. Fidjeland & Georg Ostrovski & Stig Petersen & Charle, 2015. "Human-level control through deep reinforcement learning," Nature, Nature, vol. 518(7540), pages 529-533, February.
    9. Daniel Guterding, 2023. "Sparse Modeling Approach to the Arbitrage-Free Interpolation of Plain-Vanilla Option Prices and Implied Volatilities," Risks, MDPI, vol. 11(5), pages 1-24, April.
    10. Tomas Björk & Bent Jesper Christensen, 1999. "Interest Rate Dynamics and Consistent Forward Rate Curves," Mathematical Finance, Wiley Blackwell, vol. 9(4), pages 323-348, October.
    11. Philippe Artzner & Freddy Delbaen & Jean‐Marc Eber & David Heath, 1999. "Coherent Measures of Risk," Mathematical Finance, Wiley Blackwell, vol. 9(3), pages 203-228, July.
    12. Roger W. Lee, 2004. "The Moment Formula For Implied Volatility At Extreme Strikes," Mathematical Finance, Wiley Blackwell, vol. 14(3), pages 469-480, July.
    13. Carr, Peter & Madan, Dilip B., 2005. "A note on sufficient conditions for no arbitrage," Finance Research Letters, Elsevier, vol. 2(3), pages 125-130, September.
    14. Julian Schrittwieser & Ioannis Antonoglou & Thomas Hubert & Karen Simonyan & Laurent Sifre & Simon Schmitt & Arthur Guez & Edward Lockhart & Demis Hassabis & Thore Graepel & Timothy Lillicrap & David , 2020. "Mastering Atari, Go, chess and shogi by planning with a learned model," Nature, Nature, vol. 588(7839), pages 604-609, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jian'an Zhang, 2025. "Risk-Sensitive Option Market Making with Arbitrage-Free eSSVI Surfaces: A Constrained RL and Stochastic Control Bridge," Papers 2510.04569, arXiv.org.
    2. Bastien Baldacci, 2020. "High-frequency dynamics of the implied volatility surface," Papers 2012.10875, arXiv.org.
    3. Jian'an Zhang, 2025. "Tail-Safe Stochastic-Control SPX-VIX Hedging: A White-Box Bridge Between AI Sensitivities and Arbitrage-Free Market Dynamics," Papers 2510.15937, arXiv.org.
    4. Michael R. Tehranchi, 2020. "A Black–Scholes inequality: applications and generalisations," Finance and Stochastics, Springer, vol. 24(1), pages 1-38, January.
    5. Itkin, Andrey, 2015. "To sigmoid-based functional description of the volatility smile," The North American Journal of Economics and Finance, Elsevier, vol. 31(C), pages 264-291.
    6. Jian'an Zhang, 2025. "A Risk-Neutral Neural Operator for Arbitrage-Free SPX-VIX Term Structures," Papers 2511.06451, arXiv.org.
    7. Stefano De Marco, 2020. "On the harmonic mean representation of the implied volatility," Papers 2007.03585, arXiv.org.
    8. Boswijk, H. Peter & Laeven, Roger J.A. & Vladimirov, Evgenii, 2024. "Estimating option pricing models using a characteristic function-based linear state space representation," Journal of Econometrics, Elsevier, vol. 244(1).
    9. Huang, Ruchen & He, Hongwen & Gao, Miaojue, 2023. "Training-efficient and cost-optimal energy management for fuel cell hybrid electric bus based on a novel distributed deep reinforcement learning framework," Applied Energy, Elsevier, vol. 346(C).
    10. Sergey Badikov & Mark H. A. Davis & Antoine Jacquier, 2018. "Perturbation analysis of sub/super hedging problems," Papers 1806.03543, arXiv.org, revised May 2021.
    11. Virmani, Vineet, 2014. "Model Risk in Pricing Path-dependent Derivatives: An Illustration," IIMA Working Papers WP2014-03-22, Indian Institute of Management Ahmedabad, Research and Publication Department.
    12. Boute, Robert N. & Gijsbrechts, Joren & van Jaarsveld, Willem & Vanvuchelen, Nathalie, 2022. "Deep reinforcement learning for inventory control: A roadmap," European Journal of Operational Research, Elsevier, vol. 298(2), pages 401-412.
    13. Ying Jiao & Chunhua Ma & Simone Scotti & Chao Zhou, 2018. "The Alpha-Heston Stochastic Volatility Model," Papers 1812.01914, arXiv.org.
    14. Emmanuel Gnabeyeu & Omar Karkar & Imad Idboufous, 2024. "Solving The Dynamic Volatility Fitting Problem: A Deep Reinforcement Learning Approach," Papers 2410.11789, arXiv.org.
    15. Jing Wang & Shuaiqiang Liu & Cornelis Vuik, 2025. "Controllable Generation of Implied Volatility Surfaces with Variational Autoencoders," Papers 2509.01743, arXiv.org.
    16. Hyun-Gyoon Kim & Hyeongmi Kim & Jeonggyu Huh, 2025. "Considering Appropriate Input Features of Neural Network to Calibrate Option Pricing Models," Computational Economics, Springer;Society for Computational Economics, vol. 66(1), pages 77-104, July.
    17. Stella C. Dong & James R. Finlay, 2025. "Dynamic Reinsurance Treaty Bidding via Multi-Agent Reinforcement Learning," Papers 2506.13113, arXiv.org.
    18. A. Gulisashvili, 2009. "Asymptotic Formulas with Error Estimates for Call Pricing Functions and the Implied Volatility at Extreme Strikes," Papers 0906.0394, arXiv.org.
    19. Ying Jiao & Chunhua Ma & Simone Scotti & Chao Zhou, 2021. "The Alpha‐Heston stochastic volatility model," Mathematical Finance, Wiley Blackwell, vol. 31(3), pages 943-978, July.
    20. Mohammed Ahnouch & Lotfi Elaachak & Erwan Le Saout, 2025. "Domain Knowledge Preservation in Financial Machine Learning: Evidence from Autocallable Note Pricing," Risks, MDPI, vol. 13(7), pages 1-15, July.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2511.17304. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.