Solving optimal stopping problems with Deep Q-Learning

My bibliography Save this paper

Solving optimal stopping problems with Deep Q-Learning

Author

Listed:

John Ery
Loris Michel

Registered:

Abstract

We propose a reinforcement learning (RL) approach to model optimal exercise strategies for option-type products. We pursue the RL avenue in order to learn the optimal action-value function of the underlying stopping problem. In addition to retrieving the optimal Q-function at any time step, one can also price the contract at inception. We first discuss the standard setting with one exercise right, and later extend this framework to the case of multiple stopping opportunities in the presence of constraints. We propose to approximate the Q-function with a deep neural network, which does not require the specification of basis functions as in the least-squares Monte Carlo framework and is scalable to higher dimensions. We derive a lower bound on the option price obtained from the trained neural network and an upper bound from the dual formulation of the stopping problem, which can also be expressed in terms of the Q-function. Our methodology is illustrated with examples covering the pricing of swing options.

Suggested Citation

John Ery & Loris Michel, 2021. "Solving optimal stopping problems with Deep Q-Learning," Papers 2101.09682, arXiv.org.

Handle: RePEc:arx:papers:2101.09682

Download full text from publisher

References listed on IDEAS

Longstaff, Francis A & Schwartz, Eduardo S, 2001. "Valuing American Options by Simulation: A Simple Least-Squares Approach," The Review of Financial Studies, Society for Financial Studies, vol. 14(1), pages 113-147.
Christian Bender, 2011. "Dual pricing of multi-exercise options under volume constraints," Finance and Stochastics, Springer, vol. 15(1), pages 1-26, January.
John Schoenmakers, 2012. "A pure martingale dual for multiple stopping," Finance and Stochastics, Springer, vol. 16(2), pages 319-334, April.
Christian Bender & John Schoenmakers & Jianing Zhang, 2015. "Dual Representations For General Multiple Stopping Problems," Mathematical Finance, Wiley Blackwell, vol. 25(2), pages 339-370, April.
Sebastian Becker & Patrick Cheridito & Arnulf Jentzen, 2020. "Pricing and Hedging American-Style Options with Deep Learning," JRFM, MDPI, vol. 13(7), pages 1-12, July.
Sebastian Becker & Patrick Cheridito & Arnulf Jentzen, 2019. "Pricing and hedging American-style options with deep learning," Papers 1912.11060, arXiv.org, revised Jul 2020.
Roberto Daluiso & Emanuele Nastasi & Andrea Pallavicini & Giulio Sartorelli, 2020. "Pricing commodity swing options," Papers 2001.08906, arXiv.org.
N. Meinshausen & B. M. Hambly, 2004. "Monte Carlo Methods For The Valuation Of Multiple‐Exercise Options," Mathematical Finance, Wiley Blackwell, vol. 14(4), pages 557-583, October.
Olivier Bardou & Sandrine Bouthemy & Gilles Pages, 2009. "Optimal Quantization for the Pricing of Swing Options," Applied Mathematical Finance, Taylor & Francis Journals, vol. 16(2), pages 183-217.
Longstaff, Francis A & Schwartz, Eduardo S, 2001. "Valuing American Options by Simulation: A Simple Least-Squares Approach," University of California at Los Angeles, Anderson Graduate School of Management qt43n1k4jb, Anderson Graduate School of Management, UCLA.

Full references (including those not matched with items on IDEAS)

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Nicolas Essis-Breton & Patrice Gaillardetz, 2020. "Fast Lower and Upper Estimates for the Price of Constrained Multiple Exercise American Options by Single Pass Lookahead Search and Nearest-Neighbor Martingale," Papers 2002.11258, arXiv.org.
Dong, Wenfeng & Kang, Boda, 2019. "Analysis of a multiple year gas sales agreement with make-up, carry-forward and indexation," Energy Economics, Elsevier, vol. 79(C), pages 76-96.
Hainaut, Donatien & Akbaraly, Adnane, 2023. "Risk management with Local Least Squares Monte-Carlo," LIDAM Discussion Papers ISBA 2023003, Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA).
Calypso Herrera & Florian Krach & Pierre Ruyssen & Josef Teichmann, 2021. "Optimal Stopping via Randomized Neural Networks," Papers 2104.13669, arXiv.org, revised Dec 2023.
Ivan Guo & Nicolas Langren'e & Jiahao Wu, 2023. "Simultaneous upper and lower bounds of American option prices with hedging via neural networks," Papers 2302.12439, arXiv.org, revised Apr 2024.
Juri Hinz & Jeremy Yee, 2017. "An Algorithmic Approach to Optimal Asset Liquidation Problems," Asia-Pacific Financial Markets, Springer;Japanese Association of Financial Economics and Engineering, vol. 24(2), pages 109-129, June.
J. Lars Kirkby & Shi-Jie Deng, 2019. "Swing Option Pricing By Dynamic Programming With B-Spline Density Projection," International Journal of Theoretical and Applied Finance (IJTAF), World Scientific Publishing Co. Pte. Ltd., vol. 22(08), pages 1-53, December.
Jain, Shashi & Roelofs, Ferry & Oosterlee, Cornelis W., 2013. "Valuing modular nuclear power plants in finite time decision horizon," Energy Economics, Elsevier, vol. 36(C), pages 625-636.
Lukas Gonon, 2022. "Deep neural network expressivity for optimal stopping problems," Papers 2210.10443, arXiv.org.
Chinonso Nwankwo & Nneka Umeorah & Tony Ware & Weizhong Dai, 2022. "Deep learning and American options via free boundary framework," Papers 2211.11803, arXiv.org, revised Dec 2022.
A. Max Reppen & H. Mete Soner & Valentin Tissot-Daguette, 2022. "Neural Optimal Stopping Boundary," Papers 2205.04595, arXiv.org, revised May 2023.
John Schoenmakers, 2012. "A pure martingale dual for multiple stopping," Finance and Stochastics, Springer, vol. 16(2), pages 319-334, April.
R. Mark Reesor & T. James Marshall, 2020. "Forest of Stochastic Trees: A Method for Valuing Multiple Exercise Options," JRFM, MDPI, vol. 13(5), pages 1-31, May.
Secomandi, Nicola & Seppi, Duane J., 2014. "Real Options and Merchant Operations of Energy and Other Commodities," Foundations and Trends(R) in Technology, Information and Operations Management, now publishers, vol. 6(3-4), pages 161-331, July.
Anne Laure Bronstein & Gilles Pagès & Jacques Portès, 2013. "Multi-asset American Options and Parallel Quantization," Methodology and Computing in Applied Probability, Springer, vol. 15(3), pages 547-561, September.
Gilles Pag`es & Benedikt Wilbertz, 2011. "GPGPUs in computational finance: Massive parallel computing for American style options," Papers 1101.3228, arXiv.org.
Yi Yang & Jianan Wang & Youhua Chen & Zhiyuan Chen & Yanchu Liu, 2020. "Optimal procurement strategies for contractual assembly systems with fluctuating procurement price," Annals of Operations Research, Springer, vol. 291(1), pages 1027-1059, August.
Pflug, Georg C. & Broussev, Nikola, 2009. "Electricity swing options: Behavioral models and pricing," European Journal of Operational Research, Elsevier, vol. 197(3), pages 1041-1050, September.
Christian Bender & Nikolai Dokuchaev, 2013. "A First-Order BSPDE for Swing Option Pricing," Papers 1305.3988, arXiv.org.
Aleksandrov, Nikolay & Espinoza, Raphael & Gyurkó, Lajos, 2013. "Optimal oil production and the world supply of oil," Journal of Economic Dynamics and Control, Elsevier, vol. 37(7), pages 1248-1263.
- Nikolay Aleksandrov & Raphael Espinoza & Lajos Gyurko, 2012. "Optimal Oil Production and the World Supply of Oil," OxCarre Working Papers 092, Oxford Centre for the Analysis of Resource Rich Economies, University of Oxford.

More about this item

NEP fields

This paper has been announced in the following NEP Reports:

NEP-BIG-2021-02-15 (Big Data)
NEP-CMP-2021-02-15 (Computational Economics)

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2101.09682. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Solving optimal stopping problems with Deep Q-Learning

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Most related items

More about this item

NEP fields

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data