IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2408.09335.html
   My bibliography  Save this paper

Exploratory Optimal Stopping: A Singular Control Formulation

Author

Listed:
  • Jodi Dianetti
  • Giorgio Ferrari
  • Renyuan Xu

Abstract

This paper explores continuous-time and state-space optimal stopping problems from a reinforcement learning perspective. We begin by formulating the stopping problem using randomized stopping times, where the decision maker's control is represented by the probability of stopping within a given time--specifically, a bounded, non-decreasing, c\`adl\`ag control process. To encourage exploration and facilitate learning, we introduce a regularized version of the problem by penalizing it with the cumulative residual entropy of the randomized stopping time. The regularized problem takes the form of an (n+1)-dimensional degenerate singular stochastic control with finite-fuel. We address this through the dynamic programming principle, which enables us to identify the unique optimal exploratory strategy. For the specific case of a real option problem, we derive a semi-explicit solution to the regularized problem, allowing us to assess the impact of entropy regularization and analyze the vanishing entropy limit. Finally, we propose a reinforcement learning algorithm based on policy iteration. We show both policy improvement and policy convergence results for our proposed algorithm.

Suggested Citation

  • Jodi Dianetti & Giorgio Ferrari & Renyuan Xu, 2024. "Exploratory Optimal Stopping: A Singular Control Formulation," Papers 2408.09335, arXiv.org, revised Oct 2024.
  • Handle: RePEc:arx:papers:2408.09335
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2408.09335
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Touzi, N. & Vieille, N., 1999. "Continuous-Time Dynkin Games with Mixed Strategies," Papiers d'Economie Mathématique et Applications 1999.112, Université Panthéon-Sorbonne (Paris 1).
    2. Tiziano De Angelis & Salvatore Federico & Giorgio Ferrari, 2014. "Optimal Boundary Surface for Irreversible Investment with Stochastic Costs," Papers 1406.4297, arXiv.org, revised Jan 2017.
    3. Tiziano De Angelis & Salvatore Federico & Giorgio Ferrari, 2017. "Optimal Boundary Surface for Irreversible Investment with Stochastic Costs," Mathematics of Operations Research, INFORMS, vol. 42(4), pages 1135-1161, November.
    4. Boetius, Frederik & Kohlmann, Michael, 1998. "Connections between optimal stopping and singular stochastic control," Stochastic Processes and their Applications, Elsevier, vol. 77(2), pages 253-281, September.
    5. Ben Hambly & Renyuan Xu & Huining Yang, 2020. "Policy Gradient Methods for the Noisy Linear Quadratic Regulator over a Finite Horizon," Papers 2011.10300, arXiv.org, revised Jun 2021.
    6. Eyal Neuman & Wolfgang Stockinger & Yufei Zhang, 2023. "An Offline Learning Approach to Propagator Models," Papers 2309.02994, arXiv.org.
    7. Sunil Kumar & Kumar Muthuraman, 2004. "A Numerical Method for Solving Singular Stochastic Control Problems," Operations Research, INFORMS, vol. 52(4), pages 563-582, August.
    8. Dianetti, Jodi & Ferrari, Giorgio, 2023. "Multidimensional singular control and related Skorokhod problem: Sufficient conditions for the characterization of optimal controls," Stochastic Processes and their Applications, Elsevier, vol. 162(C), pages 547-592.
    9. Tiziano De Angelis & Giorgio Ferrari & John Moriarty, 2019. "A Solvable Two-Dimensional Degenerate Singular Stochastic Control Problem with Nonconvex Costs," Mathematics of Operations Research, INFORMS, vol. 44(2), pages 512-531, May.
    10. Y.M. Kabanov, 1999. "Hedging and liquidation under transaction costs in currency markets," Finance and Stochastics, Springer, vol. 3(2), pages 237-248.
    11. Giorgio Ferrari, 2012. "On an integral equation for the free-boundary of stochastic, irreversible investment problems," Papers 1211.0412, arXiv.org, revised Jan 2015.
    12. Yanwei Jia & Xun Yu Zhou, 2021. "Policy Gradient and Actor-Critic Learning in Continuous Time and Space: Theory and Algorithms," Papers 2111.11232, arXiv.org, revised Jul 2022.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Giorgio Ferrari & Hanwu Li & Frank Riedel, 2020. "A Knightian Irreversible Investment Problem," Papers 2003.14359, arXiv.org, revised Apr 2020.
    2. Andrea Bovo & Tiziano De Angelis & Jan Palczewski, 2023. "Stopper vs. singular-controller games with degenerate diffusions," Papers 2312.00613, arXiv.org, revised Jul 2024.
    3. Andrea Bovo & Tiziano De Angelis & Jan Palczewski, 2023. "Zero-sum stopper vs. singular-controller games with constrained control directions," Papers 2306.05113, arXiv.org, revised Feb 2024.
    4. de Angelis, Tiziano & Ferrari, Giorgio, 2014. "A Stochastic Reversible Investment Problem on a Finite-Time Horizon: Free Boundary Analysis," Center for Mathematical Economics Working Papers 477, Center for Mathematical Economics, Bielefeld University.
    5. Aïd, René & Basei, Matteo & Ferrari, Giorgio, 2023. "A Stationary Mean-Field Equilibrium Model of Irreversible Investment in a Two-Regime Economy," Center for Mathematical Economics Working Papers 679, Center for Mathematical Economics, Bielefeld University.
    6. Ren'e Aid & Matteo Basei & Giorgio Ferrari, 2023. "A Stationary Mean-Field Equilibrium Model of Irreversible Investment in a Two-Regime Economy," Papers 2305.00541, arXiv.org.
    7. Dianetti, Jodi & Ferrari, Giorgio, 2021. "Multidimensional Singular Control and Related Skorokhod Problem: Suficient Conditions for the Characterization of Optimal Controls," Center for Mathematical Economics Working Papers 645, Center for Mathematical Economics, Bielefeld University.
    8. Dianetti, Jodi & Ferrari, Giorgio, 2023. "Multidimensional singular control and related Skorokhod problem: Sufficient conditions for the characterization of optimal controls," Stochastic Processes and their Applications, Elsevier, vol. 162(C), pages 547-592.
    9. Salvatore Federico & Mauro Rosestolato & Elisa Tacconi, 2018. "Irreversible investment with fixed adjustment costs: a stochastic impulse control approach," Papers 1801.04491, arXiv.org, revised Feb 2019.
    10. Junkee Jeon & Geonwoo Kim, 2020. "An Integral Equation Approach to the Irreversible Investment Problem with a Finite Horizon," Mathematics, MDPI, vol. 8(11), pages 1-10, November.
    11. Kexin Chen & Kyunghyun Park & Hoi Ying Wong, 2024. "Robust dividend policy: Equivalence of Epstein-Zin and Maenhout preferences," Papers 2406.12305, arXiv.org.
    12. Ferrari, Giorgio & Riedel, Frank & Steg, Jan-Henrik, 2016. "Continuous-Time Public Good Contribution under Uncertainty," Center for Mathematical Economics Working Papers 485, Center for Mathematical Economics, Bielefeld University.
    13. Felix Dammann & Giorgio Ferrari, 2023. "Optimal execution with multiplicative price impact and incomplete information on the return," Finance and Stochastics, Springer, vol. 27(3), pages 713-768, July.
    14. Dammann, Felix & Ferrari, Giorgio, 2022. "Optimal Execution with Multiplicative Price Impact and Incomplete Information on the Return," Center for Mathematical Economics Working Papers 663, Center for Mathematical Economics, Bielefeld University.
    15. De Angelis, Tiziano & Ferrari, Giorgio, 2014. "A stochastic partially reversible investment problem on a finite time-horizon: Free-boundary analysis," Stochastic Processes and their Applications, Elsevier, vol. 124(12), pages 4080-4119.
    16. Dianetti, Jodi, 2023. "Linear-Quadratic-Singular Stochastic Differential Games and Applications," Center for Mathematical Economics Working Papers 678, Center for Mathematical Economics, Bielefeld University.
    17. Christensen, Sören & Crocce, Fabián & Mordecki, Ernesto & Salminen, Paavo, 2019. "On optimal stopping of multidimensional diffusions," Stochastic Processes and their Applications, Elsevier, vol. 129(7), pages 2561-2581.
    18. Dianetti, Jodi & Ferrari, Giorgio, 2019. "Nonzero-Sum Submodular Monotone-Follower Games. Existence and Approximation of Nash Equilibria," Center for Mathematical Economics Working Papers 605, Center for Mathematical Economics, Bielefeld University.
    19. Giorgia Callegaro & Claudia Ceci & Giorgio Ferrari, 2019. "Optimal Reduction of Public Debt under Partial Observation of the Economic Growth," Papers 1901.08356, arXiv.org, revised Jan 2019.
    20. Peter Bank & Yan Dolinsky, 2018. "Continuous-time Duality for Super-replication with Transient Price Impact," Papers 1808.09807, arXiv.org, revised May 2019.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2408.09335. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.