IDEAS home Printed from https://ideas.repec.org/a/eee/spapps/v190y2025ics0304414925002091.html

Mirror descent for stochastic control problems with measure-valued controls

Author

Listed:
  • Kerimkulov, Bekzhan
  • Šiška, David
  • Szpruch, Łukasz
  • Zhang, Yufei

Abstract

This paper studies the convergence of the mirror descent algorithm for finite horizon stochastic control problems with measure-valued control processes. The control objective involves a convex regularisation function, denoted as h, with regularisation strength determined by the weight τ≥0. The setting covers regularised relaxed control problems. Under suitable conditions, we establish the relative smoothness and convexity of the control objective with respect to the Bregman divergence of h, and prove linear convergence of the algorithm for τ=0 and exponential convergence for τ>0. The results apply to common regularisers including relative entropy, χ2-divergence, and entropic Wasserstein costs. This validates recent reinforcement learning heuristics that adding regularisation accelerates the convergence of gradient methods. The proof exploits careful regularity estimates of backward stochastic differential equations in the bounded mean oscillation norm.

Suggested Citation

  • Kerimkulov, Bekzhan & Šiška, David & Szpruch, Łukasz & Zhang, Yufei, 2025. "Mirror descent for stochastic control problems with measure-valued controls," Stochastic Processes and their Applications, Elsevier, vol. 190(C).
  • Handle: RePEc:eee:spapps:v:190:y:2025:i:c:s0304414925002091
    DOI: 10.1016/j.spa.2025.104765
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0304414925002091
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.spa.2025.104765?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to

    for a different version of it.

    References listed on IDEAS

    as
    1. Haihao Lu & Robert M. Freund & Yurii Nesterov, 2018. "Relatively smooth convex optimization by first-order methods, and applications," LIDAM Reprints CORE 2965, Université catholique de Louvain, Center for Operations Research and Econometrics (CORE).
    2. Yanwei Jia & Xun Yu Zhou, 2021. "Policy Gradient and Actor-Critic Learning in Continuous Time and Space: Theory and Algorithms," Papers 2111.11232, arXiv.org, revised Jul 2022.
    3. Heinz H. Bauschke & Jérôme Bolte & Marc Teboulle, 2017. "A Descent Lemma Beyond Lipschitz Gradient Continuity: First-Order Methods Revisited and Applications," Mathematics of Operations Research, INFORMS, vol. 42(2), pages 330-348, May.
    4. Justin Sirignano & Konstantinos Spiliopoulos, 2017. "DGM: A deep learning algorithm for solving partial differential equations," Papers 1708.07469, arXiv.org, revised Sep 2018.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Zamani, Moslem & Abbaszadehpeivasti, Hadi & de Klerk, Etienne, 2024. "The exact worst-case convergence rate of the alternating direction method of multipliers," Other publications TiSEM f30ae9e6-ed19-423f-bd1e-0, Tilburg University, School of Economics and Management.
    2. Masoud Ahookhosh, 2019. "Accelerated first-order methods for large-scale convex optimization: nearly optimal complexity under strong convexity," Mathematical Methods of Operations Research, Springer;Gesellschaft für Operations Research (GOR);Nederlands Genootschap voor Besliskunde (NGB), vol. 89(3), pages 319-353, June.
    3. Masoud Ahookhosh & Le Thi Khanh Hien & Nicolas Gillis & Panagiotis Patrinos, 2021. "A Block Inertial Bregman Proximal Algorithm for Nonsmooth Nonconvex Problems with Application to Symmetric Nonnegative Matrix Tri-Factorization," Journal of Optimization Theory and Applications, Springer, vol. 190(1), pages 234-258, July.
    4. Shota Takahashi & Akiko Takeda, 2025. "Approximate bregman proximal gradient algorithm for relatively smooth nonconvex optimization," Computational Optimization and Applications, Springer, vol. 90(1), pages 227-256, January.
    5. Emanuel Laude & Peter Ochs & Daniel Cremers, 2020. "Bregman Proximal Mappings and Bregman–Moreau Envelopes Under Relative Prox-Regularity," Journal of Optimization Theory and Applications, Springer, vol. 184(3), pages 724-761, March.
    6. Yin Liu & Sam Davanloo Tajbakhsh, 2023. "Stochastic Composition Optimization of Functions Without Lipschitz Continuous Gradient," Journal of Optimization Theory and Applications, Springer, vol. 198(1), pages 239-289, July.
    7. Pourya Behmandpoor & Puya Latafat & Andreas Themelis & Marc Moonen & Panagiotis Patrinos, 2024. "SPIRAL: a superlinearly convergent incremental proximal algorithm for nonconvex finite sum minimization," Computational Optimization and Applications, Springer, vol. 88(1), pages 71-106, May.
    8. Radu-Alexandru Dragomir & Alexandre d’Aspremont & Jérôme Bolte, 2021. "Quartic First-Order Methods for Low-Rank Minimization," Journal of Optimization Theory and Applications, Springer, vol. 189(2), pages 341-363, May.
    9. Vincenzo Bonifaci, 2021. "A Laplacian approach to $$\ell _1$$ ℓ 1 -norm minimization," Computational Optimization and Applications, Springer, vol. 79(2), pages 441-469, June.
    10. Masoud Ahookhosh & Le Thi Khanh Hien & Nicolas Gillis & Panagiotis Patrinos, 2021. "Multi-block Bregman proximal alternating linearized minimization and its application to orthogonal nonnegative matrix factorization," Computational Optimization and Applications, Springer, vol. 79(3), pages 681-715, July.
    11. Kuangyu Ding & Kim-Chuan Toh, 2025. "Stochastic Bregman Subgradient Methods for Nonsmooth Nonconvex Optimization Problems," Journal of Optimization Theory and Applications, Springer, vol. 206(3), pages 1-36, September.
    12. Alkousa, Mohammad & Stonyakin, Fedor & Gasnikov, Alexander & Abdo, Asmaa & Alcheikh, Mohammad, 2024. "Higher degree inexact model for optimization problems," Chaos, Solitons & Fractals, Elsevier, vol. 186(C).
    13. Abbaszadehpeivasti, Hadi, 2024. "Performance analysis of optimization methods for machine learning," Other publications TiSEM 3050a62d-1a1f-494e-99ef-7, Tilburg University, School of Economics and Management.
    14. Zehui Liu & Qingsong Wang & Chunfeng Cui & Yong Xia, 2025. "Inertial accelerated stochastic mirror descent for large-scale generalized tensor CP decomposition," Computational Optimization and Applications, Springer, vol. 91(1), pages 201-233, May.
    15. Ziyuan Wang & Andreas Themelis & Hongjia Ou & Xianfu Wang, 2024. "A Mirror Inertial Forward–Reflected–Backward Splitting: Convergence Analysis Beyond Convexity and Lipschitz Smoothness," Journal of Optimization Theory and Applications, Springer, vol. 203(2), pages 1127-1159, November.
    16. Xin Jiang & Lieven Vandenberghe, 2023. "Bregman Three-Operator Splitting Methods," Journal of Optimization Theory and Applications, Springer, vol. 196(3), pages 936-972, March.
    17. Junyan Ye & Hoi Ying Wong & Kyunghyun Park, 2025. "Robust Exploratory Stopping under Ambiguity in Reinforcement Learning," Papers 2510.10260, arXiv.org, revised Apr 2026.
    18. Filip Hanzely & Peter Richtárik & Lin Xiao, 2021. "Accelerated Bregman proximal gradient methods for relatively smooth convex optimization," Computational Optimization and Applications, Springer, vol. 79(2), pages 405-440, June.
    19. Zhou Fang, 2023. "Continuous-Time Path-Dependent Exploratory Mean-Variance Portfolio Construction," Papers 2303.02298, arXiv.org.
    20. Matthew K. Tam & Daniel J. Uteda, 2023. "Bregman-Golden Ratio Algorithms for Variational Inequalities," Journal of Optimization Theory and Applications, Springer, vol. 199(3), pages 993-1021, December.

    More about this item

    Keywords

    ;
    ;
    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:spapps:v:190:y:2025:i:c:s0304414925002091. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/wps/find/journaldescription.cws_home/505572/description#description .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.