IDEAS home Printed from https://ideas.repec.org/a/eee/spapps/v149y2022icp341-368.html
   My bibliography  Save this article

Stochastic Gradient Hamiltonian Monte Carlo for non-convex learning

Author

Listed:
  • Chau, Huy N.
  • Rásonyi, Miklós

Abstract

Stochastic Gradient Hamiltonian Monte Carlo (SGHMC) is a momentum version of stochastic gradient descent with properly injected Gaussian noise to find a global minimum. In this paper, non-asymptotic convergence analysis of SGHMC is given in the context of non-convex optimization, where subsampling techniques are used over an i.i.d. dataset for gradient updates. In contrast to Raginsky et al. (2017) and Gao et al. (2021), our results are sharper in terms of step size, variance, and independent from the number of iterations.

Suggested Citation

  • Chau, Huy N. & Rásonyi, Miklós, 2022. "Stochastic Gradient Hamiltonian Monte Carlo for non-convex learning," Stochastic Processes and their Applications, Elsevier, vol. 149(C), pages 341-368.
  • Handle: RePEc:eee:spapps:v:149:y:2022:i:c:p:341-368
    DOI: 10.1016/j.spa.2022.04.001
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0304414922000825
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.spa.2022.04.001?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Dalalyan, Arnak S. & Karagulyan, Avetik, 2019. "User-friendly guarantees for the Langevin Monte Carlo with inaccurate gradient," Stochastic Processes and their Applications, Elsevier, vol. 129(12), pages 5278-5311.
    2. Arnak S. Dalalyan, 2017. "Theoretical guarantees for approximate sampling from smooth and log-concave densities," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 79(3), pages 651-676, June.
    3. Arnak Dalalyan, 2017. "Further and stronger analogy between sampling and optimization: Langevin Monte Carlo and gradient descent," Working Papers 2017-21, Center for Research in Economics and Statistics.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Dalalyan, Arnak S. & Karagulyan, Avetik, 2019. "User-friendly guarantees for the Langevin Monte Carlo with inaccurate gradient," Stochastic Processes and their Applications, Elsevier, vol. 129(12), pages 5278-5311.
    2. Yang, Jun & Roberts, Gareth O. & Rosenthal, Jeffrey S., 2020. "Optimal scaling of random-walk metropolis algorithms on general target distributions," Stochastic Processes and their Applications, Elsevier, vol. 130(10), pages 6094-6132.
    3. Crespo, Marelys & Gadat, Sébastien & Gendre, Xavier, 2023. "Stochastic Langevin Monte Carlo for (weakly) log-concave posterior distributions," TSE Working Papers 23-1398, Toulouse School of Economics (TSE).
    4. Vincent Lemaire & Gilles Pag`es & Christian Yeo, 2023. "Swing contract pricing: with and without Neural Networks," Papers 2306.03822, arXiv.org, revised Mar 2024.
    5. Tengyuan Liang & Weijie J. Su, 2019. "Statistical inference for the population landscape via moment‐adjusted stochastic gradients," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 81(2), pages 431-456, April.
    6. Florian Maire & Nial Friel & Pierre ALQUIER, 2017. "Informed Sub-Sampling MCMC: Approximate Bayesian Inference for Large Datasets," Working Papers 2017-40, Center for Research in Economics and Statistics.
    7. Menz, Georg & Schlichting, André & Tang, Wenpin & Wu, Tianqi, 2022. "Ergodicity of the infinite swapping algorithm at low temperature," Stochastic Processes and their Applications, Elsevier, vol. 151(C), pages 519-552.
    8. Arnak Dalalyan, 2017. "Further and stronger analogy between sampling and optimization: Langevin Monte Carlo and gradient descent," Working Papers 2017-21, Center for Research in Economics and Statistics.
    9. Ruben Loaiza-Maya & Didier Nibbering & Dan Zhu, 2023. "Hybrid unadjusted Langevin methods for high-dimensional latent variable models," Papers 2306.14445, arXiv.org.
    10. Denis Belomestny & Leonid Iosipoi, 2019. "Fourier transform MCMC, heavy tailed distributions and geometric ergodicity," Papers 1909.00698, arXiv.org, revised Dec 2019.
    11. Villeneuve, Stéphane & Bolte, Jérôme & Miclo, Laurent, 2022. "Swarm gradient dynamics for global optimization: the mean-field limit case," TSE Working Papers 22-1302, Toulouse School of Economics (TSE).
    12. Belomestny, Denis & Iosipoi, Leonid, 2021. "Fourier transform MCMC, heavy-tailed distributions, and geometric ergodicity," Mathematics and Computers in Simulation (MATCOM), Elsevier, vol. 181(C), pages 351-363.
    13. Samuel Livingstone & Giacomo Zanella, 2022. "The Barker proposal: Combining robustness and efficiency in gradient‐based MCMC," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 84(2), pages 496-523, April.
    14. Murray Pollock & Paul Fearnhead & Adam M. Johansen & Gareth O. Roberts, 2020. "Quasi‐stationary Monte Carlo and the ScaLE algorithm," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 82(5), pages 1167-1221, December.
    15. M. Barkhagen & S. García & J. Gondzio & J. Kalcsics & J. Kroeske & S. Sabanis & A. Staal, 2023. "Optimising portfolio diversification and dimensionality," Journal of Global Optimization, Springer, vol. 85(1), pages 185-234, January.
    16. Brosse, Nicolas & Durmus, Alain & Moulines, Éric & Sabanis, Sotirios, 2019. "The tamed unadjusted Langevin algorithm," Stochastic Processes and their Applications, Elsevier, vol. 129(10), pages 3638-3663.
    17. Sotirios Sabanis & Ying Zhang, 2020. "A fully data-driven approach to minimizing CVaR for portfolio of assets via SGLD with discontinuous updating," Papers 2007.01672, arXiv.org.
    18. Tung Duy Luu & Jalal Fadili & Christophe Chesneau, 2021. "Sampling from Non-smooth Distributions Through Langevin Diffusion," Methodology and Computing in Applied Probability, Springer, vol. 23(4), pages 1173-1201, December.
    19. Gadat, Sébastien & Panloup, Fabien & Pellegrini, C., 2020. "On the cost of Bayesian posterior mean strategy for log-concave models," TSE Working Papers 20-1155, Toulouse School of Economics (TSE), revised Feb 2022.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:spapps:v:149:y:2022:i:c:p:341-368. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/wps/find/journaldescription.cws_home/505572/description#description .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.