IDEAS home Printed from https://ideas.repec.org/p/hhs/rbnkwp/0372.html
   My bibliography  Save this paper

Hamiltonian Monte Carlo with Energy Conserving Subsampling

Author

Listed:
  • Dang, Khue-Dung

    (School of Economics, UNSW Business School, University of New South Wales, ARC Centre of Excellence for Mathematical and Statistical Frontiers (ACEMS))

  • Quiroz, Matias

    (School of Economics, UNSW Business School, University of New South Wales, ARC Centre of Excellence for Mathematical and Statistical Frontiers (ACEMS), Research Division, Sveriges Riksbank)

  • Kohn, Robert

    (School of Economics, UNSW Business School, University of New South Wales, ARC Centre of Excellence for Mathematical and Statistical Frontiers (ACEMS))

  • Tran, Minh-Ngoc

    (ARC Centre of Excellence for Mathematical and Statistical Frontiers (ACEMS), Discipline of Business Analytics, University of Sidney)

  • Villani, Mattias

    (ARC Centre of Excellence for Mathematical and Statistical Frontiers (ACEMS), Division of Statistics and Machine Learning, Linköping University, Department of Statistics, Stockholm University.)

Abstract

Hamiltonian Monte Carlo (HMC) samples efficiently from high-dimensional posterior distributions with proposed parameter draws obtained by iterating on a discretized version of the Hamiltonian dynamics. The iterations make HMC computationally costly, especially in problems with large datasets, since it is necessary to compute posterior densities and their derivatives with respect to the parameters. Naively computing the Hamiltonian dynamics on a subset of the data causes HMC to lose its key ability to generate distant parameter proposals with high acceptance probability. The key insight in our article is that efficient subsampling HMC for the parameters is possible if both the dynamics and the acceptance probability are computed from the same data subsample in each complete HMC iteration. We show that this is possible to do in a principled way in a HMC-within-Gibbs framework where the subsample is updated using a pseudo marginal MH step and the parameters are then updated using an HMC step, based on the current subsample. We show that our subsampling methods are fast and compare favorably to two popular sampling algorithms that utilize gradient estimates from data subsampling. We also explore the current limitations of subsampling HMC algorithms by varying the quality of the variance reducing control variates used in the estimators of the posterior density and its gradients.

Suggested Citation

  • Dang, Khue-Dung & Quiroz, Matias & Kohn, Robert & Tran, Minh-Ngoc & Villani, Mattias, 2019. "Hamiltonian Monte Carlo with Energy Conserving Subsampling," Working Paper Series 372, Sveriges Riksbank (Central Bank of Sweden).
  • Handle: RePEc:hhs:rbnkwp:0372
    as

    Download full text from publisher

    File URL: https://www.riksbank.se/globalassets/media/rapporter/working-papers/2019/wp372.pdf
    File Function: Full text
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. David M. Blei & Alp Kucukelbir & Jon D. McAuliffe, 2017. "Variational Inference: A Review for Statisticians," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 112(518), pages 859-877, April.
    2. P. Baldi & P. Sadowski & D. Whiteson, 2014. "Searching for exotic particles in high-energy physics with deep learning," Nature Communications, Nature, vol. 5(1), pages 1-9, September.
    3. Matias Quiroz & Mattias Villani & Robert Kohn & Minh-Ngoc Tran & Khue-Dung Dang, 2018. "Subsampling MCMC - an Introduction for the Survey Statistician," Sankhya A: The Indian Journal of Statistics, Springer;Indian Statistical Institute, vol. 80(1), pages 33-69, December.
    4. Giordani, Paolo & Jacobson, Tor & Schedvin, Erik von & Villani, Mattias, 2014. "Taking the Twists into Account: Predicting Firm Bankruptcy Risk with Splines of Financial Ratios," Journal of Financial and Quantitative Analysis, Cambridge University Press, vol. 49(4), pages 1071-1099, August.
    5. Håvard Rue & Sara Martino & Nicolas Chopin, 2009. "Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 71(2), pages 319-392, April.
    6. G. O. Roberts & O. Stramer, 2002. "Langevin Diffusions and Metropolis-Hastings Algorithms," Methodology and Computing in Applied Probability, Springer, vol. 4(4), pages 337-357, December.
    7. Christopher Nemeth & Chris Sherlock & Paul Fearnhead, 2016. "Particle Metropolis-adjusted Langevin algorithms," Biometrika, Biometrika Trust, vol. 103(3), pages 701-717.
    8. Gareth O. Roberts & Jeffrey S. Rosenthal, 1998. "Optimal scaling of discrete approximations to Langevin diffusions," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 60(1), pages 255-268.
    9. Pitt, Michael K. & Silva, Ralph dos Santos & Giordani, Paolo & Kohn, Robert, 2012. "On some properties of Markov chain Monte Carlo simulation methods based on the particle filter," Journal of Econometrics, Elsevier, vol. 171(2), pages 134-151.
    10. Mark Girolami & Ben Calderhead, 2011. "Riemann manifold Langevin and Hamiltonian Monte Carlo methods," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 73(2), pages 123-214, March.
    11. George Deligiannidis & Arnaud Doucet & Michael K. Pitt, 2018. "The correlated pseudomarginal method," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 80(5), pages 839-870, November.
    12. A. Doucet & M. K. Pitt & G. Deligiannidis & R. Kohn, 2015. "Efficient implementation of Markov chain Monte Carlo when using an unbiased likelihood estimator," Biometrika, Biometrika Trust, vol. 102(2), pages 295-313.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Szymon Sacher & Laura Battaglia & Stephen Hansen, 2021. "Hamiltonian Monte Carlo for Regression with High-Dimensional Categorical Data," Papers 2107.08112, arXiv.org, revised Feb 2024.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Gael M. Martin & David T. Frazier & Christian P. Robert, 2020. "Computing Bayes: Bayesian Computation from 1763 to the 21st Century," Monash Econometrics and Business Statistics Working Papers 14/20, Monash University, Department of Econometrics and Business Statistics.
    2. Gael M. Martin & David T. Frazier & Christian P. Robert, 2022. "Computing Bayes: From Then `Til Now," Monash Econometrics and Business Statistics Working Papers 14/22, Monash University, Department of Econometrics and Business Statistics.
    3. Gael M. Martin & David T. Frazier & Ruben Loaiza-Maya & Florian Huber & Gary Koop & John Maheu & Didier Nibbering & Anastasios Panagiotelis, 2023. "Bayesian Forecasting in the 21st Century: A Modern Review," Monash Econometrics and Business Statistics Working Papers 1/23, Monash University, Department of Econometrics and Business Statistics.
    4. Matias Quiroz & Mattias Villani & Robert Kohn & Minh-Ngoc Tran & Khue-Dung Dang, 2018. "Subsampling MCMC - an Introduction for the Survey Statistician," Sankhya A: The Indian Journal of Statistics, Springer;Indian Statistical Institute, vol. 80(1), pages 33-69, December.
    5. Gael M. Martin & David T. Frazier & Worapree Maneesoonthorn & Ruben Loaiza-Maya & Florian Huber & Gary Koop & John Maheu & Didier Nibbering & Anastasios Panagiotelis, 2022. "Bayesian Forecasting in Economics and Finance: A Modern Review," Papers 2212.03471, arXiv.org, revised Jul 2023.
    6. Gael M. Martin & David T. Frazier & Christian P. Robert, 2021. "Approximating Bayes in the 21st Century," Monash Econometrics and Business Statistics Working Papers 24/21, Monash University, Department of Econometrics and Business Statistics.
    7. Matti Vihola & Jouni Helske & Jordan Franks, 2020. "Importance sampling type estimators based on approximate marginal Markov chain Monte Carlo," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 47(4), pages 1339-1376, December.
    8. Matias Quiroz & Robert Kohn & Mattias Villani & Minh-Ngoc Tran, 2019. "Speeding Up MCMC by Efficient Data Subsampling," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 114(526), pages 831-843, April.
    9. Ruben Loaiza-Maya & Didier Nibbering & Dan Zhu, 2023. "Hybrid unadjusted Langevin methods for high-dimensional latent variable models," Papers 2306.14445, arXiv.org.
    10. Golightly, Andrew & Bradley, Emma & Lowe, Tom & Gillespie, Colin S., 2019. "Correlated pseudo-marginal schemes for time-discretised stochastic kinetic models," Computational Statistics & Data Analysis, Elsevier, vol. 136(C), pages 92-107.
    11. Wiqvist, Samuel & Golightly, Andrew & McLean, Ashleigh T. & Picchini, Umberto, 2021. "Efficient inference for stochastic differential equation mixed-effects models using correlated particle pseudo-marginal algorithms," Computational Statistics & Data Analysis, Elsevier, vol. 157(C).
    12. Gunawan, David & Dang, Khue-Dung & Quiroz, Matias & Kohn, Robert & Tran, Minh-Ngoc, 2019. "Subsampling Sequential Monte Carlo for Static Bayesian Models," Working Paper Series 371, Sveriges Riksbank (Central Bank of Sweden).
    13. Panayotis Michaelides & Mike Tsionas & Panos Xidonas, 2020. "A Bayesian Signals Approach for the Detection of Crises," Journal of Quantitative Economics, Springer;The Indian Econometric Society (TIES), vol. 18(3), pages 551-585, September.
    14. Tsionas, Mike G. & Izzeldin, Marwan & Trapani, Lorenzo, 2022. "Estimation of large dimensional time varying VARs using copulas," European Economic Review, Elsevier, vol. 141(C).
    15. Mike Tsionas & Marwan Izzeldin & Lorenzo Trapani, 2019. "Bayesian estimation of large dimensional time varying VARs using copulas," Papers 1912.12527, arXiv.org.
    16. Bédard, Mylène, 2017. "Hierarchical models: Local proposal variances for RWM-within-Gibbs and MALA-within-Gibbs," Computational Statistics & Data Analysis, Elsevier, vol. 109(C), pages 231-246.
    17. Lux, Thomas, 2020. "Bayesian estimation of agent-based models via adaptive particle Markov chain Monte Carlo," Economics Working Papers 2020-01, Christian-Albrechts-University of Kiel, Department of Economics.
    18. Mamatzakis, Emmanuel C. & Tsionas, Mike G., 2021. "Making inference of British household's happiness efficiency: A Bayesian latent model," European Journal of Operational Research, Elsevier, vol. 294(1), pages 312-326.
    19. Delis, Manthos D. & Tsionas, Mike G., 2018. "Measuring management practices," International Journal of Production Economics, Elsevier, vol. 199(C), pages 65-77.
    20. Dalalyan, Arnak S. & Karagulyan, Avetik, 2019. "User-friendly guarantees for the Langevin Monte Carlo with inaccurate gradient," Stochastic Processes and their Applications, Elsevier, vol. 129(12), pages 5278-5311.

    More about this item

    Keywords

    Large datasets; Bayesian inference; Stochastic gradient;
    All these keywords.

    JEL classification:

    • C11 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Bayesian Analysis: General
    • C15 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Statistical Simulation Methods: General
    • C55 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Large Data Sets: Modeling and Analysis

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:hhs:rbnkwp:0372. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Lena Löfgren (email available below). General contact details of provider: https://edirc.repec.org/data/rbgovse.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.