IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v105y2017icp59-75.html
   My bibliography  Save this article

Fitting large-scale structured additive regression models using Krylov subspace methods

Author

Listed:
  • Schmidt, Paul
  • Mühlau, Mark
  • Schmid, Volker

Abstract

Fitting regression models can be challenging when regression coefficients are high-dimensional. Especially when large spatial or temporal effects need to be taken into account the limits of computational capacities of normal working stations are reached quickly. The analysis of images with several million pixels, where each pixel value can be seen as an observation on a new spatial location, represent such a situation. A Markov chain Monte Carlo (MCMC) framework for the applied statistician is presented that allows to fit models with millions of parameters with only low to moderate computational requirements. The method combines a modified sampling scheme with novel accomplishments in iterative methods for sparse linear systems. This way a solution is given that eliminates potential computational burdens such as calculating the log-determinant of massive precision matrices and sampling from high-dimensional Gaussian distributions. In an extensive simulation study with models of moderate size it is shown that this approach gives results that are in perfect agreement with state-of-the-art methods for fitting structured additive regression models. Furthermore, the method is applied to two real world examples from the field of medical imaging.

Suggested Citation

  • Schmidt, Paul & Mühlau, Mark & Schmid, Volker, 2017. "Fitting large-scale structured additive regression models using Krylov subspace methods," Computational Statistics & Data Analysis, Elsevier, vol. 105(C), pages 59-75.
  • Handle: RePEc:eee:csdana:v:105:y:2017:i:c:p:59-75
    DOI: 10.1016/j.csda.2016.07.006
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947316301700
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2016.07.006?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Sudipto Banerjee & Alan E. Gelfand & Andrew O. Finley & Huiyan Sang, 2008. "Gaussian predictive process models for large spatial data sets," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 70(4), pages 825-848, September.
    2. Simon N. Wood & Yannig Goude & Simon Shaw, 2015. "Generalized additive models for large data sets," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 64(1), pages 139-155, January.
    3. Fuentes, Montserrat, 2007. "Approximate Likelihood for Large Irregularly Spaced Spatial Data," Journal of the American Statistical Association, American Statistical Association, vol. 102, pages 321-331, March.
    4. Håvard Rue & Sara Martino & Nicolas Chopin, 2009. "Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 71(2), pages 319-392, April.
    5. Eidsvik, Jo & Finley, Andrew O. & Banerjee, Sudipto & Rue, Håvard, 2012. "Approximate Bayesian inference for large spatial datasets using predictive process models," Computational Statistics & Data Analysis, Elsevier, vol. 56(6), pages 1362-1380.
    6. Håvard Rue, 2001. "Fast sampling of Gaussian Markov random fields," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 63(2), pages 325-338.
    7. Michael L. Stein & Zhiyi Chi & Leah J. Welty, 2004. "Approximating likelihoods for large spatial data sets," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 66(2), pages 275-296, May.
    8. Umlauf, Nikolaus & Adler, Daniel & Kneib, Thomas & Lang, Stefan & Zeileis, Achim, 2015. "Structured Additive Regression Models: An R Interface to BayesX," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 63(i21).
    9. Helene Roth & Stefan Lang & Helga Wagner, 2015. "Random intercept selection in structured additive regression models," Working Papers 2015-02, Faculty of Economics and Statistics, Universität Innsbruck.
    10. Julian Besag & Debashis Mondal, 2005. "First-order intrinsic autoregressions and the de Wijs process," Biometrika, Biometrika Trust, vol. 92(4), pages 909-920, December.
    11. Ludwig Fahrmeir & Stefan Lang, 2001. "Bayesian inference for generalized additive mixed models based on Markov random field priors," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 50(2), pages 201-220.
    12. Brezger, Andreas & Kneib, Thomas & Lang, Stefan, 2005. "BayesX: Analyzing Bayesian Structural Additive Regression Models," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 14(i11).
    13. Pace, R. Kelley & LeSage, James P., 2004. "Chebyshev approximation of log-determinants of spatial weight matrices," Computational Statistics & Data Analysis, Elsevier, vol. 45(2), pages 179-196, March.
    14. C. Gössl & D. P. Auer & L. Fahrmeir, 2001. "Bayesian Spatiotemporal Inference in Functional Magnetic Resonance Imaging," Biometrics, The International Biometric Society, vol. 57(2), pages 554-562, June.
    15. G. O. Roberts & S. K. Sahu, 1997. "Updating Schemes, Correlation Structure, Blocking and Parameterization for the Gibbs Sampler," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 59(2), pages 291-317.
    16. Leonhard Knorr‐Held & Håvard Rue, 2002. "On Block Updating in Markov Random Field Models for Disease Mapping," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 29(4), pages 597-614, December.
    17. Brezger, Andreas & Lang, Stefan, 2006. "Generalized structured additive regression based on Bayesian P-splines," Computational Statistics & Data Analysis, Elsevier, vol. 50(4), pages 967-991, February.
    18. Fahrmeir, Ludwig & Kneib, Thomas, 2011. "Bayesian Smoothing and Regression for Longitudinal, Spatial and Event History Data," OUP Catalogue, Oxford University Press, number 9780199533022.
    19. Paul Schmidt & Volker J Schmid & Christian Gaser & Dorothea Buck & Susanne Bührlen & Annette Förschler & Mark Mühlau, 2013. "Fully Bayesian Inference for Structural MRI: Application to Segmentation and Statistical Analysis of T2-Hypointensities," PLOS ONE, Public Library of Science, vol. 8(7), pages 1-14, July.
    20. J. Besag & D. Higdon, 1999. "Bayesian analysis of agricultural field experiments," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 61(4), pages 691-746.
    21. Leonhard Knorr‐Held, 1999. "Conditional Prior Proposals in Dynamic Models," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 26(1), pages 129-144, March.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Håvard Rue & Sara Martino & Nicolas Chopin, 2009. "Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 71(2), pages 319-392, April.
    2. Håvard Rue & Ingelin Steinsland & Sveinung Erland, 2004. "Approximating hidden Gaussian Markov random fields," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 66(4), pages 877-892, November.
    3. Matthew J. Heaton & Abhirup Datta & Andrew O. Finley & Reinhard Furrer & Joseph Guinness & Rajarshi Guhaniyogi & Florian Gerber & Robert B. Gramacy & Dorit Hammerling & Matthias Katzfuss & Finn Lindgr, 2019. "A Case Study Competition Among Methods for Analyzing Large Spatial Data," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 24(3), pages 398-425, September.
    4. Djeundje, Viani Biatat & Crook, Jonathan, 2019. "Identifying hidden patterns in credit risk survival data using Generalised Additive Models," European Journal of Operational Research, Elsevier, vol. 277(1), pages 366-376.
    5. Yue, Yu Ryan & Rue, Håvard, 2011. "Bayesian inference for additive mixed quantile regression models," Computational Statistics & Data Analysis, Elsevier, vol. 55(1), pages 84-96, January.
    6. Brezger, Andreas & Lang, Stefan, 2006. "Generalized structured additive regression based on Bayesian P-splines," Computational Statistics & Data Analysis, Elsevier, vol. 50(4), pages 967-991, February.
    7. Simon N. Wood & Natalya Pya & Benjamin Säfken, 2016. "Smoothing Parameter and Model Selection for General Smooth Models," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(516), pages 1548-1563, October.
    8. Eidsvik, Jo & Finley, Andrew O. & Banerjee, Sudipto & Rue, Håvard, 2012. "Approximate Bayesian inference for large spatial datasets using predictive process models," Computational Statistics & Data Analysis, Elsevier, vol. 56(6), pages 1362-1380.
    9. Giovanna Jona Lasinio & Gianluca Mastrantonio & Alessio Pollice, 2013. "Discussing the “big n problem”," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 22(1), pages 97-112, March.
    10. Umlauf, Nikolaus & Adler, Daniel & Kneib, Thomas & Lang, Stefan & Zeileis, Achim, 2015. "Structured Additive Regression Models: An R Interface to BayesX," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 63(i21).
    11. Brown, Paul T. & Joshi, Chaitanya & Joe, Stephen & Rue, Håvard, 2021. "A novel method of marginalisation using low discrepancy sequences for integrated nested Laplace approximations," Computational Statistics & Data Analysis, Elsevier, vol. 157(C).
    12. Jamie Roberman & Theophilus I. Emeto & Oyelola A. Adegboye, 2021. "Adverse Birth Outcomes Due to Exposure to Household Air Pollution from Unclean Cooking Fuel among Women of Reproductive Age in Nigeria," IJERPH, MDPI, vol. 18(2), pages 1-15, January.
    13. Seongil Jo & Taeyoung Roh & Taeryon Choi, 2016. "Bayesian spectral analysis models for quantile regression with Dirichlet process mixtures," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 28(1), pages 177-206, March.
    14. Gressani, Oswaldo & Lambert, Philippe, 2021. "Laplace approximations for fast Bayesian inference in generalized additive models based on P-splines," Computational Statistics & Data Analysis, Elsevier, vol. 154(C).
    15. Sameh Abdulah & Yuxiao Li & Jian Cao & Hatem Ltaief & David E. Keyes & Marc G. Genton & Ying Sun, 2023. "Large‐scale environmental data science with ExaGeoStatR," Environmetrics, John Wiley & Sons, Ltd., vol. 34(1), February.
    16. Stefan Lang & Nikolaus Umlauf & Peter Wechselberger & Kenneth Harttgen & Thomas Kneib, 2012. "Multilevel structured additive regression," Working Papers 2012-07, Faculty of Economics and Statistics, Universität Innsbruck.
    17. Jialuo Liu & Tingjin Chu & Jun Zhu & Haonan Wang, 2022. "Large spatial data modeling and analysis: A Krylov subspace approach," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 49(3), pages 1115-1143, September.
    18. Zilber, Daniel & Katzfuss, Matthias, 2021. "Vecchia–Laplace approximations of generalized Gaussian processes for big non-Gaussian spatial data," Computational Statistics & Data Analysis, Elsevier, vol. 153(C).
    19. Simon N. Wood, 2020. "Inference and computation with generalized additive models and their extensions," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 29(2), pages 307-339, June.
    20. Vinicius Mayrink & Dani Gamerman, 2009. "On computational aspects of Bayesian spatial models: influence of the neighboring structure in the efficiency of MCMC algorithms," Computational Statistics, Springer, vol. 24(4), pages 641-669, December.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:105:y:2017:i:c:p:59-75. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.