IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v105y2017icp59-75.html
   My bibliography  Save this article

Fitting large-scale structured additive regression models using Krylov subspace methods

Author

Listed:
  • Schmidt, Paul
  • Mühlau, Mark
  • Schmid, Volker

Abstract

Fitting regression models can be challenging when regression coefficients are high-dimensional. Especially when large spatial or temporal effects need to be taken into account the limits of computational capacities of normal working stations are reached quickly. The analysis of images with several million pixels, where each pixel value can be seen as an observation on a new spatial location, represent such a situation. A Markov chain Monte Carlo (MCMC) framework for the applied statistician is presented that allows to fit models with millions of parameters with only low to moderate computational requirements. The method combines a modified sampling scheme with novel accomplishments in iterative methods for sparse linear systems. This way a solution is given that eliminates potential computational burdens such as calculating the log-determinant of massive precision matrices and sampling from high-dimensional Gaussian distributions. In an extensive simulation study with models of moderate size it is shown that this approach gives results that are in perfect agreement with state-of-the-art methods for fitting structured additive regression models. Furthermore, the method is applied to two real world examples from the field of medical imaging.

Suggested Citation

  • Schmidt, Paul & Mühlau, Mark & Schmid, Volker, 2017. "Fitting large-scale structured additive regression models using Krylov subspace methods," Computational Statistics & Data Analysis, Elsevier, vol. 105(C), pages 59-75.
  • Handle: RePEc:eee:csdana:v:105:y:2017:i:c:p:59-75
    DOI: 10.1016/j.csda.2016.07.006
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947316301700
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2016.07.006?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Sudipto Banerjee & Alan E. Gelfand & Andrew O. Finley & Huiyan Sang, 2008. "Gaussian predictive process models for large spatial data sets," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 70(4), pages 825-848, September.
    2. Simon N. Wood & Yannig Goude & Simon Shaw, 2015. "Generalized additive models for large data sets," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 64(1), pages 139-155, January.
    3. Fuentes, Montserrat, 2007. "Approximate Likelihood for Large Irregularly Spaced Spatial Data," Journal of the American Statistical Association, American Statistical Association, vol. 102, pages 321-331, March.
    4. Håvard Rue & Sara Martino & Nicolas Chopin, 2009. "Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 71(2), pages 319-392, April.
    5. Eidsvik, Jo & Finley, Andrew O. & Banerjee, Sudipto & Rue, Håvard, 2012. "Approximate Bayesian inference for large spatial datasets using predictive process models," Computational Statistics & Data Analysis, Elsevier, vol. 56(6), pages 1362-1380.
    6. Håvard Rue, 2001. "Fast sampling of Gaussian Markov random fields," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 63(2), pages 325-338.
    7. Michael L. Stein & Zhiyi Chi & Leah J. Welty, 2004. "Approximating likelihoods for large spatial data sets," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 66(2), pages 275-296, May.
    8. Umlauf, Nikolaus & Adler, Daniel & Kneib, Thomas & Lang, Stefan & Zeileis, Achim, 2015. "Structured Additive Regression Models: An R Interface to BayesX," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 63(i21).
    9. Helene Roth & Stefan Lang & Helga Wagner, 2015. "Random intercept selection in structured additive regression models," Working Papers 2015-02, Faculty of Economics and Statistics, Universität Innsbruck.
    10. Julian Besag & Debashis Mondal, 2005. "First-order intrinsic autoregressions and the de Wijs process," Biometrika, Biometrika Trust, vol. 92(4), pages 909-920, December.
    11. Ludwig Fahrmeir & Stefan Lang, 2001. "Bayesian inference for generalized additive mixed models based on Markov random field priors," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 50(2), pages 201-220.
    12. Brezger, Andreas & Kneib, Thomas & Lang, Stefan, 2005. "BayesX: Analyzing Bayesian Structural Additive Regression Models," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 14(i11).
    13. Pace, R. Kelley & LeSage, James P., 2004. "Chebyshev approximation of log-determinants of spatial weight matrices," Computational Statistics & Data Analysis, Elsevier, vol. 45(2), pages 179-196, March.
    14. C. Gössl & D. P. Auer & L. Fahrmeir, 2001. "Bayesian Spatiotemporal Inference in Functional Magnetic Resonance Imaging," Biometrics, The International Biometric Society, vol. 57(2), pages 554-562, June.
    15. G. O. Roberts & S. K. Sahu, 1997. "Updating Schemes, Correlation Structure, Blocking and Parameterization for the Gibbs Sampler," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 59(2), pages 291-317.
    16. Leonhard Knorr‐Held & Håvard Rue, 2002. "On Block Updating in Markov Random Field Models for Disease Mapping," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 29(4), pages 597-614, December.
    17. Brezger, Andreas & Lang, Stefan, 2006. "Generalized structured additive regression based on Bayesian P-splines," Computational Statistics & Data Analysis, Elsevier, vol. 50(4), pages 967-991, February.
    18. Fahrmeir, Ludwig & Kneib, Thomas, 2011. "Bayesian Smoothing and Regression for Longitudinal, Spatial and Event History Data," OUP Catalogue, Oxford University Press, number 9780199533022.
    19. Paul Schmidt & Volker J Schmid & Christian Gaser & Dorothea Buck & Susanne Bührlen & Annette Förschler & Mark Mühlau, 2013. "Fully Bayesian Inference for Structural MRI: Application to Segmentation and Statistical Analysis of T2-Hypointensities," PLOS ONE, Public Library of Science, vol. 8(7), pages 1-14, July.
    20. J. Besag & D. Higdon, 1999. "Bayesian analysis of agricultural field experiments," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 61(4), pages 691-746.
    21. Leonhard Knorr‐Held, 1999. "Conditional Prior Proposals in Dynamic Models," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 26(1), pages 129-144, March.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Håvard Rue & Sara Martino & Nicolas Chopin, 2009. "Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 71(2), pages 319-392, April.
    2. Håvard Rue & Ingelin Steinsland & Sveinung Erland, 2004. "Approximating hidden Gaussian Markov random fields," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 66(4), pages 877-892, November.
    3. Simon N. Wood & Natalya Pya & Benjamin Säfken, 2016. "Smoothing Parameter and Model Selection for General Smooth Models," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(516), pages 1548-1563, October.
    4. Matthew J. Heaton & Abhirup Datta & Andrew O. Finley & Reinhard Furrer & Joseph Guinness & Rajarshi Guhaniyogi & Florian Gerber & Robert B. Gramacy & Dorit Hammerling & Matthias Katzfuss & Finn Lindgr, 2019. "A Case Study Competition Among Methods for Analyzing Large Spatial Data," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 24(3), pages 398-425, September.
    5. Eidsvik, Jo & Finley, Andrew O. & Banerjee, Sudipto & Rue, Håvard, 2012. "Approximate Bayesian inference for large spatial datasets using predictive process models," Computational Statistics & Data Analysis, Elsevier, vol. 56(6), pages 1362-1380.
    6. Djeundje, Viani Biatat & Crook, Jonathan, 2019. "Identifying hidden patterns in credit risk survival data using Generalised Additive Models," European Journal of Operational Research, Elsevier, vol. 277(1), pages 366-376.
    7. Yue, Yu Ryan & Rue, Håvard, 2011. "Bayesian inference for additive mixed quantile regression models," Computational Statistics & Data Analysis, Elsevier, vol. 55(1), pages 84-96, January.
    8. Giovanna Jona Lasinio & Gianluca Mastrantonio & Alessio Pollice, 2013. "Discussing the “big n problem”," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 22(1), pages 97-112, March.
    9. Umlauf, Nikolaus & Adler, Daniel & Kneib, Thomas & Lang, Stefan & Zeileis, Achim, 2015. "Structured Additive Regression Models: An R Interface to BayesX," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 63(i21).
    10. Brezger, Andreas & Lang, Stefan, 2006. "Generalized structured additive regression based on Bayesian P-splines," Computational Statistics & Data Analysis, Elsevier, vol. 50(4), pages 967-991, February.
    11. Brown, Paul T. & Joshi, Chaitanya & Joe, Stephen & Rue, Håvard, 2021. "A novel method of marginalisation using low discrepancy sequences for integrated nested Laplace approximations," Computational Statistics & Data Analysis, Elsevier, vol. 157(C).
    12. Xin Fang & Bo Fang & Chunfang Wang & Tian Xia & Matteo Bottai & Fang Fang & Yang Cao, 2019. "Comparison of Frequentist and Bayesian Generalized Additive Models for Assessing the Association between Daily Exposure to Fine Particles and Respiratory Mortality: A Simulation Study," IJERPH, MDPI, vol. 16(5), pages 1-20, March.
    13. Jamie Roberman & Theophilus I. Emeto & Oyelola A. Adegboye, 2021. "Adverse Birth Outcomes Due to Exposure to Household Air Pollution from Unclean Cooking Fuel among Women of Reproductive Age in Nigeria," IJERPH, MDPI, vol. 18(2), pages 1-15, January.
    14. Belitz, Christiane & Lang, Stefan, 2008. "Simultaneous selection of variables and smoothing parameters in structured additive regression models," Computational Statistics & Data Analysis, Elsevier, vol. 53(1), pages 61-81, September.
    15. Volker Schmid & Leonhard Held, 2004. "Bayesian Extrapolation of Space–Time Trends in Cancer Registry Data," Biometrics, The International Biometric Society, vol. 60(4), pages 1034-1042, December.
    16. Seongil Jo & Taeyoung Roh & Taeryon Choi, 2016. "Bayesian spectral analysis models for quantile regression with Dirichlet process mixtures," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 28(1), pages 177-206, March.
    17. Gressani, Oswaldo & Lambert, Philippe, 2021. "Laplace approximations for fast Bayesian inference in generalized additive models based on P-splines," Computational Statistics & Data Analysis, Elsevier, vol. 154(C).
    18. Sameh Abdulah & Yuxiao Li & Jian Cao & Hatem Ltaief & David E. Keyes & Marc G. Genton & Ying Sun, 2023. "Large‐scale environmental data science with ExaGeoStatR," Environmetrics, John Wiley & Sons, Ltd., vol. 34(1), February.
    19. Riccardo Borgoni & Francesco Billari, 2003. "Bayesian spatial analysis of demographic survey data," Demographic Research, Max Planck Institute for Demographic Research, Rostock, Germany, vol. 8(3), pages 61-92.
    20. Gamerman, Dani & Moreira, Ajax R. B. & Rue, Havard, 2003. "Space-varying regression models: specifications and simulation," Computational Statistics & Data Analysis, Elsevier, vol. 42(3), pages 513-533, March.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:105:y:2017:i:c:p:59-75. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.