IDEAS home Printed from https://ideas.repec.org/a/oup/biomet/v100y2013i1p75-89.html
   My bibliography  Save this article

Efficient Gaussian process regression for large datasets

Author

Listed:
  • Anjishnu Banerjee
  • David B. Dunson
  • Surya T. Tokdar

Abstract

Gaussian processes are widely used in nonparametric regression, classification and spatiotemporal modelling, facilitated in part by a rich literature on their theoretical properties. However, one of their practical limitations is expensive computation, typically on the order of n-super-3 where n is the number of data points, in performing the necessary matrix inversions. For large datasets, storage and processing also lead to computational bottlenecks, and numerical stability of the estimates and predicted values degrades with increasing n. Various methods have been proposed to address these problems, including predictive processes in spatial data analysis and the subset-of-regressors technique in machine learning. The idea underlying these approaches is to use a subset of the data, but this raises questions concerning sensitivity to the choice of subset and limitations in estimating fine-scale structure in regions that are not well covered by the subset. Motivated by the literature on compressive sensing, we propose an alternative approach that involves linear projection of all the data points onto a lower-dimensional subspace. We demonstrate the superiority of this approach from a theoretical perspective and through simulated and real data examples. Copyright 2013, Oxford University Press.

Suggested Citation

  • Anjishnu Banerjee & David B. Dunson & Surya T. Tokdar, 2013. "Efficient Gaussian process regression for large datasets," Biometrika, Biometrika Trust, vol. 100(1), pages 75-89.
  • Handle: RePEc:oup:biomet:v:100:y:2013:i:1:p:75-89
    as

    Download full text from publisher

    File URL: http://hdl.handle.net/10.1093/biomet/ass068
    Download Restriction: Access to full text is restricted to subscribers.
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Mahdi Hosseinpouri & Majid Jafari Khaledi, 2019. "An area-specific stick breaking process for spatial data," Statistical Papers, Springer, vol. 60(1), pages 199-221, February.
    2. Hong Wang & Guangyu Long & Jianxing Liao & Yan Xu & Yan Lv, 2022. "A new hybrid method for establishing point forecasting, interval forecasting, and probabilistic forecasting of landslide displacement," Natural Hazards: Journal of the International Society for the Prevention and Mitigation of Natural Hazards, Springer;International Society for the Prevention and Mitigation of Natural Hazards, vol. 111(2), pages 1479-1505, March.
    3. Eric Cebekhulu & Adeiza James Onumanyi & Sherrin John Isaac, 2022. "Performance Analysis of Machine Learning Algorithms for Energy Demand–Supply Prediction in Smart Grids," Sustainability, MDPI, vol. 14(5), pages 1-26, February.
    4. Jingjing Yang & Dennis D. Cox & Jong Soo Lee & Peng Ren & Taeryon Choi, 2017. "Efficient Bayesian hierarchical functional data analysis with basis function approximations using Gaussian–Wishart processes," Biometrics, The International Biometric Society, vol. 73(4), pages 1082-1091, December.
    5. Durante, Daniele & Dunson, David B., 2014. "Bayesian dynamic financial networks with time-varying predictors," Statistics & Probability Letters, Elsevier, vol. 93(C), pages 19-26.
    6. Gutiérrez, Luis & Gutiérrez-Peña, Eduardo & Mena, Ramsés H., 2014. "Bayesian nonparametric classification for spectroscopy data," Computational Statistics & Data Analysis, Elsevier, vol. 78(C), pages 56-68.
    7. Matthew W. Wheeler, 2019. "Bayesian additive adaptive basis tensor product models for modeling high dimensional surfaces: an application to high‐throughput toxicity testing," Biometrics, The International Biometric Society, vol. 75(1), pages 193-201, March.
    8. Kelly R. Moran & Matthew W. Wheeler, 2022. "Fast increased fidelity samplers for approximate Bayesian Gaussian process regression," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 84(4), pages 1198-1228, September.
    9. Bohan Xu & Rayus Kuplicki & Sandip Sen & Martin P Paulus, 2021. "The pitfalls of using Gaussian Process Regression for normative modeling," PLOS ONE, Public Library of Science, vol. 16(9), pages 1-14, September.
    10. Jiang, Chen & Vega, Manuel A. & Todd, Michael D. & Hu, Zhen, 2022. "Model correction and updating of a stochastic degradation model for failure prognostics of miter gates," Reliability Engineering and System Safety, Elsevier, vol. 218(PA).
    11. David A. Buch & James E. Johndrow & David B. Dunson, 2023. "Explaining transmission rate variations and forecasting epidemic spread in multiple regions with a semiparametric mixed effects SIR model," Biometrics, The International Biometric Society, vol. 79(4), pages 2987-2997, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:oup:biomet:v:100:y:2013:i:1:p:75-89. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Oxford University Press (email available below). General contact details of provider: https://academic.oup.com/biomet .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.