IDEAS home Printed from
   My bibliography  Save this article

Generalized additive models for large data sets


  • Simon N. Wood
  • Yannig Goude
  • Simon Shaw


type="main" xml:id="rssc12068-abs-0001"> We consider an application in electricity grid load prediction, where generalized additive models are appropriate, but where the data set's size can make their use practically intractable with existing methods. We therefore develop practical generalized additive model fitting methods for large data sets in the case in which the smooth terms in the model are represented by using penalized regression splines. The methods use iterative update schemes to obtain factors of the model matrix while requiring only subblocks of the model matrix to be computed at any one time. We show that efficient smoothing parameter estimation can be carried out in a well-justified manner. The grid load prediction problem requires updates of the model fit, as new data become available, and some means for dealing with residual auto-correlation in grid load. Methods are provided for these problems and parallel implementation is covered. The methods allow estimation of generalized additive models for large data sets by using modest computer hardware, and the grid load prediction problem illustrates the utility of reduced rank spline smoothing methods for dealing with complex modelling problems.

Suggested Citation

  • Simon N. Wood & Yannig Goude & Simon Shaw, 2015. "Generalized additive models for large data sets," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 64(1), pages 139-155, January.
  • Handle: RePEc:bla:jorssc:v:64:y:2015:i:1:p:139-155

    Download full text from publisher

    File URL:
    Download Restriction: Access to full text is restricted to subscribers.

    As the access to this document is restricted, you may want to search for a different version of it.


    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

    Cited by:

    1. Cui, Wenquan & Cheng, Haoyang & Sun, Jiajing, 2018. "An RKHS-based approach to double-penalized regression in high-dimensional partially linear models," Journal of Multivariate Analysis, Elsevier, vol. 168(C), pages 201-210.
    2. Schmidt, Paul & Mühlau, Mark & Schmid, Volker, 2017. "Fitting large-scale structured additive regression models using Krylov subspace methods," Computational Statistics & Data Analysis, Elsevier, vol. 105(C), pages 59-75.
    3. Ali M. Mosammam & Jorge Mateu, 2018. "A penalized likelihood method for nonseparable space–time generalized additive models," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 102(3), pages 333-357, July.
    4. Caston Sigauke & Murendeni Maurel Nemukula & Daniel Maposa, 2018. "Probabilistic Hourly Load Forecasting Using Additive Quantile Regression Models," Energies, MDPI, Open Access Journal, vol. 11(9), pages 1-21, August.
    5. Mills, Brian M. & Salaga, Steven, 2018. "A natural experiment for efficient markets: Information quality and influential agents," Journal of Financial Markets, Elsevier, vol. 40(C), pages 23-39.
    6. Djeundje, Viani Biatat & Crook, Jonathan, 2019. "Identifying hidden patterns in credit risk survival data using Generalised Additive Models," European Journal of Operational Research, Elsevier, vol. 277(1), pages 366-376.
    7. Anne-Sophie Krah & Zoran Nikolić & Ralf Korn, 2020. "Machine Learning in Least-Squares Monte Carlo Proxy Modeling of Life Insurance Companies," Risks, MDPI, Open Access Journal, vol. 8(1), pages 1-79, February.
    8. Anne-Sophie Krah & Zoran Nikoli'c & Ralf Korn, 2019. "Machine Learning in Least-Squares Monte Carlo Proxy Modeling of Life Insurance Companies," Papers 1909.02182,
    9. Zanin, Luca, 2020. "Combining multiple probability predictions in the presence of class imbalance to discriminate between potential bad and good borrowers in the peer-to-peer lending market," Journal of Behavioral and Experimental Finance, Elsevier, vol. 25(C).
    10. Shao, Zhen & Chao, Fu & Yang, Shan-Lin & Zhou, Kai-Le, 2017. "A review of the decomposition methodology for extracting and identifying the fluctuation characteristics in electricity demand forecasting," Renewable and Sustainable Energy Reviews, Elsevier, vol. 75(C), pages 123-136.
    11. Veronica Kostenko & Eduard Ponarin & Musa Shteiwi & Olga Strebkova, 2017. "Historical Legacies and Gender Attitudes in the Middle East," Working Papers 1105, Economic Research Forum, revised 05 2017.

    More about this item


    Access and download statistics


    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:jorssc:v:64:y:2015:i:1:p:139-155. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Wiley Content Delivery). General contact details of provider: .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.