IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v120y2018icp70-83.html

A note on the validity of cross-validation for evaluating autoregressive time series prediction

Author

Listed:
  • Bergmeir, Christoph
  • Hyndman, Rob J.
  • Koo, Bonsoo

Abstract

One of the most widely used standard procedures for model evaluation in classification and regression is K-fold cross-validation (CV). However, when it comes to time series forecasting, because of the inherent serial correlation and potential non-stationarity of the data, its application is not straightforward and often replaced by practitioners in favour of an out-of-sample (OOS) evaluation. It is shown that for purely autoregressive models, the use of standard K-fold CV is possible provided the models considered have uncorrelated errors. Such a setup occurs, for example, when the models nest a more appropriate model. This is very common when Machine Learning methods are used for prediction, and where CV can control for overfitting the data. Theoretical insights supporting these arguments are presented, along with a simulation study and a real-world example. It is shown empirically that K-fold CV performs favourably compared to both OOS evaluation and other time-series-specific techniques such as non-dependent cross-validation.

Suggested Citation

  • Bergmeir, Christoph & Hyndman, Rob J. & Koo, Bonsoo, 2018. "A note on the validity of cross-validation for evaluating autoregressive time series prediction," Computational Statistics & Data Analysis, Elsevier, vol. 120(C), pages 70-83.
  • Handle: RePEc:eee:csdana:v:120:y:2018:i:c:p:70-83
    DOI: 10.1016/j.csda.2017.11.003
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947317302384
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2017.11.003?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to

    for a different version of it.

    References listed on IDEAS

    as
    1. Mokkadem, Abdelkader, 1988. "Mixing properties of ARMA processes," Stochastic Processes and their Applications, Elsevier, vol. 29(2), pages 309-315, September.
    2. Borra, Simone & Di Ciaccio, Agostino, 2010. "Measuring the prediction error. A comparison of cross-validation, bootstrap and covariance penalty methods," Computational Statistics & Data Analysis, Elsevier, vol. 54(12), pages 2976-2989, December.
    3. Bergmeir, Christoph & Costantini, Mauro & Benítez, José M., 2014. "On the usefulness of cross-validation for directional forecast evaluation," Computational Statistics & Data Analysis, Elsevier, vol. 76(C), pages 132-143.
    4. Andrews, Donald W K, 1987. "Consistency in Nonlinear Econometric Models: A Generic Uniform Law of Large Numbers [On Unification of the Asymptotic Theory of Nonlinear Econometric Models]," Econometrica, Econometric Society, vol. 55(6), pages 1465-1471, November.
    5. Racine, Jeff, 2000. "Consistent cross-validatory model-selection for dependent data: hv-block cross-validation," Journal of Econometrics, Elsevier, vol. 99(1), pages 39-61, November.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Christoph Bergmeir & Rob J Hyndman & Bonsoo Koo, 2015. "A Note on the Validity of Cross-Validation for Evaluating Time Series Prediction," Monash Econometrics and Business Statistics Working Papers 10/15, Monash University, Department of Econometrics and Business Statistics.
    2. Pinto, Jeronymo Marcondes & Marçal, Emerson Fernandes, 2019. "Cross-validation based forecasting method: a machine learning approach," Textos para discussão 498, FGV EESP - Escola de Economia de São Paulo, Fundação Getulio Vargas (Brazil).
    3. Mariana Oliveira & Luís Torgo & Vítor Santos Costa, 2021. "Evaluation Procedures for Forecasting with Spatiotemporal Data," Mathematics, MDPI, vol. 9(6), pages 1-27, March.
    4. Bergmeir, Christoph & Costantini, Mauro & Benítez, José M., 2014. "On the usefulness of cross-validation for directional forecast evaluation," Computational Statistics & Data Analysis, Elsevier, vol. 76(C), pages 132-143.
    5. Filip Stanek, 2021. "Optimal Out-of-Sample Forecast Evaluation under Stationarity," CERGE-EI Working Papers wp712, The Center for Economic Research and Graduate Education - Economics Institute, Prague.
    6. Benedikt M. Potscher & Ingmar R. Prucha, 1994. "On the Formulation of Uniform Laws of Large Numbers: A Truncation Approach," NBER Technical Working Papers 0085, National Bureau of Economic Research, Inc.
    7. Exterkate, Peter & Groenen, Patrick J.F. & Heij, Christiaan & van Dijk, Dick, 2016. "Nonlinear forecasting with many predictors using kernel ridge regression," International Journal of Forecasting, Elsevier, vol. 32(3), pages 736-753.
    8. Freyberger, Joachim, 2015. "Asymptotic theory for differentiated products demand models with many markets," Journal of Econometrics, Elsevier, vol. 185(1), pages 162-181.
    9. Scholz, Michael & Nielsen, Jens Perch & Sperlich, Stefan, 2015. "Nonparametric prediction of stock returns based on yearly data: The long-term view," Insurance: Mathematics and Economics, Elsevier, vol. 65(C), pages 143-155.
    10. Eric Beutner & Alexander Heinemann & Stephan Smeekes, 2019. "A General Framework for Prediction in Time Series Models," Papers 1902.01622, arXiv.org.
    11. Huijun Guo & Youming Liu, 2019. "Regression estimation under strong mixing data," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 71(3), pages 553-576, June.
    12. Ha, Tran Vinh & Asada, Takumi & Arimura, Mikiharu, 2019. "Determination of the influence factors on household vehicle ownership patterns in Phnom Penh using statistical and machine learning methods," Journal of Transport Geography, Elsevier, vol. 78(C), pages 70-86.
    13. Kock, Anders Bredahl & Teräsvirta, Timo, 2014. "Forecasting performances of three automated modelling techniques during the economic crisis 2007–2009," International Journal of Forecasting, Elsevier, vol. 30(3), pages 616-631.
    14. Iglesias, Emma M. & Linton, Oliver, 2009. "Estimation of tail thickness parameters from GJR-GARCH models," UC3M Working papers. Economics we094726, Universidad Carlos III de Madrid. Departamento de Economía.
    15. Gary S. Anderson & Alena Audzeyeva, 2019. "A Coherent Framework for Predicting Emerging Market Credit Spreads with Support Vector Regression," Finance and Economics Discussion Series 2019-074, Board of Governors of the Federal Reserve System (U.S.).
    16. Neumann, Michael H., 1997. "Strong approximation of density estimators from weakly dependent observations by density estimators from independent observations," SFB 373 Discussion Papers 1997,86, Humboldt University of Berlin, Interdisciplinary Research Project 373: Quantification and Simulation of Economic Processes.
    17. Sin, Chor-Yiu & White, Halbert, 1996. "Information criteria for selecting possibly misspecified parametric models," Journal of Econometrics, Elsevier, vol. 71(1-2), pages 207-225.
    18. Ryoko Ito, 2016. "Asymptotic Theory for Beta-t-GARCH," Cambridge Working Papers in Economics 1607, Faculty of Economics, University of Cambridge.
    19. Petropoulos, Fotios & Apiletti, Daniele & Assimakopoulos, Vassilios & Babai, Mohamed Zied & Barrow, Devon K. & Ben Taieb, Souhaib & Bergmeir, Christoph & Bessa, Ricardo J. & Bijak, Jakub & Boylan, Joh, 2022. "Forecasting: theory and practice," International Journal of Forecasting, Elsevier, vol. 38(3), pages 705-871.
      • Fotios Petropoulos & Daniele Apiletti & Vassilios Assimakopoulos & Mohamed Zied Babai & Devon K. Barrow & Souhaib Ben Taieb & Christoph Bergmeir & Ricardo J. Bessa & Jakub Bijak & John E. Boylan & Jet, 2020. "Forecasting: theory and practice," Papers 2012.03854, arXiv.org, revised Jan 2022.
    20. Dahl, Christian M. & Levine, Michael, 2006. "Nonparametric estimation of volatility models with serially dependent innovations," Statistics & Probability Letters, Elsevier, vol. 76(18), pages 2007-2016, December.

    More about this item

    Keywords

    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:120:y:2018:i:c:p:70-83. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.