IDEAS home Printed from https://ideas.repec.org/a/eee/econom/v187y2015i1p95-112.html
   My bibliography  Save this article

Cross-validation for selecting a model selection procedure

Author

Listed:
  • Zhang, Yongli
  • Yang, Yuhong

Abstract

While there are various model selection methods, an unanswered but important question is how to select one of them for data at hand. The difficulty is due to that the targeted behaviors of the model selection procedures depend heavily on uncheckable or difficult-to-check assumptions on the data generating process. Fortunately, cross-validation (CV) provides a general tool to solve this problem. In this work, results are provided on how to apply CV to consistently choose the best method, yielding new insights and guidance for potentially vast amount of application. In addition, we address several seemingly widely spread misconceptions on CV.

Suggested Citation

  • Zhang, Yongli & Yang, Yuhong, 2015. "Cross-validation for selecting a model selection procedure," Journal of Econometrics, Elsevier, vol. 187(1), pages 95-112.
  • Handle: RePEc:eee:econom:v:187:y:2015:i:1:p:95-112
    DOI: 10.1016/j.jeconom.2015.02.006
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0304407615000305
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.jeconom.2015.02.006?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. T. Speed & Bin Yu, 1993. "Model selection and prediction: Normal regression," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 45(1), pages 35-54, March.
    2. Yuhong Yang, 2005. "Can the strengths of AIC and BIC be shared? A conflict between model indentification and regression estimation," Biometrika, Biometrika Trust, vol. 92(4), pages 937-950, December.
    3. Tim van Erven & Peter Grünwald & Steven de Rooij, 2012. "Catching up faster by switching sooner: a predictive approach to adaptive estimation with an application to the AIC–BIC dilemma," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 74(3), pages 361-417, June.
    4. Ng, Serena, 2013. "Variable Selection in Predictive Regressions," Handbook of Economic Forecasting, in: G. Elliott & C. Granger & A. Timmermann (ed.), Handbook of Economic Forecasting, edition 1, volume 2, chapter 0, pages 752-789, Elsevier.
    5. Vaart Aad W. van der & Dudoit Sandrine & Laan Mark J. van der, 2006. "Oracle inequalities for multi-fold cross validation," Statistics & Risk Modeling, De Gruyter, vol. 24(3), pages 1-21, December.
    6. Andrews, Donald W. K., 1991. "Asymptotic optimality of generalized CL, cross-validation, and generalized cross-validation in regression with heteroskedastic errors," Journal of Econometrics, Elsevier, vol. 47(2-3), pages 359-377, February.
    7. Fan J. & Li R., 2001. "Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 1348-1360, December.
    8. Yang Y., 2001. "Adaptive Regression by Mixing," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 574-588, June.
    9. Jianqing Fan & Jinchi Lv & Lei Qi, 2011. "Sparse High-Dimensional Models in Economics," Annual Review of Economics, Annual Reviews, vol. 3(1), pages 291-317, September.
    10. Yang, Yuhong, 2007. "Prediction/Estimation With Simple Linear Models: Is It Really That Simple?," Econometric Theory, Cambridge University Press, vol. 23(1), pages 1-36, February.
    11. Shen X. & Ye J., 2002. "Adaptive Model Selection," Journal of the American Statistical Association, American Statistical Association, vol. 97, pages 210-221, March.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ng, Serena, 2013. "Variable Selection in Predictive Regressions," Handbook of Economic Forecasting, in: G. Elliott & C. Granger & A. Timmermann (ed.), Handbook of Economic Forecasting, edition 1, volume 2, chapter 0, pages 752-789, Elsevier.
    2. Jie Ding & Vahid Tarokh & Yuhong Yang, 2018. "Model Selection Techniques -- An Overview," Papers 1810.09583, arXiv.org.
    3. Lu, Xun & Su, Liangjun, 2015. "Jackknife model averaging for quantile regressions," Journal of Econometrics, Elsevier, vol. 188(1), pages 40-58.
    4. Sermpinis, Georgios & Tsoukas, Serafeim & Zhang, Ping, 2018. "Modelling market implied ratings using LASSO variable selection techniques," Journal of Empirical Finance, Elsevier, vol. 48(C), pages 19-35.
    5. Wenjing Yang & Yuhong Yang, 2017. "Toward an objective and reproducible model choice via variable selection deviation," Biometrics, The International Biometric Society, vol. 73(1), pages 20-30, March.
    6. Aman Ullah & Huansha Wang, 2013. "Parametric and Nonparametric Frequentist Model Selection and Model Averaging," Econometrics, MDPI, vol. 1(2), pages 1-23, September.
    7. Cheng, Tzu-Chang F. & Ing, Ching-Kang & Yu, Shu-Hui, 2015. "Toward optimal model averaging in regression models with time series errors," Journal of Econometrics, Elsevier, vol. 189(2), pages 321-334.
    8. Xianyi Wu & Xian Zhou, 2019. "On Hodges’ superefficiency and merits of oracle property in model selection," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 71(5), pages 1093-1119, October.
    9. Zhang, Xinyu & Liu, Chu-An, 2023. "Model averaging prediction by K-fold cross-validation," Journal of Econometrics, Elsevier, vol. 235(1), pages 280-301.
    10. William Kengne, 2023. "On consistency for time series model selection," Statistical Inference for Stochastic Processes, Springer, vol. 26(2), pages 437-458, July.
    11. Zemin Zheng & Jie Zhang & Yang Li, 2022. "L 0 -Regularized Learning for High-Dimensional Additive Hazards Regression," INFORMS Journal on Computing, INFORMS, vol. 34(5), pages 2762-2775, September.
    12. Laurent Ferrara & Anna Simoni, 2023. "When are Google Data Useful to Nowcast GDP? An Approach via Preselection and Shrinkage," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 41(4), pages 1188-1202, October.
    13. Lee, Ji Hyung & Shi, Zhentao & Gao, Zhan, 2022. "On LASSO for predictive regression," Journal of Econometrics, Elsevier, vol. 229(2), pages 322-349.
    14. Fan, Jianqing & Guo, Yongyi & Jiang, Bai, 2022. "Adaptive Huber regression on Markov-dependent data," Stochastic Processes and their Applications, Elsevier, vol. 150(C), pages 802-818.
    15. Ruiqi Liu & Ben Boukai & Zuofeng Shang, 2019. "Statistical Inference on Partially Linear Panel Model under Unobserved Linearity," Papers 1911.08830, arXiv.org.
    16. Qinqin Hu & Lu Lin, 2017. "Conditional sure independence screening by conditional marginal empirical likelihood," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 69(1), pages 63-96, February.
    17. Jun Yu & HaiYing Wang, 2022. "Subdata selection algorithm for linear model discrimination," Statistical Papers, Springer, vol. 63(6), pages 1883-1906, December.
    18. Zheng Tracy Ke & Jianqing Fan & Yichao Wu, 2015. "Homogeneity Pursuit," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 110(509), pages 175-194, March.
    19. Benjamin Poignard & Manabu Asai, 2023. "High‐dimensional sparse multivariate stochastic volatility models," Journal of Time Series Analysis, Wiley Blackwell, vol. 44(1), pages 4-22, January.
    20. Tao Huang & Jialiang Li, 2018. "Semiparametric model average prediction in panel data analysis," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 30(1), pages 125-144, January.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:econom:v:187:y:2015:i:1:p:95-112. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/jeconom .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.