IDEAS home Printed from https://ideas.repec.org/p/zur/econwp/283.html
   My bibliography  Save this paper

Testing-Based Forward Model Selection

Author

Listed:
  • Damian Kozbur

Abstract

This paper introduces and analyzes a procedure called Testing-Based Forward Model Selection (TBFMS) in linear regression problems. This procedure inductively selects covariates that add predictive power into a working statistical model before estimating a final regression. The criterion for deciding which covariate to include next and when to stop including covariates is derived from a profile of traditional statistical hypothesis tests. This paper proves probabilistic bounds for prediction error and the number of selected covariates, which depend on the quality of the tests. The bounds are then specialized to a case with heteroskedastic data with tests derived from Huber-Eicker-White standard errors. TBFMS performance is compared to Lasso and Post-Lasso in simulation studies. TBFMS is then analyzed as a component into larger post-model selection estimation problems for structural economic parameters. Finally, TBFMS is used to illustrate an empirical application to estimating determinants of economic growth.

Suggested Citation

  • Damian Kozbur, 2015. "Testing-Based Forward Model Selection," ECON - Working Papers 283, Department of Economics - University of Zurich, revised Apr 2018.
  • Handle: RePEc:zur:econwp:283
    as

    Download full text from publisher

    File URL: https://www.zora.uzh.ch/id/eprint/151160/1/econwp283.pdf
    Download Restriction: no
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. Alexandre Belloni & Victor Chernozhukov & Christian Hansen & Damian Kozbur, 2016. "Inference in High-Dimensional Panel Models With an Application to Gun Control," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 34(4), pages 590-605, October.
    2. Hansen, Christian & Kozbur, Damian, 2014. "Instrumental variables estimation with many weak instruments using regularized JIVE," Journal of Econometrics, Elsevier, vol. 182(2), pages 290-308.
    3. A. Belloni & D. Chen & V. Chernozhukov & C. Hansen, 2012. "Sparse Models and Methods for Optimal Instruments With an Application to Eminent Domain," Econometrica, Econometric Society, vol. 80(6), pages 2369-2429, November.
    4. Bai, Jushan & Ng, Serena, 2008. "Forecasting economic time series using targeted predictors," Journal of Econometrics, Elsevier, vol. 146(2), pages 304-317, October.
    5. Damian Kozbur, 2017. "Testing-Based Forward Model Selection," American Economic Review, American Economic Association, vol. 107(5), pages 266-269, May.
    6. Alexandre Belloni & Victor Chernozhukov & Christian Hansen, 2014. "Inference on Treatment Effects after Selection among High-Dimensional Controlsâ€," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 81(2), pages 608-650.
    7. Amemiya, Takeshi, 1974. "The nonlinear two-stage least-squares estimator," Journal of Econometrics, Elsevier, vol. 2(2), pages 105-110, July.
    8. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney K. Newey, 2016. "Double machine learning for treatment and causal parameters," CeMMAP working papers CWP49/16, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    9. Daron Acemoglu & Simon Johnson & James A. Robinson, 2001. "The Colonial Origins of Comparative Development: An Empirical Investigation," American Economic Review, American Economic Association, vol. 91(5), pages 1369-1401, December.
    10. Alexandre Belloni & Victor Chernozhukov & Christian Hansen, 2014. "High-Dimensional Methods and Inference on Structural and Treatment Effects," Journal of Economic Perspectives, American Economic Association, vol. 28(2), pages 29-50, Spring.
    11. White, Halbert, 1980. "A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity," Econometrica, Econometric Society, vol. 48(4), pages 817-838, May.
    12. Knight, Keith, 2008. "Shrinkage Estimation For Nearly Singular Designs," Econometric Theory, Cambridge University Press, vol. 24(2), pages 323-337, April.
    13. Leeb, Hannes & Pötscher, Benedikt M., 2008. "Can One Estimate The Unconditional Distribution Of Post-Model-Selection Estimators?," Econometric Theory, Cambridge University Press, vol. 24(2), pages 338-376, April.
    14. Newey, Whitney & West, Kenneth, 2014. "A simple, positive semi-definite, heteroscedasticity and autocorrelation consistent covariance matrix," Applied Econometrics, Russian Presidential Academy of National Economy and Public Administration (RANEPA), vol. 33(1), pages 125-132.
    15. Victor Chernozhukov & Ivan Fernandez-Val & Christian Hansen, 2013. "Program evaluation with high-dimensional data," CeMMAP working papers CWP57/13, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    16. Jushan Bai & Serena Ng, 2009. "Boosting diffusion indices," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 24(4), pages 607-629.
    17. Damian Kozbur, 2013. "Inference in additively separable models with a high-dimensional set of conditioning variables," ECON - Working Papers 284, Department of Economics - University of Zurich, revised Apr 2018.
    18. Damian Kozbur, 2017. "Sharp convergence rates for forward regression in high-dimensional sparse linear models," ECON - Working Papers 253, Department of Economics - University of Zurich, revised Apr 2018.
    19. Newey, Whitney K, 1990. "Efficient Instrumental Variables Estimation of Nonlinear Models," Econometrica, Econometric Society, vol. 58(4), pages 809-837, July.
    20. Andrews, Donald W K, 1991. "Heteroskedasticity and Autocorrelation Consistent Covariance Matrix Estimation," Econometrica, Econometric Society, vol. 59(3), pages 817-858, May.
    21. Douglas Staiger & James H. Stock, 1997. "Instrumental Variables Regression with Weak Instruments," Econometrica, Econometric Society, vol. 65(3), pages 557-586, May.
    22. Chao, John C. & Swanson, Norman R. & Hausman, Jerry A. & Newey, Whitney K. & Woutersen, Tiemen, 2012. "Asymptotic Distribution Of Jive In A Heteroskedastic Iv Regression With Many Instruments," Econometric Theory, Cambridge University Press, vol. 28(1), pages 42-86, February.
    23. Denis Chetverikov & . ., 2016. "On cross-validated Lasso," CeMMAP working papers CWP47/16, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    24. Andreas Steinhauer & Tobias Wuergler, 2010. "Leverage and covariance matrix estimation in finite-sample IV regressions," IEW - Working Papers 521, Institute for Empirical Research in Economics - University of Zurich.
    25. Chamberlain, Gary, 1987. "Asymptotic efficiency in estimation with conditional moment restrictions," Journal of Econometrics, Elsevier, vol. 34(3), pages 305-334, March.
    26. MacKinnon, James G. & White, Halbert, 1985. "Some heteroskedasticity-consistent covariance matrix estimators with improved finite sample properties," Journal of Econometrics, Elsevier, vol. 29(3), pages 305-325, September.
    27. Wang, Hansheng, 2009. "Forward Regression for Ultra-High Dimensional Variable Screening," Journal of the American Statistical Association, American Statistical Association, vol. 104(488), pages 1512-1524.
    28. Cun-Hui Zhang & Stephanie S. Zhang, 2014. "Confidence intervals for low dimensional parameters in high dimensional linear models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 76(1), pages 217-242, January.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018. "Double/debiased machine learning for treatment and structural parameters," Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
    2. Sarra Houidi & Dominique Fourer & François Auger & Houda Ben Attia Sethom & Laurence Miègeville, 2021. "Comparative Evaluation of Non-Intrusive Load Monitoring Methods Using Relevant Features and Transfer Learning," Energies, MDPI, vol. 14(9), pages 1-28, May.
    3. Peter C. B. Phillips & Zhentao Shi, 2021. "Boosting: Why You Can Use The Hp Filter," International Economic Review, Department of Economics, University of Pennsylvania and Osaka University Institute of Social and Economic Research Association, vol. 62(2), pages 521-570, May.
    4. Hansen, Christian & Liao, Yuan, 2019. "The Factor-Lasso And K-Step Bootstrap Approach For Inference In High-Dimensional Economic Applications," Econometric Theory, Cambridge University Press, vol. 35(3), pages 465-509, June.
    5. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2016. "Double/Debiased Machine Learning for Treatment and Causal Parameters," Papers 1608.00060, arXiv.org, revised Dec 2017.
    6. Damian Kozbur, 2017. "Testing-Based Forward Model Selection," American Economic Review, American Economic Association, vol. 107(5), pages 266-269, May.
    7. Peter C.B. Phillips & Zhentao Shi, 2019. "Boosting the Hodrick-Prescott Filter," Cowles Foundation Discussion Papers 2192, Cowles Foundation for Research in Economics, Yale University.
    8. Jooyoung Cha & Harold D. Chiang & Yuya Sasaki, 2021. "Inference in high-dimensional regression models without the exact or $L^p$ sparsity," Papers 2108.09520, arXiv.org, revised Dec 2022.
    9. Damian Kozbur, 2017. "Sharp convergence rates for forward regression in high-dimensional sparse linear models," ECON - Working Papers 253, Department of Economics - University of Zurich, revised Apr 2018.
    10. Zhentao Shi & Jingyi Huang, 2019. "Forward-Selected Panel Data Approach for Program Evaluation," Papers 1908.05894, arXiv.org, revised Apr 2021.
    11. Shi, Zhentao & Huang, Jingyi, 2023. "Forward-selected panel data approach for program evaluation," Journal of Econometrics, Elsevier, vol. 234(2), pages 512-535.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Christian Hansen & Damian Kozbur & Sanjog Misra, 2016. "Targeted undersmoothing," ECON - Working Papers 282, Department of Economics - University of Zurich, revised Apr 2018.
    2. Alexandre Belloni & Victor Chernozhukov & Christian Hansen & Damian Kozbur, 2016. "Inference in High-Dimensional Panel Models With an Application to Gun Control," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 34(4), pages 590-605, October.
    3. Hansen, Christian & Liao, Yuan, 2019. "The Factor-Lasso And K-Step Bootstrap Approach For Inference In High-Dimensional Economic Applications," Econometric Theory, Cambridge University Press, vol. 35(3), pages 465-509, June.
    4. Damian Kozbur, 2020. "Analysis of Testing‐Based Forward Model Selection," Econometrica, Econometric Society, vol. 88(5), pages 2147-2173, September.
    5. Alena Skolkova, 2023. "Instrumental Variable Estimation with Many Instruments Using Elastic-Net IV," CERGE-EI Working Papers wp759, The Center for Economic Research and Graduate Education - Economics Institute, Prague.
    6. Damian Kozbur, 2013. "Inference in additively separable models with a high-dimensional set of conditioning variables," ECON - Working Papers 284, Department of Economics - University of Zurich, revised Apr 2018.
    7. Mardi Dungey & Vitali Alexeev & Jing Tian & Alastair R. Hall, 2015. "Econometricians Have Their Moments: GMM at 32," The Economic Record, The Economic Society of Australia, vol. 91, pages 1-24, June.
    8. Alastair R. Hall, 2015. "Econometricians Have Their Moments: GMM at 32," The Economic Record, The Economic Society of Australia, vol. 91(S1), pages 1-24, June.
    9. Shi, Zhentao & Huang, Jingyi, 2023. "Forward-selected panel data approach for program evaluation," Journal of Econometrics, Elsevier, vol. 234(2), pages 512-535.
    10. Damian Kozbur, 2017. "Sharp convergence rates for forward regression in high-dimensional sparse linear models," ECON - Working Papers 253, Department of Economics - University of Zurich, revised Apr 2018.
    11. Alexandre Belloni & Victor Chernozhukov & Denis Chetverikov & Christian Hansen & Kengo Kato, 2018. "High-dimensional econometrics and regularized GMM," CeMMAP working papers CWP35/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    12. A. Belloni & D. Chen & V. Chernozhukov & C. Hansen, 2012. "Sparse Models and Methods for Optimal Instruments With an Application to Eminent Domain," Econometrica, Econometric Society, vol. 80(6), pages 2369-2429, November.
    13. Victor Chernozhukov & Christian Hansen & Martin Spindler, 2015. "Post-Selection and Post-Regularization Inference in Linear Models with Many Controls and Instruments," American Economic Review, American Economic Association, vol. 105(5), pages 486-490, May.
    14. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018. "Double/debiased machine learning for treatment and structural parameters," Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
    15. Farrell, Max H., 2015. "Robust inference on average treatment effects with possibly more covariates than observations," Journal of Econometrics, Elsevier, vol. 189(1), pages 1-23.
    16. Hansen, Christian & Kozbur, Damian, 2014. "Instrumental variables estimation with many weak instruments using regularized JIVE," Journal of Econometrics, Elsevier, vol. 182(2), pages 290-308.
    17. Gold, David & Lederer, Johannes & Tao, Jing, 2020. "Inference for high-dimensional instrumental variables regression," Journal of Econometrics, Elsevier, vol. 217(1), pages 79-111.
    18. Alexandre Belloni & Victor Chernozhukov & Kengo Kato, 2019. "Valid Post-Selection Inference in High-Dimensional Approximately Sparse Quantile Regression Models," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 114(526), pages 749-758, April.
    19. Jushan Bai & Sung Hoon Choi & Yuan Liao, 2021. "Feasible generalized least squares for panel data with cross-sectional and serial correlations," Empirical Economics, Springer, vol. 60(1), pages 309-326, January.
    20. Danquah, Michael & Iddrisu, Abdul Malik & Boakye, Ernest Owusu & Owusu, Solomon, 2021. "Do gender wage differences within households influence women's empowerment and welfare? Evidence from Ghana," Journal of Economic Behavior & Organization, Elsevier, vol. 188(C), pages 916-932.

    More about this item

    Keywords

    Model selection; forward regression; sparsity; hypothesis testing;
    All these keywords.

    JEL classification:

    • C55 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Large Data Sets: Modeling and Analysis

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:zur:econwp:283. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Severin Oswald (email available below). General contact details of provider: https://edirc.repec.org/data/seizhch.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.