IDEAS home Printed from https://ideas.repec.org/p/zur/econwp/283.html
   My bibliography  Save this paper

Testing-Based Forward Model Selection

Author

Listed:
  • Damian Kozbur

Abstract

This paper introduces and analyzes a procedure called Testing-Based Forward Model Selection (TBFMS) in linear regression problems. This procedure inductively selects covariates that add predictive power into a working statistical model before estimating a final regression. The criterion for deciding which covariate to include next and when to stop including covariates is derived from a profile of traditional statistical hypothesis tests. This paper proves probabilistic bounds for prediction error and the number of selected covariates, which depend on the quality of the tests. The bounds are then specialized to a case with heteroskedastic data with tests derived from Huber-Eicker-White standard errors. TBFMS performance is compared to Lasso and Post-Lasso in simulation studies. TBFMS is then analyzed as a component into larger post-model selection estimation problems for structural economic parameters. Finally, TBFMS is used to illustrate an empirical application to estimating determinants of economic growth.

Suggested Citation

  • Damian Kozbur, 2015. "Testing-Based Forward Model Selection," ECON - Working Papers 283, Department of Economics - University of Zurich, revised Apr 2018.
  • Handle: RePEc:zur:econwp:283
    as

    Download full text from publisher

    File URL: http://www.econ.uzh.ch/static/wp/econwp283.pdf
    Download Restriction: no

    Other versions of this item:

    References listed on IDEAS

    as
    1. Alexandre Belloni & Victor Chernozhukov & Christian Hansen & Damian Kozbur, 2016. "Inference in High-Dimensional Panel Models With an Application to Gun Control," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 34(4), pages 590-605, October.
    2. Alexandre Belloni & Victor Chernozhukov & Christian Hansen, 2014. "High-Dimensional Methods and Inference on Structural and Treatment Effects," Journal of Economic Perspectives, American Economic Association, vol. 28(2), pages 29-50, Spring.
    3. Hansen, Christian & Kozbur, Damian, 2014. "Instrumental variables estimation with many weak instruments using regularized JIVE," Journal of Econometrics, Elsevier, vol. 182(2), pages 290-308.
    4. Damian Kozbur, 2013. "Inference in additively separable models with a high-dimensional set of conditioning variables," ECON - Working Papers 284, Department of Economics - University of Zurich, revised Apr 2018.
    5. Damian Kozbur, 2017. "Sharp convergence rates for forward regression in high-dimensional sparse linear models," ECON - Working Papers 253, Department of Economics - University of Zurich, revised Apr 2018.
    6. Newey, Whitney K, 1990. "Efficient Instrumental Variables Estimation of Nonlinear Models," Econometrica, Econometric Society, vol. 58(4), pages 809-837, July.
    7. Andrews, Donald W K, 1991. "Heteroskedasticity and Autocorrelation Consistent Covariance Matrix Estimation," Econometrica, Econometric Society, vol. 59(3), pages 817-858, May.
    8. A. Belloni & D. Chen & V. Chernozhukov & C. Hansen, 2012. "Sparse Models and Methods for Optimal Instruments With an Application to Eminent Domain," Econometrica, Econometric Society, vol. 80(6), pages 2369-2429, November.
    9. Bai, Jushan & Ng, Serena, 2008. "Forecasting economic time series using targeted predictors," Journal of Econometrics, Elsevier, vol. 146(2), pages 304-317, October.
    10. Douglas Staiger & James H. Stock, 1997. "Instrumental Variables Regression with Weak Instruments," Econometrica, Econometric Society, vol. 65(3), pages 557-586, May.
    11. Damian Kozbur, 2017. "Testing-Based Forward Model Selection," American Economic Review, American Economic Association, vol. 107(5), pages 266-269, May.
    12. Alexandre Belloni & Victor Chernozhukov & Christian Hansen, 2014. "Inference on Treatment Effects after Selection among High-Dimensional Controlsâ€," Review of Economic Studies, Oxford University Press, vol. 81(2), pages 608-650.
    13. Amemiya, Takeshi, 1974. "The nonlinear two-stage least-squares estimator," Journal of Econometrics, Elsevier, vol. 2(2), pages 105-110, July.
    14. Chao, John C. & Swanson, Norman R. & Hausman, Jerry A. & Newey, Whitney K. & Woutersen, Tiemen, 2012. "Asymptotic Distribution Of Jive In A Heteroskedastic Iv Regression With Many Instruments," Econometric Theory, Cambridge University Press, vol. 28(01), pages 42-86, February.
    15. Andreas Steinhauer & Tobias Wuergler, 2010. "Leverage and covariance matrix estimation in finite-sample IV regressions," IEW - Working Papers 521, Institute for Empirical Research in Economics - University of Zurich.
    16. Daron Acemoglu & Simon Johnson & James A. Robinson, 2001. "The Colonial Origins of Comparative Development: An Empirical Investigation," American Economic Review, American Economic Association, vol. 91(5), pages 1369-1401, December.
    17. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney K. Newey, 2016. "Double machine learning for treatment and causal parameters," CeMMAP working papers CWP49/16, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    18. Leeb, Hannes & P tscher, Benedikt M., 2008. "Can One Estimate The Unconditional Distribution Of Post-Model-Selection Estimators?," Econometric Theory, Cambridge University Press, vol. 24(02), pages 338-376, April.
    19. White, Halbert, 1980. "A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity," Econometrica, Econometric Society, vol. 48(4), pages 817-838, May.
    20. Chamberlain, Gary, 1987. "Asymptotic efficiency in estimation with conditional moment restrictions," Journal of Econometrics, Elsevier, vol. 34(3), pages 305-334, March.
    21. Alexandre Belloni & Victor Chernozhukov & Ivan Fernandez-Val & Christian Hansen, 2013. "Program evaluation with high-dimensional data," CeMMAP working papers CWP77/13, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    22. Knight, Keith, 2008. "Shrinkage Estimation For Nearly Singular Designs," Econometric Theory, Cambridge University Press, vol. 24(02), pages 323-337, April.
    23. Newey, Whitney & West, Kenneth, 2014. "A simple, positive semi-definite, heteroscedasticity and autocorrelation consistent covariance matrix," Applied Econometrics, Publishing House "SINERGIA PRESS", vol. 33(1), pages 125-132.
    24. MacKinnon, James G. & White, Halbert, 1985. "Some heteroskedasticity-consistent covariance matrix estimators with improved finite sample properties," Journal of Econometrics, Elsevier, vol. 29(3), pages 305-325, September.
    25. Jushan Bai & Serena Ng, 2009. "Boosting diffusion indices," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 24(4), pages 607-629.
    26. Wang, Hansheng, 2009. "Forward Regression for Ultra-High Dimensional Variable Screening," Journal of the American Statistical Association, American Statistical Association, vol. 104(488), pages 1512-1524.
    27. Cun-Hui Zhang & Stephanie S. Zhang, 2014. "Confidence intervals for low dimensional parameters in high dimensional linear models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 76(1), pages 217-242, January.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018. "Double/debiased machine learning for treatment and structural parameters," Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
    2. Hansen, Christian & Liao, Yuan, 2016. "The Factor-Lasso and K-Step Bootstrap Approach for Inference in High-Dimensional Economic Applications," MPRA Paper 75313, University Library of Munich, Germany.
    3. Christian Hansen & Yuan Liao, 2016. "The Factor-Lasso and K-Step Bootstrap Approach for Inference in High-Dimensional Economic Applications," Papers 1611.09420, arXiv.org, revised Dec 2016.
    4. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2016. "Double/Debiased Machine Learning for Treatment and Causal Parameters," Papers 1608.00060, arXiv.org, revised Dec 2017.
    5. Damian Kozbur, 2017. "Testing-Based Forward Model Selection," American Economic Review, American Economic Association, vol. 107(5), pages 266-269, May.
    6. Damian Kozbur, 2017. "Sharp convergence rates for forward regression in high-dimensional sparse linear models," ECON - Working Papers 253, Department of Economics - University of Zurich, revised Apr 2018.
    7. Christian Hansen & Yuan Liao, 2016. "The Factor-Lasso and K-Step Bootstrap Approach for Inference in High-Dimensional Economic Applications," Departmental Working Papers 201610, Rutgers University, Department of Economics.

    More about this item

    Keywords

    Model selection; forward regression; sparsity; hypothesis testing;

    JEL classification:

    • C55 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Large Data Sets: Modeling and Analysis

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:zur:econwp:283. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Marita Kieser). General contact details of provider: http://edirc.repec.org/data/seizhch.html .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.