IDEAS home Printed from https://ideas.repec.org/p/azt/cemmap/03-17.html
   My bibliography  Save this paper

Inference in linear regression models with many covariates and heteroskedasticity

Author

Listed:
  • Matias Cattaneo
  • Michael Jansson
  • Whitney K. Newey

Abstract

The linear regression model is widely used in empirical work in Economics, Statistics, and many other disciplines. Researchers often include many covariates in their linear model specification in an attempt to control for confounders. We give inference methods that allow for many covariates and heteroskedasticity. Our results are obtained using high-dimensional approximations, where the number of included covariates are allowed to grow as fast as the sample size. We find that all of the usual versions of Eicker-White heteroskedasticity consistent standard error estimators for linear models are inconsistent under this asymptotics. We then propose a new heteroskedasticity consistent standard error formula that is fully automatic and robust to both (conditional) heteroskedasticity of unknown form and the inclusion of possibly many covariates. We apply our findings to three settings: parametric linear models with many covariates, linear panel models with many fixed effects, and semiparametric semi-linear models with many technical regressors. Simulation evidence consistent with our theoretical results is also provided. The proposed methods are also illustrated with an empirical application.

Suggested Citation

  • Matias Cattaneo & Michael Jansson & Whitney K. Newey, 2017. "Inference in linear regression models with many covariates and heteroskedasticity," CeMMAP working papers 03/17, Institute for Fiscal Studies.
  • Handle: RePEc:azt:cemmap:03/17
    DOI: 10.1920/wp/cem.2017.0317
    as

    Download full text from publisher

    File URL: https://www.cemmap.ac.uk/wp-content/uploads/2020/08/CWP0317.pdf
    Download Restriction: no

    File URL: https://libkey.io/10.1920/wp/cem.2017.0317?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. Pedro Carneiro & James J. Heckman & Edward J. Vytlacil, 2011. "Estimating Marginal Returns to Education," American Economic Review, American Economic Association, vol. 101(6), pages 2754-2781, October.
    2. Kline, Patrick & Santos, Andres, 2012. "Higher order properties of the wild bootstrap under misspecification," Journal of Econometrics, Elsevier, vol. 171(1), pages 54-70.
    3. Koenker, Roger, 1988. "Asymptotic Theory and Econometric Practice," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 3(2), pages 139-147, April.
    4. Alexandre Belloni & Victor Chernozhukov & Christian Hansen, 2014. "Inference on Treatment Effects after Selection among High-Dimensional Controlsâ€," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 81(2), pages 608-650.
    5. Newey, Whitney K., 1997. "Convergence rates and asymptotic normality for series estimators," Journal of Econometrics, Elsevier, vol. 79(1), pages 147-168, July.
    6. Shurong Zheng & Dandan Jiang & Zhidong Bai & Xuming He, 2014. "Inference on multiple correlation coefficients with moderately high dimensional data," Biometrika, Biometrika Trust, vol. 101(3), pages 748-754.
    7. Belloni, Alexandre & Chernozhukov, Victor & Chetverikov, Denis & Kato, Kengo, 2015. "Some new asymptotic theory for least squares series: Pointwise and uniform results," Journal of Econometrics, Elsevier, vol. 186(2), pages 345-366.
    8. White, Halbert, 1980. "A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity," Econometrica, Econometric Society, vol. 48(4), pages 817-838, May.
    9. Farrell, Max H., 2015. "Robust inference on average treatment effects with possibly more covariates than observations," Journal of Econometrics, Elsevier, vol. 189(1), pages 1-23.
    10. Chesher, Andrew & Jewitt, Ian, 1987. "The Bias of a Heteroskedasticity Consistent Covariance Matrix Estimator," Econometrica, Econometric Society, vol. 55(5), pages 1217-1222, September.
    11. Chen, Xiaohong, 2007. "Large Sample Sieve Estimation of Semi-Nonparametric Models," Handbook of Econometrics, in: J.J. Heckman & E.E. Leamer (ed.), Handbook of Econometrics, edition 1, volume 6, chapter 76, Elsevier.
    12. A. Belloni & V. Chernozhukov & I. Fernández‐Val & C. Hansen, 2017. "Program Evaluation and Causal Inference With High‐Dimensional Data," Econometrica, Econometric Society, vol. 85, pages 233-298, January.
    13. James H. Stock & Mark W. Watson, 2008. "Heteroskedasticity-Robust Standard Errors for Fixed Effects Panel Data Regression," Econometrica, Econometric Society, vol. 76(1), pages 155-174, January.
    14. Ulrich K. Müller, 2013. "Risk of Bayesian Inference in Misspecified Models, and the Sandwich Covariance Matrix," Econometrica, Econometric Society, vol. 81(5), pages 1805-1849, September.
    15. Goncalves, Silvia & White, Halbert, 2005. "Bootstrap Standard Error Estimates for Linear Regression," Journal of the American Statistical Association, American Statistical Association, vol. 100, pages 970-979, September.
    16. J.J. Heckman & E.E. Leamer (ed.), 2007. "Handbook of Econometrics," Handbook of Econometrics, Elsevier, edition 1, volume 6, number 6b.
    17. James G. MacKinnon, 2012. "Thirty Years Of Heteroskedasticity-robust Inference," Working Paper 1268, Economics Department, Queen's University.
    18. Alberto Abadie & Guido W. Imbens & Fanyin Zheng, 2014. "Inference for Misspecified Models With Fixed Regressors," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(508), pages 1601-1614, December.
    19. Chesher, Andrew, 1989. "Hajek Inequalities, Measures of Leverage and the Size of Heteroskedasticity Robust Wald Tests," Econometrica, Econometric Society, vol. 57(4), pages 971-977, July.
    20. J.J. Heckman & E.E. Leamer (ed.), 2007. "Handbook of Econometrics," Handbook of Econometrics, Elsevier, edition 1, volume 6, number 6a.
    21. Joshua Angrist & Jinyong Hahn, 2004. "When to Control for Covariates? Panel Asymptotics for Estimates of Treatment Effects," The Review of Economics and Statistics, MIT Press, vol. 86(1), pages 58-72, February.
    22. MacKinnon, James G. & White, Halbert, 1985. "Some heteroskedasticity-consistent covariance matrix estimators with improved finite sample properties," Journal of Econometrics, Elsevier, vol. 29(3), pages 305-325, September.
    23. Donald, S. G. & Newey, W. K., 1994. "Series Estimation of Semilinear Models," Journal of Multivariate Analysis, Elsevier, vol. 50(1), pages 30-40, July.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Cattaneo, Matias D. & Jansson, Michael & Newey, Whitney K., 2018. "Alternative Asymptotics And The Partially Linear Model With Many Regressors," Econometric Theory, Cambridge University Press, vol. 34(2), pages 277-301, April.
    2. repec:cdl:ucsdec:qt86c7x315 is not listed on IDEAS
    3. repec:cdl:econwp:qt86c7x315 is not listed on IDEAS
    4. Yang Ning & Sida Peng & Jing Tao, 2020. "Doubly Robust Semiparametric Difference-in-Differences Estimators with High-Dimensional Data," Papers 2009.03151, arXiv.org.
    5. Byunghoon Kang, 2018. "Inference in Nonparametric Series Estimation with Specification Searches for the Number of Series Terms," Working Papers 240829404, Lancaster University Management School, Economics Department.
    6. Dong, Chaohua & Gao, Jiti & Linton, Oliver, 2023. "High dimensional semiparametric moment restriction models," Journal of Econometrics, Elsevier, vol. 232(2), pages 320-345.
    7. Qiu, Chen & Otsu, Taisuke, 2022. "Information theoretic approach to high dimensional multiplicative models: stochastic discount factor and treatment effect," LSE Research Online Documents on Economics 110494, London School of Economics and Political Science, LSE Library.
    8. Byunghoon Kang, 2019. "Inference in Nonparametric Series Estimation with Specification Searches for the Number of Series Terms," Papers 1909.12162, arXiv.org, revised Feb 2020.
    9. Jochmans, K., 2019. "Heteroskedasticity-Robust Inference in Linear Regression Models," Cambridge Working Papers in Economics 1957, Faculty of Economics, University of Cambridge.
    10. Byunghoon Kang, 2017. "Inference in Nonparametric Series Estimation with Data-Dependent Undersmoothing," Working Papers 170712442, Lancaster University Management School, Economics Department.
    11. Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP54/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    12. Pötscher, Benedikt M. & Preinerstorfer, David, 2023. "How Reliable Are Bootstrap-Based Heteroskedasticity Robust Tests?," Econometric Theory, Cambridge University Press, vol. 39(4), pages 789-847, August.
    13. Kyle Colangelo & Ying-Ying Lee, 2020. "Double Debiased Machine Learning Nonparametric Inference with Continuous Treatments," Papers 2004.03036, arXiv.org, revised Sep 2023.
    14. Romano, Joseph P. & Wolf, Michael, 2017. "Resurrecting weighted least squares," Journal of Econometrics, Elsevier, vol. 197(1), pages 1-19.
    15. Difang Huang & Jiti Gao & Tatsushi Oka, 2025. "Semiparametric single-index estimation for average treatment effects," Econometric Reviews, Taylor & Francis Journals, vol. 44(6), pages 843-885, July.
    16. Achim Ahrens & Christian B. Hansen & Mark E. Schaffer & Thomas Wiemann, 2024. "ddml: Double/debiased machine learning in Stata," Stata Journal, StataCorp LLC, vol. 24(1), pages 3-45, March.
    17. Farrell, Max H., 2015. "Robust inference on average treatment effects with possibly more covariates than observations," Journal of Econometrics, Elsevier, vol. 189(1), pages 1-23.
    18. Pötscher, Benedikt M. & Preinerstorfer, David, 2021. "Valid Heteroskedasticity Robust Testing," MPRA Paper 117855, University Library of Munich, Germany, revised Jul 2023.
    19. Sin, C.Y. (Chor-yiu) & Lee, Cheng-Few, 2021. "Using heteroscedasticity-non-consistent or heteroscedasticity-consistent variances in linear regression," Econometrics and Statistics, Elsevier, vol. 18(C), pages 117-142.
    20. Alexandre Belloni & Victor Chernozhukov & Christian Hansen, 2011. "Inference on Treatment Effects After Selection Amongst High-Dimensional Controls," Papers 1201.0224, arXiv.org, revised May 2012.
    21. Victor Chernozhukov & Juan Carlos Escanciano & Hidehiko Ichimura & Whitney K. Newey & James M. Robins, 2022. "Locally Robust Semiparametric Estimation," Econometrica, Econometric Society, vol. 90(4), pages 1501-1535, July.
    22. Young, Alwyn, 2019. "Channeling Fisher: randomization tests and the statistical insignificance of seemingly significant experimental results," LSE Research Online Documents on Economics 101401, London School of Economics and Political Science, LSE Library.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:azt:cemmap:03/17. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Dermot Watson (email available below). General contact details of provider: https://edirc.repec.org/data/ifsssuk.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.