IDEAS home Printed from https://ideas.repec.org/p/arx/papers/1507.02493.html
   My bibliography  Save this paper

Inference in Linear Regression Models with Many Covariates and Heteroskedasticity

Author

Listed:
  • Matias D. Cattaneo
  • Michael Jansson
  • Whitney K. Newey

Abstract

The linear regression model is widely used in empirical work in Economics, Statistics, and many other disciplines. Researchers often include many covariates in their linear model specification in an attempt to control for confounders. We give inference methods that allow for many covariates and heteroskedasticity. Our results are obtained using high-dimensional approximations, where the number of included covariates are allowed to grow as fast as the sample size. We find that all of the usual versions of Eicker-White heteroskedasticity consistent standard error estimators for linear models are inconsistent under this asymptotics. We then propose a new heteroskedasticity consistent standard error formula that is fully automatic and robust to both (conditional)\ heteroskedasticity of unknown form and the inclusion of possibly many covariates. We apply our findings to three settings: parametric linear models with many covariates, linear panel models with many fixed effects, and semiparametric semi-linear models with many technical regressors. Simulation evidence consistent with our theoretical results is also provided. The proposed methods are also illustrated with an empirical application.

Suggested Citation

  • Matias D. Cattaneo & Michael Jansson & Whitney K. Newey, 2015. "Inference in Linear Regression Models with Many Covariates and Heteroskedasticity," Papers 1507.02493, arXiv.org, revised Jan 2017.
  • Handle: RePEc:arx:papers:1507.02493
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/1507.02493
    File Function: Latest version
    Download Restriction: no

    Other versions of this item:

    References listed on IDEAS

    as
    1. Stock, James H. & Watson, Mark, 2008. "Heteroskedasticity-Robust Standard Errors for Fixed Effects Panel Data Regression," Scholarly Articles 28461843, Harvard University Department of Economics.
    2. Ulrich K. Müller, 2013. "Risk of Bayesian Inference in Misspecified Models, and the Sandwich Covariance Matrix," Econometrica, Econometric Society, vol. 81(5), pages 1805-1849, September.
    3. Koenker, Roger, 1988. "Asymptotic Theory and Econometric Practice," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 3(2), pages 139-147, April.
    4. A. Belloni & V. Chernozhukov & I. Fernández‐Val & C. Hansen, 2017. "Program Evaluation and Causal Inference With High‐Dimensional Data," Econometrica, Econometric Society, vol. 85, pages 233-298, January.
    5. Alexandre Belloni & Victor Chernozhukov & Christian Hansen, 2014. "Inference on Treatment Effects after Selection among High-Dimensional Controlsâ€," Review of Economic Studies, Oxford University Press, vol. 81(2), pages 608-650.
    6. Belloni, Alexandre & Chernozhukov, Victor & Chetverikov, Denis & Kato, Kengo, 2015. "Some new asymptotic theory for least squares series: Pointwise and uniform results," Journal of Econometrics, Elsevier, vol. 186(2), pages 345-366.
    7. White, Halbert, 1980. "A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity," Econometrica, Econometric Society, vol. 48(4), pages 817-838, May.
    8. Chesher, Andrew & Jewitt, Ian, 1987. "The Bias of a Heteroskedasticity Consistent Covariance Matrix Estimator," Econometrica, Econometric Society, vol. 55(5), pages 1217-1222, September.
    9. Joshua Angrist & Jinyong Hahn, 2004. "When to Control for Covariates? Panel Asymptotics for Estimates of Treatment Effects," The Review of Economics and Statistics, MIT Press, vol. 86(1), pages 58-72, February.
    10. Goncalves, Silvia & White, Halbert, 2005. "Bootstrap Standard Error Estimates for Linear Regression," Journal of the American Statistical Association, American Statistical Association, vol. 100, pages 970-979, September.
    11. Kline, Patrick & Santos, Andres, 2012. "Higher order properties of the wild bootstrap under misspecification," Journal of Econometrics, Elsevier, vol. 171(1), pages 54-70.
    12. Donald, S. G. & Newey, W. K., 1994. "Series Estimation of Semilinear Models," Journal of Multivariate Analysis, Elsevier, vol. 50(1), pages 30-40, July.
    13. Newey, Whitney K., 1997. "Convergence rates and asymptotic normality for series estimators," Journal of Econometrics, Elsevier, vol. 79(1), pages 147-168, July.
    14. James H. Stock & Mark W. Watson, 2008. "Heteroskedasticity-Robust Standard Errors for Fixed Effects Panel Data Regression," Econometrica, Econometric Society, vol. 76(1), pages 155-174, January.
    15. MacKinnon, James G. & White, Halbert, 1985. "Some heteroskedasticity-consistent covariance matrix estimators with improved finite sample properties," Journal of Econometrics, Elsevier, vol. 29(3), pages 305-325, September.
    16. Farrell, Max H., 2015. "Robust inference on average treatment effects with possibly more covariates than observations," Journal of Econometrics, Elsevier, vol. 189(1), pages 1-23.
    17. Alberto Abadie & Guido W. Imbens & Fanyin Zheng, 2014. "Inference for Misspecified Models With Fixed Regressors," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(508), pages 1601-1614, December.
    18. Shurong Zheng & Dandan Jiang & Zhidong Bai & Xuming He, 2014. "Inference on multiple correlation coefficients with moderately high dimensional data," Biometrika, Biometrika Trust, vol. 101(3), pages 748-754.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Chaohua Dong & Jiti Gao & Oliver Linton, 2017. "High dimensional semiparametric moment restriction models," Monash Econometrics and Business Statistics Working Papers 17/17, Monash University, Department of Econometrics and Business Statistics.
    2. Riccardo D'Adamo, 2018. "Cluster-robust Standard Errors for Linear Regression Models with Many Controls," Papers 1806.07314, arXiv.org.
    3. Alexandre Belloni & Victor Chernozhukov & Denis Chetverikov & Christian Hansen & Kengo Kato, 2018. "High-dimensional econometrics and regularized GMM," CeMMAP working papers CWP35/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    4. Chaohua Dong & Jiti Gao & Oliver Linton, 2018. "High dimensional semiparametric moment restriction models," CeMMAP working papers CWP04/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    5. Pei, Zhuan & Pischke, Jörn-Steffen & Schwandt, Hannes, 2017. "Poorly Measured Confounders Are More Useful on the Left Than on the Right," IZA Discussion Papers 10647, Institute for the Study of Labor (IZA).
    6. Patrick Kline & Raffaele Saggio & Mikkel S{o}lvsten, 2018. "Leave-out estimation of variance components," Papers 1806.01494, arXiv.org.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:1507.02493. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (arXiv administrators). General contact details of provider: http://arxiv.org/ .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.