IDEAS home Printed from https://ideas.repec.org/p/qed/wpaper/1421.html
   My bibliography  Save this paper

When and How to Deal with Clustered Errors in Regression Models

Author

Listed:
  • James G. MacKinnon

    () (Queen's University)

  • Matthew D. Webb

    () (Carleton University)

Abstract

We discuss when and how to deal with possibly clustered errors in linear regression models. Specifically, we discuss situations in which a regression model may plausibly be treated as having error terms that are arbitrarily correlated within known clusters but uncorrelated across them. The methods we discuss include various covariance matrix estimators, possibly combined with various methods of obtaining critical values, several bootstrap procedures, and randomization inference. Special attention is given to models with few treated clusters and clusters that vary in size, where inference may be problematic. Two empirical examples and a simulation experiment illustrate the methods we discuss and the concerns we raise.

Suggested Citation

  • James G. MacKinnon & Matthew D. Webb, 2019. "When and How to Deal with Clustered Errors in Regression Models," Working Paper 1421, Economics Department, Queen's University.
  • Handle: RePEc:qed:wpaper:1421
    as

    Download full text from publisher

    File URL: https://www.econ.queensu.ca/sites/econ.queensu.ca/files/wpaper/qed_wp_1421.pdf
    File Function: First version 2019
    Download Restriction: no

    References listed on IDEAS

    as
    1. repec:tpr:restat:v:101:y:2019:i:3:p:452-467 is not listed on IDEAS
    2. James G. MacKinnon, 2019. "How cluster-robust inference is changing applied econometrics," Canadian Journal of Economics, Canadian Economics Association, vol. 52(3), pages 851-881, August.
    3. Stanislav Kolenikov, 2010. "Resampling variance estimation for complex survey data," Stata Journal, StataCorp LP, vol. 10(2), pages 165-199, June.
    4. Bruno Ferman, 2019. "Inference in Differences-in-Differences: How Much Should We Trust in Independent Clusters?," Papers 1909.01782, arXiv.org, revised Sep 2019.
    5. Guido W. Imbens & Michal Kolesár, 2016. "Robust Standard Errors in Small Samples: Some Practical Advice," The Review of Economics and Statistics, MIT Press, vol. 98(4), pages 701-712, October.
    6. repec:clg:wpaper:2013-20 is not listed on IDEAS
    7. James G. MacKinnon & Matthew D. Webb, 2018. "The wild bootstrap for few (treated) clusters," Econometrics Journal, Royal Economic Society, vol. 21(2), pages 114-135, June.
    8. Andreas Hagemann, 2019. "Permutation inference with a finite number of heterogeneous clusters," Papers 1907.01049, arXiv.org.
    9. Bruno Ferman & Cristine Pinto, 2019. "Inference in Differences-in-Differences with Few Treated Groups and Heteroskedasticity," The Review of Economics and Statistics, MIT Press, vol. 101(3), pages 452-467, July.
    10. A. Colin Cameron & Jonah B. Gelbach & Douglas L. Miller, 2008. "Bootstrap-Based Improvements for Inference with Clustered Errors," The Review of Economics and Statistics, MIT Press, vol. 90(3), pages 414-427, August.
    11. Davidson, Russell & Flachaire, Emmanuel, 2008. "The wild bootstrap, tamed at last," Journal of Econometrics, Elsevier, vol. 146(1), pages 162-169, September.
    12. Timothy G. Conley & Christopher R. Taber, 2011. "Inference with "Difference in Differences" with a Small Number of Policy Changes," The Review of Economics and Statistics, MIT Press, vol. 93(1), pages 113-125, February.
    13. A. Colin Cameron & Jonah B. Gelbach & Douglas L. Miller, 2011. "Robust Inference With Multiway Clustering," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 29(2), pages 238-249, April.
    14. MacKinnon , James G., 2015. "Wild Cluster Bootstrap Confidence Intervals," L'Actualité Economique, Société Canadienne de Science Economique, vol. 91(1-2), pages 11-33, Mars-Juin.
    15. Djogbenou, Antoine A. & MacKinnon, James G. & Nielsen, Morten Ørregaard, 2019. "Asymptotic theory and wild bootstrap inference with clustered errors," Journal of Econometrics, Elsevier, vol. 212(2), pages 393-412.
    16. Thompson, Samuel B., 2011. "Simple formulas for standard errors that cluster by both firm and time," Journal of Financial Economics, Elsevier, vol. 99(1), pages 1-10, January.
    17. White, Halbert, 1980. "A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity," Econometrica, Econometric Society, vol. 48(4), pages 817-838, May.
    18. Hansen, Bruce E. & Lee, Seojeong, 2019. "Asymptotic theory for clustered samples," Journal of Econometrics, Elsevier, vol. 210(2), pages 268-290.
    19. Esarey, Justin & Menger, Andrew, 2019. "Practical and Effective Approaches to Dealing With Clustered Data," Political Science Research and Methods, Cambridge University Press, vol. 7(3), pages 541-559, July.
    20. Russell Davidson & James MacKinnon, 2000. "Bootstrap tests: how many bootstraps?," Econometric Reviews, Taylor & Francis Journals, vol. 19(1), pages 55-68.
    21. Ivan A. Canay & Andres Santos & Azeem M. Shaikh, 2018. "The wild bootstrap with a "small" number of "large" clusters," CeMMAP working papers CWP27/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    22. Matthew D. Webb, 2014. "Reworking Wild Bootstrap Based Inference For Clustered Errors," Working Paper 1315, Economics Department, Queen's University.
    23. repec:bpj:jecome:v:7:y:2018:i:1:p:16:n:9 is not listed on IDEAS
    24. repec:tsj:stataj:v:18:y:2018:i:2:p:447-460 is not listed on IDEAS
    25. James G. MacKinnon & Matthew D. Webb, 2018. "Wild Bootstrap Randomization Inference For Few Treated Clusters," Working Paper 1404, Economics Department, Queen's University.
    26. Arellano, M, 1987. "Computing Robust Standard Errors for Within-Groups Estimators," Oxford Bulletin of Economics and Statistics, Department of Economics, University of Oxford, vol. 49(4), pages 431-434, November.
    27. Alberto Abadie & Susan Athey & Guido W. Imbens & Jeffrey M. Wooldridge, 2017. "Sampling-based vs. Design-based Uncertainty in Regression Analysis," Papers 1706.01778, arXiv.org, revised Jun 2019.
    28. Thomas Barrios & Rebecca Diamond & Guido W. Imbens & Michal Kolesár, 2012. "Clustering, Spatial Correlations, and Randomization Inference," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 107(498), pages 578-591, June.
    29. Laurent Davezies & Xavier D'Haultfoeuille & Yannick Guyonvarch, 2018. "Asymptotic results under multiway clustering," Papers 1807.07925, arXiv.org, revised Aug 2018.
    30. Nickell, Stephen J, 1981. "Biases in Dynamic Models with Fixed Effects," Econometrica, Econometric Society, vol. 49(6), pages 1417-1426, November.
    31. MacKinnon, James G. & White, Halbert, 1985. "Some heteroskedasticity-consistent covariance matrix estimators with improved finite sample properties," Journal of Econometrics, Elsevier, vol. 29(3), pages 305-325, September.
    32. James G. MacKinnon, 2002. "Bootstrap inference in econometrics," Canadian Journal of Economics, Canadian Economics Association, vol. 35(4), pages 615-645, November.
    33. repec:tpr:restat:v:99:y:2017:i:4:p:698-709 is not listed on IDEAS
    34. Bester, C. Alan & Conley, Timothy G. & Hansen, Christian B., 2011. "Inference with dependent data using cluster covariance estimators," Journal of Econometrics, Elsevier, vol. 165(2), pages 137-151.
    35. Antoine A. Djogbenou & James G. MacKinnon & Morten Orregard Nielsen, 2018. "Asymptotic Theory and Wild Bootstrap Inference with Clustered Errors," Working Papers 1399, Queen's University, Department of Economics.
    36. repec:tsj:stataj:v:19:y:2019:i:1:p:4-60 is not listed on IDEAS
    37. repec:wly:emetrp:v:85:y:2017:i::p:1013-1030 is not listed on IDEAS
    38. Stephen G. Donald & Kevin Lang, 2007. "Inference with Difference-in-Differences and Other Panel Data," The Review of Economics and Statistics, MIT Press, vol. 89(2), pages 221-233, May.
    39. Marianne Bertrand & Esther Duflo & Sendhil Mullainathan, 2004. "How Much Should We Trust Differences-In-Differences Estimates?," The Quarterly Journal of Economics, Oxford University Press, vol. 119(1), pages 249-275.
    40. Kelly, Morgan, 2019. "The Standard Errors of Persistence," CEPR Discussion Papers 13783, C.E.P.R. Discussion Papers.
    41. Conley, T. G., 1999. "GMM estimation with cross sectional dependence," Journal of Econometrics, Elsevier, vol. 92(1), pages 1-45, September.
    42. Xavier Giné & Ghazala Mansuri, 2018. "Together We Will: Experimental Evidence on Female Voting Behavior in Pakistan," American Economic Journal: Applied Economics, American Economic Association, vol. 10(1), pages 207-235, January.
    43. repec:tsj:stataj:y:17:y:2017:i:3:p:630-651 is not listed on IDEAS
    44. David Roodman & James G. MacKinnon & Morten Ørregaard Nielsen & Matthew D. Webb, 2019. "Fast and wild: Bootstrap inference in Stata using boottest," Stata Journal, StataCorp LP, vol. 19(1), pages 4-60, March.
    45. A. Colin Cameron & Douglas L. Miller, 2015. "A Practitioner’s Guide to Cluster-Robust Inference," Journal of Human Resources, University of Wisconsin Press, vol. 50(2), pages 317-372.
    46. Imbens,Guido W. & Rubin,Donald B., 2015. "Causal Inference for Statistics, Social, and Biomedical Sciences," Cambridge Books, Cambridge University Press, number 9780521885881, December.
    47. Morgan Kelly, 2019. "The Standard Errors of Persistence," Working Papers 201913, School of Economics, University College Dublin.
    48. Moulton, Brent R., 1986. "Random group effects and the precision of regression estimates," Journal of Econometrics, Elsevier, vol. 32(3), pages 385-397, August.
    49. MacKinnon, James G., 2016. "Inference with Large Clustered Datasets," L'Actualité Economique, Société Canadienne de Science Economique, vol. 92(4), pages 649-665, Décembre.
    50. Moulton, Brent R, 1990. "An Illustration of a Pitfall in Estimating the Effects of Aggregate Variables on Micro Unit," The Review of Economics and Statistics, MIT Press, vol. 72(2), pages 334-338, May.
    51. Brewer Mike & Crossley Thomas F. & Joyce Robert, 2018. "Inference with Difference-in-Differences Revisited," Journal of Econometric Methods, De Gruyter, vol. 7(1), pages 1-16, January.
    52. Kloek, T, 1981. "OLS Estimation in a Model Where a Microvariable Is Explained by Aggregates and Contemporaneous Disturbances Are Equicorrelated," Econometrica, Econometric Society, vol. 49(1), pages 205-207, January.
    53. James G. MacKinnon & Morten Ø. Nielsen & Matthew D. Webb, 2019. "Wild Bootstrap and Asymptotic Inference with Multiway Clustering," Working Paper 1415, Economics Department, Queen's University.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Bruno Ferman, 2019. "A simple way to assess inference methods," Papers 1912.08772, arXiv.org, revised Dec 2019.

    More about this item

    Keywords

    clustered data; cluster-robust variance estimator; CRVE; wild cluster bootstrap; robust inference;

    JEL classification:

    • C15 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Statistical Simulation Methods: General
    • C21 - Mathematical and Quantitative Methods - - Single Equation Models; Single Variables - - - Cross-Sectional Models; Spatial Models; Treatment Effect Models
    • C23 - Mathematical and Quantitative Methods - - Single Equation Models; Single Variables - - - Models with Panel Data; Spatio-temporal Models

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:qed:wpaper:1421. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Mark Babcock). General contact details of provider: http://edirc.repec.org/data/qedquca.html .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.