IDEAS home Printed from https://ideas.repec.org/p/nbr/nberwo/24003.html
   My bibliography  Save this paper

When Should You Adjust Standard Errors for Clustering?

Author

Listed:
  • Alberto Abadie
  • Susan Athey
  • Guido W. Imbens
  • Jeffrey Wooldridge

Abstract

In empirical work in economics it is common to report standard errors that account for clustering of units. Typically, the motivation given for the clustering adjustments is that unobserved components in outcomes for units within clusters are correlated. However, because correlation may occur across more than one dimension, this motivation makes it difficult to justify why researchers use clustering in some dimensions, such as geographic, but not others, such as age cohorts or gender. This motivation also makes it difficult to explain why one should not cluster with data from a randomized experiment. In this paper, we argue that clustering is in essence a design problem, either a sampling design or an experimental design issue. It is a sampling design issue if sampling follows a two stage process where in the first stage, a subset of clusters were sampled randomly from a population of clusters, and in the second stage, units were sampled randomly from the sampled clusters. In this case the clustering adjustment is justified by the fact that there are clusters in the population that we do not see in the sample. Clustering is an experimental design issue if the assignment is correlated within the clusters. We take the view that this second perspective best fits the typical setting in economics where clustering adjustments are used. This perspective allows us to shed new light on three questions: (i) when should one adjust the standard errors for clustering, (ii) when is the conventional adjustment for clustering appropriate, and (iii) when does the conventional adjustment of the standard errors matter.

Suggested Citation

  • Alberto Abadie & Susan Athey & Guido W. Imbens & Jeffrey Wooldridge, 2017. "When Should You Adjust Standard Errors for Clustering?," NBER Working Papers 24003, National Bureau of Economic Research, Inc.
  • Handle: RePEc:nbr:nberwo:24003
    Note: AG CF CH DEV ED EEE HC HE LE LS PE POL
    as

    Download full text from publisher

    File URL: http://www.nber.org/papers/w24003.pdf
    Download Restriction: no
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. Jeffrey M Wooldridge, 2010. "Econometric Analysis of Cross Section and Panel Data," MIT Press Books, The MIT Press, edition 2, volume 1, number 0262232588, September.
    2. Stephen G. Donald & Kevin Lang, 2007. "Inference with Difference-in-Differences and Other Panel Data," The Review of Economics and Statistics, MIT Press, vol. 89(2), pages 221-233, May.
    3. Marianne Bertrand & Esther Duflo & Sendhil Mullainathan, 2004. "How Much Should We Trust Differences-In-Differences Estimates?," The Quarterly Journal of Economics, Oxford University Press, vol. 119(1), pages 249-275.
    4. Moulton, Brent R., 1986. "Random group effects and the precision of regression estimates," Journal of Econometrics, Elsevier, vol. 32(3), pages 385-397, August.
    5. Kloek, T, 1981. "OLS Estimation in a Model Where a Microvariable Is Explained by Aggregates and Contemporaneous Disturbances Are Equicorrelated," Econometrica, Econometric Society, vol. 49(1), pages 205-207, January.
    6. Rustam Ibragimov & Ulrich K. Müller, 2016. "Inference with Few Heterogeneous Clusters," The Review of Economics and Statistics, MIT Press, vol. 98(1), pages 83-96, March.
    7. Arellano, M, 1987. "Computing Robust Standard Errors for Within-Groups Estimators," Oxford Bulletin of Economics and Statistics, Department of Economics, University of Oxford, vol. 49(4), pages 431-434, November.
    8. Thomas Barrios & Rebecca Diamond & Guido W. Imbens & Michal Kolesár, 2012. "Clustering, Spatial Correlations, and Randomization Inference," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 107(498), pages 578-591, June.
    9. Jeffrey M. Wooldridge, 2003. "Cluster-Sample Methods in Applied Econometrics," American Economic Review, American Economic Association, vol. 93(2), pages 133-138, May.
    10. Conley, T. G., 1999. "GMM estimation with cross sectional dependence," Journal of Econometrics, Elsevier, vol. 92(1), pages 1-45, September.
    11. Imbens,Guido W. & Rubin,Donald B., 2015. "Causal Inference for Statistics, Social, and Biomedical Sciences," Cambridge Books, Cambridge University Press, number 9780521885881, January.
    12. Ibragimov, Rustam & Müller, Ulrich K., 2010. "t-Statistic Based Correlation and Heterogeneity Robust Inference," Journal of Business & Economic Statistics, American Statistical Association, vol. 28(4), pages 453-468.
    13. Moulton, Brent R, 1990. "An Illustration of a Pitfall in Estimating the Effects of Aggregate Variables on Micro Unit," The Review of Economics and Statistics, MIT Press, vol. 72(2), pages 334-338, May.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Hansen, Bruce E. & Lee, Seojeong, 2019. "Asymptotic theory for clustered samples," Journal of Econometrics, Elsevier, vol. 210(2), pages 268-290.
    2. A. Colin Cameron & Douglas L. Miller, 2010. "Robust Inference with Clustered Data," Working Papers 318, University of California, Davis, Department of Economics.
    3. James G. MacKinnon & Matthew D. Webb, 2020. "When and How to Deal with Clustered Errors in Regression Models," Working Paper 1421, Economics Department, Queen's University.
    4. A. Colin Cameron & Douglas L. Miller, 2010. "Robust Inference with Clustered Data," Working Papers 106, University of California, Davis, Department of Economics.
    5. A. Colin Cameron & Jonah B. Gelbach & Douglas L. Miller, 2008. "Bootstrap-Based Improvements for Inference with Clustered Errors," The Review of Economics and Statistics, MIT Press, vol. 90(3), pages 414-427, August.
    6. Hagemann, Andreas, 2019. "Placebo inference on treatment effects when the number of clusters is small," Journal of Econometrics, Elsevier, vol. 213(1), pages 190-209.
    7. Ivan A. Canay & Andres Santos & Azeem M. Shaikh, 2018. "The wild bootstrap with a "small" number of "large" clusters," CeMMAP working papers CWP27/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    8. A. Colin Cameron & Jonah B. Gelbach & Douglas L. Miller, 2008. "Bootstrap-Based Improvements for Inference with Clustered Errors," The Review of Economics and Statistics, MIT Press, vol. 90(3), pages 414-427, August.
    9. Rok Spruk, 2019. "The rise and fall of Argentina," Latin American Economic Review, Springer;Centro de Investigaciòn y Docencia Económica (CIDE), vol. 28(1), pages 1-40, December.
    10. Michael Pollmann, 2020. "Causal Inference for Spatial Treatments," Papers 2011.00373, arXiv.org.
    11. Andreas Hagemann, 2019. "Permutation inference with a finite number of heterogeneous clusters," Papers 1907.01049, arXiv.org.
    12. Mitchell A. Petersen, 2009. "Estimating Standard Errors in Finance Panel Data Sets: Comparing Approaches," Review of Financial Studies, Society for Financial Studies, vol. 22(1), pages 435-480, January.
    13. Vikström, Johan, 2009. "Cluster sample inference using sensitivity analysis: the case with few groups," Working Paper Series 2009:15, IFAU - Institute for Evaluation of Labour Market and Education Policy.
    14. Miroslav Verbič & Rok Spruk, 2019. "Political economy of pension reforms: an empirical investigation," European Journal of Law and Economics, Springer, vol. 47(2), pages 171-232, April.
    15. Kim, Min Seong & Sun, Yixiao, 2013. "Heteroskedasticity and spatiotemporal dependence robust inference for linear panel models with fixed effects," Journal of Econometrics, Elsevier, vol. 177(1), pages 85-108.
    16. Cameron, A. Colin & Gelbach, Jonah B. & Miller, Douglas L., 2011. "Robust Inference With Multiway Clustering," Journal of Business & Economic Statistics, American Statistical Association, vol. 29(2), pages 238-249.
    17. Vogelsang, Timothy J., 2012. "Heteroskedasticity, autocorrelation, and spatial correlation robust inference in linear panel models with fixed-effects," Journal of Econometrics, Elsevier, vol. 166(2), pages 303-319.
    18. Sun, Yu & Yan, Karen X., 2019. "Inference on Difference-in-Differences average treatment effects: A fixed-b approach," Journal of Econometrics, Elsevier, vol. 211(2), pages 560-588.
    19. Bester, C. Alan & Conley, Timothy G. & Hansen, Christian B., 2011. "Inference with dependent data using cluster covariance estimators," Journal of Econometrics, Elsevier, vol. 165(2), pages 137-151.
    20. Jungmo Yoon & Antonio F. Galvao, 2020. "Cluster robust covariance matrix estimation in panel quantile regression with individual fixed effects," Quantitative Economics, Econometric Society, vol. 11(2), pages 579-608, May.

    More about this item

    JEL classification:

    • C21 - Mathematical and Quantitative Methods - - Single Equation Models; Single Variables - - - Cross-Sectional Models; Spatial Models; Treatment Effect Models

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nbr:nberwo:24003. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (). General contact details of provider: http://edirc.repec.org/data/nberrus.html .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.