When Should You Adjust Standard Errors for Clustering?

My bibliography Save this paper

When Should You Adjust Standard Errors for Clustering?

Author

Listed:

Alberto Abadie
Susan Athey
Guido Imbens
Jeffrey Wooldridge

Registered:

Abstract

In empirical work it is common to estimate parameters of models and report associated standard errors that account for "clustering" of units, where clusters are defined by factors such as geography. Clustering adjustments are typically motivated by the concern that unobserved components of outcomes for units within clusters are correlated. However, this motivation does not provide guidance about questions such as: (i) Why should we adjust standard errors for clustering in some situations but not others? How can we justify the common practice of clustering in observational studies but not randomized experiments, or clustering by state but not by gender? (ii) Why is conventional clustering a potentially conservative "all-or-nothing" adjustment, and are there alternative methods that respond to data and are less conservative? (iii) In what settings does the choice of whether and how to cluster make a difference? We address these questions using a framework of sampling and design inference. We argue that clustering can be needed to address sampling issues if sampling follows a two stage process where in the first stage, a subset of clusters are sampled from a population of clusters, and in the second stage, units are sampled from the sampled clusters. Then, clustered standard errors account for the existence of clusters in the population that we do not see in the sample. Clustering can be needed to account for design issues if treatment assignment is correlated with membership in a cluster. We propose new variance estimators to deal with intermediate settings where conventional cluster standard errors are unnecessarily conservative and robust standard errors are too small.

Suggested Citation

Alberto Abadie & Susan Athey & Guido Imbens & Jeffrey Wooldridge, 2017. "When Should You Adjust Standard Errors for Clustering?," Papers 1710.02926, arXiv.org, revised Sep 2022.

Handle: RePEc:arx:papers:1710.02926

Download full text from publisher

Other versions of this item:

Alberto Abadie & Susan Athey & Guido W Imbens & Jeffrey M Wooldridge, 2023. "When Should You Adjust Standard Errors for Clustering?," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 138(1), pages 1-35.

Alberto Abadie & Susan Athey & Guido W. Imbens & Jeffrey Wooldridge, 2017. "When Should You Adjust Standard Errors for Clustering?," NBER Working Papers 24003, National Bureau of Economic Research, Inc.
Abadie, Alberto & Athey, Susan & Imbens, Guido W. & Wooldridge, Jeffrey, 2017. "When Should You Adjust Standard Errors for Clustering?," Research Papers repec:ecl:stabus:3596, Stanford University, Graduate School of Business.

References listed on IDEAS

James H. Stock & Mark W. Watson, 2008. "Heteroskedasticity-Robust Standard Errors for Fixed Effects Panel Data Regression," Econometrica, Econometric Society, vol. 76(1), pages 155-174, January.
- James H. Stock & Mark W. Watson, 2006. "Heteroskedasticity-Robust Standard Errors for Fixed Effects Panel Data Regression," NBER Technical Working Papers 0323, National Bureau of Economic Research, Inc.
- Stock, James H. & Watson, Mark, 2008. "Heteroskedasticity-Robust Standard Errors for Fixed Effects Panel Data Regression," Scholarly Articles 28461843, Harvard University Department of Economics.
Rustam Ibragimov & Ulrich K. Müller, 2016. "Inference with Few Heterogeneous Clusters," The Review of Economics and Statistics, MIT Press, vol. 98(1), pages 83-96, March.
Matthew Gentzkow & Jesse M. Shapiro, 2008. "Preschool Television Viewing and Adolescent Test Scores: Historical Evidence from the Coleman Study," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 123(1), pages 279-323.
Arellano, M, 1987. "Computing Robust Standard Errors for Within-Groups Estimators," Oxford Bulletin of Economics and Statistics, Department of Economics, University of Oxford, vol. 49(4), pages 431-434, November.
Jeffrey M Wooldridge, 2010. "Econometric Analysis of Cross Section and Panel Data," MIT Press Books, The MIT Press, edition 2, volume 1, number 0262232588, December.
- Jeffrey M. Wooldridge, 2001. "Econometric Analysis of Cross Section and Panel Data," MIT Press Books, The MIT Press, edition 1, volume 1, number 0262232197, December.
Alberto Abadie & Susan Athey & Guido W. Imbens & Jeffrey M. Wooldridge, 2017. "Sampling-based vs. Design-based Uncertainty in Regression Analysis," Papers 1706.01778, arXiv.org, revised Jun 2019.
- Abadie, Alberto & Athey, Susan & Imbens, Guido W. & Wooldridge, Jeffrey M., 2017. "Sampling-Based vs. Design-Based Uncertainty in Regression Analysis," Research Papers 3349, Stanford University, Graduate School of Business.
Jessica Cohen & Pascaline Dupas, 2010. "Free Distribution or Cost-Sharing? Evidence from a Randomized Malaria Prevention Experiment," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 125(1), pages 1-45.
Thomas Barrios & Rebecca Diamond & Guido W. Imbens & Michal Kolesár, 2012. "Clustering, Spatial Correlations, and Randomization Inference," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 107(498), pages 578-591, June.
- Thomas Barrios & Rebecca Diamond & Guido W. Imbens & Michal Kolesar, 2010. "Clustering, Spatial Correlations and Randomization Inference," NBER Working Papers 15760, National Bureau of Economic Research, Inc.
Jeffrey M. Wooldridge, 2003. "Cluster-Sample Methods in Applied Econometrics," American Economic Review, American Economic Association, vol. 93(2), pages 133-138, May.
Stephen G. Donald & Kevin Lang, 2007. "Inference with Difference-in-Differences and Other Panel Data," The Review of Economics and Statistics, MIT Press, vol. 89(2), pages 221-233, May.
Hansen, Christian B., 2007. "Generalized least squares inference in panel and multilevel models with serial correlation and fixed effects," Journal of Econometrics, Elsevier, vol. 140(2), pages 670-694, October.
Marianne Bertrand & Esther Duflo & Sendhil Mullainathan, 2004. "How Much Should We Trust Differences-In-Differences Estimates?," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 119(1), pages 249-275.
- Marianne Bertrand & Esther Duflo & Sendhil Mullainathan, 2002. "How Much Should We Trust Differences-in-Differences Estimates?," NBER Working Papers 8841, National Bureau of Economic Research, Inc.
Conley, T. G., 1999. "GMM estimation with cross sectional dependence," Journal of Econometrics, Elsevier, vol. 92(1), pages 1-45, September.
Imbens,Guido W. & Rubin,Donald B., 2015. "Causal Inference for Statistics, Social, and Biomedical Sciences," Cambridge Books, Cambridge University Press, number 9780521885881.
Ibragimov, Rustam & MÃ¼ller, Ulrich K., 2010. "t-Statistic Based Correlation and Heterogeneity Robust Inference," Journal of Business & Economic Statistics, American Statistical Association, vol. 28(4), pages 453-468.
Moulton, Brent R., 1986. "Random group effects and the precision of regression estimates," Journal of Econometrics, Elsevier, vol. 32(3), pages 385-397, August.
Moulton, Brent R, 1990. "An Illustration of a Pitfall in Estimating the Effects of Aggregate Variables on Micro Unit," The Review of Economics and Statistics, MIT Press, vol. 72(2), pages 334-338, May.
Kloek, T, 1981. "OLS Estimation in a Model Where a Microvariable Is Explained by Aggregates and Contemporaneous Disturbances Are Equicorrelated," Econometrica, Econometric Society, vol. 49(1), pages 205-207, January.

Full references (including those not matched with items on IDEAS)

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

A. Colin Cameron & Douglas L. Miller, 2010. "Robust Inference with Clustered Data," Working Papers 318, University of California, Davis, Department of Economics.
- Colin Cameron, 2011. "Robust inference with clustered data," Mexican Stata Users' Group Meetings 2011 07, Stata Users Group.
- A. Colin Cameron & Douglas L. Miller, 2010. "Robust Inference with Clustered Data," Working Papers 316, University of California, Davis, Department of Economics.
James G. MacKinnon & Matthew D. Webb, 2020. "When and How to Deal with Clustered Errors in Regression Models," Working Paper 1421, Economics Department, Queen's University.
A. Colin Cameron & Douglas L. Miller, 2010. "Robust Inference with Clustered Data," Working Papers 106, University of California, Davis, Department of Economics.
- Colin Cameron, 2011. "Robust inference with clustered data," Mexican Stata Users' Group Meetings 2011 07, Stata Users Group.
- A. Colin Cameron & Douglas L. Miller, 2010. "Robust Inference with Clustered Data," Working Papers 107, University of California, Davis, Department of Economics.
Hansen, Bruce E. & Lee, Seojeong, 2019. "Asymptotic theory for clustered samples," Journal of Econometrics, Elsevier, vol. 210(2), pages 268-290.
- Bruce E. Hansen & Seojeong Jay Lee, 2017. "Asymptotic Theory for Clustered Samples," Discussion Papers 2017-18, School of Economics, The University of New South Wales.
- Bruce E. Hansen & Seojeong Lee, 2019. "Asymptotic Theory for Clustered Samples," Papers 1902.01497, arXiv.org.
Michael Pollmann, 2020. "Causal Inference for Spatial Treatments," Papers 2011.00373, arXiv.org, revised Jan 2023.
A. Colin Cameron & Jonah B. Gelbach & Douglas L. Miller, 2008. "Bootstrap-Based Improvements for Inference with Clustered Errors," The Review of Economics and Statistics, MIT Press, vol. 90(3), pages 414-427, August.
- Jonah B. Gelbach & Doug Miller & A. Colin Cameron, 2006. "Bootstrap-Based Improvements for Inference with Clustered Errors," Working Papers 621, University of California, Davis, Department of Economics.
- A. Colin Cameron & Jonah B. Gelbach & Douglas L. Miller, 2007. "Bootstrap-Based Improvements for Inference with Clustered Errors," NBER Technical Working Papers 0344, National Bureau of Economic Research, Inc.
Matthew D. Webb, 2023. "Reworking wild bootstrap‐based inference for clustered errors," Canadian Journal of Economics/Revue canadienne d'économique, John Wiley & Sons, vol. 56(3), pages 839-858, August.
- Matthew D. Webb, 2014. "Reworking Wild Bootstrap Based Inference For Clustered Errors," Working Paper 1315, Economics Department, Queen's University.
Athey, Susan & Imbens, Guido W., 2022. "Design-based analysis in Difference-In-Differences settings with staggered adoption," Journal of Econometrics, Elsevier, vol. 226(1), pages 62-79.
- Susan Athey & Guido Imbens, 2018. "Design-based Analysis in Difference-In-Differences Settings with Staggered Adoption," Papers 1808.05293, arXiv.org, revised Sep 2018.
- Athey, Susan & Imbens, Guido W., 2018. "Design-based Analysis in Difference-In-Differences Settings with Staggered Adoption," Research Papers 3712, Stanford University, Graduate School of Business.
- Susan Athey & Guido W. Imbens, 2018. "Design-based Analysis in Difference-In-Differences Settings with Staggered Adoption," NBER Working Papers 24963, National Bureau of Economic Research, Inc.
A. Colin Cameron & Jonah B. Gelbach & Douglas L. Miller, 2008. "Bootstrap-Based Improvements for Inference with Clustered Errors," The Review of Economics and Statistics, MIT Press, vol. 90(3), pages 414-427, August.
- Jonah B. Gelbach & Doug Miller & A. Colin Cameron, 2006. "Bootstrap-Based Improvements for Inference with Clustered Errors," Working Papers 128, University of California, Davis, Department of Economics.
- A. Colin Cameron & Jonah B. Gelbach & Douglas L. Miller, 2007. "Bootstrap-Based Improvements for Inference with Clustered Errors," NBER Technical Working Papers 0344, National Bureau of Economic Research, Inc.
Hwang, Jungbin, 2021. "Simple and trustworthy cluster-robust GMM inference," Journal of Econometrics, Elsevier, vol. 222(2), pages 993-1023.
A. Colin Cameron & Douglas L. Miller, 2015. "A Practitionerâ€™s Guide to Cluster-Robust Inference," Journal of Human Resources, University of Wisconsin Press, vol. 50(2), pages 317-372.
Hagemann, Andreas, 2019. "Placebo inference on treatment effects when the number of clusters is small," Journal of Econometrics, Elsevier, vol. 213(1), pages 190-209.
MacKinnon, James G. & Nielsen, Morten Ørregaard & Webb, Matthew D., 2023. "Cluster-robust inference: A guide to empirical practice," Journal of Econometrics, Elsevier, vol. 232(2), pages 272-299.
- Matthew D. Webb & James MacKinnon & Morten Nielsen, 2021. "Cluster–robust inference: A guide to empirical practice," Economics Virtual Symposium 2021 6, Stata Users Group.
- James G. MacKinnon & Morten Ã˜rregaard Nielsen & Matthew D. Webb, 2022. "Cluster-Robust Inference: A Guide to Empirical Practice," Working Paper 1456, Economics Department, Queen's University.
- James MacKinnon & Morten Ørregaard Nielsen, 2022. "Cluster-Robust Inference: A Guide to Empirical Practice," CREATES Research Papers 2022-08, Department of Economics and Business Economics, Aarhus University.
- James G. MacKinnon & Morten {O}rregaard Nielsen & Matthew D. Webb, 2022. "Cluster-Robust Inference: A Guide to Empirical Practice," Papers 2205.03285, arXiv.org.
Bester, C. Alan & Conley, Timothy G. & Hansen, Christian B., 2011. "Inference with dependent data using cluster covariance estimators," Journal of Econometrics, Elsevier, vol. 165(2), pages 137-151.
Jungmo Yoon & Antonio F. Galvao, 2020. "Cluster robust covariance matrix estimation in panel quantile regression with individual fixed effects," Quantitative Economics, Econometric Society, vol. 11(2), pages 579-608, May.
Vikström, Johan, 2009. "Cluster sample inference using sensitivity analysis: the case with few groups," Working Paper Series 2009:15, IFAU - Institute for Evaluation of Labour Market and Education Policy.
James G. MacKinnon & Matthew D. Webb, 2017. "Wild Bootstrap Inference for Wildly Different Cluster Sizes," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 32(2), pages 233-254, March.
- James G. MacKinnon & Matthew D. Webb, 2015. "Wild Bootstrap Inference For Wildly Different Cluster Sizes," Working Paper 1314, Economics Department, Queen's University.
Rok Spruk, 2019. "The rise and fall of Argentina," Latin American Economic Review, Springer;Centro de Investigaciòn y Docencia Económica (CIDE), vol. 28(1), pages 1-40, December.
- Spruk, Rok, 2018. "The Rise and Fall of Argentina," Working Papers 07520, George Mason University, Mercatus Center.
Ivan A. Canay & Andres Santos & Azeem M. Shaikh, 2018. "The wild bootstrap with a "small" number of "large" clusters," CeMMAP working papers CWP27/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
- Ivan A. Canay & Andres Santos & Azeem M. Shaikh, 2019. "The Wild Bootstrap with a Small Number of Large Clusters," CeMMAP working papers CWP40/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
Andreas Hagemann, 2019. "Permutation inference with a finite number of heterogeneous clusters," Papers 1907.01049, arXiv.org, revised Feb 2023.

More about this item

JEL classification:

C14 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Semiparametric and Nonparametric Methods: General
C21 - Mathematical and Quantitative Methods - - Single Equation Models; Single Variables - - - Cross-Sectional Models; Spatial Models; Treatment Effect Models
C52 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Model Evaluation, Validation, and Selection

NEP fields

This paper has been announced in the following NEP Reports:

NEP-ECM-2017-11-05 (Econometrics)
NEP-EXP-2017-11-05 (Experimental Economics)

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:1710.02926. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

When Should You Adjust Standard Errors for Clustering?

Author

Abstract

Suggested Citation

Download full text from publisher

Other versions of this item:

References listed on IDEAS

Most related items

More about this item

JEL classification:

NEP fields

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data