IDEAS home Printed from https://ideas.repec.org/a/tsj/stataj/v23y2023i4p942-982.html
   My bibliography  Save this article

Leverage, influence, and the jackknife in clustered regression models: Reliable inference using summclust

Author

Listed:
  • James G. MacKinnon

    (Queen’s University)

  • Morten Ørregaard Nielsen

    (Aarhus University)

  • Matthew D. Webb

    (Carleton University)

Abstract

We introduce a new command, summclust, that summarizes the cluster structure of the dataset for linear regression models with clustered disturbances. The key unit of observation for such a model is the cluster. We therefore propose cluster-level measures of leverage, partial leverage, and influence and show how to compute them quickly in most cases. The measures of leverage and partial leverage can be used as diagnostic tools to identify datasets and regression designs in which cluster–robust inference is likely to be challenging. The measures of influence can provide valuable information about how the results depend on the data in the various clusters. We also show how to calculate two jackknife variance matrix estimators efficiently as a by-product of our other computations. These estimators, which are already available in Stata, are generally more conservative than conventional variance matrix estimators. The summclust command computes all the quantities that we discuss.

Suggested Citation

  • James G. MacKinnon & Morten Ørregaard Nielsen & Matthew D. Webb, 2023. "Leverage, influence, and the jackknife in clustered regression models: Reliable inference using summclust," Stata Journal, StataCorp LP, vol. 23(4), pages 942-982, December.
  • Handle: RePEc:tsj:stataj:v:23:y:2023:i:4:p:942-982
    DOI: 10.1177/1536867X231212433
    Note: to access software from within Stata, net describe http://www.stata-journal.com/software/sj23-4/st0733/
    as

    Download full text from publisher

    File URL: http://www.stata-journal.com/article.html?article=st0733
    File Function: link to article purchase
    Download Restriction: no

    File URL: https://libkey.io/10.1177/1536867X231212433?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. MacKinnon, James G. & Webb, Matthew D., 2020. "Randomization inference for difference-in-differences with few treated clusters," Journal of Econometrics, Elsevier, vol. 218(2), pages 435-450.
    2. Matias Busso & Sebastian Galiani, 2019. "The Causal Effect of Competition on Prices and Quality: Evidence from a Field Experiment," American Economic Journal: Applied Economics, American Economic Association, vol. 11(1), pages 33-56, January.
    3. James G. MacKinnon & Morten Ørregaard Nielsen & Matthew D. Webb, 2021. "Wild Bootstrap and Asymptotic Inference With Multiway Clustering," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 39(2), pages 505-519, March.
    4. Guido W. Imbens & Michal Kolesár, 2016. "Robust Standard Errors in Small Samples: Some Practical Advice," The Review of Economics and Statistics, MIT Press, vol. 98(4), pages 701-712, October.
    5. James G. MacKinnon & Matthew D. Webb, 2018. "The wild bootstrap for few (treated) clusters," Econometrics Journal, Royal Economic Society, vol. 21(2), pages 114-135, June.
    6. MacKinnon, James G. & Nielsen, Morten Ørregaard & Webb, Matthew D., 2023. "Testing for the appropriate level of clustering in linear regression models," Journal of Econometrics, Elsevier, vol. 235(2), pages 2027-2056.
    7. A. Colin Cameron & Jonah B. Gelbach & Douglas L. Miller, 2008. "Bootstrap-Based Improvements for Inference with Clustered Errors," The Review of Economics and Statistics, MIT Press, vol. 90(3), pages 414-427, August.
    8. Timothy Conley & Silvia Gonçalves & Christian Hansen, 2018. "Inference with Dependent Data in Accounting and Finance Applications," Journal of Accounting Research, Wiley Blackwell, vol. 56(4), pages 1139-1203, September.
    9. A. Colin Cameron & Jonah B. Gelbach & Douglas L. Miller, 2011. "Robust Inference With Multiway Clustering," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 29(2), pages 238-249, April.
    10. James G. MacKinnon & Morten Ørregaard Nielsen & Matthew D. Webb, 2023. "Fast and reliable jackknife and bootstrap methods for cluster‐robust inference," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 38(5), pages 671-694, August.
    11. MacKinnon, James G. & Nielsen, Morten Ørregaard & Webb, Matthew D., 2023. "Cluster-robust inference: A guide to empirical practice," Journal of Econometrics, Elsevier, vol. 232(2), pages 272-299.
    12. Djogbenou, Antoine A. & MacKinnon, James G. & Nielsen, Morten Ørregaard, 2019. "Asymptotic theory and wild bootstrap inference with clustered errors," Journal of Econometrics, Elsevier, vol. 212(2), pages 393-412.
    13. Davidson, Russell & MacKinnon, James G., 1993. "Estimation and Inference in Econometrics," OUP Catalogue, Oxford University Press, number 9780195060119.
    14. Hansen, Bruce E. & Lee, Seojeong, 2019. "Asymptotic theory for clustered samples," Journal of Econometrics, Elsevier, vol. 210(2), pages 268-290.
    15. Chang Hyung Lee & Douglas G. Steigerwald, 2018. "Inference for clustered data," Stata Journal, StataCorp LP, vol. 18(2), pages 447-460, June.
    16. James G. MacKinnon & Matthew D. Webb, 2019. "Wild Bootstrap Randomization Inference for Few Treated Clusters," Advances in Econometrics, in: The Econometrics of Complex Survey Data, volume 39, pages 61-85, Emerald Group Publishing Limited.
    17. MacKinnon, James G. & White, Halbert, 1985. "Some heteroskedasticity-consistent covariance matrix estimators with improved finite sample properties," Journal of Econometrics, Elsevier, vol. 29(3), pages 305-325, September.
    18. Andrew V. Carter & Kevin T. Schnepel & Douglas G. Steigerwald, 2017. "Asymptotic Behavior of a t -Test Robust to Cluster Heterogeneity," The Review of Economics and Statistics, MIT Press, vol. 99(4), pages 698-709, July.
    19. Bester, C. Alan & Conley, Timothy G. & Hansen, Christian B., 2011. "Inference with dependent data using cluster covariance estimators," Journal of Econometrics, Elsevier, vol. 165(2), pages 137-151.
    20. David Roodman & James G. MacKinnon & Morten Ørregaard Nielsen & Matthew D. Webb, 2019. "Fast and wild: Bootstrap inference in Stata using boottest," Stata Journal, StataCorp LP, vol. 19(1), pages 4-60, March.
    21. James G. MacKinnon & Matthew D. Webb, 2017. "Pitfalls When Estimating Treatment Effects Using Clustered Data," Working Paper 1387, Economics Department, Queen's University.
    22. James E. Pustejovsky & Elizabeth Tipton, 2018. "Small-Sample Methods for Cluster-Robust Variance Estimation and Hypothesis Testing in Fixed Effects Models," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 36(4), pages 672-683, October.
    23. James G. MacKinnon & Matthew D. Webb, 2017. "Wild Bootstrap Inference for Wildly Different Cluster Sizes," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 32(2), pages 233-254, March.
    24. A. Colin Cameron & Douglas L. Miller, 2015. "A Practitioner’s Guide to Cluster-Robust Inference," Journal of Human Resources, University of Wisconsin Press, vol. 50(2), pages 317-372.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. MacKinnon, James G. & Nielsen, Morten Ørregaard & Webb, Matthew D., 2023. "Testing for the appropriate level of clustering in linear regression models," Journal of Econometrics, Elsevier, vol. 235(2), pages 2027-2056.
    2. MacKinnon, James G. & Nielsen, Morten Ørregaard & Webb, Matthew D., 2023. "Cluster-robust inference: A guide to empirical practice," Journal of Econometrics, Elsevier, vol. 232(2), pages 272-299.
    3. Daniel Auer & Michaela Slotwinski & Achim Ahrens & Dominik Hangartner & Selina Kurer & Stefanie Kurt & Alois Stutzer, 2024. "Social Assistance and Refugee Crime," CESifo Working Paper Series 11051, CESifo.
    4. James G. MacKinnon & Morten Ørregaard Nielsen & Matthew D. Webb, 2023. "Fast and reliable jackknife and bootstrap methods for cluster‐robust inference," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 38(5), pages 671-694, August.
    5. Wang, Wenjie & Zhang, Yichong, 2024. "Wild bootstrap inference for instrumental variables regressions with weak and few clusters," Journal of Econometrics, Elsevier, vol. 241(1).
    6. Pettersson-Lidbom, Per, 2022. "Exit, Voice and Political Change: Evidence from Swedish Mass Migration to the United States. A Comment on Karadja and Prawitz (Journal of Political Economy, 2019)," Journal of Comments and Replications in Economics (JCRE), ZBW - Leibniz Information Centre for Economics, vol. 1(2022-3), pages 1-13.
    7. MacKinnon, James G., 2023. "Using large samples in econometrics," Journal of Econometrics, Elsevier, vol. 235(2), pages 922-926.
    8. Johannes W. Ligtenberg, 2023. "Inference in IV models with clustered dependence, many instruments and weak identification," Papers 2306.08559, arXiv.org, revised Mar 2024.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. MacKinnon, James G. & Nielsen, Morten Ørregaard & Webb, Matthew D., 2023. "Cluster-robust inference: A guide to empirical practice," Journal of Econometrics, Elsevier, vol. 232(2), pages 272-299.
    2. MacKinnon, James G. & Nielsen, Morten Ørregaard & Webb, Matthew D., 2023. "Testing for the appropriate level of clustering in linear regression models," Journal of Econometrics, Elsevier, vol. 235(2), pages 2027-2056.
    3. James G. MacKinnon & Morten Ørregaard Nielsen & Matthew D. Webb, 2023. "Fast and reliable jackknife and bootstrap methods for cluster‐robust inference," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 38(5), pages 671-694, August.
    4. James G. MacKinnon & Morten Ørregaard Nielsen & Matthew D. Webb, 2021. "Wild Bootstrap and Asymptotic Inference With Multiway Clustering," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 39(2), pages 505-519, March.
    5. James G. MacKinnon & Matthew D. Webb, 2020. "When and How to Deal with Clustered Errors in Regression Models," Working Paper 1421, Economics Department, Queen's University.
    6. Djogbenou, Antoine A. & MacKinnon, James G. & Nielsen, Morten Ørregaard, 2019. "Asymptotic theory and wild bootstrap inference with clustered errors," Journal of Econometrics, Elsevier, vol. 212(2), pages 393-412.
    7. James G. MacKinnon & Morten {O}rregaard Nielsen & Matthew D. Webb, 2024. "Cluster-robust jackknife and bootstrap inference for binary response models," Papers 2406.00650, arXiv.org.
    8. Matthew D. Webb, 2023. "Reworking wild bootstrap‐based inference for clustered errors," Canadian Journal of Economics/Revue canadienne d'économique, John Wiley & Sons, vol. 56(3), pages 839-858, August.
    9. James G. MacKinnon, 2019. "How cluster‐robust inference is changing applied econometrics," Canadian Journal of Economics/Revue canadienne d'économique, John Wiley & Sons, vol. 52(3), pages 851-881, August.
    10. MacKinnon, James G., 2023. "Fast cluster bootstrap methods for linear regression models," Econometrics and Statistics, Elsevier, vol. 26(C), pages 52-71.
    11. Hansen, Bruce E. & Lee, Seojeong, 2019. "Asymptotic theory for clustered samples," Journal of Econometrics, Elsevier, vol. 210(2), pages 268-290.
    12. James G. MacKinnon & Matthew D. Webb, 2017. "Pitfalls When Estimating Treatment Effects Using Clustered Data," Working Paper 1387, Economics Department, Queen's University.
    13. Wenjie Wang & Yichong Zhang, 2021. "Wild Bootstrap for Instrumental Variables Regressions with Weak and Few Clusters," Papers 2108.13707, arXiv.org, revised Jan 2024.
    14. Wang, Wenjie & Zhang, Yichong, 2024. "Wild bootstrap inference for instrumental variables regressions with weak and few clusters," Journal of Econometrics, Elsevier, vol. 241(1).
    15. Antoine A. Djogbenou & James G. MacKinnon & Morten Ø. Nielsen, 2017. "Validity Of Wild Bootstrap Inference With Clustered Errors," Working Paper 1383, Economics Department, Queen's University.
    16. Tom Boot & Gianmaria Niccodemi & Tom Wansbeek, 2023. "Unbiased estimation of the OLS covariance matrix when the errors are clustered," Empirical Economics, Springer, vol. 64(6), pages 2511-2533, June.
    17. MacKinnon, James G., 2023. "Using large samples in econometrics," Journal of Econometrics, Elsevier, vol. 235(2), pages 922-926.
    18. MacKinnon, James G. & Webb, Matthew D., 2020. "Randomization inference for difference-in-differences with few treated clusters," Journal of Econometrics, Elsevier, vol. 218(2), pages 435-450.
    19. James G. MacKinnon & Matthew D. Webb & Morten Ø. Nielsen, 2017. "Bootstrap And Asymptotic Inference With Multiway Clustering," Working Paper 1386, Economics Department, Queen's University.
    20. Wang, Wenjie, 2021. "Wild Bootstrap for Instrumental Variables Regression with Weak Instruments and Few Clusters," MPRA Paper 106227, University Library of Munich, Germany.

    More about this item

    Keywords

    summclust; clustered data; cluster–robust variance estimator; CRVE; grouped data; high-leverage clusters; influential clusters; jackknife; partial leverage; robust inference;
    All these keywords.

    JEL classification:

    • C10 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - General
    • C12 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Hypothesis Testing: General
    • C21 - Mathematical and Quantitative Methods - - Single Equation Models; Single Variables - - - Cross-Sectional Models; Spatial Models; Treatment Effect Models
    • C23 - Mathematical and Quantitative Methods - - Single Equation Models; Single Variables - - - Models with Panel Data; Spatio-temporal Models
    • C87 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Econometric Software

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:tsj:stataj:v:23:y:2023:i:4:p:942-982. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Christopher F. Baum or Lisa Gilmore (email available below). General contact details of provider: http://www.stata-journal.com/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.