IDEAS home Printed from https://ideas.repec.org/a/sae/jedbes/v45y2020i1p32-57.html
   My bibliography  Save this article

Analyzing Grouped Administrative Data for RCTs Using Design-Based Methods

Author

Listed:
  • Peter Z. Schochet

    (Mathematica Policy Research, Inc.)

Abstract

This article discusses estimation of average treatment effects for randomized controlled trials (RCTs) using grouped administrative data to help improve data access. The focus is on design-based estimators, derived using the building blocks of experiments, that are conducive to grouped data for a wide range of RCT designs, including clustered and blocked designs, and models with weights and covariates. Because of the linearity of the regression model underlying RCTs, the asymptotic properties of design-based estimators using group-level averages—formed randomly or by covariates for nonclustered designs and as cluster-level averages for clustered designs—match those using individual data. Furthermore, design effects from aggregation are tolerable with moderate numbers of groups and few covariates, suggesting little information is lost in these cases. Ecological inference methods for subgroup analyses, however, yield large design effects. Several empirical examples using real-world education RCT data demonstrate the theory.

Suggested Citation

  • Peter Z. Schochet, 2020. "Analyzing Grouped Administrative Data for RCTs Using Design-Based Methods," Journal of Educational and Behavioral Statistics, , vol. 45(1), pages 32-57, February.
  • Handle: RePEc:sae:jedbes:v:45:y:2020:i:1:p:32-57
    DOI: 10.3102/1076998619855350
    as

    Download full text from publisher

    File URL: https://journals.sagepub.com/doi/10.3102/1076998619855350
    Download Restriction: no

    File URL: https://libkey.io/10.3102/1076998619855350?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Yang L. & Tsiatis A. A., 2001. "Efficiency Study of Estimators for a Treatment Effect in a Pretest-Posttest Trial," The American Statistician, American Statistical Association, vol. 55, pages 314-321, November.
    2. Xinran Li & Peng Ding, 2017. "General Forms of Finite Population Central Limit Theorems with Applications to Causal Inference," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 112(520), pages 1759-1769, October.
    3. Paul T. Decker & Daniel P. Mayer & Steven Glazerman, "undated". "The Effects of Teach For America on Students: Findings from a National Evaluation," Mathematica Policy Research Reports c8b5eb6d499c465c86a96bee4, Mathematica Policy Research.
    4. Dhrymes, Phoebus J. & Lleras-Muney, Adriana, 2006. "Estimation of models with grouped and ungrouped data by means of "2SLS"," Journal of Econometrics, Elsevier, vol. 133(1), pages 1-29, July.
    5. A. Colin Cameron & Douglas L. Miller, 2015. "A Practitioner’s Guide to Cluster-Robust Inference," Journal of Human Resources, University of Wisconsin Press, vol. 50(2), pages 317-372.
    6. repec:mpr:mprres:3181 is not listed on IDEAS
    7. Luke W. Miratrix & Jasjeet S. Sekhon & Bin Yu, 2013. "Adjusting treatment effect estimates by post-stratification in randomized experiments," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 75(2), pages 369-396, March.
    8. Daniel P. Mayer & Paul E. Peterson & David E. Myers & Christina Clark Tuttle & William G. Howell, 2002. "School Choice in New York City After Three Years: An Evaluation of the School Choice Scholarships Program," Mathematica Policy Research Reports bd29adb569094778a5981be0e, Mathematica Policy Research.
    9. Peter Z. Schochet, "undated". "Statistical Power for Random Assignment Evaluations of Education Programs," Mathematica Policy Research Reports 6749d31ad72d4acf988f7dce5, Mathematica Policy Research.
    10. repec:mpr:mprres:6573 is not listed on IDEAS
    11. repec:mpr:mprres:3180 is not listed on IDEAS
    12. repec:mpr:mprres:5863 is not listed on IDEAS
    13. André Lapidus & Jean-Sébastien Lenfant & Goulven Rubin & Hans-Michael Trautwein, 2020. "Introduction," The European Journal of the History of Economic Thought, Taylor & Francis Journals, vol. 27(6), pages 815-818, November.
    14. Feige, Edgar L & Watts, Harold W, 1972. "An Investigation of the Consequences of Partial Aggregation of Micro-Economic Data," Econometrica, Econometric Society, vol. 40(2), pages 343-360, March.
    15. Peter Z. Schochet, "undated". "Is Regression Adjustment Supported by the Neyman Model for Causal Inference? (Presentation)," Mathematica Policy Research Reports abfc39d59c714499b2fe42f68, Mathematica Policy Research.
    16. Peter Z. Schochet, "undated". "Is Regression Adjustment Supported By the Neyman Model for Causal Inference?," Mathematica Policy Research Reports 782da2242fba458eb61752f96, Mathematica Policy Research.
    17. Stoker, Thomas M, 1993. "Empirical Approaches to the Problem of Aggregation Over Individuals," Journal of Economic Literature, American Economic Association, vol. 31(4), pages 1827-1874, December.
    18. repec:mpr:mprres:4150 is not listed on IDEAS
    19. Imbens,Guido W. & Rubin,Donald B., 2015. "Causal Inference for Statistics, Social, and Biomedical Sciences," Cambridge Books, Cambridge University Press, number 9780521885881.
    20. repec:mpr:mprres:4012 is not listed on IDEAS
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Fangzhou Su & Peng Ding, 2021. "Model‐assisted analyses of cluster‐randomized experiments," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 83(5), pages 994-1015, November.
    2. Peter Z. Schochet, 2021. "Statistical Power for Estimating Treatment Effects Using Difference-in-Differences and Comparative Interrupted Time Series Designs with Variation in Treatment Timing," Papers 2102.06770, arXiv.org, revised Oct 2021.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Peter Z. Schochet, "undated". "Statistical Theory for the RCT-YES Software: Design-Based Causal Inference for RCTs," Mathematica Policy Research Reports a0c005c003c242308a92c02dc, Mathematica Policy Research.
    2. Peter Z. Schochet, 2010. "The Late Pretest Problem in Randomized Control Trials of Education Interventions," Journal of Educational and Behavioral Statistics, , vol. 35(4), pages 379-406, August.
    3. Peter Z. Schochet, 2018. "Design-Based Estimators for Average Treatment Effects for Multi-Armed RCTs," Journal of Educational and Behavioral Statistics, , vol. 43(5), pages 568-593, October.
    4. Peter Z. Schochet, "undated". "The Late Pretest Problem in Randomized Control Trials of Education Interventions," Mathematica Policy Research Reports fb514df5dbb84a5dbea79865c, Mathematica Policy Research.
    5. Tim Kautz & Peter Z. Schochet & Charles Tilley, "undated". "Comparing Impact Findings from Design-Based and Model-Based Methods: An Empirical Investigation," Mathematica Policy Research Reports b7656ddce20f4007b71836e99, Mathematica Policy Research.
    6. repec:mpr:mprres:6286 is not listed on IDEAS
    7. repec:mpr:mprres:6094 is not listed on IDEAS
    8. Peter Z. Schochet & Hanley Chiang, "undated". "Technical Methods Report: Estimation and Identification of the Complier Average Causal Effect Parameter in Education RCTs," Mathematica Policy Research Reports 947d1823e3ff42208532a763d, Mathematica Policy Research.
    9. Peter Z. Schochet, "undated". "Technical Methods Report: Statistical Power for Regression Discontinuity Designs in Education Evaluations," Mathematica Policy Research Reports 61fb6c057561451a8a6074508, Mathematica Policy Research.
    10. Peter Z. Schochet, 2013. "Estimators for Clustered Education RCTs Using the Neyman Model for Causal Inference," Journal of Educational and Behavioral Statistics, , vol. 38(3), pages 219-238, June.
    11. repec:mpr:mprres:6372 is not listed on IDEAS
    12. Peter Z. Schochet, 2021. "Statistical Power for Estimating Treatment Effects Using Difference-in-Differences and Comparative Interrupted Time Series Designs with Variation in Treatment Timing," Papers 2102.06770, arXiv.org, revised Oct 2021.
    13. Kenneth Fortson & Natalya Verbitsky-Savitz & Emma Kopa & Philip Gleason, 2012. "Using an Experimental Evaluation of Charter Schools to Test Whether Nonexperimental Comparison Group Methods Can Replicate Experimental Impact Estimates," Mathematica Policy Research Reports 27f871b5b7b94f3a80278a593, Mathematica Policy Research.
    14. Lu, Jiannan, 2016. "On randomization-based and regression-based inferences for 2K factorial designs," Statistics & Probability Letters, Elsevier, vol. 112(C), pages 72-78.
    15. John Deke, 2016. "Design and Analysis Considerations for Cluster Randomized Controlled Trials That Have a Small Number of Clusters," Evaluation Review, , vol. 40(5), pages 444-486, October.
    16. Donald P. Green & Winston Lin & Claudia Gerber, 2018. "Optimal Allocation of Interviews to Baseline and Endline Surveys in Place-Based Randomized Trials and Quasi-Experiments," Evaluation Review, , vol. 42(4), pages 391-422, August.
    17. Ding, Peng, 2021. "The Frisch–Waugh–Lovell theorem for standard errors," Statistics & Probability Letters, Elsevier, vol. 168(C).
    18. Haoge Chang, 2023. "Design-based Estimation Theory for Complex Experiments," Papers 2311.06891, arXiv.org.
    19. Joel A. Middleton, 2021. "Unifying Design-based Inference: On Bounding and Estimating the Variance of any Linear Estimator in any Experimental Design," Papers 2109.09220, arXiv.org.
    20. repec:mpr:mprres:7443 is not listed on IDEAS
    21. repec:mpr:mprres:8128 is not listed on IDEAS
    22. Susan Athey & Guido Imbens, 2016. "The Econometrics of Randomized Experiments," Papers 1607.00698, arXiv.org.
    23. Fangzhou Su & Peng Ding, 2021. "Model‐assisted analyses of cluster‐randomized experiments," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 83(5), pages 994-1015, November.
    24. Peng Ding, 2020. "The Frisch--Waugh--Lovell Theorem for Standard Errors," Papers 2009.06621, arXiv.org.
    25. Clément de Chaisemartin & Jaime Ramirez-Cuellar, 2024. "At What Level Should One Cluster Standard Errors in Paired and Small-Strata Experiments?," American Economic Journal: Applied Economics, American Economic Association, vol. 16(1), pages 193-212, January.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:sae:jedbes:v:45:y:2020:i:1:p:32-57. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: SAGE Publications (email available below). General contact details of provider: .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.