IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2308.04963.html
   My bibliography  Save this paper

A Guide to Impact Evaluation under Sample Selection and Missing Data: Teacher's Aides and Adolescent Mental Health

Author

Listed:
  • Simon Calmar Andersen
  • Louise Beuchert
  • Phillip Heiler
  • Helena Skyt Nielsen

Abstract

This paper is concerned with identification, estimation, and specification testing in causal evaluation problems when data is selective and/or missing. We leverage recent advances in the literature on graphical methods to provide a unifying framework for guiding empirical practice. The approach integrates and connects to prominent identification and testing strategies in the literature on missing data, causal machine learning, panel data analysis, and more. We demonstrate its utility in the context of identification and specification testing in sample selection models and field experiments with attrition. We provide a novel analysis of a large-scale cluster-randomized controlled teacher's aide trial in Danish schools at grade 6. Even with detailed administrative data, the handling of missing data crucially affects broader conclusions about effects on mental health. Results suggest that teaching assistants provide an effective way of improving internalizing behavior for large parts of the student population.

Suggested Citation

  • Simon Calmar Andersen & Louise Beuchert & Phillip Heiler & Helena Skyt Nielsen, 2023. "A Guide to Impact Evaluation under Sample Selection and Missing Data: Teacher's Aides and Adolescent Mental Health," Papers 2308.04963, arXiv.org.
  • Handle: RePEc:arx:papers:2308.04963
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2308.04963
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Sokbae Lee & Ryo Okui & Yoon†Jae Whang, 2017. "Doubly robust uniform confidence band for the conditional average treatment effect function," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 32(7), pages 1207-1225, November.
    2. Peter Fredriksson & Björn Öckert & Hessel Oosterbeek, 2013. "Long-Term Effects of Class Size," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 128(1), pages 249-285.
    3. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018. "Double/debiased machine learning for treatment and structural parameters," Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
    4. Simon Calmar Andersen & Louise Beuchert & Helena Skyt Nielsen & Mette Kjærgaard Thomsen, 2020. "The Effect of Teacher's Aides in the Classroom: Evidence from a Randomized Trial," Journal of the European Economic Association, European Economic Association, vol. 18(1), pages 469-505.
    5. Patrick Kline & Andres Santos, 2013. "Sensitivity to missing data assumptions: Theory and an evaluation of the U.S. wage structure," Quantitative Economics, Econometric Society, vol. 4(2), pages 231-267, July.
    6. James J. Heckman & Rodrigo Pinto, 2022. "The Econometric Model for Causal Policy Analysis," Annual Review of Economics, Annual Reviews, vol. 14(1), pages 893-923, August.
    7. Manski, Charles F, 1987. "Semiparametric Analysis of Random Effects Linear Models from Binary Panel Data," Econometrica, Econometric Society, vol. 55(2), pages 357-362, March.
    8. Michael C Knaus, 2022. "Double machine learning-based programme evaluation under unconfoundedness [Econometric methods for program evaluation]," The Econometrics Journal, Royal Economic Society, vol. 25(3), pages 602-627.
    9. Semenova, Vira, 2023. "Debiased machine learning of set-identified linear models," Journal of Econometrics, Elsevier, vol. 235(2), pages 1725-1746.
    10. Alexandre Belloni & Victor Chernozhukov & Christian Hansen, 2014. "Inference on Treatment Effects after Selection among High-Dimensional Controlsâ€," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 81(2), pages 608-650.
    11. Imbens, Guido W & Angrist, Joshua D, 1994. "Identification and Estimation of Local Average Treatment Effects," Econometrica, Econometric Society, vol. 62(2), pages 467-475, March.
    12. Abadie, Alberto, 2003. "Semiparametric instrumental variable estimation of treatment response models," Journal of Econometrics, Elsevier, vol. 113(2), pages 231-263, April.
    13. Jinyong Hahn, 1998. "On the Role of the Propensity Score in Efficient Semiparametric Estimation of Average Treatment Effects," Econometrica, Econometric Society, vol. 66(2), pages 315-332, March.
    14. David S. Lee, 2009. "Training, Wages, and Sample Selection: Estimating Sharp Bounds on Treatment Effects," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 76(3), pages 1071-1102.
    15. Ghanem, Dalia & Hirshleifer, Sarojini & Ortiz-Becerra, Karen, 2019. "Testing Attrition Bias in Field Experiments," 2019 Annual Meeting, July 21-23, Atlanta, Georgia 291215, Agricultural and Applied Economics Association.
    16. Kristoffersen, Jannie Helene Grøne & Krægpøth, Morten Visby & Nielsen, Helena Skyt & Simonsen, Marianne, 2015. "Disruptive school peers and student outcomes," Economics of Education Review, Elsevier, vol. 45(C), pages 1-13.
    17. Bonesrønning, Hans & Finseraas, Henning & Hardoy, Ines & Iversen, Jon Marius Vaag & Nyhus, Ole Henning & Opheim, Vibeke & Salvanes, Kari Vea & Sandsør, Astrid Marie Jorde & Schøne, Pål, 2022. "Small-group instruction to improve student performance in mathematics in early grades: Results from a randomized field experiment," Journal of Public Economics, Elsevier, vol. 216(C).
    18. Martin Huber, 2012. "Identification of Average Treatment Effects in Social Experiments Under Alternative Forms of Attrition," Journal of Educational and Behavioral Statistics, , vol. 37(3), pages 443-474, June.
    19. Heckman, James & Pinto, Rodrigo, 2015. "Causal Analysis After Haavelmo," Econometric Theory, Cambridge University Press, vol. 31(1), pages 115-151, February.
    20. Dmitry Arkhangelsky & Guido W Imbens, 2022. "Doubly robust identification for causal panel data models [Sufficient statistics for unobserved heterogeneity in structural dynamic logit models]," The Econometrics Journal, Royal Economic Society, vol. 25(3), pages 649-674.
    21. Newey, Whitney K, 1994. "The Asymptotic Variance of Semiparametric Estimators," Econometrica, Econometric Society, vol. 62(6), pages 1349-1382, November.
    22. Keisuke Hirano & Guido W. Imbens & Geert Ridder & Donald B. Rubin, 2001. "Combining Panel Data Sets with Attrition and Refreshment Samples," Econometrica, Econometric Society, vol. 69(6), pages 1645-1659, November.
    23. Huber, Martin & Lechner, Michael & Wunsch, Conny, 2013. "The performance of estimators based on the propensity score," Journal of Econometrics, Elsevier, vol. 175(1), pages 1-21.
    24. Jeffrey M. Wooldridge, 2015. "Control Function Methods in Applied Econometrics," Journal of Human Resources, University of Wisconsin Press, vol. 50(2), pages 420-445.
    25. Peter Fredriksson & Björn Öckert & Hessel Oosterbeek, 2016. "Parental Responses to Public Investments in Children: Evidence from a Maximum Class Size Rule," Journal of Human Resources, University of Wisconsin Press, vol. 51(4), pages 832-868.
    26. Phillip Heiler, 2022. "Efficient Covariate Balancing for the Local Average Treatment Effect," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 40(4), pages 1569-1582, October.
    27. Heckman, James, 2013. "Sample selection bias as a specification error," Applied Econometrics, Russian Presidential Academy of National Economy and Public Administration (RANEPA), vol. 31(3), pages 129-137.
    28. Dmitry Arkhangelsky & Guido W. Imbens, 2019. "Doubly Robust Identification for Causal Panel Data Models," Papers 1909.09412, arXiv.org, revised Feb 2022.
    29. Constantine E. Frangakis & Donald B. Rubin, 2002. "Principal Stratification in Causal Inference," Biometrics, The International Biometric Society, vol. 58(1), pages 21-29, March.
    30. Robinson, Peter M, 1988. "Root- N-Consistent Semiparametric Regression," Econometrica, Econometric Society, vol. 56(4), pages 931-954, July.
    31. Victor Chernozhukov & Mert Demirer & Esther Duflo & Iván Fernández-Val, 2018. "Generic Machine Learning Inference on Heterogeneous Treatment Effects in Randomized Experiments, with an Application to Immunization in India," NBER Working Papers 24678, National Bureau of Economic Research, Inc.
    32. Joseph G. Altonji & Rosa L. Matzkin, 2005. "Cross Section and Panel Data Estimators for Nonseparable Models with Endogenous Regressors," Econometrica, Econometric Society, vol. 73(4), pages 1053-1102, July.
    33. Niklas Jakobsson & Mattias Persson & Mikael Svensson, 2013. "Class-size effects on adolescents' mental health and well-being in Swedish schools," Education Economics, Taylor & Francis Journals, vol. 21(3), pages 248-263, July.
    34. Karthika Mohan & Judea Pearl, 2021. "Graphical Models for Processing Missing Data," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 116(534), pages 1023-1037, April.
    35. Graham McKee & Katharine Sims & Steven Rivkin, 2015. "Disruption, learning, and the heterogeneous benefits of smaller classes," Empirical Economics, Springer, vol. 48(3), pages 1267-1286, May.
    36. Shakeeb Khan & Elie Tamer, 2010. "Irregular Identification, Support Conditions, and Inverse Weight Estimation," Econometrica, Econometric Society, vol. 78(6), pages 2021-2042, November.
    37. Farrell, Max H., 2015. "Robust inference on average treatment effects with possibly more covariates than observations," Journal of Econometrics, Elsevier, vol. 189(1), pages 1-23.
    38. Nicolai T. Borgen & Lars J. Kirkebøen & Andreas Kotsadam & Oddbjørn Raaum, 2022. "Do funds for more teachers improve student outcomes?," Discussion Papers 982, Statistics Norway, Research Department.
    39. Newey, Whitney K, 1990. "Semiparametric Efficiency Bounds," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 5(2), pages 99-135, April-Jun.
    40. Edward P. Lazear, 2001. "Educational Production," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 116(3), pages 777-803.
    41. Petra E. Todd & Kenneth I. Wolpin, 2003. "On The Specification and Estimation of The Production Function for Cognitive Achievement," Economic Journal, Royal Economic Society, vol. 113(485), pages 3-33, February.
    42. Hausman, Jerry A & Wise, David A, 1979. "Attrition Bias in Experimental and Panel Data: The Gary Income Maintenance Experiment," Econometrica, Econometric Society, vol. 47(2), pages 455-473, March.
    43. Rosa L. Matzkin, 2013. "Nonparametric Identification in Structural Economic Models," Annual Review of Economics, Annual Reviews, vol. 5(1), pages 457-486, May.
    44. Paul Hunermund & Elias Bareinboim, 2019. "Causal Inference and Data Fusion in Econometrics," Papers 1912.09104, arXiv.org, revised Mar 2023.
    45. Imai, Kosuke, 2008. "Sharp bounds on the causal effects in randomized experiments with "truncation-by-death"," Statistics & Probability Letters, Elsevier, vol. 78(2), pages 144-149, February.
    46. Victor Chernozhukov & Iván Fernández‐Val & Jinyong Hahn & Whitney Newey, 2013. "Average and Quantile Effects in Nonseparable Panel Models," Econometrica, Econometric Society, vol. 81(2), pages 535-580, March.
    47. Ro'ee Levy, 2021. "Social Media, News Consumption, and Polarization: Evidence from a Field Experiment," American Economic Review, American Economic Association, vol. 111(3), pages 831-870, March.
    48. Vira Semenova & Victor Chernozhukov, 2021. "Debiased machine learning of conditional average treatment effects and other causal functions," The Econometrics Journal, Royal Economic Society, vol. 24(2), pages 264-289.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Phillip Heiler, 2022. "Heterogeneous Treatment Effect Bounds under Sample Selection with an Application to the Effects of Social Media on Political Polarization," Papers 2209.04329, arXiv.org, revised Jan 2024.
    2. Ganesh Karapakula, 2023. "Stable Probability Weighting: Large-Sample and Finite-Sample Estimation and Inference Methods for Heterogeneous Causal Effects of Multivalued Treatments Under Limited Overlap," Papers 2301.05703, arXiv.org, revised Jan 2023.
    3. Markus Frölich & Martin Huber, 2014. "Treatment Evaluation With Multiple Outcome Periods Under Endogeneity and Attrition," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(508), pages 1697-1711, December.
    4. Sant’Anna, Pedro H.C. & Zhao, Jun, 2020. "Doubly robust difference-in-differences estimators," Journal of Econometrics, Elsevier, vol. 219(1), pages 101-122.
    5. Huber, Martin, 2019. "An introduction to flexible methods for policy evaluation," FSES Working Papers 504, Faculty of Economics and Social Sciences, University of Freiburg/Fribourg Switzerland.
    6. Qi Li & Jeffrey Scott Racine, 2006. "Nonparametric Econometrics: Theory and Practice," Economics Books, Princeton University Press, edition 1, volume 1, number 8355.
    7. Agboola, Oluwagbenga David & Yu, Han, 2023. "Neighborhood-based cross fitting approach to treatment effects with high-dimensional data," Computational Statistics & Data Analysis, Elsevier, vol. 186(C).
    8. Huber, Martin & Meier, Jonas & Wallimann, Hannes, 2022. "Business analytics meets artificial intelligence: Assessing the demand effects of discounts on Swiss train tickets," Transportation Research Part B: Methodological, Elsevier, vol. 163(C), pages 22-39.
    9. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018. "Double/debiased machine learning for treatment and structural parameters," Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
    10. Su, Liangjun & Ura, Takuya & Zhang, Yichong, 2019. "Non-separable models with high-dimensional data," Journal of Econometrics, Elsevier, vol. 212(2), pages 646-677.
    11. Hans Fricke & Markus Frölich & Martin Huber & Michael Lechner, 2020. "Endogeneity and non‐response bias in treatment evaluation – nonparametric identification of causal effects by instruments," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 35(5), pages 481-504, August.
    12. Victor Chernozhukov & Juan Carlos Escanciano & Hidehiko Ichimura & Whitney K. Newey & James M. Robins, 2022. "Locally Robust Semiparametric Estimation," Econometrica, Econometric Society, vol. 90(4), pages 1501-1535, July.
    13. Haitian Xie, 2020. "Efficient and Robust Estimation of the Generalized LATE Model," Papers 2001.06746, arXiv.org, revised Feb 2022.
    14. Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP72/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    15. Michael C Knaus & Michael Lechner & Anthony Strittmatter, 2021. "Machine learning estimation of heterogeneous causal effects: Empirical Monte Carlo evidence," The Econometrics Journal, Royal Economic Society, vol. 24(1), pages 134-161.
    16. Kyle Colangelo & Ying-Ying Lee, 2020. "Double Debiased Machine Learning Nonparametric Inference with Continuous Treatments," Papers 2004.03036, arXiv.org, revised Sep 2023.
    17. Susan Athey & Stefan Wager, 2021. "Policy Learning With Observational Data," Econometrica, Econometric Society, vol. 89(1), pages 133-161, January.
    18. Phillip Heiler & Michael C. Knaus, 2021. "Effect or Treatment Heterogeneity? Policy Evaluation with Aggregated and Disaggregated Treatments," Papers 2110.01427, arXiv.org, revised Aug 2023.
    19. Matias D Cattaneo & Michael Jansson & Xinwei Ma, 2019. "Two-Step Estimation and Inference with Possibly Many Included Covariates," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 86(3), pages 1095-1122.
    20. Vira Semenova, 2020. "Generalized Lee Bounds," Papers 2008.12720, arXiv.org, revised Feb 2023.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2308.04963. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.