IDEAS home Printed from https://ideas.repec.org/a/spr/psycho/v87y2022i1d10.1007_s11336-021-09805-x.html
   My bibliography  Save this article

Robust Machine Learning for Treatment Effects in Multilevel Observational Studies Under Cluster-level Unmeasured Confounding

Author

Listed:
  • Youmi Suk

    (University of Virginia)

  • Hyunseung Kang

    (University of Wisconsin-Madison)

Abstract

Recently, machine learning (ML) methods have been used in causal inference to estimate treatment effects in order to reduce concerns for model mis-specification. However, many ML methods require that all confounders are measured to consistently estimate treatment effects. In this paper, we propose a family of ML methods that estimate treatment effects in the presence of cluster-level unmeasured confounders, a type of unmeasured confounders that are shared within each cluster and are common in multilevel observational studies. We show through simulation studies that our proposed methods are robust from biases from unmeasured cluster-level confounders in a variety of multilevel observational studies. We also examine the effect of taking an algebra course on math achievement scores from the Early Childhood Longitudinal Study, a multilevel observational educational study, using our methods. The proposed methods are available in the CURobustML R package.

Suggested Citation

  • Youmi Suk & Hyunseung Kang, 2022. "Robust Machine Learning for Treatment Effects in Multilevel Observational Studies Under Cluster-level Unmeasured Confounding," Psychometrika, Springer;The Psychometric Society, vol. 87(1), pages 310-343, March.
  • Handle: RePEc:spr:psycho:v:87:y:2022:i:1:d:10.1007_s11336-021-09805-x
    DOI: 10.1007/s11336-021-09805-x
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11336-021-09805-x
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11336-021-09805-x?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Arpino, Bruno & Mealli, Fabrizia, 2011. "The specification of the propensity score in multilevel observational studies," Computational Statistics & Data Analysis, Elsevier, vol. 55(4), pages 1770-1780, April.
    2. Hansen, Lars Peter, 1982. "Large Sample Properties of Generalized Method of Moments Estimators," Econometrica, Econometric Society, vol. 50(4), pages 1029-1054, July.
    3. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018. "Double/debiased machine learning for treatment and structural parameters," Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
    4. van Buuren, Stef & Groothuis-Oudshoorn, Karin, 2011. "mice: Multivariate Imputation by Chained Equations in R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 45(i03).
    5. Bates, Douglas & Mächler, Martin & Bolker, Ben & Walker, Steve, 2015. "Fitting Linear Mixed-Effects Models Using lme4," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 67(i01).
    6. Jee-Seon Kim & Edward Frees, 2007. "Multilevel Modeling with Correlated Effects," Psychometrika, Springer;The Psychometric Society, vol. 72(4), pages 505-533, December.
    7. Hong, Guanglei & Raudenbush, Stephen W., 2006. "Evaluating Kindergarten Retention Policy: A Case Study of Causal Inference for Multilevel Observational Data," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 901-910, September.
    8. Jeffrey M Wooldridge, 2010. "Econometric Analysis of Cross Section and Panel Data," MIT Press Books, The MIT Press, edition 2, volume 1, number 0262232588, December.
    9. Stefan Wager & Susan Athey, 2018. "Estimation and Inference of Heterogeneous Treatment Effects using Random Forests," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(523), pages 1228-1242, July.
    10. Shadish, William R. & Clark, M. H. & Steiner, Peter M., 2008. "Can Nonrandomized Experiments Yield Accurate Answers? A Randomized Experiment Comparing Random and Nonrandom Assignments," Journal of the American Statistical Association, American Statistical Association, vol. 103(484), pages 1334-1344.
    11. Gruber, Susan & Laan, Mark van der, 2012. "tmle: An R Package for Targeted Maximum Likelihood Estimation," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 51(i13).
    12. Heejung Bang & James M. Robins, 2005. "Doubly Robust Estimation in Missing Data and Causal Inference Models," Biometrics, The International Biometric Society, vol. 61(4), pages 962-973, December.
    13. Dmitry Arkhangelsky & Guido Imbens, 2018. "The Role of the Propensity Score in Fixed Effect Models," NBER Working Papers 24814, National Bureau of Economic Research, Inc.
    14. Jordan H. Rickles, 2013. "Examining Heterogeneity in the Effect of Taking Algebra in Eighth Grade," The Journal of Educational Research, Taylor & Francis Journals, vol. 106(4), pages 251-268, July.
    15. Peng Ding & Avi Feller & Luke Miratrix, 2019. "Decomposing Treatment Effect Variation," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 114(525), pages 304-317, January.
    16. Jee-Seon Kim & Edward Frees, 2006. "Omitted Variables in Multilevel Models," Psychometrika, Springer;The Psychometric Society, vol. 71(4), pages 659-690, December.
    17. Henderson, Daniel J. & Carroll, Raymond J. & Li, Qi, 2008. "Nonparametric estimation and testing of fixed effects panel data models," Journal of Econometrics, Elsevier, vol. 144(1), pages 257-275, May.
    18. Glynn, Adam N. & Quinn, Kevin M., 2010. "An Introduction to the Augmented Inverse Propensity Weighted Estimator," Political Analysis, Cambridge University Press, vol. 18(1), pages 36-56, January.
    19. Lin, Zhongjian & Li, Qi & Sun, Yiguo, 2014. "A consistent nonparametric test of parametric regression functional form in fixed effects panel data models," Journal of Econometrics, Elsevier, vol. 178(P1), pages 167-179.
    20. Kosuke Imai & In Song Kim, 2019. "When Should We Use Unit Fixed Effects Regression Models for Causal Inference with Longitudinal Data?," American Journal of Political Science, John Wiley & Sons, vol. 63(2), pages 467-490, April.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Youmi Suk, 2024. "A Within-Group Approach to Ensemble Machine Learning Methods for Causal Inference in Multilevel Studies," Journal of Educational and Behavioral Statistics, , vol. 49(1), pages 61-91, February.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Youmi Suk, 2024. "A Within-Group Approach to Ensemble Machine Learning Methods for Causal Inference in Multilevel Studies," Journal of Educational and Behavioral Statistics, , vol. 49(1), pages 61-91, February.
    2. Mary Ying-Fang Wang & Paul Tuss & Lihong Qi, 2019. "Augmented Weighted Estimators Dealing with Practical Positivity Violation to Causal inferences in a Random Coefficient Model," Psychometrika, Springer;The Psychometric Society, vol. 84(2), pages 447-467, June.
    3. Davide Viviano & Jelena Bradic, 2019. "Synthetic learner: model-free inference on treatments over time," Papers 1904.01490, arXiv.org, revised Aug 2022.
    4. Jushan Bai & Sung Hoon Choi & Yuan Liao, 2021. "Feasible generalized least squares for panel data with cross-sectional and serial correlations," Empirical Economics, Springer, vol. 60(1), pages 309-326, January.
    5. Dmitry Arkhangelsky & Guido Imbens, 2023. "Causal Models for Longitudinal and Panel Data: A Survey," Papers 2311.15458, arXiv.org, revised Mar 2024.
    6. Dmitry Arkhangelsky & Guido W. Imbens & Lihua Lei & Xiaoman Luo, 2021. "Design-Robust Two-Way-Fixed-Effects Regression For Panel Data," Papers 2107.13737, arXiv.org, revised Mar 2024.
    7. Lechner, Michael & Okasa, Gabriel, 2019. "Random Forest Estimation of the Ordered Choice Model," Economics Working Paper Series 1908, University of St. Gallen, School of Economics and Political Science.
    8. Aleksey Oshchepkov & Anna Shirokanova, 2020. "Multilevel Modeling For Economists: Why, When And How," HSE Working papers WP BRP 233/EC/2020, National Research University Higher School of Economics.
    9. Kerda Varaku & Robin Sickles, 2023. "Public subsidies and innovation: a doubly robust machine learning approach leveraging deep neural networks," Empirical Economics, Springer, vol. 64(6), pages 3121-3165, June.
    10. Mark Kattenberg & Bas Scheer & Jurre Thiel, 2023. "Causal forests with fixed effects for treatment effect heterogeneity in difference-in-differences," CPB Discussion Paper 452, CPB Netherlands Bureau for Economic Policy Analysis.
    11. Augustine Denteh & Helge Liebert, 2022. "Who Increases Emergency Department Use? New Insights from the Oregon Health Insurance Experiment," Working Papers 2201, Tulane University, Department of Economics.
    12. Matias D Cattaneo & Michael Jansson & Xinwei Ma, 2019. "Two-Step Estimation and Inference with Possibly Many Included Covariates," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 86(3), pages 1095-1122.
    13. Jiaming Mao & Jingzhi Xu, 2020. "Ensemble Learning with Statistical and Structural Models," Papers 2006.05308, arXiv.org.
    14. Khashayar Khosravi & Greg Lewis & Vasilis Syrgkanis, 2019. "Non-Parametric Inference Adaptive to Intrinsic Dimension," Papers 1901.03719, arXiv.org, revised Jun 2019.
    15. Jorge Manzi & Ernesto San Martín & Sébastien Van Bellegem, 2014. "School System Evaluation by Value Added Analysis Under Endogeneity," Psychometrika, Springer;The Psychometric Society, vol. 79(1), pages 130-153, January.
    16. Isaac Meza & Rahul Singh, 2021. "Nested Nonparametric Instrumental Variable Regression: Long Term, Mediated, and Time Varying Treatment Effects," Papers 2112.14249, arXiv.org, revised Mar 2024.
    17. Paul Clarke & Annalivia Polselli, 2023. "Double Machine Learning for Static Panel Models with Fixed Effects," Papers 2312.08174, arXiv.org, revised Dec 2023.
    18. Jiabei Yang & Issa J. Dahabreh & Jon A. Steingrimsson, 2022. "Causal interaction trees: Finding subgroups with heterogeneous treatment effects in observational data," Biometrics, The International Biometric Society, vol. 78(2), pages 624-635, June.
    19. Xiang Zhou, 2022. "Semiparametric estimation for causal mediation analysis with multiple causally ordered mediators," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 84(3), pages 794-821, July.
    20. Ruoxuan Xiong & Allison Koenecke & Michael Powell & Zhu Shen & Joshua T. Vogelstein & Susan Athey, 2021. "Federated Causal Inference in Heterogeneous Observational Data," Papers 2107.11732, arXiv.org, revised Apr 2023.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:psycho:v:87:y:2022:i:1:d:10.1007_s11336-021-09805-x. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.