IDEAS home Printed from https://ideas.repec.org/a/bpj/ijbist/v12y2016i1p97-115n9.html

Variable Selection for Confounder Control, Flexible Modeling and Collaborative Targeted Minimum Loss-Based Estimation in Causal Inference

Author

Listed:
  • Schnitzer Mireille E.

    (Faculté de pharmacie, Université de Montréal, Pavillon Jean-Coutu, 2940 ch de la Polytechnique, P.O. Box 6128, Station Centre-ville, Montreal, Quebec, Canada)

  • Lok Judith J.

    (Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA)

  • Gruber Susan

    (Reagan-Udall Foundation for the FDA, Washington, DC, USA)

Abstract

This paper investigates the appropriateness of the integration of flexible propensity score modeling (nonparametric or machine learning approaches) in semiparametric models for the estimation of a causal quantity, such as the mean outcome under treatment. We begin with an overview of some of the issues involved in knowledge-based and statistical variable selection in causal inference and the potential pitfalls of automated selection based on the fit of the propensity score. Using a simple example, we directly show the consequences of adjusting for pure causes of the exposure when using inverse probability of treatment weighting (IPTW). Such variables are likely to be selected when using a naive approach to model selection for the propensity score. We describe how the method of Collaborative Targeted minimum loss-based estimation (C-TMLE; van der Laan and Gruber, 2010 [27]) capitalizes on the collaborative double robustness property of semiparametric efficient estimators to select covariates for the propensity score based on the error in the conditional outcome model. Finally, we compare several approaches to automated variable selection in low- and high-dimensional settings through a simulation study. From this simulation study, we conclude that using IPTW with flexible prediction for the propensity score can result in inferior estimation, while Targeted minimum loss-based estimation and C-TMLE may benefit from flexible prediction and remain robust to the presence of variables that are highly correlated with treatment. However, in our study, standard influence function-based methods for the variance underestimated the standard errors, resulting in poor coverage under certain data-generating scenarios.

Suggested Citation

  • Schnitzer Mireille E. & Lok Judith J. & Gruber Susan, 2016. "Variable Selection for Confounder Control, Flexible Modeling and Collaborative Targeted Minimum Loss-Based Estimation in Causal Inference," The International Journal of Biostatistics, De Gruyter, vol. 12(1), pages 97-115, May.
  • Handle: RePEc:bpj:ijbist:v:12:y:2016:i:1:p:97-115:n:9
    DOI: 10.1515/ijb-2015-0017
    as

    Download full text from publisher

    File URL: https://doi.org/10.1515/ijb-2015-0017
    Download Restriction: For access to full text, subscription to the journal or payment for the individual article is required.

    File URL: https://libkey.io/10.1515/ijb-2015-0017?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to

    for a different version of it.

    References listed on IDEAS

    as
    1. Brookhart, M. Alan & van der Laan, Mark J., 2006. "A semiparametric model selection criterion with applications to the marginal structural model," Computational Statistics & Data Analysis, Elsevier, vol. 50(2), pages 475-498, January.
    2. Andrea Rotnitzky & Lingling Li & Xiaochun Li, 2010. "A note on overadjustment in inverse probability weighted estimation," Biometrika, Biometrika Trust, vol. 97(4), pages 997-1001.
    3. Porter Kristin E. & Gruber Susan & van der Laan Mark J. & Sekhon Jasjeet S., 2011. "The Relative Performance of Targeted Maximum Likelihood Estimators," The International Journal of Biostatistics, De Gruyter, vol. 7(1), pages 1-34, August.
    4. Ciprian M. Crainiceanu & Francesca Dominici & Giovanni Parmigiani, 2008. "Adjustment uncertainty in effect estimation," Biometrika, Biometrika Trust, vol. 95(3), pages 635-651.
    5. Jinyong Hahn, 1998. "On the Role of the Propensity Score in Efficient Semiparametric Estimation of Average Treatment Effects," Econometrica, Econometric Society, vol. 66(2), pages 315-332, March.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. repec:osf:socarx:aeszf_v1 is not listed on IDEAS

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Xun Lu, 2015. "A Covariate Selection Criterion for Estimation of Treatment Effects," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 33(4), pages 506-522, October.
    2. Lefebvre, Geneviève & Atherton, Juli & Talbot, Denis, 2014. "The effect of the prior distribution in the Bayesian Adjustment for Confounding algorithm," Computational Statistics & Data Analysis, Elsevier, vol. 70(C), pages 227-240.
    3. Agboola, Oluwagbenga David & Yu, Han, 2023. "Neighborhood-based cross fitting approach to treatment effects with high-dimensional data," Computational Statistics & Data Analysis, Elsevier, vol. 186(C).
    4. Richard K. Crump & V. Joseph Hotz & Guido W. Imbens & Oscar A. Mitnik, 2008. "Nonparametric Tests for Treatment Effect Heterogeneity," The Review of Economics and Statistics, MIT Press, vol. 90(3), pages 389-405, August.
    5. Noémi Kreif & Richard Grieve & Iván Díaz & David Harrison, 2015. "Evaluation of the Effect of a Continuous Treatment: A Machine Learning Approach with an Application to Treatment for Traumatic Brain Injury," Health Economics, John Wiley & Sons, Ltd., vol. 24(9), pages 1213-1228, September.
    6. Federico A. Bugni & Mengsi Gao & Filip Obradovic & Amilcar Velez, 2024. "Identification and Inference on Treatment Effects under Covariate-Adaptive Randomization and Imperfect Compliance," Papers 2406.08419, arXiv.org, revised Apr 2025.
    7. Richard Blundell & Monica Costa Dias, 2009. "Alternative Approaches to Evaluation in Empirical Microeconomics," Journal of Human Resources, University of Wisconsin Press, vol. 44(3).
    8. Dettmann, E. & Becker, C. & Schmeißer, C., 2011. "Distance functions for matching in small samples," Computational Statistics & Data Analysis, Elsevier, vol. 55(5), pages 1942-1960, May.
    9. Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP72/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    10. Chabé-Ferret, Sylvain, 2015. "Analysis of the bias of Matching and Difference-in-Difference under alternative earnings and selection processes," Journal of Econometrics, Elsevier, vol. 185(1), pages 110-123.
    11. Alexander Hijzen & Sébastien Jean & Thierry Mayer, 2011. "The effects at home of initiating production abroad: evidence from matched French firms," Review of World Economics (Weltwirtschaftliches Archiv), Springer;Institut für Weltwirtschaft (Kiel Institute for the World Economy), vol. 147(3), pages 457-483, September.
    12. Kyoo il Kim, 2019. "Efficiency of Average Treatment Effect Estimation When the True Propensity Is Parametric," Econometrics, MDPI, vol. 7(2), pages 1-13, May.
    13. Stitelman Ori M. & De Gruttola Victor & van der Laan Mark J., 2012. "A General Implementation of TMLE for Longitudinal Data Applied to Causal Inference in Survival Analysis," The International Journal of Biostatistics, De Gruyter, vol. 8(1), pages 1-39, September.
    14. Cuong NGUYEN, 2016. "An Introduction to Alternative Methods in Program Impact Evaluation," Journal of Economic and Social Thought, KSP Journals, vol. 3(3), pages 349-375, September.
    15. Shu Yang & Yunshu Zhang, 2023. "Multiply robust matching estimators of average and quantile treatment effects," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 50(1), pages 235-265, March.
    16. Kitagawa, Toru & Muris, Chris, 2016. "Model averaging in semiparametric estimation of treatment effects," Journal of Econometrics, Elsevier, vol. 193(1), pages 271-289.
    17. Hairu Wang & Yukun Liu & Haiying Zhou, 2025. "Score test for unconfoundedness under a logistic treatment assignment model," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 77(4), pages 517-533, August.
    18. Cansino, José M. & Lopez-Melendo, Jaime & Pablo-Romero, María del P. & Sánchez-Braza, Antonio, 2013. "An economic evaluation of public programs for internationalization: The case of the Diagnostic program in Spain," Evaluation and Program Planning, Elsevier, vol. 41(C), pages 38-46.
    19. Wei, Kecheng & Qin, Guoyou & Zhang, Jiajia & Sui, Xuemei, 2022. "Doubly robust estimation in causal inference with missing outcomes: With an application to the Aerobics Center Longitudinal Study," Computational Statistics & Data Analysis, Elsevier, vol. 168(C).
    20. John C. Ham & Xianghong Li & Patricia B. Reagan, 2004. "Propensity Score Matching, a Distance-Based Measure of Migration, and the Wage Growth of Young Men," IEPR Working Papers 05.13, Institute of Economic Policy Research (IEPR).

    More about this item

    Keywords

    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bpj:ijbist:v:12:y:2016:i:1:p:97-115:n:9. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Peter Golla (email available below). General contact details of provider: https://www.degruyterbrill.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.