IDEAS home Printed from https://ideas.repec.org/a/bpj/ijbist/v12y2016i1p97-115n9.html
   My bibliography  Save this article

Variable Selection for Confounder Control, Flexible Modeling and Collaborative Targeted Minimum Loss-Based Estimation in Causal Inference

Author

Listed:
  • Schnitzer Mireille E.

    (Faculté de pharmacie, Université de Montréal, Pavillon Jean-Coutu, 2940 ch de la Polytechnique, P.O. Box 6128, Station Centre-ville, Montreal, Quebec, Canada)

  • Lok Judith J.

    (Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA)

  • Gruber Susan

    (Reagan-Udall Foundation for the FDA, Washington, DC, USA)

Abstract

This paper investigates the appropriateness of the integration of flexible propensity score modeling (nonparametric or machine learning approaches) in semiparametric models for the estimation of a causal quantity, such as the mean outcome under treatment. We begin with an overview of some of the issues involved in knowledge-based and statistical variable selection in causal inference and the potential pitfalls of automated selection based on the fit of the propensity score. Using a simple example, we directly show the consequences of adjusting for pure causes of the exposure when using inverse probability of treatment weighting (IPTW). Such variables are likely to be selected when using a naive approach to model selection for the propensity score. We describe how the method of Collaborative Targeted minimum loss-based estimation (C-TMLE; van der Laan and Gruber, 2010 [27]) capitalizes on the collaborative double robustness property of semiparametric efficient estimators to select covariates for the propensity score based on the error in the conditional outcome model. Finally, we compare several approaches to automated variable selection in low- and high-dimensional settings through a simulation study. From this simulation study, we conclude that using IPTW with flexible prediction for the propensity score can result in inferior estimation, while Targeted minimum loss-based estimation and C-TMLE may benefit from flexible prediction and remain robust to the presence of variables that are highly correlated with treatment. However, in our study, standard influence function-based methods for the variance underestimated the standard errors, resulting in poor coverage under certain data-generating scenarios.

Suggested Citation

  • Schnitzer Mireille E. & Lok Judith J. & Gruber Susan, 2016. "Variable Selection for Confounder Control, Flexible Modeling and Collaborative Targeted Minimum Loss-Based Estimation in Causal Inference," The International Journal of Biostatistics, De Gruyter, vol. 12(1), pages 97-115, May.
  • Handle: RePEc:bpj:ijbist:v:12:y:2016:i:1:p:97-115:n:9
    DOI: 10.1515/ijb-2015-0017
    as

    Download full text from publisher

    File URL: https://doi.org/10.1515/ijb-2015-0017
    Download Restriction: For access to full text, subscription to the journal or payment for the individual article is required.

    File URL: https://libkey.io/10.1515/ijb-2015-0017?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Andrea Rotnitzky & Lingling Li & Xiaochun Li, 2010. "A note on overadjustment in inverse probability weighted estimation," Biometrika, Biometrika Trust, vol. 97(4), pages 997-1001.
    2. Porter Kristin E. & Gruber Susan & van der Laan Mark J. & Sekhon Jasjeet S., 2011. "The Relative Performance of Targeted Maximum Likelihood Estimators," The International Journal of Biostatistics, De Gruyter, vol. 7(1), pages 1-34, August.
    3. Brookhart, M. Alan & van der Laan, Mark J., 2006. "A semiparametric model selection criterion with applications to the marginal structural model," Computational Statistics & Data Analysis, Elsevier, vol. 50(2), pages 475-498, January.
    4. Ciprian M. Crainiceanu & Francesca Dominici & Giovanni Parmigiani, 2008. "Adjustment uncertainty in effect estimation," Biometrika, Biometrika Trust, vol. 95(3), pages 635-651.
    5. Jinyong Hahn, 1998. "On the Role of the Propensity Score in Efficient Semiparametric Estimation of Average Treatment Effects," Econometrica, Econometric Society, vol. 66(2), pages 315-332, March.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Xun Lu, 2015. "A Covariate Selection Criterion for Estimation of Treatment Effects," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 33(4), pages 506-522, October.
    2. Lefebvre, Geneviève & Atherton, Juli & Talbot, Denis, 2014. "The effect of the prior distribution in the Bayesian Adjustment for Confounding algorithm," Computational Statistics & Data Analysis, Elsevier, vol. 70(C), pages 227-240.
    3. Agboola, Oluwagbenga David & Yu, Han, 2023. "Neighborhood-based cross fitting approach to treatment effects with high-dimensional data," Computational Statistics & Data Analysis, Elsevier, vol. 186(C).
    4. Noémi Kreif & Richard Grieve & Iván Díaz & David Harrison, 2015. "Evaluation of the Effect of a Continuous Treatment: A Machine Learning Approach with an Application to Treatment for Traumatic Brain Injury," Health Economics, John Wiley & Sons, Ltd., vol. 24(9), pages 1213-1228, September.
    5. Dettmann, E. & Becker, C. & Schmeißer, C., 2011. "Distance functions for matching in small samples," Computational Statistics & Data Analysis, Elsevier, vol. 55(5), pages 1942-1960, May.
    6. Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP72/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    7. Alexander Hijzen & Sébastien Jean & Thierry Mayer, 2011. "The effects at home of initiating production abroad: evidence from matched French firms," Review of World Economics (Weltwirtschaftliches Archiv), Springer;Institut für Weltwirtschaft (Kiel Institute for the World Economy), vol. 147(3), pages 457-483, September.
    8. Sant’Anna, Pedro H.C. & Zhao, Jun, 2020. "Doubly robust difference-in-differences estimators," Journal of Econometrics, Elsevier, vol. 219(1), pages 101-122.
    9. Kitagawa, Toru & Muris, Chris, 2016. "Model averaging in semiparametric estimation of treatment effects," Journal of Econometrics, Elsevier, vol. 193(1), pages 271-289.
    10. Sung Jae Jun & Sokbae Lee, 2020. "Causal Inference under Outcome-Based Sampling with Monotonicity Assumptions," Papers 2004.08318, arXiv.org, revised Oct 2023.
    11. Samuel D. Lendle & Meenakshi S. Subbaraman & Mark J. van der Laan, 2013. "Identification and Efficient Estimation of the Natural Direct Effect among the Untreated," Biometrics, The International Biometric Society, vol. 69(2), pages 310-317, June.
    12. Alberto Abadie & Guido W. Imbens, 2002. "Simple and Bias-Corrected Matching Estimators for Average Treatment Effects," NBER Technical Working Papers 0283, National Bureau of Economic Research, Inc.
    13. Waverly Wei & Maya Petersen & Mark J van der Laan & Zeyu Zheng & Chong Wu & Jingshen Wang, 2023. "Efficient targeted learning of heterogeneous treatment effects for multiple subgroups," Biometrics, The International Biometric Society, vol. 79(3), pages 1934-1946, September.
    14. Michael C. Knaus, 2021. "A double machine learning approach to estimate the effects of musical practice on student’s skills," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 184(1), pages 282-300, January.
    15. Frölich, Markus & Michaelowa, Katharina, 2004. "Peer effects and textbooks in primary education: Evidence from francophone sub-Saharan Africa," HWWA Discussion Papers 311, Hamburg Institute of International Economics (HWWA).
    16. Jesse Rothstein & Albert Yoon, 2006. "Mismatch in Law School," Working Papers 29, Princeton University, School of Public and International Affairs, Education Research Section..
    17. Richard K. Crump & V. Joseph Hotz & Guido W. Imbens & Oscar A. Mitnik, 2008. "Nonparametric Tests for Treatment Effect Heterogeneity," The Review of Economics and Statistics, MIT Press, vol. 90(3), pages 389-405, August.
    18. Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP54/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    19. Chunrong Ai & Oliver Linton & Kaiji Motegi & Zheng Zhang, 2021. "A unified framework for efficient estimation of general treatment models," Quantitative Economics, Econometric Society, vol. 12(3), pages 779-816, July.
    20. Luo, Yu & Graham, Daniel J. & McCoy, Emma J., 2023. "Semiparametric Bayesian doubly robust causal estimation," LSE Research Online Documents on Economics 117944, London School of Economics and Political Science, LSE Library.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bpj:ijbist:v:12:y:2016:i:1:p:97-115:n:9. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Peter Golla (email available below). General contact details of provider: https://www.degruyter.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.