IDEAS home Printed from https://ideas.repec.org/a/bla/biomet/v74y2018i2p389-398.html
   My bibliography  Save this article

Data†driven confounder selection via Markov and Bayesian networks

Author

Listed:
  • Jenny Häggström

Abstract

To unbiasedly estimate a causal effect on an outcome unconfoundedness is often assumed. If there is sufficient knowledge on the underlying causal structure then existing confounder selection criteria can be used to select subsets of the observed pretreatment covariates, X, sufficient for unconfoundedness, if such subsets exist. Here, estimation of these target subsets is considered when the underlying causal structure is unknown. The proposed method is to model the causal structure by a probabilistic graphical model, for example, a Markov or Bayesian network, estimate this graph from observed data and select the target subsets given the estimated graph. The approach is evaluated by simulation both in a high†dimensional setting where unconfoundedness holds given X and in a setting where unconfoundedness only holds given subsets of X. Several common target subsets are investigated and the selected subsets are compared with respect to accuracy in estimating the average causal effect. The proposed method is implemented with existing software that can easily handle high†dimensional data, in terms of large samples and large number of covariates. The results from the simulation study show that, if unconfoundedness holds given X, this approach is very successful in selecting the target subsets, outperforming alternative approaches based on random forests and LASSO, and that the subset estimating the target subset containing all causes of outcome yields smallest MSE in the average causal effect estimation.

Suggested Citation

  • Jenny Häggström, 2018. "Data†driven confounder selection via Markov and Bayesian networks," Biometrics, The International Biometric Society, vol. 74(2), pages 389-398, June.
  • Handle: RePEc:bla:biomet:v:74:y:2018:i:2:p:389-398
    DOI: 10.1111/biom.12788
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/biom.12788
    Download Restriction: no

    File URL: https://libkey.io/10.1111/biom.12788?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Persson, Emma & Häggström, Jenny & Waernbaum, Ingeborg & de Luna, Xavier, 2017. "Data-driven algorithms for dimension reduction in causal inference," Computational Statistics & Data Analysis, Elsevier, vol. 105(C), pages 280-292.
    2. Tyler J. VanderWeele & Ilya Shpitser, 2011. "A New Criterion for Confounder Selection," Biometrics, The International Biometric Society, vol. 67(4), pages 1406-1413, December.
    3. Scutari, Marco, 2010. "Learning Bayesian Networks with the bnlearn R Package," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 35(i03).
    4. van der Laan Mark J. & Rubin Daniel, 2006. "Targeted Maximum Likelihood Learning," The International Journal of Biostatistics, De Gruyter, vol. 2(1), pages 1-40, December.
    5. Xavier De Luna & Ingeborg Waernbaum & Thomas S. Richardson, 2011. "Covariate selection for the nonparametric estimation of an average treatment effect," Biometrika, Biometrika Trust, vol. 98(4), pages 861-875.
    6. Gruber, Susan & Laan, Mark van der, 2012. "tmle: An R Package for Targeted Maximum Likelihood Estimation," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 51(i13).
    7. Kapelner, Adam & Bleich, Justin, 2016. "bartMachine: Machine Learning with Bayesian Additive Regression Trees," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 70(i04).
    8. Alberto Abadie & Guido W. Imbens, 2006. "Large Sample Properties of Matching Estimators for Average Treatment Effects," Econometrica, Econometric Society, vol. 74(1), pages 235-267, January.
    9. Sekhon, Jasjeet S., 2011. "Multivariate and Propensity Score Matching Software with Automated Balance Optimization: The Matching package for R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 42(i07).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Bryan Keller, 2020. "Variable Selection for Causal Effect Estimation: Nonparametric Conditional Independence Testing With Random Forests," Journal of Educational and Behavioral Statistics, , vol. 45(2), pages 119-142, April.
    2. Agboola, Oluwagbenga David & Yu, Han, 2023. "Neighborhood-based cross fitting approach to treatment effects with high-dimensional data," Computational Statistics & Data Analysis, Elsevier, vol. 186(C).
    3. Joseph Antonelli & Matthew Cefalu & Nathan Palmer & Denis Agniel, 2018. "Doubly robust matching estimators for high dimensional confounding adjustment," Biometrics, The International Biometric Society, vol. 74(4), pages 1171-1179, December.
    4. Persson, Emma & Häggström, Jenny & Waernbaum, Ingeborg & de Luna, Xavier, 2017. "Data-driven algorithms for dimension reduction in causal inference," Computational Statistics & Data Analysis, Elsevier, vol. 105(C), pages 280-292.
    5. Uehleke, Reinhard & Petrick, Martin & Hüttel, Silke, 2022. "Evaluations of agri-environmental schemes based on observational farm data: The importance of covariate selection," Land Use Policy, Elsevier, vol. 114(C).
    6. Susan Athey & Guido W. Imbens & Stefan Wager, 2018. "Approximate residual balancing: debiased inference of average treatment effects in high dimensions," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 80(4), pages 597-623, September.
    7. Jasjeet Singh Sekhon & Richard D. Grieve, 2012. "A matching method for improving covariate balance in cost‐effectiveness analyses," Health Economics, John Wiley & Sons, Ltd., vol. 21(6), pages 695-714, June.
    8. Tingting Zhou & Michael R. Elliott & Roderick J. A. Little, 2021. "Robust Causal Estimation from Observational Studies Using Penalized Spline of Propensity Score for Treatment Comparison," Stats, MDPI, vol. 4(2), pages 1-21, June.
    9. Xun Lu, 2015. "A Covariate Selection Criterion for Estimation of Treatment Effects," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 33(4), pages 506-522, October.
    10. Valentina A. Assenova & Olav Sorenson, 2017. "Legitimacy and the Benefits of Firm Formalization," Organization Science, INFORMS, vol. 28(5), pages 804-818, October.
    11. Grilli, Gianluca & Curtis, John, 2021. "An evaluation of public initiatives to change behaviours that affect water quality," Papers WP696, Economic and Social Research Institute (ESRI).
    12. Frida Skog, 2019. "Sibling Effects on Adult Earnings Among Poor and Wealthy Children Evidence from Sweden," Child Indicators Research, Springer;The International Society of Child Indicators (ISCI), vol. 12(3), pages 917-942, June.
    13. Angelov, Nikolay & Eliason, Marcus, 2014. "The effects of targeted labour market programs for job seekers with occupational disabilities," Working Paper Series 2014:27, IFAU - Institute for Evaluation of Labour Market and Education Policy.
    14. Gruber Susan & van der Laan Mark J., 2010. "An Application of Collaborative Targeted Maximum Likelihood Estimation in Causal Inference and Genomics," The International Journal of Biostatistics, De Gruyter, vol. 6(1), pages 1-31, May.
    15. Frölich, Markus & Huber, Martin & Wiesenfarth, Manuel, 2017. "The finite sample performance of semi- and non-parametric estimators for treatment effects and policy evaluation," Computational Statistics & Data Analysis, Elsevier, vol. 115(C), pages 91-102.
    16. Miranda, Juan Jose & Corral, Leonardo & Blackman, Allen & Asner, Gregory & Lima, Eirivelthon, 2014. "Effects of Protected Areas on Forest Cover Change and Local Communities," RFF Working Paper Series dp-14-14, Resources for the Future.
    17. Susan Gruber & Mark J. van der Laan, 2013. "An Application of Targeted Maximum Likelihood Estimation to the Meta-Analysis of Safety Data," Biometrics, The International Biometric Society, vol. 69(1), pages 254-262, March.
    18. Massimo Baldini & Daniele Pacifico & Federica Termini, 2015. "Imputation of missing expenditure information in standard household income surveys," Center for the Analysis of Public Policies (CAPP) 0116, Universita di Modena e Reggio Emilia, Dipartimento di Economia "Marco Biagi".
    19. Robert J. Johnston & Klaus Moeltner, 2019. "Special Flood Hazard Effects on Coastal and Interior Home Values: One Size Does Not Fit All," Environmental & Resource Economics, Springer;European Association of Environmental and Resource Economists, vol. 74(1), pages 181-210, September.
    20. Jan-Hinrik Meyer-Sahling & Will Lowe & Christian van Stolk, 2016. "Silent professionalization: EU integration and the professional socialization of public officials in Central and Eastern Europe," European Union Politics, , vol. 17(1), pages 162-183, March.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:biomet:v:74:y:2018:i:2:p:389-398. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: http://www.blackwellpublishing.com/journal.asp?ref=0006-341X .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.