IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1014436.html

Variable selection-combined causal mediation analysis for continuous treatments with application to large-dimensional biomedical data

Author

Listed:
  • Yajing Zhou
  • Kecheng Wei
  • Yahang Liu
  • Zhaoyang Li
  • Chen Huang
  • Guoyou Qin
  • Yongfu Yu

Abstract

Substantial progress has been made in the area of causal inference utilizing large-scale data, among which the estimation of causal mediation effects has attracted a lot of attention. However, existing large-dimensional causal inference primarily focuses on total effects or typical causal mediation effects under binary variable settings, placing less emphasis on large-scale covariate selection with continuous treatment and mediator. To address this, we propose a weighted semiparametric estimation framework that integrates the generalized outcome-adaptive LASSO method into generalized propensity score modeling to achieve estimation of causal mediation effects under continuous variable settings. Simulation results show that our proposed method outperforms other regularization-based methods in selection accuracy and estimation efficiency, which is achieved by incorporating outcome-related key variables and excluding noise covariates. From the perspective of achieving a stable balance between efficiency and bias, as well as high-dimensional information filtering, our method may serve as a compelling alternative that balances estimation efficiency with model interpretability and inferential robustness. We further conduct a real-world application based on the UK Biobank database, quantifying the causal mediation effects of apolipoprotein B levels within the association between potential diabetes risk and cancer incidence using large-scale healthcare and medical data.Author summary: Disease development and progress are well recognized to be influenced by multiple factors, and exploring the causal mediation effects of the mediator in the exposure-outcome association can help reveal the etiological mechanisms. Due to the widespread application of large-scale biology and health data, it is challenging to precisely select all important variables based on prior knowledge to obtain accurate estimates. In this study, we propose a generalized outcome-adaptive LASSO (GOAL)-combined weighted semiparametric approach to estimate the natural direct and indirect effects of continuous treatment and mediator in large-scale covariate settings. Our method extends previous work by allowing for accurate causal mediation estimates for continuous treatment and mediator with large-dimensional covariates, and also improves estimation efficiency by precisely incorporating outcome-related variables. We apply the proposed method to investigate the mediating role of apolipoprotein B in the association between potential diabetes risk and cancer incidence under extensive candidate covariates from biomedical and healthcare data.

Suggested Citation

  • Yajing Zhou & Kecheng Wei & Yahang Liu & Zhaoyang Li & Chen Huang & Guoyou Qin & Yongfu Yu, 2026. "Variable selection-combined causal mediation analysis for continuous treatments with application to large-dimensional biomedical data," PLOS Computational Biology, Public Library of Science, vol. 22(6), pages 1-31, June.
  • Handle: RePEc:plo:pcbi00:1014436
    DOI: 10.1371/journal.pcbi.1014436
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1014436
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1014436&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1014436?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1014436. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.