IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v16y2025i1d10.1038_s41467-025-62525-z.html
   My bibliography  Save this article

FedECA: federated external control arms for causal inference with time-to-event data in distributed settings

Author

Listed:
  • Jean Ogier du Terrail

    (Inc.)

  • Quentin Klopfenstein

    (Inc.)

  • Honghao Li

    (Inc.)

  • Imke Mayer

    (Inc.)

  • Nicolas Loiseau

    (Inc.)

  • Mohammad Hallal

    (Inc.)

  • Michael Debouver

    (Inc.)

  • Thibault Camalon

    (Inc.)

  • Thibault Fouqueray

    (Inc.)

  • Jorge Arellano Castro

    (Inc.)

  • Zahia Yanes

    (Inc.)

  • Laëtitia Dahan

    (Hôpital la Timone)

  • Julien Taïeb

    (Université Paris Cité)

  • Pierre Laurent-Puig

    (Sorbonne Université, Inserm, Université Paris Cité
    AP-HP Centre, Hôpital Européen Georges Pompidou)

  • Jean-Baptiste Bachet

    (APHP)

  • Shulin Zhao

    (Sorbonne Université, Inserm, Université Paris Cité)

  • Remy Nicolle

    (CNRS)

  • Jérôme Cros

    (Université Paris Cité - FHU MOSAIC, Beaujon Hospital)

  • Daniel Gonzalez

    (Fédération Francophone de Cancérologie Digestive)

  • Robert Carreras-Torres

    (Institut d’Investigació Biomèdica de Girona (IDIBGI))

  • Adelaida Garcia Velasco

    (Institut d’Investigació Biomèdica de Girona (IDIBGI)
    Doctor Josep Trueta University Hospital)

  • Kawther Abdilleh

    (Pancreatic Cancer Action Network)

  • Sudheer Doss

    (Pancreatic Cancer Action Network)

  • Félix Balazard

    (Inc.)

  • Mathieu Andreux

    (Inc.)

Abstract

External control arms can inform early clinical development of experimental drugs and provide efficacy evidence for regulatory approval. However, accessing sufficient real-world or historical clinical trials data is challenging. Indeed, regulations protecting patients’ rights by strictly controlling data processing make pooling data from multiple sources in a central server often difficult. To address these limitations, we develop a method that leverages federated learning to enable inverse probability of treatment weighting for time-to-event outcomes on separate cohorts without needing to pool data. To showcase its potential, we apply it in different settings of increasing complexity, culminating with a real-world use-case in which our method is used to compare the treatment effect of two approved chemotherapy regimens using data from three separate cohorts of patients with metastatic pancreatic cancer. By sharing our code, we hope it will foster the creation of federated research networks and thus accelerate drug development.

Suggested Citation

  • Jean Ogier du Terrail & Quentin Klopfenstein & Honghao Li & Imke Mayer & Nicolas Loiseau & Mohammad Hallal & Michael Debouver & Thibault Camalon & Thibault Fouqueray & Jorge Arellano Castro & Zahia Ya, 2025. "FedECA: federated external control arms for causal inference with time-to-event data in distributed settings," Nature Communications, Nature, vol. 16(1), pages 1-22, December.
  • Handle: RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-62525-z
    DOI: 10.1038/s41467-025-62525-z
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-025-62525-z
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-025-62525-z?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018. "Double/debiased machine learning for treatment and structural parameters," Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
    2. Guildo W. Imbens, 2003. "Sensitivity to Exogeneity Assumptions in Program Evaluation," American Economic Review, American Economic Association, vol. 93(2), pages 126-132, May.
    3. DiMasi, Joseph A. & Grabowski, Henry G. & Hansen, Ronald W., 2016. "Innovation in the pharmaceutical industry: New estimates of R&D costs," Journal of Health Economics, Elsevier, vol. 47(C), pages 20-33.
    4. Karan Singhal & Shekoofeh Azizi & Tao Tu & S. Sara Mahdavi & Jason Wei & Hyung Won Chung & Nathan Scales & Ajay Tanwani & Heather Cole-Lewis & Stephen Pfohl & Perry Payne & Martin Seneviratne & Paul G, 2023. "Publisher Correction: Large language models encode clinical knowledge," Nature, Nature, vol. 620(7973), pages 19-19, August.
    5. Sarthak Pati & Ujjwal Baid & Brandon Edwards & Micah Sheller & Shih-Han Wang & G. Anthony Reina & Patrick Foley & Alexey Gruzdev & Deepthi Karkada & Christos Davatzikos & Chiharu Sako & Satyam Ghodasa, 2022. "Federated learning enables big data for rare cancer boundary detection," Nature Communications, Nature, vol. 13(1), pages 1-17, December.
    6. Xiong, Ruoxuan & Koenecke, Allison & Powell, Michael & Shen, Zhu & Vogelstein, Joshua T. & Athey, Susan, 2021. "Federated Causal Inference in Heterogeneous Observational Data," Research Papers 3990, Stanford University, Graduate School of Business.
    7. Philippe Pébay & Timothy B. Terriberry & Hemanth Kolla & Janine Bennett, 2016. "Numerically stable, scalable formulas for parallel and online computation of higher-order multivariate central moments with arbitrary weights," Computational Statistics, Springer, vol. 31(4), pages 1305-1325, December.
    8. Karan Singhal & Shekoofeh Azizi & Tao Tu & S. Sara Mahdavi & Jason Wei & Hyung Won Chung & Nathan Scales & Ajay Tanwani & Heather Cole-Lewis & Stephen Pfohl & Perry Payne & Martin Seneviratne & Paul G, 2023. "Large language models encode clinical knowledge," Nature, Nature, vol. 620(7972), pages 172-180, August.
    9. Lihui Zhao & Brian Claggett & Lu Tian & Hajime Uno & Marc A. Pfeffer & Scott D. Solomon & Lorenzo Trippa & L. J. Wei, 2016. "On the restricted mean survival time curve in survival analysis," Biometrics, The International Biometric Society, vol. 72(1), pages 215-221, March.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Maxime Griot & Coralie Hemptinne & Jean Vanderdonckt & Demet Yuksel, 2025. "Large Language Models lack essential metacognition for reliable medical reasoning," Nature Communications, Nature, vol. 16(1), pages 1-10, December.
    2. Arslon Ruziboev & Dilmurod Turimov & Jiyoun Kim & Wooseong Kim, 2025. "Multiclass Classification of Sarcopenia Severity in Korean Adults Using Machine Learning and Model Fusion Approaches," Mathematics, MDPI, vol. 13(18), pages 1-22, September.
    3. Ali Nemati & Mohammad Assadi Shalmani & Qiang Lu & Jake Luo, 2025. "Benchmarking Large Language Models from Open and Closed Source Models to Apply Data Annotation for Free-Text Criteria in Healthcare," Future Internet, MDPI, vol. 17(4), pages 1-27, March.
    4. Victor Chernozhukov & Carlos Cinelli & Whitney Newey & Amit Sharma & Vasilis Syrgkanis, 2021. "Long Story Short: Omitted Variable Bias in Causal Machine Learning," Papers 2112.13398, arXiv.org, revised May 2024.
    5. Aditya Ghosh & Dominik Rothenhausler, 2025. "Assumption-robust Causal Inference," Papers 2505.08729, arXiv.org, revised Jun 2025.
    6. Cheng-Yi Li & Kao-Jung Chang & Cheng-Fu Yang & Hsin-Yu Wu & Wenting Chen & Hritik Bansal & Ling Chen & Yi-Ping Yang & Yu-Chun Chen & Shih-Pin Chen & Shih-Jen Chen & Jiing-Feng Lirng & Kai-Wei Chang & , 2025. "Towards a holistic framework for multimodal LLM in 3D brain CT radiology report generation," Nature Communications, Nature, vol. 16(1), pages 1-14, December.
    7. Tingmingke Lu, 2025. "Maximum Hallucination Standards for Domain-Specific Large Language Models," Papers 2503.05481, arXiv.org.
    8. Zheng, Shuwen & Pan, Kai & Liu, Jie & Chen, Yunxia, 2024. "Empirical study on fine-tuning pre-trained large language models for fault diagnosis of complex systems," Reliability Engineering and System Safety, Elsevier, vol. 252(C).
    9. Xiangru Tang & Qiao Jin & Kunlun Zhu & Tongxin Yuan & Yichi Zhang & Wangchunshu Zhou & Meng Qu & Yilun Zhao & Jian Tang & Zhuosheng Zhang & Arman Cohan & Dov Greenbaum & Zhiyong Lu & Mark Gerstein, 2025. "Risks of AI scientists: prioritizing safeguarding over autonomy," Nature Communications, Nature, vol. 16(1), pages 1-11, December.
    10. Aldo Gael Carranza & Susan Athey, 2023. "Federated Offline Policy Learning," Papers 2305.12407, arXiv.org, revised Oct 2024.
    11. Zhou, Zhen & Gu, Ziyuan & Qu, Xiaobo & Liu, Pan & Liu, Zhiyuan & Yu, Wenwu, 2024. "Urban mobility foundation model: A literature review and hierarchical perspective," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 192(C).
    12. Qingyu Chen & Yan Hu & Xueqing Peng & Qianqian Xie & Qiao Jin & Aidan Gilson & Maxwell B. Singer & Xuguang Ai & Po-Ting Lai & Zhizheng Wang & Vipina K. Keloth & Kalpana Raja & Jimin Huang & Huan He & , 2025. "Benchmarking large language models for biomedical natural language processing applications and recommendations," Nature Communications, Nature, vol. 16(1), pages 1-16, December.
    13. Shi, Ruoyao, 2024. "An Averaging Estimator For Two-Step M-Estimation In Semiparametric Models," Econometric Theory, Cambridge University Press, vol. 40(3), pages 652-687, June.
    14. Zhenjia Chen & Zhenyuan Lin & Ji Yang & Cong Chen & Di Liu & Liuting Shan & Yuanyuan Hu & Tailiang Guo & Huipeng Chen, 2024. "Cross-layer transmission realized by light-emitting memristor for constructing ultra-deep neural network with transfer learning ability," Nature Communications, Nature, vol. 15(1), pages 1-12, December.
    15. Yujin Oh & Sangjoon Park & Hwa Kyung Byun & Yeona Cho & Ik Jae Lee & Jin Sung Kim & Jong Chul Ye, 2024. "LLM-driven multimodal target volume contouring in radiation oncology," Nature Communications, Nature, vol. 15(1), pages 1-14, December.
    16. Chen Gao & Xiaochong Lan & Nian Li & Yuan Yuan & Jingtao Ding & Zhilun Zhou & Fengli Xu & Yong Li, 2024. "Large language models empowered agent-based modeling and simulation: a survey and perspectives," Humanities and Social Sciences Communications, Palgrave Macmillan, vol. 11(1), pages 1-24, December.
    17. Juexiao Zhou & Xiaonan He & Liyuan Sun & Jiannan Xu & Xiuying Chen & Yuetan Chu & Longxi Zhou & Xingyu Liao & Bin Zhang & Shawn Afvari & Xin Gao, 2024. "Pre-trained multimodal large language model enhances dermatological diagnosis using SkinGPT-4," Nature Communications, Nature, vol. 15(1), pages 1-12, December.
    18. Qin, Hongyi & Zhu, Yifan & Jiang, Yan & Luo, Siqi & Huang, Cui, 2024. "Examining the impact of personalization and carefulness in AI-generated health advice: Trust, adoption, and insights in online healthcare consultations experiments," Technology in Society, Elsevier, vol. 79(C).
    19. Ching-Nam Hang & Pei-Duo Yu & Roberto Morabito & Chee-Wei Tan, 2024. "Large Language Models Meet Next-Generation Networking Technologies: A Review," Future Internet, MDPI, vol. 16(10), pages 1-29, October.
    20. Stéphane Bonhomme & Martin Weidner, 2020. "Minimizing Sensitivity to Model Misspecification," CeMMAP working papers CWP37/20, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-62525-z. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.