IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v16y2025i1d10.1038_s41467-025-57587-y.html
   My bibliography  Save this article

A foundation model for generalizable cancer diagnosis and survival prediction from histopathological images

Author

Listed:
  • Zhaochang Yang

    (Shanghai Jiao Tong University)

  • Ting Wei

    (Shanghai Jiao Tong University)

  • Ying Liang

    (Shanghai Jiao Tong University
    Fudan University)

  • Xin Yuan

    (Shanghai Jiao Tong University
    Shanghai Jiao Tong University
    Shanghai Jiao Tong University School of Medicine
    Shanghai Jiao Tong University)

  • RuiTian Gao

    (Shanghai Jiao Tong University)

  • Yujia Xia

    (Shanghai Jiao Tong University)

  • Jie Zhou

    (Shanghai Jiao Tong University)

  • Yue Zhang

    (Shanghai Jiao Tong University
    Shanghai Jiao Tong University)

  • Zhangsheng Yu

    (Shanghai Jiao Tong University
    Shanghai Jiao Tong University
    Shanghai Jiao Tong University School of Medicine
    Shanghai Jiao Tong University)

Abstract

Computational pathology, utilizing whole slide images (WSIs) for pathological diagnosis, has advanced the development of intelligent healthcare. However, the scarcity of annotated data and histological differences hinder the general application of existing methods. Extensive histopathological data and the robustness of self-supervised models in small-scale data demonstrate promising prospects for developing foundation pathology models. Here we show BEPH (BEiT-based model Pre-training on Histopathological image), a foundation model that leverages self-supervised learning to learn meaningful representations from 11 million unlabeled histopathological images. These representations are then efficiently adapted to various tasks, including patch-level cancer diagnosis, WSI-level cancer classification, and survival prediction for multiple cancer subtypes. By leveraging the masked image modeling (MIM) pre-training approach, BEPH offers an efficient solution to enhance model performance, reduce the reliance on expert annotations, and facilitate the broader application of artificial intelligence in clinical settings. The pre-trained model is available at https://github.com/Zhcyoung/BEPH .

Suggested Citation

  • Zhaochang Yang & Ting Wei & Ying Liang & Xin Yuan & RuiTian Gao & Yujia Xia & Jie Zhou & Yue Zhang & Zhangsheng Yu, 2025. "A foundation model for generalizable cancer diagnosis and survival prediction from histopathological images," Nature Communications, Nature, vol. 16(1), pages 1-16, December.
  • Handle: RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-57587-y
    DOI: 10.1038/s41467-025-57587-y
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-025-57587-y
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-025-57587-y?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Xiyue Wang & Junhan Zhao & Eliana Marostica & Wei Yuan & Jietian Jin & Jiayu Zhang & Ruijiang Li & Hongping Tang & Kanran Wang & Yu Li & Fang Wang & Yulong Peng & Junyou Zhu & Jing Zhang & Christopher, 2024. "A pathology foundation model for cancer diagnosis and prognosis prediction," Nature, Nature, vol. 634(8035), pages 970-978, October.
    2. Yukun Zhou & Mark A. Chia & Siegfried K. Wagner & Murat S. Ayhan & Dominic J. Williamson & Robbert R. Struyven & Timing Liu & Moucheng Xu & Mateo G. Lozano & Peter Woodward-Court & Yuka Kihara & Andre, 2023. "A foundation model for generalizable disease detection from retinal images," Nature, Nature, vol. 622(7981), pages 156-163, October.
    3. Francesco Cisternino & Sara Ometto & Soumick Chatterjee & Edoardo Giacopuzzi & Adam P. Levine & Craig A. Glastonbury, 2024. "Self-supervised learning for characterising histomorphological diversity and spatial RNA expression prediction across 23 human tissue types," Nature Communications, Nature, vol. 15(1), pages 1-16, December.
    4. Hanwen Xu & Naoto Usuyama & Jaspreet Bagga & Sheng Zhang & Rajesh Rao & Tristan Naumann & Cliff Wong & Zelalem Gero & Javier González & Yu Gu & Yanbo Xu & Mu Wei & Wenhui Wang & Shuming Ma & Furu Wei , 2024. "A whole-slide foundation model for digital pathology from real-world data," Nature, Nature, vol. 630(8015), pages 181-188, June.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Cosmin I. Bercea & Benedikt Wiestler & Daniel Rueckert & Julia A. Schnabel, 2025. "Evaluating normative representation learning in generative AI for robust anomaly detection in brain imaging," Nature Communications, Nature, vol. 16(1), pages 1-10, December.
    2. Thiers, Fabio A. & Lucy, Kimberly, 2024. "A Distinct Approach to Clinical GenAI Oversight," OSF Preprints vm6zy, Center for Open Science.
    3. Gabriele Campanella & Shengjia Chen & Manbir Singh & Ruchika Verma & Silke Muehlstedt & Jennifer Zeng & Aryeh Stock & Matt Croken & Brandon Veremis & Abdulkadir Elmas & Ivan Shujski & Noora Neittaanmä, 2025. "A clinical benchmark of public self-supervised pathology foundation models," Nature Communications, Nature, vol. 16(1), pages 1-12, December.
    4. Ruixue Zhang & Huate Zhu & Qinglin Chang & Qirong Mao, 2025. "A Comprehensive Review of Digital Twins Technology in Agriculture," Agriculture, MDPI, vol. 15(9), pages 1-25, April.
    5. Bojing Liu & Meaghan Polack & Nicolas Coudray & Adalberto Claudio Quiros & Theodore Sakellaropoulos & Hortense Le & Afreen Karimkhan & Augustinus S. L. P. Crobach & J. Han J. M. Krieken & Ke Yuan & Ro, 2025. "Self-supervised learning reveals clinically relevant histomorphological patterns for therapeutic strategies in colon cancer," Nature Communications, Nature, vol. 16(1), pages 1-18, December.
    6. Zhilong Weng & Alexander Seper & Alexey Pryalukhin & Fabian Mairinger & Claudia Wickenhauser & Marcus Bauer & Lennert Glamann & Hendrik Bläker & Thomas Lingscheidt & Wolfgang Hulla & Danny Jonigk & Si, 2024. "GrandQC: A comprehensive solution to quality control problem in digital pathology," Nature Communications, Nature, vol. 15(1), pages 1-12, December.
    7. repec:osf:osfxxx:vm6zy_v1 is not listed on IDEAS
    8. Sofía Ortín Vela & Michael J. Beyeler & Olga Trofimova & Ilaria Iuliani & Jose D. Vargas Quiros & Victor A. Vries & Ilenia Meloni & Adham Elwakil & Florence Hoogewoud & Bart Liefers & David Presby & W, 2024. "Phenotypic and genetic characteristics of retinal vascular parameters and their association with diseases," Nature Communications, Nature, vol. 15(1), pages 1-17, December.
    9. Luyang Luo & Mingxiang Wu & Mei Li & Yi Xin & Qiong Wang & Varut Vardhanabhuti & Winnie CW Chu & Zhenhui Li & Juan Zhou & Pranav Rajpurkar & Hao Chen, 2025. "A large model for non-invasive and personalized management of breast cancer from multiparametric MRI," Nature Communications, Nature, vol. 16(1), pages 1-14, December.
    10. Juan Manuel Zambrano Chaves & Shih-Cheng Huang & Yanbo Xu & Hanwen Xu & Naoto Usuyama & Sheng Zhang & Fei Wang & Yujia Xie & Mahmoud Khademi & Ziyi Yang & Hany Awadalla & Julia Gong & Houdong Hu & Jia, 2025. "A clinically accessible small multimodal radiology model and evaluation metric for chest X-ray findings," Nature Communications, Nature, vol. 16(1), pages 1-15, December.
    11. Shaopeng Yang & Zhuoyao Xin & Weijing Cheng & Pingting Zhong & Riqian Liu & Ziyu Zhu & Lisa Zhuoting Zhu & Xianwen Shang & Shida Chen & Wenyong Huang & Lei Zhang & Wei Wang, 2025. "Photoreceptor metabolic window unveils eye–body interactions," Nature Communications, Nature, vol. 16(1), pages 1-16, December.
    12. Dyke Ferber & Georg Wölflein & Isabella C. Wiest & Marta Ligero & Srividhya Sainath & Narmin Ghaffari Laleh & Omar S. M. El Nahhas & Gustav Müller-Franzes & Dirk Jäger & Daniel Truhn & Jakob Nikolas K, 2024. "In-context learning enables multimodal large language models to classify cancer pathology images," Nature Communications, Nature, vol. 15(1), pages 1-12, December.
    13. Weijian Huang & Cheng Li & Hong-Yu Zhou & Hao Yang & Jiarun Liu & Yong Liang & Hairong Zheng & Shaoting Zhang & Shanshan Wang, 2024. "Enhancing representation in radiography-reports foundation model: a granular alignment algorithm using masked contrastive learning," Nature Communications, Nature, vol. 15(1), pages 1-12, December.
    14. Jay Jasti & Hua Zhong & Vandana Panwar & Vipul Jarmale & Jeffrey Miyata & Deyssy Carrillo & Alana Christie & Dinesh Rakheja & Zora Modrusan & Edward Ernest Kadel & Niha Beig & Mahrukh Huseni & James B, 2025. "Histopathology based AI model predicts anti-angiogenic therapy response in renal cancer clinical trial," Nature Communications, Nature, vol. 16(1), pages 1-13, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-57587-y. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.