Author
Listed:
- Yuansong Zeng
(Sun Yat-sen University
Chongqing University
Jinfeng Laboratory)
- Jiancong Xie
(Sun Yat-sen University)
- Ningyuan Shangguan
(Sun Yat-sen University)
- Zhuoyi Wei
(Sun Yat-sen University
Ltd)
- Wenbing Li
(Sun Yat-sen University)
- Yun Su
(Ltd)
- Shuangyu Yang
(Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University)
- Chengyang Zhang
(Chongqing University)
- Jinbo Zhang
(Nanjing)
- Nan Fang
(Nanjing)
- Hongyu Zhang
(Chongqing University)
- Yutong Lu
(Sun Yat-sen University)
- Huiying Zhao
(Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University)
- Jue Fan
(Nanjing)
- Weijiang Yu
(Sun Yat-sen University
Ltd)
- Yuedong Yang
(Sun Yat-sen University)
Abstract
Single-cell sequencing provides transcriptomic profiling at single-cell resolution, uncovering cellular heterogeneity with unprecedented precision. Yet, current single cell data analysis suffers from the inherent data noises, batch effects, and sparsity, highlighting the requirement of a unified model to represent cellular states. To circumvent this problem, many recent efforts focus on training single-cell foundation models based on large datasets. However, current human foundation models are still limited by the sizes of training data and model parameters. Here, we have collected a diverse dataset of 100 million human cells, on which we train a single-cell foundation model (CellFM) containing 800 million parameters. To balance efficiency and performance, the model is trained through a modified RetNet framework on the MindSpore. Extensive experiments have shown that CellFM outperforms existing models in cell annotation, perturbation prediction, gene function prediction, and gene-gene relationship capturing.
Suggested Citation
Yuansong Zeng & Jiancong Xie & Ningyuan Shangguan & Zhuoyi Wei & Wenbing Li & Yun Su & Shuangyu Yang & Chengyang Zhang & Jinbo Zhang & Nan Fang & Hongyu Zhang & Yutong Lu & Huiying Zhao & Jue Fan & We, 2025.
"CellFM: a large-scale foundation model pre-trained on transcriptomics of 100 million human cells,"
Nature Communications, Nature, vol. 16(1), pages 1-17, December.
Handle:
RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-59926-5
DOI: 10.1038/s41467-025-59926-5
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-59926-5. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.