IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v13y2022i1d10.1038_s41467-022-33758-z.html
   My bibliography  Save this article

Online single-cell data integration through projecting heterogeneous datasets into a common cell-embedding space

Author

Listed:
  • Lei Xiong

    (Tsinghua University
    Tsinghua-Peking Center for Life Sciences
    Shanghai Qi Zhi Institute)

  • Kang Tian

    (Tsinghua University
    Tsinghua-Peking Center for Life Sciences)

  • Yuzhe Li

    (Tsinghua University
    Peking University)

  • Weixi Ning

    (Tsinghua University)

  • Xin Gao

    (King Abdullah University of Science and Technology (KAUST)
    King Abdullah University of Science and Technology (KAUST)
    BioMap)

  • Qiangfeng Cliff Zhang

    (Tsinghua University
    Tsinghua-Peking Center for Life Sciences)

Abstract

Computational tools for integrative analyses of diverse single-cell experiments are facing formidable new challenges including dramatic increases in data scale, sample heterogeneity, and the need to informatively cross-reference new data with foundational datasets. Here, we present SCALEX, a deep-learning method that integrates single-cell data by projecting cells into a batch-invariant, common cell-embedding space in a truly online manner (i.e., without retraining the model). SCALEX substantially outperforms online iNMF and other state-of-the-art non-online integration methods on benchmark single-cell datasets of diverse modalities, (e.g., single-cell RNA sequencing, scRNA-seq, single-cell assay for transposase-accessible chromatin use sequencing, scATAC-seq), especially for datasets with partial overlaps, accurately aligning similar cell populations while retaining true biological differences. We showcase SCALEX’s advantages by constructing continuously expandable single-cell atlases for human, mouse, and COVID-19 patients, each assembled from diverse data sources and growing with every new data. The online data integration capacity and superior performance makes SCALEX particularly appropriate for large-scale single-cell applications to build upon previous scientific insights.

Suggested Citation

  • Lei Xiong & Kang Tian & Yuzhe Li & Weixi Ning & Xin Gao & Qiangfeng Cliff Zhang, 2022. "Online single-cell data integration through projecting heterogeneous datasets into a common cell-embedding space," Nature Communications, Nature, vol. 13(1), pages 1-17, December.
  • Handle: RePEc:nat:natcom:v:13:y:2022:i:1:d:10.1038_s41467-022-33758-z
    DOI: 10.1038/s41467-022-33758-z
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-022-33758-z
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-022-33758-z?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Sonya A. MacParland & Jeff C. Liu & Xue-Zhong Ma & Brendan T. Innes & Agata M. Bartczak & Blair K. Gage & Justin Manuel & Nicholas Khuu & Juan Echeverri & Ivan Linares & Rahul Gupta & Michael L. Cheng, 2018. "Single cell RNA sequencing of human liver reveals distinct intrahepatic macrophage populations," Nature Communications, Nature, vol. 9(1), pages 1-21, December.
    2. Amos Tanay & Aviv Regev, 2017. "Scaling single-cell genomics from phenomenology to mechanism," Nature, Nature, vol. 541(7637), pages 331-338, January.
    3. Nadim Aizarani & Antonio Saviano & Sagar & Laurent Mailly & Sarah Durand & Josip S. Herman & Patrick Pessaux & Thomas F. Baumert & Dominic Grün, 2019. "A human liver cell atlas reveals heterogeneity and epithelial progenitors," Nature, Nature, vol. 572(7768), pages 199-204, August.
    4. Lindsey W. Plasschaert & Rapolas Žilionis & Rayman Choo-Wing & Virginia Savova & Judith Knehr & Guglielmo Roma & Allon M. Klein & Aron B. Jaffe, 2018. "A single-cell atlas of the airway epithelium reveals the CFTR-rich pulmonary ionocyte," Nature, Nature, vol. 560(7718), pages 377-381, August.
    5. Orit Rozenblatt-Rosen & Michael J. T. Stubbington & Aviv Regev & Sarah A. Teichmann, 2017. "The Human Cell Atlas: from vision to reality," Nature, Nature, vol. 550(7677), pages 451-453, October.
    6. Chuang Guo & Bin Li & Huan Ma & Xiaofang Wang & Pengfei Cai & Qiaoni Yu & Lin Zhu & Liying Jin & Chen Jiang & Jingwen Fang & Qian Liu & Dandan Zong & Wen Zhang & Yichen Lu & Kun Li & Xuyuan Gao & Binq, 2020. "Single-cell analysis of two severe COVID-19 patients reveals a monocyte-associated and tocilizumab-responding cytokine storm," Nature Communications, Nature, vol. 11(1), pages 1-11, December.
    7. Rongxin Fang & Sebastian Preissl & Yang Li & Xiaomeng Hou & Jacinta Lucero & Xinxin Wang & Amir Motamedi & Andrew K. Shiau & Xinzhu Zhou & Fangming Xie & Eran A. Mukamel & Kai Zhang & Yanxiao Zhang & , 2021. "Comprehensive analysis of single cell ATAC-seq data with SnapATAC," Nature Communications, Nature, vol. 12(1), pages 1-15, December.
    8. Jason D. Buenrostro & Beijing Wu & Ulrike M. Litzenburger & Dave Ruff & Michael L. Gonzales & Michael P. Snyder & Howard Y. Chang & William J. Greenleaf, 2015. "Single-cell chromatin accessibility reveals principles of regulatory variation," Nature, Nature, vol. 523(7561), pages 486-490, July.
    9. Lei Xiong & Kui Xu & Kang Tian & Yanqiu Shao & Lei Tang & Ge Gao & Michael Zhang & Tao Jiang & Qiangfeng Cliff Zhang, 2019. "SCALE method for single-cell ATAC-seq analysis via latent feature extraction," Nature Communications, Nature, vol. 10(1), pages 1-10, December.
    10. Lawrence Hubert & Phipps Arabie, 1985. "Comparing partitions," Journal of Classification, Springer;The Classification Society, vol. 2(1), pages 193-218, December.
    11. Xiaoping Han & Ziming Zhou & Lijiang Fei & Huiyu Sun & Renying Wang & Yao Chen & Haide Chen & Jingjing Wang & Huanna Tang & Wenhao Ge & Yincong Zhou & Fang Ye & Mengmeng Jiang & Junqing Wu & Yanyu Xia, 2020. "Construction of a human cell landscape at single-cell level," Nature, Nature, vol. 581(7808), pages 303-309, May.
    12. Nayoung Kim & Hong Kwan Kim & Kyungjong Lee & Yourae Hong & Jong Ho Cho & Jung Won Choi & Jung-Il Lee & Yeon-Lim Suh & Bo Mi Ku & Hye Hyeon Eum & Soyean Choi & Yoon-La Choi & Je-Gun Joung & Woong-Yang, 2020. "Single-cell RNA sequencing demonstrates the molecular and cellular reprogramming of metastatic lung adenocarcinoma," Nature Communications, Nature, vol. 11(1), pages 1-15, December.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Songming Tang & Xuejian Cui & Rongxiang Wang & Sijie Li & Siyu Li & Xin Huang & Shengquan Chen, 2024. "scCASE: accurate and interpretable enhancement for single-cell chromatin accessibility sequencing data," Nature Communications, Nature, vol. 15(1), pages 1-16, December.
    2. Andreas Fønss Møller & Jesper Grud Skat Madsen, 2023. "JOINTLY: interpretable joint clustering of single-cell transcriptomes," Nature Communications, Nature, vol. 14(1), pages 1-15, December.
    3. Kai Cao & Qiyu Gong & Yiguang Hong & Lin Wan, 2022. "A unified computational framework for single-cell data integration with optimal transport," Nature Communications, Nature, vol. 13(1), pages 1-15, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Songming Tang & Xuejian Cui & Rongxiang Wang & Sijie Li & Siyu Li & Xin Huang & Shengquan Chen, 2024. "scCASE: accurate and interpretable enhancement for single-cell chromatin accessibility sequencing data," Nature Communications, Nature, vol. 15(1), pages 1-16, December.
    2. Zhijian Li & Christoph Kuppe & Susanne Ziegler & Mingbo Cheng & Nazanin Kabgani & Sylvia Menzel & Martin Zenke & Rafael Kramann & Ivan G. Costa, 2021. "Chromatin-accessibility estimation from single-cell ATAC-seq data with scOpen," Nature Communications, Nature, vol. 12(1), pages 1-14, December.
    3. Ajita Shree & Musale Krushna Pavan & Hamim Zafar, 2023. "scDREAMER for atlas-level integration of single-cell datasets using deep generative model paired with adversarial classifier," Nature Communications, Nature, vol. 14(1), pages 1-19, December.
    4. Franziska Hildebrandt & Alma Andersson & Sami Saarenpää & Ludvig Larsson & Noémi Van Hul & Sachie Kanatani & Jan Masek & Ewa Ellis & Antonio Barragan & Annelie Mollbrink & Emma R. Andersson & Joakim L, 2021. "Spatial Transcriptomics to define transcriptional patterns of zonation and structural components in the mouse liver," Nature Communications, Nature, vol. 12(1), pages 1-14, December.
    5. Han Luo & Xuyang Xia & Li-Bin Huang & Hyunsu An & Minyuan Cao & Gyeong Dae Kim & Hai-Ning Chen & Wei-Han Zhang & Yang Shu & Xiangyu Kong & Zhixiang Ren & Pei-Heng Li & Yang Liu & Huairong Tang & Rongh, 2022. "Pan-cancer single-cell analysis reveals the heterogeneity and plasticity of cancer-associated fibroblasts in the tumor microenvironment," Nature Communications, Nature, vol. 13(1), pages 1-17, December.
    6. T. Hautz & S. Salcher & M. Fodor & G. Sturm & S. Ebner & A. Mair & M. Trebo & G. Untergasser & S. Sopper & B. Cardini & A. Martowicz & J. Hofmann & S. Daum & M. Kalb & T. Resch & F. Krendl & A. Weisse, 2023. "Immune cell dynamics deconvoluted by single-cell RNA sequencing in normothermic machine perfusion of the liver," Nature Communications, Nature, vol. 14(1), pages 1-18, December.
    7. Md Tauhidul Islam & Jen-Yeu Wang & Hongyi Ren & Xiaomeng Li & Masoud Badiei Khuzani & Shengtian Sang & Lequan Yu & Liyue Shen & Wei Zhao & Lei Xing, 2022. "Leveraging data-driven self-consistency for high-fidelity gene expression recovery," Nature Communications, Nature, vol. 13(1), pages 1-17, December.
    8. Joyce B. Kang & Aparna Nathan & Kathryn Weinand & Fan Zhang & Nghia Millard & Laurie Rumker & D. Branch Moody & Ilya Korsunsky & Soumya Raychaudhuri, 2021. "Efficient and precise single-cell reference atlas mapping with Symphony," Nature Communications, Nature, vol. 12(1), pages 1-21, December.
    9. Ting Dong & Guangan Hu & Zhongqi Fan & Huirui Wang & Yinghui Gao & Sisi Wang & Hao Xu & Michael B. Yaffe & Matthew G. Vander Heiden & Guoyue Lv & Jianzhu Chen, 2024. "Activation of GPR3-β-arrestin2-PKM2 pathway in Kupffer cells stimulates glycolysis and inhibits obesity and liver pathogenesis," Nature Communications, Nature, vol. 15(1), pages 1-18, December.
    10. Xiang Lin & Tian Tian & Zhi Wei & Hakon Hakonarson, 2022. "Clustering of single-cell multi-omics data with a multimodal deep learning method," Nature Communications, Nature, vol. 13(1), pages 1-18, December.
    11. Yuanyuan Chen & Reka Toth & Sara Chocarro & Dieter Weichenhan & Joschka Hey & Pavlo Lutsik & Stefan Sawall & Georgios T. Stathopoulos & Christoph Plass & Rocio Sotillo, 2022. "Club cells employ regeneration mechanisms during lung tumorigenesis," Nature Communications, Nature, vol. 13(1), pages 1-16, December.
    12. Shengen Shawn Hu & Lin Liu & Qi Li & Wenjing Ma & Michael J. Guertin & Clifford A. Meyer & Ke Deng & Tingting Zhang & Chongzhi Zang, 2022. "Intrinsic bias estimation for improved analysis of bulk and single-cell chromatin accessibility profiles using SELMA," Nature Communications, Nature, vol. 13(1), pages 1-17, December.
    13. Ziye Xu & Tianyu Zhang & Hongyu Chen & Yuyi Zhu & Yuexiao Lv & Shunji Zhang & Jiaye Chen & Haide Chen & Lili Yang & Weiqin Jiang & Shengyu Ni & Fangru Lu & Zhaolun Wang & Hao Yang & Ling Dong & Feng C, 2023. "High-throughput single nucleus total RNA sequencing of formalin-fixed paraffin-embedded tissues by snRandom-seq," Nature Communications, Nature, vol. 14(1), pages 1-12, December.
    14. Alicja Grześkowiak, 2016. "Assessment of Participation in Cultural Activities in Poland by Selected Multivariate Methods," European Journal of Social Sciences Education and Research Articles, Revistia Research and Publishing, vol. 3, January -.
    15. Yunpeng Zhao & Qing Pan & Chengan Du, 2019. "Logistic regression augmented community detection for network data with application in identifying autism‐related gene pathways," Biometrics, The International Biometric Society, vol. 75(1), pages 222-234, March.
    16. Wu, Han-Ming & Tien, Yin-Jing & Chen, Chun-houh, 2010. "GAP: A graphical environment for matrix visualization and cluster analysis," Computational Statistics & Data Analysis, Elsevier, vol. 54(3), pages 767-778, March.
    17. José E. Chacón, 2021. "Explicit Agreement Extremes for a 2 × 2 Table with Given Marginals," Journal of Classification, Springer;The Classification Society, vol. 38(2), pages 257-263, July.
    18. F. Marta L. Di Lascio & Andrea Menapace & Roberta Pappadà, 2024. "A spatially‐weighted AMH copula‐based dissimilarity measure for clustering variables: An application to urban thermal efficiency," Environmetrics, John Wiley & Sons, Ltd., vol. 35(1), February.
    19. Yifan Zhu & Chongzhi Di & Ying Qing Chen, 2019. "Clustering Functional Data with Application to Electronic Medication Adherence Monitoring in HIV Prevention Trials," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 11(2), pages 238-261, July.
    20. Irene Vrbik & Paul McNicholas, 2015. "Fractionally-Supervised Classification," Journal of Classification, Springer;The Classification Society, vol. 32(3), pages 359-381, October.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:13:y:2022:i:1:d:10.1038_s41467-022-33758-z. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.