IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v14y2023i1d10.1038_s41467-023-43590-8.html
   My bibliography  Save this article

scDREAMER for atlas-level integration of single-cell datasets using deep generative model paired with adversarial classifier

Author

Listed:
  • Ajita Shree

    (Indian Institute of Technology Kanpur)

  • Musale Krushna Pavan

    (Indian Institute of Technology Kanpur)

  • Hamim Zafar

    (Indian Institute of Technology Kanpur
    Indian Institute of Technology Kanpur
    Indian Institute of Technology Kanpur)

Abstract

Integration of heterogeneous single-cell sequencing datasets generated across multiple tissue locations, time, and conditions is essential for a comprehensive understanding of the cellular states and expression programs underlying complex biological systems. Here, we present scDREAMER ( https://github.com/Zafar-Lab/scDREAMER ), a data-integration framework that employs deep generative models and adversarial training for both unsupervised and supervised (scDREAMER-Sup) integration of multiple batches. Using six real benchmarking datasets, we demonstrate that scDREAMER can overcome critical challenges including skewed cell type distribution among batches, nested batch-effects, large number of batches and conservation of development trajectory across batches. Our experiments also show that scDREAMER and scDREAMER-Sup outperform state-of-the-art unsupervised and supervised integration methods respectively in batch-correction and conservation of biological variation. Using a 1 million cells dataset, we demonstrate that scDREAMER is scalable and can perform atlas-level cross-species (e.g., human and mouse) integration while being faster than other deep-learning-based methods.

Suggested Citation

  • Ajita Shree & Musale Krushna Pavan & Hamim Zafar, 2023. "scDREAMER for atlas-level integration of single-cell datasets using deep generative model paired with adversarial classifier," Nature Communications, Nature, vol. 14(1), pages 1-19, December.
  • Handle: RePEc:nat:natcom:v:14:y:2023:i:1:d:10.1038_s41467-023-43590-8
    DOI: 10.1038/s41467-023-43590-8
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-023-43590-8
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-023-43590-8?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Xiaokang Yu & Xinyi Xu & Jingxiao Zhang & Xiangjie Li, 2023. "Batch alignment of single-cell transcriptomics data using deep metric learning," Nature Communications, Nature, vol. 14(1), pages 1-15, December.
    2. Jiarui Ding & Aviv Regev, 2021. "Deep generative model embedding of single-cell RNA-Seq profiles on hyperspheres and hyperbolic spaces," Nature Communications, Nature, vol. 12(1), pages 1-17, December.
    3. Kazumasa Kanemaru & James Cranley & Daniele Muraro & Antonio M. A. Miranda & Siew Yen Ho & Anna Wilbrey-Clark & Jan Patrick Pett & Krzysztof Polanski & Laura Richardson & Monika Litvinukova & Natsuhik, 2023. "Spatially resolved multiomics of human cardiac niches," Nature, Nature, vol. 619(7971), pages 801-810, July.
    4. Blanca Pijuan-Sala & Jonathan A. Griffiths & Carolina Guibentif & Tom W. Hiscock & Wajid Jawaid & Fernando J. Calero-Nieto & Carla Mulas & Ximena Ibarra-Soria & Richard C. V. Tyser & Debbie Lee Lian H, 2019. "A single-cell molecular map of mouse gastrulation and early organogenesis," Nature, Nature, vol. 566(7745), pages 490-495, February.
    5. Zhe Sun & Li Chen & Hongyi Xin & Yale Jiang & Qianhui Huang & Anthony R. Cillo & Tracy Tabib & Jay K. Kolls & Tullia C. Bruno & Robert Lafyatis & Dario A. A. Vignali & Kong Chen & Ying Ding & Ming Hu , 2019. "A Bayesian mixture model for clustering droplet-based single-cell transcriptomic data from population studies," Nature Communications, Nature, vol. 10(1), pages 1-10, December.
    6. Orit Rozenblatt-Rosen & Michael J. T. Stubbington & Aviv Regev & Sarah A. Teichmann, 2017. "The Human Cell Atlas: from vision to reality," Nature, Nature, vol. 550(7677), pages 451-453, October.
    7. Xiangjie Li & Kui Wang & Yafei Lyu & Huize Pan & Jingxiao Zhang & Dwight Stambolian & Katalin Susztak & Muredach P. Reilly & Gang Hu & Mingyao Li, 2020. "Deep learning enables accurate clustering with batch effect removal in single-cell RNA-seq analysis," Nature Communications, Nature, vol. 11(1), pages 1-14, December.
    8. Lawrence Hubert & Phipps Arabie, 1985. "Comparing partitions," Journal of Classification, Springer;The Classification Society, vol. 2(1), pages 193-218, December.
    9. Xiaoping Han & Ziming Zhou & Lijiang Fei & Huiyu Sun & Renying Wang & Yao Chen & Haide Chen & Jingjing Wang & Huanna Tang & Wenhao Ge & Yincong Zhou & Fang Ye & Mengmeng Jiang & Junqing Wu & Yanyu Xia, 2020. "Construction of a human cell landscape at single-cell level," Nature, Nature, vol. 581(7808), pages 303-309, May.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Lei Xiong & Kang Tian & Yuzhe Li & Weixi Ning & Xin Gao & Qiangfeng Cliff Zhang, 2022. "Online single-cell data integration through projecting heterogeneous datasets into a common cell-embedding space," Nature Communications, Nature, vol. 13(1), pages 1-17, December.
    2. Yingxin Lin & Yue Cao & Elijah Willie & Ellis Patrick & Jean Y. H. Yang, 2023. "Atlas-scale single-cell multi-sample multi-condition data integration using scMerge2," Nature Communications, Nature, vol. 14(1), pages 1-13, December.
    3. Allen W. Lynch & Myles Brown & Clifford A. Meyer, 2023. "Multi-batch single-cell comparative atlas construction by deep learning disentanglement," Nature Communications, Nature, vol. 14(1), pages 1-22, December.
    4. Xiaokang Yu & Xinyi Xu & Jingxiao Zhang & Xiangjie Li, 2023. "Batch alignment of single-cell transcriptomics data using deep metric learning," Nature Communications, Nature, vol. 14(1), pages 1-15, December.
    5. Lucy Xia & Christy Lee & Jingyi Jessica Li, 2024. "Statistical method scDEED for detecting dubious 2D single-cell embeddings and optimizing t-SNE and UMAP hyperparameters," Nature Communications, Nature, vol. 15(1), pages 1-21, December.
    6. Md Tauhidul Islam & Jen-Yeu Wang & Hongyi Ren & Xiaomeng Li & Masoud Badiei Khuzani & Shengtian Sang & Lequan Yu & Liyue Shen & Wei Zhao & Lei Xing, 2022. "Leveraging data-driven self-consistency for high-fidelity gene expression recovery," Nature Communications, Nature, vol. 13(1), pages 1-17, December.
    7. Joyce B. Kang & Aparna Nathan & Kathryn Weinand & Fan Zhang & Nghia Millard & Laurie Rumker & D. Branch Moody & Ilya Korsunsky & Soumya Raychaudhuri, 2021. "Efficient and precise single-cell reference atlas mapping with Symphony," Nature Communications, Nature, vol. 12(1), pages 1-21, December.
    8. Qihuang Zhang & Shunzhou Jiang & Amelia Schroeder & Jian Hu & Kejie Li & Baohong Zhang & David Dai & Edward B. Lee & Rui Xiao & Mingyao Li, 2023. "Leveraging spatial transcriptomics data to recover cell locations in single-cell RNA-seq with CeLEry," Nature Communications, Nature, vol. 14(1), pages 1-19, December.
    9. Ziye Xu & Tianyu Zhang & Hongyu Chen & Yuyi Zhu & Yuexiao Lv & Shunji Zhang & Jiaye Chen & Haide Chen & Lili Yang & Weiqin Jiang & Shengyu Ni & Fangru Lu & Zhaolun Wang & Hao Yang & Ling Dong & Feng C, 2023. "High-throughput single nucleus total RNA sequencing of formalin-fixed paraffin-embedded tissues by snRandom-seq," Nature Communications, Nature, vol. 14(1), pages 1-12, December.
    10. Amir Alavi & Ziv Bar-Joseph, 2020. "Iterative point set registration for aligning scRNA-seq data," PLOS Computational Biology, Public Library of Science, vol. 16(10), pages 1-21, October.
    11. Alicja Grześkowiak, 2016. "Assessment of Participation in Cultural Activities in Poland by Selected Multivariate Methods," European Journal of Social Sciences Education and Research Articles, Revistia Research and Publishing, vol. 3, January -.
    12. Yunpeng Zhao & Qing Pan & Chengan Du, 2019. "Logistic regression augmented community detection for network data with application in identifying autism‐related gene pathways," Biometrics, The International Biometric Society, vol. 75(1), pages 222-234, March.
    13. Wu, Han-Ming & Tien, Yin-Jing & Chen, Chun-houh, 2010. "GAP: A graphical environment for matrix visualization and cluster analysis," Computational Statistics & Data Analysis, Elsevier, vol. 54(3), pages 767-778, March.
    14. José E. Chacón, 2021. "Explicit Agreement Extremes for a 2 × 2 Table with Given Marginals," Journal of Classification, Springer;The Classification Society, vol. 38(2), pages 257-263, July.
    15. F. Marta L. Di Lascio & Andrea Menapace & Roberta Pappadà, 2024. "A spatially‐weighted AMH copula‐based dissimilarity measure for clustering variables: An application to urban thermal efficiency," Environmetrics, John Wiley & Sons, Ltd., vol. 35(1), February.
    16. Yifan Zhu & Chongzhi Di & Ying Qing Chen, 2019. "Clustering Functional Data with Application to Electronic Medication Adherence Monitoring in HIV Prevention Trials," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 11(2), pages 238-261, July.
    17. Irene Vrbik & Paul McNicholas, 2015. "Fractionally-Supervised Classification," Journal of Classification, Springer;The Classification Society, vol. 32(3), pages 359-381, October.
    18. Maurizio Vichi & Carlo Cavicchia & Patrick J. F. Groenen, 2022. "Hierarchical Means Clustering," Journal of Classification, Springer;The Classification Society, vol. 39(3), pages 553-577, November.
    19. Batool, Fatima & Hennig, Christian, 2021. "Clustering with the Average Silhouette Width," Computational Statistics & Data Analysis, Elsevier, vol. 158(C).
    20. Patrick D. Shay & Stephen S. Farnsworth Mick, 2017. "Clustered and distinct: a taxonomy of local multihospital systems," Health Care Management Science, Springer, vol. 20(3), pages 303-315, September.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:14:y:2023:i:1:d:10.1038_s41467-023-43590-8. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.