IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v11y2020i1d10.1038_s41467-020-15851-3.html
   My bibliography  Save this article

Deep learning enables accurate clustering with batch effect removal in single-cell RNA-seq analysis

Author

Listed:
  • Xiangjie Li

    (University of Pennsylvania
    Renmin University of China
    Chinese Academy of Medical Sciences and Peking Union Medical College)

  • Kui Wang

    (University of Pennsylvania
    Nankai University)

  • Yafei Lyu

    (University of Pennsylvania)

  • Huize Pan

    (Columbia University Medical Center)

  • Jingxiao Zhang

    (Renmin University of China)

  • Dwight Stambolian

    (University of Pennsylvania)

  • Katalin Susztak

    (University of Pennsylvania)

  • Muredach P. Reilly

    (Columbia University Medical Center)

  • Gang Hu

    (University of Pennsylvania
    Nankai University)

  • Mingyao Li

    (University of Pennsylvania)

Abstract

Single-cell RNA sequencing (scRNA-seq) can characterize cell types and states through unsupervised clustering, but the ever increasing number of cells and batch effect impose computational challenges. We present DESC, an unsupervised deep embedding algorithm that clusters scRNA-seq data by iteratively optimizing a clustering objective function. Through iterative self-learning, DESC gradually removes batch effects, as long as technical differences across batches are smaller than true biological variations. As a soft clustering algorithm, cluster assignment probabilities from DESC are biologically interpretable and can reveal both discrete and pseudotemporal structure of cells. Comprehensive evaluations show that DESC offers a proper balance of clustering accuracy and stability, has a small footprint on memory, does not explicitly require batch information for batch effect removal, and can utilize GPU when available. As the scale of single-cell studies continues to grow, we believe DESC will offer a valuable tool for biomedical researchers to disentangle complex cellular heterogeneity.

Suggested Citation

  • Xiangjie Li & Kui Wang & Yafei Lyu & Huize Pan & Jingxiao Zhang & Dwight Stambolian & Katalin Susztak & Muredach P. Reilly & Gang Hu & Mingyao Li, 2020. "Deep learning enables accurate clustering with batch effect removal in single-cell RNA-seq analysis," Nature Communications, Nature, vol. 11(1), pages 1-14, December.
  • Handle: RePEc:nat:natcom:v:11:y:2020:i:1:d:10.1038_s41467-020-15851-3
    DOI: 10.1038/s41467-020-15851-3
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-020-15851-3
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-020-15851-3?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Yun-Tsan Chang & Pacôme Prompsy & Susanne Kimeswenger & Yi-Chien Tsai & Desislava Ignatova & Olesya Pavlova & Christoph Iselin & Lars E. French & Mitchell P. Levesque & François Kuonen & Malgorzata Bo, 2024. "MHC-I upregulation safeguards neoplastic T cells in the skin against NK cell-mediated eradication in mycosis fungoides," Nature Communications, Nature, vol. 15(1), pages 1-18, December.
    2. Xiaokang Yu & Xinyi Xu & Jingxiao Zhang & Xiangjie Li, 2023. "Batch alignment of single-cell transcriptomics data using deep metric learning," Nature Communications, Nature, vol. 14(1), pages 1-15, December.
    3. Ajita Shree & Musale Krushna Pavan & Hamim Zafar, 2023. "scDREAMER for atlas-level integration of single-cell datasets using deep generative model paired with adversarial classifier," Nature Communications, Nature, vol. 14(1), pages 1-19, December.
    4. Yasa Baig & Helena R. Ma & Helen Xu & Lingchong You, 2023. "Autoencoder neural networks enable low dimensional structure analyses of microbial growth dynamics," Nature Communications, Nature, vol. 14(1), pages 1-17, December.
    5. Qihuang Zhang & Shunzhou Jiang & Amelia Schroeder & Jian Hu & Kejie Li & Baohong Zhang & David Dai & Edward B. Lee & Rui Xiao & Mingyao Li, 2023. "Leveraging spatial transcriptomics data to recover cell locations in single-cell RNA-seq with CeLEry," Nature Communications, Nature, vol. 14(1), pages 1-19, December.
    6. Zhuohan Yu & Yanchi Su & Yifu Lu & Yuning Yang & Fuzhou Wang & Shixiong Zhang & Yi Chang & Ka-Chun Wong & Xiangtao Li, 2023. "Topological identification and interpretation for single-cell gene regulation elucidation across multiple platforms using scMGCA," Nature Communications, Nature, vol. 14(1), pages 1-18, December.
    7. Yingxin Lin & Yue Cao & Elijah Willie & Ellis Patrick & Jean Y. H. Yang, 2023. "Atlas-scale single-cell multi-sample multi-condition data integration using scMerge2," Nature Communications, Nature, vol. 14(1), pages 1-13, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:11:y:2020:i:1:d:10.1038_s41467-020-15851-3. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.