IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1013744.html
   My bibliography  Save this article

D3Impute: Dropout-aware discrimination, distribution-aware modeling, and density-guide imputation for scRNA-seq data

Author

Listed:
  • Siyi Huang
  • Linfeng Jiang
  • Ming Yi
  • Yuan Zhu

Abstract

Single-cell RNA sequencing (scRNA-seq) has revolutionized the study of cellular heterogeneity. A major challenge, however, lies in the prevalence of non-biological zeros—false measurements caused by technical limitations that mask a cell’s true transcriptome. This fundamental issue of distinguishing these artifacts from true biological zeros, where a gene is genuinely absent, remains a key hurdle for computational methods, as misclassification can distort biological signals during data recovery. To overcome this, we introduce D3Impute, a discriminative imputation framework built on three key innovations: (1) a distribution-aware normalization step that adapts to dataset-specific characteristics while preserving meaningful biological variation; (2) a dual-network discriminator that uses bulk RNA-seq data as a biological reference to accurately identify non-biological zeros while retaining the true biological zeros; and (3) a density-guided imputation engine that recovers expression values while maintaining local cellular neighborhood structures. Through comprehensive benchmarking against 12 state-of-the-art methods across six diverse datasets, D3Impute demonstrates consistent and significant improvements in essential downstream analyses, including cell clustering, trajectory inference, and differential expression detection. Furthermore, we provide an extensive practical evaluation of D3Impute, demonstrating its robustness across varying data qualities and providing clear guidelines for optimal application. By offering a robust, biologically informed, and user-oriented solution, D3Impute not only enhances scRNA-seq data analysis but also offers a generalizable framework for handling zero-inflated data in computational biology.Author summary: Single-cell RNA sequencing (scRNA-seq) reveals cellular heterogeneity but is compromised by technical “dropout" events—non-biological zeros that obscure true expression patterns. To address this, we developed D3Impute, a computational framework built on three core innovations: (1) Distribution-aware modeling adapts normalization to each cell’s statistical properties, moving beyond one-size-fits-all approaches; (2) Dropout-aware discrimination integrates cell–cell networks from scRNA-seq data with gene co-expression networks from bulk RNA-seq to accurately identify non-biological zeros; (3) Density-guided imputation employs a neighborhood-preserving algorithm with dynamic weighting to recover missing values while preventing over-smoothing and retaining meaningful cellular heterogeneity. Together, these components form a principled and interpretable framework that significantly enhances the accuracy of scRNA-seq data analysis.

Suggested Citation

  • Siyi Huang & Linfeng Jiang & Ming Yi & Yuan Zhu, 2025. "D3Impute: Dropout-aware discrimination, distribution-aware modeling, and density-guide imputation for scRNA-seq data," PLOS Computational Biology, Public Library of Science, vol. 21(12), pages 1-38, December.
  • Handle: RePEc:plo:pcbi00:1013744
    DOI: 10.1371/journal.pcbi.1013744
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1013744
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1013744&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1013744?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1013744. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.