IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0333687.html

Multiplets in scRNA-seq data: Extent of the problem and efficacy of methods for removal

Author

Listed:
  • Dimitris Ttoouli
  • Daniel Hoffmann

Abstract

Multiplets—droplets that capture more than one cell—are a known artefact in droplet-based single-cell RNA sequencing (scRNA-seq), yet their prevalence and impact remain underestimated. In this study, we assess the frequency of multiplets across diverse publicly available datasets and evaluate how well commonly used detection tools are able to identify them. Using cell hashing data to determine a lower bound of the true multiplet rate, we demonstrate that commonly used heuristic estimations systematically underestimate multiplet rates, and that existing tools—despite optimized parameters—detect only a small subset of cell-hashing multiplets. We further refine a Poisson-based model to estimate the true multiplet rate, revealing that actual rates can exceed heuristic predictions by more than twofold. Downstream analyses are significantly affected by multiplets: they are not confined to isolated clusters but are distributed throughout the transcriptional landscape, where they distort clustering and cell type annotation. In differential gene expression analysis, multiplets inflated artefactual signals while expected cell-type markers remained stable, leading to shifts in effect sizes and partial loss of significant genes despite high overall fold-change correlation. Using both quantitative and qualitative approaches, we visualize these effects and show that cell-hashing-informed multiplet removal eliminates artefactual clusters and improves annotation clarity, whereas computationally detected multiplets fail to fully remove artefacts in the most common experimental contexts. Our findings confirm that multiplet contamination remains a pervasive and under-addressed issue in scRNA-seq analysis. Since most datasets lack multiplexing, researchers must often rely on heuristics and limited tools, leaving many multiplets unidentified. We advocate for more robust multiplet-detection strategies, including multimodal validation, to ensure more accurate and interpretable scRNA-seq results.

Suggested Citation

  • Dimitris Ttoouli & Daniel Hoffmann, 2025. "Multiplets in scRNA-seq data: Extent of the problem and efficacy of methods for removal," PLOS ONE, Public Library of Science, vol. 20(10), pages 1-24, October.
  • Handle: RePEc:plo:pone00:0333687
    DOI: 10.1371/journal.pone.0333687
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0333687
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0333687&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0333687?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0333687. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.