IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1013697.html
   My bibliography  Save this article

Library size-stabilized metacells construction enhances co-expression network analysis in single-cell data

Author

Listed:
  • Tianjiao Zhang
  • Haibin Zhu

Abstract

Single-cell RNA sequencing (scRNA-seq) deciphers cell type-specific co-expression networks to resolve biological functions but remains constrained by data sparsity and compositional biases. Conventional metacells construction strategies mitigate sparsity by aggregating transcriptionally similar cells but often neglect systematic biases introduced by compositional data. This problem leads to spurious co-expression correlations and obscuring biologically meaningful interactions. Through mathematical modeling and simulations, we demonstrate that uncontrolled library size variance in traditional metacells inflates false-positive correlations and distorts co-expression networks. Here, we present LSMetacell (Library Size-stabilized Metacells), a computational framework that explicitly stabilizes library sizes across metacells to reduce compositional noise while preserving cellular heterogeneity. LSMetacell addresses this by stabilizing library sizes during metacells aggregation, thereby enhancing the accuracy of downstream analyses such as Weighted Gene Co-expression Network Analysis (WGCNA). Applied to a postmortem Alzheimer’s disease brain scRNA-seq dataset, LSMetacell revealed robust, cell type-specific co-expression modules enriched for disease-relevant pathways, outperforming the conventional metacells approach. Our work establishes a principled strategy for resolving compositional biases in scRNA-seq data, advancing the reliability of co-expression network inference in studying complex biological systems. This framework provides a generalizable solution for improving transcriptional analyses in single-cell studies.Author summary: Gene co-expression analysis is a widely used method to infer functional relationships between genes by measuring correlations in their normalized gene expression level. However, in this paper, through mathematical modeling and simulations, we demonstrate that these correlations are systematically skewed—particularly due to biases caused by variability in sequencing depth (library size). This issue distorts co-expression analysis results, inflating false correlations and masking true biological interactions. Traditional methods fail to address library size biases in single-cell studies where data sparsity compounds these challenges. We introduce LSMetacell, a computational framework that simultaneously tackles single-cell data sparsity and corrects for library size-induced correlation biases. By constructing metacells with stabilized sequencing depths, our method reduces technical noise while preserving biological heterogeneity. Applied to Alzheimer’s disease brain data, LSMetacell uncovered microglia-specific co-expression networks linking immune dysregulation to neurodegeneration. Our work provides a dual solution: enhancing single-cell resolution through cell aggregation and mitigating systemic biases that plague co-expression studies. LSMetacell integrates technical approaches with biological analysis, enabling researchers to extract precise and reproducible findings from compositional data.

Suggested Citation

  • Tianjiao Zhang & Haibin Zhu, 2025. "Library size-stabilized metacells construction enhances co-expression network analysis in single-cell data," PLOS Computational Biology, Public Library of Science, vol. 21(11), pages 1-16, November.
  • Handle: RePEc:plo:pcbi00:1013697
    DOI: 10.1371/journal.pcbi.1013697
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1013697
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1013697&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1013697?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1013697. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.