Author
Listed:
- Tianjiao Zhang
- Haibin Zhu
Abstract
Single-cell RNA sequencing (scRNA-seq) deciphers cell type-specific co-expression networks to resolve biological functions but remains constrained by data sparsity and compositional biases. Conventional metacells construction strategies mitigate sparsity by aggregating transcriptionally similar cells but often neglect systematic biases introduced by compositional data. This problem leads to spurious co-expression correlations and obscuring biologically meaningful interactions. Through mathematical modeling and simulations, we demonstrate that uncontrolled library size variance in traditional metacells inflates false-positive correlations and distorts co-expression networks. Here, we present LSMetacell (Library Size-stabilized Metacells), a computational framework that explicitly stabilizes library sizes across metacells to reduce compositional noise while preserving cellular heterogeneity. LSMetacell addresses this by stabilizing library sizes during metacells aggregation, thereby enhancing the accuracy of downstream analyses such as Weighted Gene Co-expression Network Analysis (WGCNA). Applied to a postmortem Alzheimer’s disease brain scRNA-seq dataset, LSMetacell revealed robust, cell type-specific co-expression modules enriched for disease-relevant pathways, outperforming the conventional metacells approach. Our work establishes a principled strategy for resolving compositional biases in scRNA-seq data, advancing the reliability of co-expression network inference in studying complex biological systems. This framework provides a generalizable solution for improving transcriptional analyses in single-cell studies.Author summary: Gene co-expression analysis is a widely used method to infer functional relationships between genes by measuring correlations in their normalized gene expression level. However, in this paper, through mathematical modeling and simulations, we demonstrate that these correlations are systematically skewed—particularly due to biases caused by variability in sequencing depth (library size). This issue distorts co-expression analysis results, inflating false correlations and masking true biological interactions. Traditional methods fail to address library size biases in single-cell studies where data sparsity compounds these challenges. We introduce LSMetacell, a computational framework that simultaneously tackles single-cell data sparsity and corrects for library size-induced correlation biases. By constructing metacells with stabilized sequencing depths, our method reduces technical noise while preserving biological heterogeneity. Applied to Alzheimer’s disease brain data, LSMetacell uncovered microglia-specific co-expression networks linking immune dysregulation to neurodegeneration. Our work provides a dual solution: enhancing single-cell resolution through cell aggregation and mitigating systemic biases that plague co-expression studies. LSMetacell integrates technical approaches with biological analysis, enabling researchers to extract precise and reproducible findings from compositional data.
Suggested Citation
Tianjiao Zhang & Haibin Zhu, 2025.
"Library size-stabilized metacells construction enhances co-expression network analysis in single-cell data,"
PLOS Computational Biology, Public Library of Science, vol. 21(11), pages 1-16, November.
Handle:
RePEc:plo:pcbi00:1013697
DOI: 10.1371/journal.pcbi.1013697
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1013697. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.