Author
Listed:
- Gustavo Magaña-López
- Laurence Calzone
- Andrei Zinovyev
- Loïc Paulevé
Abstract
Boolean networks are largely employed to model the qualitative dynamics of cell fate processes by describing the change of binary activation states of genes and transcription factors with time. Being able to bridge such qualitative states with quantitative measurements of gene expression in cells, as scRNA-seq, is a cornerstone for data-driven model construction and validation. On one hand, scRNA-seq binarisation is a key step for inferring and validating Boolean models. On the other hand, the generation of synthetic scRNA-seq data from baseline Boolean models provides an important asset to benchmark inference methods. However, linking characteristics of scRNA-seq datasets, including dropout events, with Boolean states is a challenging task.We present scBoolSeq, a method for the bidirectional linking of scRNA-seq data and Boolean activation state of genes. Given a reference scRNA-seq dataset, scBoolSeq computes statistical criteria to classify the empirical gene pseudocount distributions as either unimodal, bimodal, or zero-inflated, and fit a probabilistic model of dropouts, with gene-dependent parameters. From these learnt distributions, scBoolSeq can perform both binarisation of scRNA-seq datasets, and generate synthetic scRNA-seq datasets from Boolean traces, as issued from Boolean networks, using biased sampling and dropout simulation. We present a case study demonstrating the application of scBoolSeq’s binarisation scheme in data-driven model inference. Furthermore, we compare synthetic scRNA-seq data generated by scBoolSeq with BoolODE’s, data for the same Boolean Network model. The comparison shows that our method better reproduces the statistics of real scRNA-seq datasets, such as the mean-variance and mean-dropout relationships while exhibiting clearly defined trajectories in two-dimensional projections of the data.Author summary: The qualitative and logical modelling of cell dynamics has brought precious insight into gene regulatory mechanisms that drive cellular differentiation and fate decisions by predicting cellular trajectories and mutations for their control. However, the design and validation of these models is impeded by the quantitative nature of experimental measurements of cellular states. In this paper, we provide and assess a new methodology, scBoolSeq for bridging single-cell level pseudocounts of RNA transcripts with Boolean classification of gene activity levels. Our method, implemented as a Python package, enables both to binarise scRNA-seq data in order to match quantitative measurements with states of logical models, and to generate synthetic data from Boolean traces to benchmark inference methods. We show that scBoolSeq accurately captures the main statistical features of scRNA-seq data, including measurement dropouts, improving significantly the state of the art. Overall, scBoolSeq brings a statistically-grounded method for enabling the inference and validation of qualitative models from scRNA-seq data.
Suggested Citation
Gustavo Magaña-López & Laurence Calzone & Andrei Zinovyev & Loïc Paulevé, 2024.
"scBoolSeq: Linking scRNA-seq statistics and Boolean dynamics,"
PLOS Computational Biology, Public Library of Science, vol. 20(7), pages 1-25, July.
Handle:
RePEc:plo:pcbi00:1011620
DOI: 10.1371/journal.pcbi.1011620
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1011620. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.