Author
Listed:
- Bradley Mason
- Laura Justham
- Liam Whitby
- Alison Whitby
- Stuart Scott
- Samuel Nti
- Jon Petzing
Abstract
Flow cytometry (FC) is essential for the precise quantification and characterisation of individual cell populations in a larger heterogenous cell suspension. FC analysis provides a foundation for advanced clinical diagnostics and is a key component in many life-saving therapeutic strategies across a broad range of medical conditions. However, clinical, industrial and research laboratories alike face significant challenges in validating the metrological and biological accuracy of FC data analysis. Due to the inherent relative nature of FC data and the lack of definitive ‘ground truth’ associated with processed biological samples. This study specifically focuses on generating realistic fully synthetic flow cytometry cell clusters and demonstrating their suitability as substitutes for traditional FC data. The inherent model-based heritage of synthetic data enables the robust ability to generate distributionally-equivalent replicate datasets with explicit knowledge of cluster membership for each individual datapoint. Thereby, reducing the uncertainty issues associated with real cluster data and its analysis. This research uses meticulously optimised synthetic cluster-generating benchmarking software to simulate real monocyte clusters. A central component of the protocol is the ‘Rosetta-Routine’, a novel codebase which deciphers the statistical properties of real data and translates them into the computational coefficients required to generate accurate cluster-based synthetic replicates. This innovative approach ensures that the synthetic datasets faithfully represent the statistical characteristics of real-world data while retaining the benefits of computational traceability. This approach addresses a critical gap in current practices by enabling the ability to provide a controlled and reproducible validation framework for assessing clustering methods applied to analyse FC data. These features allow the ability to score and subsequently enhance the analysis confidence in many FC applications such as in diagnostics or in ‘mock-up’ training scenarios. Future synthetic-data-driven enhancements in FC analysis confidence will translate into more accurate clinical decision-making and subsequent overall improvements in patient care.Author summary: In this study, we introduce a new method for generating realistic synthetic flow cytometry cell clusters. We utilise a series of robust algorithms and modelling functions to accurately translate data properties from real sample cell clusters into a cluster generator to computationally generate a synthetic replication of the real cluster. The approach demonstrates statistical and visual similarities between the synthetic and original real complex biological clusters, with close alignment in the forward and side scatter graph axes often crucial for initial cell characterisation in flow cytometry. These results display promising future applications in flow cytometry analysis with an end goal of helping to facilitate more consistent diagnostics in clinical and industrial settings. Our paper contributes to the field by offering a unique method for generating synthetic data which statistically mirrors real-world measurements. Thereby providing novel opportunities to evaluate manual and automated cluster analysis methodologies.
Suggested Citation
Bradley Mason & Laura Justham & Liam Whitby & Alison Whitby & Stuart Scott & Samuel Nti & Jon Petzing, 2026.
"Fully synthetic replication of complex real biological cell clusters using a novel cluster-based ‘Rosetta-Routine’ computational modelling process,"
PLOS Computational Biology, Public Library of Science, vol. 22(5), pages 1-30, May.
Handle:
RePEc:plo:pcbi00:1014280
DOI: 10.1371/journal.pcbi.1014280
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1014280. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.