Author
Listed:
- Safiye Celik
- Jan-Christian Hütter
- Sandra Melo Carlos
- Nathan H Lazar
- Rahul Mohan
- Conor Tillinghast
- Tommaso Biancalani
- Marta M Fay
- Berton A Earnshaw
- Imran S Haque
Abstract
The continued scaling of genetic perturbation technologies combined with high-dimensional assays such as cellular microscopy and RNA-sequencing has enabled genome-scale reverse-genetics experiments that go beyond single-endpoint measurements of growth or lethality. Datasets emerging from these experiments can be combined to construct perturbative “maps of biology”, in which readouts from various manipulations (e.g., CRISPR-Cas9 knockout, CRISPRi knockdown, compound treatment) are placed in unified, relatable embedding spaces allowing for the generation of genome-scale sets of pairwise comparisons. These maps of biology capture known biological relationships and uncover new associations which can be used for downstream discovery tasks. Construction of these maps involves many technical choices in both experimental and computational protocols, motivating the design of benchmark procedures to evaluate map quality in a systematic, unbiased manner. Here, we (1) establish a standardized terminology for the steps involved in perturbative map building, (2) introduce key classes of benchmarks to assess the quality of such maps, (3) construct 18 maps from four genome-scale datasets employing different cell types, perturbation technologies, and data readout modalities, (4) generate benchmark metrics for the constructed maps and investigate the reasons for performance variations, and (5) demonstrate utility of these maps to discover new biology by suggesting roles for two largely uncharacterized genes.Author summary: Due to the rapid advancements in genetic perturbation, laboratory robotics, sequencing, and computer vision, more researchers are now generating datasets that capture cellular responses to genetic perturbations. These datasets can be powerful discovery tools for examining known biological relationships and revealing new associations in an unbiased manner when paired with a computational pipeline that can assemble the data into a digestible format. However, the challenge arises from the variety of cellular models, assay designs, terminologies, codebases, and analysis methods involved. In this work we define a unified framework for building and benchmarking perturbative maps, benchmark four different datasets assembled into 18 different maps, explore the impact of different design decisions, and demonstrate how these maps can be used to elucidate gene functions. Our goal is to facilitate comparisons across various technologies and methods by introducing a shared language for the field. The open-source codebase, capable of incorporating new methods, aims to be a resource for researchers developing laboratory or computational methodology. While we caution against definitive recommendations due to numerous variables at play, we hope to stimulate studies directly comparing methods under controlled conditions. Our framework can also help evaluate combining maps across modalities as the field progresses.
Suggested Citation
Safiye Celik & Jan-Christian Hütter & Sandra Melo Carlos & Nathan H Lazar & Rahul Mohan & Conor Tillinghast & Tommaso Biancalani & Marta M Fay & Berton A Earnshaw & Imran S Haque, 2024.
"Building, benchmarking, and exploring perturbative maps of transcriptional and morphological data,"
PLOS Computational Biology, Public Library of Science, vol. 20(10), pages 1-24, October.
Handle:
RePEc:plo:pcbi00:1012463
DOI: 10.1371/journal.pcbi.1012463
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1012463. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.