Author
Listed:
- Eric V Strobl
- Eric R Gamazon
Abstract
Root causal genes correspond to the first gene expression levels perturbed during pathogenesis by genetic or non-genetic factors. Targeting root causal genes has the potential to alleviate disease entirely by eliminating pathology near its onset. No existing algorithm has been designed to discover root causal genes from observational data alone. We therefore propose the Transcriptome-Wide Root Causal Inference (TWRCI) algorithm that identifies root causal genes and their causal graph using a combination of genetic variant and unperturbed bulk RNA sequencing data. TWRCI uses a novel competitive regression procedure to annotate cis and trans-genetic variants to the gene expression levels they directly cause. The algorithm simultaneously determines the sequence in which gene expression changes propagate through the system to pinpoint the underlying causal graph and estimate root causal effects. TWRCI outperforms alternative approaches across a diverse group of metrics by directly targeting root causal genes while accounting for distal relations, linkage disequilibrium, patient heterogeneity and widespread pleiotropy. We demonstrate the algorithm by uncovering the root causal mechanisms of two complex diseases, which we confirm by replication using independent genome-wide summary statistics.Author summary: Many diseases progress through causal chains. The earliest step detectable in gene expression is a small set of root causal genes: expression levels that change first after genetic or non-genetic triggers. Because gene expression is relatively easy to perturb, focusing on these early changes offers a tractable route to stopping disease with a sparse set of interventions. Yet most existing tools either require expensive perturbation screens or fail to distinguish true early causes from downstream consequences. Transcriptome-Wide Root Causal Inference (TWRCI) uses widely available genotype data and bulk RNA-seq to identify these first expression events and quantify their patient-specific effects. TWRCI assigns each genetic variant to the single target it most directly influences—either a gene or the disease outcome—via a head-to-head prediction test, reconstructs the causal chain among genes, and estimates each gene’s patient-specific root causal effect, integrating genetic and non-genetic drivers into an interpretable effect size. In simulations and two diseases, TWRCI outperformed alternatives, recovered compact sets of early-acting genes consistent with known biology, detected variants that act directly on disease outside expression, and replicated across cohorts. Most variation in root causal effects was non-genetic, pointing to environmental triggers.
Suggested Citation
Eric V Strobl & Eric R Gamazon, 2025.
"Transcriptome-wide root causal inference,"
PLOS Computational Biology, Public Library of Science, vol. 21(9), pages 1-35, September.
Handle:
RePEc:plo:pcbi00:1013461
DOI: 10.1371/journal.pcbi.1013461
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1013461. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.