Author
Listed:
- Nathan Fortier
- Gabe Rudy
- Andreas Scherer
Abstract
SpliceAI is the leading tool for predicting splice-altering variants, but restrictive licensing limits clinical adoption. While open-source implementations have been published with author-reported comparisons, independent benchmarking across diverse datasets is needed to establish equivalence. We compared the original SpliceAI with two open-source implementations (OpenSpliceAI and CI-SpliceAI) and a legacy ensemble baseline across six datasets: a curated set of 1,316 validated variants, 213 variants with splice-assay data, 99,601 variants from the SPiP splicing prediction study, 242 manually curated deep intronic pathogenic variants, and two ClinVar-derived datasets comprising 53,600 intronic variants and 58,064 variants spanning all genomic contexts. The deep learning models were also evaluated against an ensemble of four legacy splice-prediction tools. Across all datasets, the deep learning algorithms outperformed the legacy ensemble. All three deep learning algorithms showed similar performance on the larger datasets dominated by canonical splice site variants (balanced accuracies 0.889-0.977). On the deep intronic benchmark, the original SpliceAI achieved the highest balanced accuracy (0.940), outperforming both CI-SpliceAI (0.890) and OpenSpliceAI (0.841). Critically, optimal thresholds for deep intronic variants were an order of magnitude lower than standard recommendations, indicating that default thresholds would miss the majority of pathogenic deep intronic variants. A correlation analysis showed that CI-SpliceAI maintained balanced concordance across event types, whereas OpenSpliceAI showed stronger correlation for loss events than gain events. Both implementations showed high positional agreement with SpliceAI, with exact splice-site match rates exceeding 90% across event types. Together, these results demonstrate that both open-source reimplementations of SpliceAI successfully reproduce the predictive behavior of the original algorithm across multiple evaluation contexts, while consistently outperforming traditional splice prediction methods. However, performance diverges on deeply intronic variants, and standard score thresholds are poorly calibrated for this variant class regardless of algorithm choice.
Suggested Citation
Nathan Fortier & Gabe Rudy & Andreas Scherer, 2026.
"Analyzing the performance of deep learning splice prediction algorithms,"
PLOS ONE, Public Library of Science, vol. 21(5), pages 1-16, May.
Handle:
RePEc:plo:pone00:0348885
DOI: 10.1371/journal.pone.0348885
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0348885. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.