Author
Listed:
- Dominik Witczak
- Krzysztof Sychla
- Julia Wysocka
- Artur Laskowski
- Wojciech Frohmberg
- Marta Glowacka
- Alicja Dzik
- Piotr Lukasiak
- Jacek Blazewicz
- Aleksandra Swiercz
Abstract
Genetic diversity is crucial for populations to adapt and survive in dynamic environments. This diversity arises from genetic mutations, which manifest in the genome as structural variants (SVs). Several types of SVs exist, but not all are equally easy to detect. Current SV detection tools tend to specialize in certain SV types or require the use of multiple tools to obtain a comprehensive variant profile, which increases computational cost and complexity. While some methods excel at identifying breakpoints, they often struggle with accurately classifying variant types, and their precision depends strongly on data quality and sequencing technology. At present, the majority of available genomic data originates from high-quality short reads, which remain the most affordable sequencing technology. In this manuscript, we introduce GrassSV, a novel and computationally efficient method that employs a hybrid pattern-matching approach to detect all major classes of structural variants using short-read sequencing data. GrassSV integrates depth-of-coverage analysis with contig-based pattern recognition to ensure both sensitivity and precision while minimizing false positives and runtime. Its robustness was demonstrated on the human Genome in a Bottle dataset, as well as on synthetic data derived from the yeast genome, where it achieved high accuracy across all SV types at a lower computational cost compared to existing methods. This makes GrassSV a practical alternative to multi-tool pipelines typically required for comprehensive SV detection. GrassSV is available at https://github.com/Domomod/GrassSV under GPL-3.0 license and the benchmark at: https://github.com/Domomod/GrassBenchmark.Author summary: Structural variants (SVs) are large genomic alterations that can profoundly influence gene function, regulation, and phenotype. Despite their biological importance, accurately detecting SVs from sequencing data remains a major computational challenge. Existing tools are often optimized for specific types of variants or rely on multiple algorithms to achieve full coverage, which increases computational cost and complexity. In this study, we present GrassSV, a hybrid approach for structural variant detection using short-read sequencing data. Our method combines coverage-based analysis with pattern recognition from assembled contigs, enabling comprehensive identification of deletions, insertions, inversions, duplications, and translocations within a single pipeline. We developed GrassSV to provide researchers with a practical, accurate, and efficient tool for large-scale genome analyses. We evaluated GrassSV on both synthetic and real datasets, showing that it detects all major SV types with high precision while reducing false positives and runtime compared to existing methods. By balancing accuracy and efficiency, GrassSV offers a cost-effective solution for genomic research and supports ongoing efforts to understand genetic variability in human and model organism populations.
Suggested Citation
Dominik Witczak & Krzysztof Sychla & Julia Wysocka & Artur Laskowski & Wojciech Frohmberg & Marta Glowacka & Alicja Dzik & Piotr Lukasiak & Jacek Blazewicz & Aleksandra Swiercz, 2026.
"GrassSV – hybrid method to detect structural variants in high throughput DNA-seq data,"
PLOS Computational Biology, Public Library of Science, vol. 22(6), pages 1-14, June.
Handle:
RePEc:plo:pcbi00:1014406
DOI: 10.1371/journal.pcbi.1014406
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1014406. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.