IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v15y2024i1d10.1038_s41467-024-46614-z.html
   My bibliography  Save this article

Tradeoffs in alignment and assembly-based methods for structural variant detection with long-read sequencing data

Author

Listed:
  • Yichen Henry Liu

    (Vanderbilt University)

  • Can Luo

    (Vanderbilt University)

  • Staunton G. Golding

    (Vanderbilt University)

  • Jacob B. Ioffe

    (Vanderbilt University)

  • Xin Maizie Zhou

    (Vanderbilt University
    Vanderbilt University
    Vanderbilt University)

Abstract

Long-read sequencing offers long contiguous DNA fragments, facilitating diploid genome assembly and structural variant (SV) detection. Efficient and robust algorithms for SV identification are crucial with increasing data availability. Alignment-based methods, favored for their computational efficiency and lower coverage requirements, are prominent. Alternative approaches, relying solely on available reads for de novo genome assembly and employing assembly-based tools for SV detection via comparison to a reference genome, demand significantly more computational resources. However, the lack of comprehensive benchmarking constrains our comprehension and hampers further algorithm development. Here we systematically compare 14 read alignment-based SV calling methods (including 4 deep learning-based methods and 1 hybrid method), and 4 assembly-based SV calling methods, alongside 4 upstream aligners and 7 assemblers. Assembly-based tools excel in detecting large SVs, especially insertions, and exhibit robustness to evaluation parameter changes and coverage fluctuations. Conversely, alignment-based tools demonstrate superior genotyping accuracy at low sequencing coverage (5-10×) and excel in detecting complex SVs, like translocations, inversions, and duplications. Our evaluation provides performance insights, highlighting the absence of a universally superior tool. We furnish guidelines across 31 criteria combinations, aiding users in selecting the most suitable tools for diverse scenarios and offering directions for further method development.

Suggested Citation

  • Yichen Henry Liu & Can Luo & Staunton G. Golding & Jacob B. Ioffe & Xin Maizie Zhou, 2024. "Tradeoffs in alignment and assembly-based methods for structural variant detection with long-read sequencing data," Nature Communications, Nature, vol. 15(1), pages 1-22, December.
  • Handle: RePEc:nat:natcom:v:15:y:2024:i:1:d:10.1038_s41467-024-46614-z
    DOI: 10.1038/s41467-024-46614-z
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-024-46614-z
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-024-46614-z?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Jingwen Ren & Mark J P Chaisson, 2021. "lra: A long read aligner for sequences and contigs," PLOS Computational Biology, Public Library of Science, vol. 17(6), pages 1-23, June.
    2. Yu Chen & Amy Y. Wang & Courtney A. Barkley & Yixin Zhang & Xinyang Zhao & Min Gao & Mick D. Edmonds & Zechen Chong, 2023. "Deciphering the exact breakpoints of structural variations using long sequencing reads with DeBreak," Nature Communications, Nature, vol. 14(1), pages 1-12, December.
    3. Mircea Cretu Stancu & Markus J. Roosmalen & Ivo Renkens & Marleen M. Nieboer & Sjors Middelkamp & Joep Ligt & Giulia Pregno & Daniela Giachino & Giorgia Mandrile & Jose Espejo Valle-Inclan & Jerome Ko, 2017. "Mapping and phasing of structural variation in patient genomes using nanopore sequencing," Nature Communications, Nature, vol. 8(1), pages 1-13, December.
    4. Peter H. Sudmant & Tobias Rausch & Eugene J. Gardner & Robert E. Handsaker & Alexej Abyzov & John Huddleston & Yan Zhang & Kai Ye & Goo Jun & Markus Hsi-Yang Fritz & Miriam K. Konkel & Ankit Malhotra , 2015. "An integrated map of structural variation in 2,504 human genomes," Nature, Nature, vol. 526(7571), pages 75-81, October.
    5. Richard Redon & Shumpei Ishikawa & Karen R. Fitch & Lars Feuk & George H. Perry & T. Daniel Andrews & Heike Fiegler & Michael H. Shapero & Andrew R. Carson & Wenwei Chen & Eun Kyung Cho & Stephanie Da, 2006. "Global variation in copy number in the human genome," Nature, Nature, vol. 444(7118), pages 444-454, November.
    6. Daniel C. Jeffares & Clemency Jolly & Mimoza Hoti & Doug Speed & Liam Shaw & Charalampos Rallis & Francois Balloux & Christophe Dessimoz & Jürg Bähler & Fritz J. Sedlazeck, 2017. "Transient structural variations have strong effects on quantitative traits and reproductive isolation in fission yeast," Nature Communications, Nature, vol. 8(1), pages 1-11, April.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Yu Chen & Amy Y. Wang & Courtney A. Barkley & Yixin Zhang & Xinyang Zhao & Min Gao & Mick D. Edmonds & Zechen Chong, 2023. "Deciphering the exact breakpoints of structural variations using long sequencing reads with DeBreak," Nature Communications, Nature, vol. 14(1), pages 1-12, December.
    2. Xiaoling Tong & Min-Jin Han & Kunpeng Lu & Shuaishuai Tai & Shubo Liang & Yucheng Liu & Hai Hu & Jianghong Shen & Anxing Long & Chengyu Zhan & Xin Ding & Shuo Liu & Qiang Gao & Bili Zhang & Linli Zhou, 2022. "High-resolution silkworm pan-genome provides genetic insights into artificial selection and ecological adaptation," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
    3. Marsha M. Wheeler & Adrienne M. Stilp & Shuquan Rao & Bjarni V. Halldórsson & Doruk Beyter & Jia Wen & Anna V. Mihkaylova & Caitlin P. McHugh & John Lane & Min-Zhi Jiang & Laura M. Raffield & Goo Jun , 2022. "Whole genome sequencing identifies structural variants contributing to hematologic traits in the NHLBI TOPMed program," Nature Communications, Nature, vol. 13(1), pages 1-18, December.
    4. Zhikun Wu & Zehang Jiang & Tong Li & Chuanbo Xie & Liansheng Zhao & Jiaqi Yang & Shuai Ouyang & Yizhi Liu & Tao Li & Zhi Xie, 2021. "Structural variants in the Chinese population and their impact on phenotypes, diseases and population adaptation," Nature Communications, Nature, vol. 12(1), pages 1-12, December.
    5. Albertas Dvirnas & Callum Stewart & Vilhelm Müller & Santosh Kumar Bikkarolla & Karolin Frykholm & Linus Sandegren & Erik Kristiansson & Fredrik Westerlund & Tobias Ambjörnsson, 2021. "Detection of structural variations in densely-labelled optical DNA barcodes: A hidden Markov model approach," PLOS ONE, Public Library of Science, vol. 16(11), pages 1-15, November.
    6. Liyuan Zhou & Qiongzi Qiu & Qing Zhou & Jianwei Li & Mengqian Yu & Kezhen Li & Lingling Xu & Xiaohui Ke & Haiming Xu & Bingjian Lu & Hui Wang & Weiguo Lu & Pengyuan Liu & Yan Lu, 2022. "Long-read sequencing unveils high-resolution HPV integration and its oncogenic progression in cervical cancer," Nature Communications, Nature, vol. 13(1), pages 1-18, December.
    7. Jinhyun Kim & Sungsik Kim & Huiran Yeom & Seo Woo Song & Kyoungseob Shin & Sangwook Bae & Han Suk Ryu & Ji Young Kim & Ahyoun Choi & Sumin Lee & Taehoon Ryu & Yeongjae Choi & Hamin Kim & Okju Kim & Yu, 2023. "Barcoded multiple displacement amplification for high coverage sequencing in spatial genomics," Nature Communications, Nature, vol. 14(1), pages 1-18, December.
    8. Jae Eun Lee & Jung Hye Sung & Daniel Sarpong & Jimmy T. Efird & Paul B. Tchounwou & Elizabeth Ofili & Keith Norris, 2018. "Knowledge Management for Fostering Biostatistical Collaboration within a Research Network: The RTRN Case Study," IJERPH, MDPI, vol. 15(11), pages 1-13, November.
    9. M. Mahmoud & Y. Huang & K. Garimella & P. A. Audano & W. Wan & N. Prasad & R. E. Handsaker & S. Hall & A. Pionzio & M. C. Schatz & M. E. Talkowski & E. E. Eichler & S. E. Levy & F. J. Sedlazeck, 2024. "Utility of long-read sequencing for All of Us," Nature Communications, Nature, vol. 15(1), pages 1-13, December.
    10. Yingyan Yu & Zhen Zhang & Xiaorui Dong & Ruixin Yang & Zhongqu Duan & Zhen Xiang & Jun Li & Guichao Li & Fazhe Yan & Hongzhang Xue & Du Jiao & Jinyuan Lu & Huimin Lu & Wenmin Zhang & Yangzhen Wei & Sh, 2022. "Pangenomic analysis of Chinese gastric cancer," Nature Communications, Nature, vol. 13(1), pages 1-13, December.
    11. Yang Guo & Shuzhen Wang & A. K. Alvi Haque & Xiguo Yuan, 2022. "WAVECNV: A New Approach for Detecting Copy Number Variation by Wavelet Clustering," Mathematics, MDPI, vol. 10(12), pages 1-11, June.
    12. Zeyu Zheng & Mingjia Zhu & Jin Zhang & Xinfeng Liu & Liqiang Hou & Wenyu Liu & Shuai Yuan & Changhong Luo & Xinhao Yao & Jianquan Liu & Yongzhi Yang, 2024. "A sequence-aware merger of genomic structural variations at population scale," Nature Communications, Nature, vol. 15(1), pages 1-9, December.
    13. Ramesh Rajaby & Dong-Xu Liu & Chun Hang Au & Yuen-Ting Cheung & Amy Yuet Ting Lau & Qing-Yong Yang & Wing-Kin Sung, 2023. "INSurVeyor: improving insertion calling from short read sequencing data," Nature Communications, Nature, vol. 14(1), pages 1-13, December.
    14. Yoshitaka Sakamoto & Shuhei Miyake & Miho Oka & Akinori Kanai & Yosuke Kawai & Satoi Nagasawa & Yuichi Shiraishi & Katsushi Tokunaga & Takashi Kohno & Masahide Seki & Yutaka Suzuki & Ayako Suzuki, 2022. "Phasing analysis of lung cancer genomes using a long read sequencer," Nature Communications, Nature, vol. 13(1), pages 1-17, December.
    15. Xue Gao & Sheng Wang & Yan-Fen Wang & Shuang Li & Shi-Xin Wu & Rong-Ge Yan & Yi-Wen Zhang & Rui-Dong Wan & Zhen He & Ren-De Song & Xin-Quan Zhao & Dong-Dong Wu & Qi-En Yang, 2022. "Long read genome assemblies complemented by single cell RNA-sequencing reveal genetic and cellular mechanisms underlying the adaptive evolution of yak," Nature Communications, Nature, vol. 13(1), pages 1-14, December.
    16. Cristian Groza & Carl Schwendinger-Schreck & Warren A. Cheung & Emily G. Farrow & Isabelle Thiffault & Juniper Lake & William B. Rizzo & Gilad Evrony & Tom Curran & Guillaume Bourque & Tomi Pastinen, 2024. "Pangenome graphs improve the analysis of structural variants in rare genetic diseases," Nature Communications, Nature, vol. 15(1), pages 1-12, December.
    17. Heyang Cui & Yong Zhou & Fang Wang & Caixia Cheng & Weimin Zhang & Ruifang Sun & Ling Zhang & Yanghui Bi & Min Guo & Yan Zhou & Xinhui Wang & Jiaxin Ren & Ruibing Bai & Ning Ding & Chen Cheng & Longlo, 2022. "Characterization of somatic structural variations in 528 Chinese individuals with Esophageal squamous cell carcinoma," Nature Communications, Nature, vol. 13(1), pages 1-18, December.
    18. Qiliang Ding & Matthew M. Edwards & Ning Wang & Xiang Zhu & Alexa N. Bracci & Michelle L. Hulke & Ya Hu & Yao Tong & Joyce Hsiao & Christine J. Charvet & Sulagna Ghosh & Robert E. Handsaker & Kevin Eg, 2021. "The genetic architecture of DNA replication timing in human pluripotent stem cells," Nature Communications, Nature, vol. 12(1), pages 1-18, December.
    19. Shravan Leonard-Murali & Chetana Bhaskarla & Ghanshyam S. Yadav & Sudeep K. Maurya & Chenna R. Galiveti & Joshua A. Tobin & Rachel J. Kann & Eishan Ashwat & Patrick S. Murphy & Anish B. Chakka & Visha, 2024. "Uveal melanoma immunogenomics predict immunotherapy resistance and susceptibility," Nature Communications, Nature, vol. 15(1), pages 1-17, December.
    20. Naser Ansari-Pour & Yonglan Zheng & Toshio F. Yoshimatsu & Ayodele Sanni & Mustapha Ajani & Jean-Baptiste Reynier & Avraam Tapinos & Jason J. Pitt & Stefan Dentro & Anna Woodard & Padma Sheila Rajagop, 2021. "Whole-genome analysis of Nigerian patients with breast cancer reveals ethnic-driven somatic evolution and distinct genomic subtypes," Nature Communications, Nature, vol. 12(1), pages 1-15, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:15:y:2024:i:1:d:10.1038_s41467-024-46614-z. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.