IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v16y2025i1d10.1038_s41467-025-59378-x.html
   My bibliography  Save this article

Restoring flowcell type and basecaller configuration from FASTQ files of nanopore sequencing data

Author

Listed:
  • Jun Mencius

    (Fudan University)

  • Wenjun Chen

    (Xinhua Hospital affiliated to Shanghai Jiao Tong University School of Medicine)

  • Youqi Zheng

    (Fudan University)

  • Tingyi An

    (Fudan University)

  • Yongguo Yu

    (Xinhua Hospital affiliated to Shanghai Jiao Tong University School of Medicine)

  • Kun Sun

    (Xinhua Hospital affiliated to Shanghai Jiao Tong University School of Medicine
    Xinhua Hospital affiliated to Shanghai Jiao Tong University School of Medicine)

  • Huijuan Feng

    (Fudan University)

  • Zhixing Feng

    (Xinhua Hospital affiliated to Shanghai Jiao Tong University School of Medicine)

Abstract

As nanopore sequencing has been widely adopted, data accumulation has surged, resulting in over 700,000 public datasets. While these data hold immense potential for advancing genomic research, their utility is compromised by the absence of flowcell type and basecaller configuration in about 85% of the data and associated publications. These parameters are essential for many analysis algorithms, and their misapplication can lead to significant drops in performance. To address this issue, we present LongBow, designed to infer flowcell type and basecaller configuration directly from the base quality value patterns of FASTQ files. LongBow has been tested on 66 in-house basecalled FAST5/POD5 datasets and 1989 public FASTQ datasets, achieving accuracies of 95.33% and 91.45%, respectively. We demonstrate its utility by reanalyzing nanopore sequencing data from the COVID-19 Genomics UK (COG-UK) project. The results show that LongBow is essential for reproducing reported genomic variants and, through a LongBow-based analysis pipeline, we discovered substantially more functionally important variants while improving accuracy in lineage assignment. Overall, LongBow is poised to play a critical role in maximizing the utility of public nanopore sequencing data, while significantly enhancing the reproducibility of related research.

Suggested Citation

  • Jun Mencius & Wenjun Chen & Youqi Zheng & Tingyi An & Yongguo Yu & Kun Sun & Huijuan Feng & Zhixing Feng, 2025. "Restoring flowcell type and basecaller configuration from FASTQ files of nanopore sequencing data," Nature Communications, Nature, vol. 16(1), pages 1-19, December.
  • Handle: RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-59378-x
    DOI: 10.1038/s41467-025-59378-x
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-025-59378-x
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-025-59378-x?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Mahan Ghafari & Matthew Hall & Tanya Golubchik & Daniel Ayoubkhani & Thomas House & George MacIntyre-Cockett & Helen R. Fryer & Laura Thomson & Anel Nurtay & Steven A. Kemp & Luca Ferretti & David Buc, 2024. "Prevalence of persistent SARS-CoV-2 in a large community surveillance study," Nature, Nature, vol. 626(8001), pages 1094-1101, February.
    2. Dinesh Aggarwal & Andrew J. Page & Ulf Schaefer & George M. Savva & Richard Myers & Erik Volz & Nicholas Ellaby & Steven Platt & Natalie Groves & Eileen Gallagher & Niamh M. Tumelty & Thanh Viet & Gar, 2022. "Genomic assessment of quarantine measures to prevent SARS-CoV-2 importation and transmission," Nature Communications, Nature, vol. 13(1), pages 1-13, December.
    3. Zhikun Wu & Zehang Jiang & Tong Li & Chuanbo Xie & Liansheng Zhao & Jiaqi Yang & Shuai Ouyang & Yizhi Liu & Tao Li & Zhi Xie, 2021. "Structural variants in the Chinese population and their impact on phenotypes, diseases and population adaptation," Nature Communications, Nature, vol. 12(1), pages 1-12, December.
    4. Yang Liu & Jianying Liu & Kenneth S. Plante & Jessica A. Plante & Xuping Xie & Xianwen Zhang & Zhiqiang Ku & Zhiqiang An & Dionna Scharton & Craig Schindewolf & Steven G. Widen & Vineet D. Menachery &, 2022. "The N501Y spike substitution enhances SARS-CoV-2 infection and transmission," Nature, Nature, vol. 602(7896), pages 294-299, February.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Aijing Feng & Sarah Bevins & Jeff Chandler & Thomas J. DeLiberto & Ria Ghai & Kristina Lantz & Julianna Lenoch & Adam Retchless & Susan Shriner & Cynthia Y. Tang & Suxiang Sue Tong & Mia Torchetti & A, 2023. "Transmission of SARS-CoV-2 in free-ranging white-tailed deer in the United States," Nature Communications, Nature, vol. 14(1), pages 1-17, December.
    2. Dillon S. McBride & Sofya K. Garushyants & John Franks & Andrew F. Magee & Steven H. Overend & Devra Huey & Amanda M. Williams & Seth A. Faith & Ahmed Kandeil & Sanja Trifkovic & Lance Miller & Trusha, 2023. "Accelerated evolution of SARS-CoV-2 in free-ranging white-tailed deer," Nature Communications, Nature, vol. 14(1), pages 1-15, December.
    3. Joanna Hui Juan Tan & Zhihui Li & Mar Gonzalez Porta & Ramesh Rajaby & Weng Khong Lim & Ye An Tan & Rodrigo Toro Jimenez & Renyi Teo & Maxime Hebrard & Jack Ling Ow & Shimin Ang & Justin Jeyakani & Ya, 2024. "A Catalogue of Structural Variation across Ancestrally Diverse Asian Genomes," Nature Communications, Nature, vol. 15(1), pages 1-15, December.
    4. Mingxi Li & Yifei Ren & Zhen Qin Aw & Bo Chen & Ziqing Yang & Yuqing Lei & Lin Cheng & Qingtai Liang & Junxian Hong & Yiling Yang & Jing Chen & Yi Hao Wong & Jing Wei & Sisi Shan & Senyan Zhang & Jiwa, 2022. "Broadly neutralizing and protective nanobodies against SARS-CoV-2 Omicron subvariants BA.1, BA.2, and BA.4/5 and diverse sarbecoviruses," Nature Communications, Nature, vol. 13(1), pages 1-17, December.
    5. Arthur Wickenhagen & Meaghan Flagg & Julia R. Port & Claude Kwe Yinda & Kerry Goldin & Shane Gallogly & Jonathan E. Schulz & Tessa Lutterman & Brandi N. Williamson & Franziska Kaiser & Reshma K. Mukes, 2025. "Evolution of Omicron lineage towards increased fitness in the upper respiratory tract in the absence of severe lung pathology," Nature Communications, Nature, vol. 16(1), pages 1-15, December.
    6. Jinlong Shi & Zhilong Jia & Jinxiu Sun & Xiaoreng Wang & Xiaojing Zhao & Chenghui Zhao & Fan Liang & Xinyu Song & Jiawei Guan & Xue Jia & Jing Yang & Qi Chen & Kang Yu & Qian Jia & Jing Wu & Depeng Wa, 2023. "Structural variants involved in high-altitude adaptation detected using single-molecule long-read sequencing," Nature Communications, Nature, vol. 14(1), pages 1-15, December.
    7. Stella Hartmann & Lisa Radochonski & Chengjin Ye & Luis Martinez-Sobrido & Jueqi Chen, 2025. "SARS-CoV-2 ORF3a drives dynamic dense body formation for optimal viral infectivity," Nature Communications, Nature, vol. 16(1), pages 1-22, December.
    8. Choi, Junhyeok & Park, Junpyo & Jang, Bongsoo, 2024. "Exploring the interplay of biodiversity and mutation in cyclic competition systems," Chaos, Solitons & Fractals, Elsevier, vol. 189(P1).
    9. Xiaoling Tong & Min-Jin Han & Kunpeng Lu & Shuaishuai Tai & Shubo Liang & Yucheng Liu & Hai Hu & Jianghong Shen & Anxing Long & Chengyu Zhan & Xin Ding & Shuo Liu & Qiang Gao & Bili Zhang & Linli Zhou, 2022. "High-resolution silkworm pan-genome provides genetic insights into artificial selection and ecological adaptation," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
    10. Jiao Gong & Huiru Sun & Kaiyuan Wang & Yanhui Zhao & Yechao Huang & Qinsheng Chen & Hui Qiao & Yang Gao & Jialin Zhao & Yunchao Ling & Ruifang Cao & Jingze Tan & Qi Wang & Yanyun Ma & Jing Li & Jingch, 2025. "Long-read sequencing of 945 Han individuals identifies structural variants associated with phenotypic diversity and disease susceptibility," Nature Communications, Nature, vol. 16(1), pages 1-21, December.
    11. Mark P. Khurana & Jacob Curran-Sebastian & Neil Scheidwasser & Christian Morgenstern & Morten Rasmussen & Jannik Fonager & Marc Stegger & Man-Hung Eric Tang & Jonas L. Juul & Leandro Andrés Escobar-He, 2024. "High-resolution epidemiological landscape from ~290,000 SARS-CoV-2 genomes from Denmark," Nature Communications, Nature, vol. 15(1), pages 1-16, December.
    12. Wenjuan Dong & Jing Wang & Lei Tian & Jianying Zhang & Erik W. Settles & Chao Qin & Daniel R. Steinken-Kollath & Ashley N. Itogawa & Kimberly R. Celona & Jinhee Yi & Mitchell Bryant & Heather Mead & S, 2023. "Factor Xa cleaves SARS-CoV-2 spike protein to block viral entry and infection," Nature Communications, Nature, vol. 14(1), pages 1-18, December.
    13. Rúbens Prince dos Santos Alves & Julia Timis & Robyn Miller & Kristen Valentine & Paolla Beatriz Almeida Pinto & Andrew Gonzalez & Jose Angel Regla-Nava & Erin Maule & Michael N. Nguyen & Norazizah Sh, 2024. "Human coronavirus OC43-elicited CD4+ T cells protect against SARS-CoV-2 in HLA transgenic mice," Nature Communications, Nature, vol. 15(1), pages 1-20, December.
    14. Chang Liu & Raksha Das & Aiste Dijokaite-Guraliuc & Daming Zhou & Alexander J. Mentzer & Piyada Supasa & Muneeswaran Selvaraj & Helen M. E. Duyvesteyn & Thomas G. Ritter & Nigel Temperton & Paul Klene, 2024. "Emerging variants develop total escape from potent monoclonal antibodies induced by BA.4/5 infection," Nature Communications, Nature, vol. 15(1), pages 1-10, December.
    15. Laura Heydemann & Małgorzata Ciurkiewicz & Theresa Störk & Isabel Zdora & Kirsten Hülskötter & Katharina Manuela Gregor & Lukas Mathias Michaely & Wencke Reineking & Tom Schreiner & Georg Beythien & A, 2025. "Respiratory long COVID in aged hamsters features impaired lung function post-exercise with bronchiolization and fibrosis," Nature Communications, Nature, vol. 16(1), pages 1-24, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-59378-x. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.