IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v16y2025i1d10.1038_s41467-025-64547-z.html
   My bibliography  Save this article

ClairS-TO: a deep-learning method for long-read tumor-only somatic small variant calling

Author

Listed:
  • Lei Chen

    (The University of Hong Kong)

  • Zhenxian Zheng

    (The University of Hong Kong)

  • Junhao Su

    (The University of Hong Kong)

  • Xian Yu

    (The University of Hong Kong)

  • Angel On Ki Wong

    (The University of Hong Kong)

  • Jingcheng Zhang

    (The University of Hong Kong)

  • Yan-Lam Lee

    (The University of Hong Kong)

  • Ruibang Luo

    (The University of Hong Kong)

Abstract

Accurate detection of somatic variants in tumors is of critical importance and remains challenging. Current methods typically require matched normal samples for reliable detection, which are often unavailable in real-world research and clinical scenarios. Without a matched normal sample, more proficient algorithms are required to distinguish true somatic variants from germline variants and technical artifacts. However, existing tumor-only somatic variant callers that were designed for short-read sequencing data are not able to work well with long-read data. To fill the gap, we present ClairS-TO, a deep-learning-based method for long-read tumor-only somatic variant calling. ClairS-TO uses an ensemble of two disparate neural networks trained from the same samples but for opposite tasks—how likely/not likely a candidate is a somatic variant. Benchmarks using COLO829 and HCC1395 cancer cell lines show that ClairS-TO outperforms DeepSomatic and smrest in ONT and PacBio long-read data. ClairS-TO is also applicable to short-read data and outperforms Mutect2, Octopus, Pisces, and DeepSomatic. Extensive experiments across various sequencing coverages, variant allelic fractions, and tumor purities support that ClairS-TO is a reliable tool for somatic variant discovery. ClairS-TO is open-source, available at https://github.com/HKU-BAL/ClairS-TO .

Suggested Citation

  • Lei Chen & Zhenxian Zheng & Junhao Su & Xian Yu & Angel On Ki Wong & Jingcheng Zhang & Yan-Lam Lee & Ruibang Luo, 2025. "ClairS-TO: a deep-learning method for long-read tumor-only somatic small variant calling," Nature Communications, Nature, vol. 16(1), pages 1-15, December.
  • Handle: RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-64547-z
    DOI: 10.1038/s41467-025-64547-z
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-025-64547-z
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-025-64547-z?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Shilpa Garg, 2023. "Towards routine chromosome-scale haplotype-resolved reconstruction in cancer genomics," Nature Communications, Nature, vol. 14(1), pages 1-11, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Leanne M. Brown & Ryan A. Hagenson & Tilen Koklič & Iztok Urbančič & Lu Qiao & Janez Strancar & Jason M. Sheltzer, 2024. "An elevated rate of whole-genome duplications in cancers from Black patients," Nature Communications, Nature, vol. 15(1), pages 1-18, December.
    2. Qian Zhou & Fahu Ji & Dongxiao Lin & Xianming Liu & Zexuan Zhu & Jue Ruan, 2024. "KSNP: a fast de Bruijn graph-based haplotyping tool approaching data-in time cost," Nature Communications, Nature, vol. 15(1), pages 1-7, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-64547-z. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.