IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v15y2024i1d10.1038_s41467-023-44323-7.html
   My bibliography  Save this article

Deep learning-driven fragment ion series classification enables highly precise and sensitive de novo peptide sequencing

Author

Listed:
  • Daniela Klaproth-Andrade

    (Technical University of Munich
    Technical University of Munich)

  • Johannes Hingerl

    (Technical University of Munich)

  • Yanik Bruns

    (Technical University of Munich)

  • Nicholas H. Smith

    (Technical University of Munich)

  • Jakob Träuble

    (Technical University of Munich)

  • Mathias Wilhelm

    (Technical University of Munich
    Technical University of Munich)

  • Julien Gagneur

    (Technical University of Munich
    Technical University of Munich
    Technical University of Munich
    Helmholtz Center Munich)

Abstract

Unlike for DNA and RNA, accurate and high-throughput sequencing methods for proteins are lacking, hindering the utility of proteomics in applications where the sequences are unknown including variant calling, neoepitope identification, and metaproteomics. We introduce Spectralis, a de novo peptide sequencing method for tandem mass spectrometry. Spectralis leverages several innovations including a convolutional neural network layer connecting peaks in spectra spaced by amino acid masses, proposing fragment ion series classification as a pivotal task for de novo peptide sequencing, and a peptide-spectrum confidence score. On spectra for which database search provided a ground truth, Spectralis surpassed 40% sensitivity at 90% precision, nearly doubling state-of-the-art sensitivity. Application to unidentified spectra confirmed its superiority and showcased its applicability to variant calling. Altogether, these algorithmic innovations and the substantial sensitivity increase in the high-precision range constitute an important step toward broadly applicable peptide sequencing.

Suggested Citation

  • Daniela Klaproth-Andrade & Johannes Hingerl & Yanik Bruns & Nicholas H. Smith & Jakob Träuble & Mathias Wilhelm & Julien Gagneur, 2024. "Deep learning-driven fragment ion series classification enables highly precise and sensitive de novo peptide sequencing," Nature Communications, Nature, vol. 15(1), pages 1-14, December.
  • Handle: RePEc:nat:natcom:v:15:y:2024:i:1:d:10.1038_s41467-023-44323-7
    DOI: 10.1038/s41467-023-44323-7
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-023-44323-7
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-023-44323-7?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Mathias Wilhelm & Daniel P. Zolg & Michael Graber & Siegfried Gessulat & Tobias Schmidt & Karsten Schnatbaum & Celina Schwencke-Westphal & Philipp Seifert & Niklas Andrade Krätzig & Johannes Zerweck &, 2021. "Deep learning boosts sensitivity of mass spectrometry-based immunopeptidomics," Nature Communications, Nature, vol. 12(1), pages 1-12, December.
    2. Mathias Wilhelm & Daniel P. Zolg & Michael Graber & Siegfried Gessulat & Tobias Schmidt & Karsten Schnatbaum & Celina Schwencke-Westphal & Philipp Seifert & Niklas Andrade Krätzig & Johannes Zerweck &, 2021. "Author Correction: Deep learning boosts sensitivity of mass spectrometry-based immunopeptidomics," Nature Communications, Nature, vol. 12(1), pages 1-1, December.
    3. Michael Lawrence & Wolfgang Huber & Hervé Pagès & Patrick Aboyoun & Marc Carlson & Robert Gentleman & Martin T Morgan & Vincent J Carey, 2013. "Software for Computing and Annotating Genomic Ranges," PLOS Computational Biology, Public Library of Science, vol. 9(8), pages 1-10, August.
    4. Konrad J. Karczewski & Laurent C. Francioli & Grace Tiao & Beryl B. Cummings & Jessica Alföldi & Qingbo Wang & Ryan L. Collins & Kristen M. Laricchia & Andrea Ganna & Daniel P. Birnbaum & Laura D. Gau, 2020. "The mutational constraint spectrum quantified from variation in 141,456 humans," Nature, Nature, vol. 581(7809), pages 434-443, May.
    5. Ruedi Aebersold & Matthias Mann, 2016. "Mass-spectrometric exploration of proteome structure and function," Nature, Nature, vol. 537(7620), pages 347-355, September.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Yi Yang & Qun Fang, 2024. "Prediction of glycopeptide fragment mass spectra by deep learning," Nature Communications, Nature, vol. 15(1), pages 1-12, December.
    2. Wen-Feng Zeng & Xie-Xuan Zhou & Sander Willems & Constantin Ammar & Maria Wahle & Isabell Bludau & Eugenia Voytik & Maximillian T. Strauss & Matthias Mann, 2022. "AlphaPeptDeep: a modular deep learning framework to predict peptide properties for proteomics," Nature Communications, Nature, vol. 13(1), pages 1-14, December.
    3. Celina Tretter & Niklas Andrade Krätzig & Matteo Pecoraro & Sebastian Lange & Philipp Seifert & Clara Frankenberg & Johannes Untch & Gabriela Zuleger & Mathias Wilhelm & Daniel P. Zolg & Florian S. Dr, 2023. "Proteogenomic analysis reveals RNA as a source for tumor-agnostic neoantigen identification," Nature Communications, Nature, vol. 14(1), pages 1-22, December.
    4. Kevin L. Yang & Fengchao Yu & Guo Ci Teo & Kai Li & Vadim Demichev & Markus Ralser & Alexey I. Nesvizhskii, 2023. "MSBooster: improving peptide identification rates using deep learning-based features," Nature Communications, Nature, vol. 14(1), pages 1-14, December.
    5. David Gomez-Zepeda & Danielle Arnold-Schild & Julian Beyrle & Arthur Declercq & Ralf Gabriels & Elena Kumm & Annica Preikschat & Mateusz Krzysztof Łącki & Aurélie Hirschler & Jeewan Babu Rijal & Chris, 2024. "Thunder-DDA-PASEF enables high-coverage immunopeptidomics and is boosted by MS2Rescore with MS2PIP timsTOF fragmentation prediction model," Nature Communications, Nature, vol. 15(1), pages 1-18, December.
    6. Charlotte Adams & Wassim Gabriel & Kris Laukens & Mario Picciani & Mathias Wilhelm & Wout Bittremieux & Kurt Boonen, 2024. "Fragment ion intensity prediction improves the identification rate of non-tryptic peptides in timsTOF," Nature Communications, Nature, vol. 15(1), pages 1-11, December.
    7. Tara N. Yankee & Sungryong Oh & Emma Wentworth Winchester & Andrea Wilderman & Kelsey Robinson & Tia Gordon & Jill A. Rosenfeld & Jennifer VanOudenhove & Daryl A. Scott & Elizabeth J. Leslie & Justin , 2023. "Integrative analysis of transcriptome dynamics during human craniofacial development identifies candidate disease genes," Nature Communications, Nature, vol. 14(1), pages 1-23, December.
    8. Lei Xin & Rui Qiao & Xin Chen & Hieu Tran & Shengying Pan & Sahar Rabinoviz & Haibo Bian & Xianliang He & Brenton Morse & Baozhen Shan & Ming Li, 2022. "A streamlined platform for analyzing tera-scale DDA and DIA mass spectrometry data enables highly sensitive immunopeptidomics," Nature Communications, Nature, vol. 13(1), pages 1-9, December.
    9. Gerard Llimos & Vincent Gardeux & Ute Koch & Judith F. Kribelbauer & Antonina Hafner & Daniel Alpern & Joern Pezoldt & Maria Litovchenko & Julie Russeil & Riccardo Dainese & Riccardo Moia & Abdurraouf, 2022. "A leukemia-protective germline variant mediates chromatin module formation via transcription factor nucleation," Nature Communications, Nature, vol. 13(1), pages 1-21, December.
    10. Weiping Sun & Qianqiu Zhang & Xiyue Zhang & Ngoc Hieu Tran & M. Ziaur Rahman & Zheng Chen & Chao Peng & Jun Ma & Ming Li & Lei Xin & Baozhen Shan, 2023. "Glycopeptide database search and de novo sequencing with PEAKS GlycanFinder enable highly sensitive glycoproteomics," Nature Communications, Nature, vol. 14(1), pages 1-15, December.
    11. Hanqing Liao & Carolina Barra & Zhicheng Zhou & Xu Peng & Isaac Woodhouse & Arun Tailor & Robert Parker & Alexia Carré & Persephone Borrow & Michael J. Hogan & Wayne Paes & Laurence C. Eisenlohr & Rob, 2024. "MARS an improved de novo peptide candidate selection method for non-canonical antigen target discovery in cancer," Nature Communications, Nature, vol. 15(1), pages 1-16, December.
    12. Vincent Michaud & Eulalie Lasseaux & David J. Green & Dave T. Gerrard & Claudio Plaisant & Tomas Fitzgerald & Ewan Birney & Benoît Arveiler & Graeme C. Black & Panagiotis I. Sergouniotis, 2022. "The contribution of common regulatory and protein-coding TYR variants to the genetic architecture of albinism," Nature Communications, Nature, vol. 13(1), pages 1-8, December.
    13. Alexendar R. Perez & Laura Sala & Richard K. Perez & Joana A. Vidigal, 2021. "CSC software corrects off-target mediated gRNA depletion in CRISPR-Cas9 essentiality screens," Nature Communications, Nature, vol. 12(1), pages 1-11, December.
    14. Poonam Dhillon & Kelly Ann Mulholland & Hailong Hu & Jihwan Park & Xin Sheng & Amin Abedini & Hongbo Liu & Allison Vassalotti & Junnan Wu & Katalin Susztak, 2023. "Increased levels of endogenous retroviruses trigger fibroinflammation and play a role in kidney disease development," Nature Communications, Nature, vol. 14(1), pages 1-20, December.
    15. Andreas Herchenröther & Stefanie Gossen & Tobias Friedrich & Alexander Reim & Nadine Daus & Felix Diegmüller & Jörg Leers & Hakimeh Moghaddas Sani & Sarah Gerstner & Leah Schwarz & Inga Stellmacher & , 2023. "The H2A.Z and NuRD associated protein HMG20A controls early head and heart developmental transcription programs," Nature Communications, Nature, vol. 14(1), pages 1-20, December.
    16. Teresa Maria Rosaria Noviello & Anna Maria Giacomo & Francesca Pia Caruso & Alessia Covre & Roberta Mortarini & Giovanni Scala & Maria Claudia Costa & Sandra Coral & Wolf H. Fridman & Catherine Sautès, 2023. "Guadecitabine plus ipilimumab in unresectable melanoma: five-year follow-up and integrated multi-omic analysis in the phase 1b NIBIT-M4 trial," Nature Communications, Nature, vol. 14(1), pages 1-18, December.
    17. Michel S. Naslavsky & Marilia O. Scliar & Guilherme L. Yamamoto & Jaqueline Yu Ting Wang & Stepanka Zverinova & Tatiana Karp & Kelly Nunes & José Ricardo Magliocco Ceroni & Diego Lima Carvalho & Carlo, 2022. "Whole-genome sequencing of 1,171 elderly admixed individuals from Brazil," Nature Communications, Nature, vol. 13(1), pages 1-11, December.
    18. Nicole Deflaux & Margaret Sunitha Selvaraj & Henry Robert Condon & Kelsey Mayo & Sara Haidermota & Melissa A. Basford & Chris Lunt & Anthony A. Philippakis & Dan M. Roden & Joshua C. Denny & Anjene Mu, 2023. "Demonstrating paths for unlocking the value of cloud genomics through cross cohort analysis," Nature Communications, Nature, vol. 14(1), pages 1-10, December.
    19. Andrea Wilderman & Eva D’haene & Machteld Baetens & Tara N. Yankee & Emma Wentworth Winchester & Nicole Glidden & Ellen Roets & Jo Dorpe & Sandra Janssens & Danny E. Miller & Miranda Galey & Kari M. B, 2024. "A distant global control region is essential for normal expression of anterior HOXA genes during mouse and human craniofacial development," Nature Communications, Nature, vol. 15(1), pages 1-23, December.
    20. Ruoyu Tian & Tian Ge & Hyeokmoon Kweon & Daniel B. Rocha & Max Lam & Jimmy Z. Liu & Kritika Singh & Daniel F. Levey & Joel Gelernter & Murray B. Stein & Ellen A. Tsai & Hailiang Huang & Christopher F., 2024. "Whole-exome sequencing in UK Biobank reveals rare genetic architecture for depression," Nature Communications, Nature, vol. 15(1), pages 1-12, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:15:y:2024:i:1:d:10.1038_s41467-023-44323-7. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.