IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v14y2023i1d10.1038_s41467-023-37446-4.html
   My bibliography  Save this article

MS2Query: reliable and scalable MS2 mass spectra-based analogue search

Author

Listed:
  • Niek F. de Jonge

    (Wageningen University & Research)

  • Joris J. R. Louwen

    (Wageningen University & Research)

  • Elena Chekmeneva

    (Digestion and Reproduction, Faculty of Medicine, Imperial College London, Hammersmith Hospital Campus)

  • Stephane Camuzeaux

    (Digestion and Reproduction, Faculty of Medicine, Imperial College London, Hammersmith Hospital Campus)

  • Femke J. Vermeir

    (Radboud University)

  • Robert S. Jansen

    (Radboud University)

  • Florian Huber

    (University of Applied Sciences Düsseldorf)

  • Justin J. J. van der Hooft

    (Wageningen University & Research
    University of Johannesburg, Auckland Park)

Abstract

Metabolomics-driven discoveries of biological samples remain hampered by the grand challenge of metabolite annotation and identification. Only few metabolites have an annotated spectrum in spectral libraries; hence, searching only for exact library matches generally returns a few hits. An attractive alternative is searching for so-called analogues as a starting point for structural annotations; analogues are library molecules which are not exact matches but display a high chemical similarity. However, current analogue search implementations are not yet very reliable and relatively slow. Here, we present MS2Query, a machine learning-based tool that integrates mass spectral embedding-based chemical similarity predictors (Spec2Vec and MS2Deepscore) as well as detected precursor masses to rank potential analogues and exact matches. Benchmarking MS2Query on reference mass spectra and experimental case studies demonstrate improved reliability and scalability. Thereby, MS2Query offers exciting opportunities to further increase the annotation rate of metabolomics profiles of complex metabolite mixtures and to discover new biology.

Suggested Citation

  • Niek F. de Jonge & Joris J. R. Louwen & Elena Chekmeneva & Stephane Camuzeaux & Femke J. Vermeir & Robert S. Jansen & Florian Huber & Justin J. J. van der Hooft, 2023. "MS2Query: reliable and scalable MS2 mass spectra-based analogue search," Nature Communications, Nature, vol. 14(1), pages 1-12, December.
  • Handle: RePEc:nat:natcom:v:14:y:2023:i:1:d:10.1038_s41467-023-37446-4
    DOI: 10.1038/s41467-023-37446-4
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-023-37446-4
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-023-37446-4?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Boran Kartal & Wouter J. Maalcke & Naomi M. de Almeida & Irina Cirpus & Jolein Gloerich & Wim Geerts & Huub J. M. Op den Camp & Harry R. Harhangi & Eva M. Janssen-Megens & Kees-Jan Francoijs & Hendrik, 2011. "Molecular mechanism of anaerobic ammonium oxidation," Nature, Nature, vol. 479(7371), pages 127-130, November.
    2. Florian Huber & Lars Ridder & Stefan Verhoeven & Jurriaan H Spaaks & Faruk Diblen & Simon Rogers & Justin J J van der Hooft, 2021. "Spec2Vec: Improved mass spectral similarity scoring through learning of structural relationships," PLOS Computational Biology, Public Library of Science, vol. 17(2), pages 1-18, February.
    3. Robert S. Jansen & Lungelo Mandyoli & Ryan Hughes & Shoko Wakabayashi & Jessica T. Pinkham & Bruna Selbach & Kristine M. Guinn & Eric J. Rubin & James C. Sacchettini & Kyu Y. Rhee, 2020. "Aspartate aminotransferase Rv3722c governs aspartate-dependent nitrogen metabolism in Mycobacterium tuberculosis," Nature Communications, Nature, vol. 11(1), pages 1-13, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Wout Bittremieux & Nicole E. Avalon & Sydney P. Thomas & Sarvar A. Kakhkhorov & Alexander A. Aksenov & Paulo Wender P. Gomes & Christine M. Aceves & Andrés Mauricio Caraballo-Rodríguez & Julia M. Gaug, 2023. "Open access repository-scale propagated nearest neighbor suspect spectral library for untargeted metabolomics," Nature Communications, Nature, vol. 14(1), pages 1-15, December.
    2. Nicholas J. Morehouse & Trevor N. Clark & Emily J. McMann & Jeffrey A. Santen & F. P. Jake Haeckl & Christopher A. Gray & Roger G. Linington, 2023. "Annotation of natural product compound families using molecular networking topology and structural similarity fingerprinting," Nature Communications, Nature, vol. 14(1), pages 1-10, December.
    3. Ying-Li Zhou & Paraskevi Mara & Guo-Jie Cui & Virginia P. Edgcomb & Yong Wang, 2022. "Microbiomes in the Challenger Deep slope and bottom-axis sediments," Nature Communications, Nature, vol. 13(1), pages 1-13, December.
    4. Qiong Yang & Hongchao Ji & Zhenbo Xu & Yiming Li & Pingshan Wang & Jinyu Sun & Xiaqiong Fan & Hailiang Zhang & Hongmei Lu & Zhimin Zhang, 2023. "Ultra-fast and accurate electron ionization mass spectrum matching for compound identification with million-scale in-silico library," Nature Communications, Nature, vol. 14(1), pages 1-11, December.
    5. Zhiwei Zhou & Mingdu Luo & Haosong Zhang & Yandong Yin & Yuping Cai & Zheng-Jiang Zhu, 2022. "Metabolite annotation from knowns to unknowns through knowledge-guided multi-layer metabolic networking," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
    6. Yuan Wei & Yue Jin & Wenjie Zhang, 2020. "Domestic Sewage Treatment Using a One-Stage ANAMMOX Process," IJERPH, MDPI, vol. 17(9), pages 1-14, May.
    7. Daniel G. C. Treen & Mingxun Wang & Shipei Xing & Katherine B. Louie & Tao Huan & Pieter C. Dorrestein & Trent R. Northen & Benjamin P. Bowen, 2022. "SIMILE enables alignment of tandem mass spectra with statistical significance," Nature Communications, Nature, vol. 13(1), pages 1-10, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:14:y:2023:i:1:d:10.1038_s41467-023-37446-4. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.