IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v16y2025i1d10.1038_s41467-025-61673-6.html
   My bibliography  Save this article

Library-based virtual match-between-runs quantification in GlyPep-Quant improves site-specific glycan identification

Author

Listed:
  • He Zhu

    (Chinese Academy of Sciences)

  • Zheng Fang

    (Chinese Academy of Sciences)

  • Lei Liu

    (Chinese Academy of Sciences
    University of Chinese Academy of Sciences)

  • Yan Wang

    (Chinese Academy of Sciences)

  • Hongqiang Qin

    (Chinese Academy of Sciences)

  • Yongzhan Nie

    (Fourth Military Medical University)

  • Mingming Dong

    (Dalian University of Technology)

  • Mingliang Ye

    (Chinese Academy of Sciences
    University of Chinese Academy of Sciences)

Abstract

Glycosylation changes are closely related to various diseases, including cancer. The quantitative analysis of site-specific glycans at proteomics scale remains challenging due to low glycopeptide spectra interpretation. Here, we present GlyPep-Quant, a tool for sensitive quantification and identification of site-specific glycans. Using a well-trained machine learning model, GlyPep-Quant quantified 25.1%–178.9% more site-specific glycans without missing values than pGlycoQuant, MSFragger-Glyco, and Skyline. To utilize identified information from previous large-scale dataset, an MS1 feature library-based “virtual match-between-runs” quantification scheme was developed, enabling over eightfold more site-specific glycan identification/quantification than conventional MS2-based methods. Enhanced coverage prompted the development of a glycoproteomic biomarker discovery method, involving calculation of site-specific glycan abundances ratios at the same glycosylation site, minimizing individual expression and experimental condition variability. Two pairs of site-specific glycan ratios on sites P01011-N127 and P08185-N96, were selected as high-performance biomarkers to classify gastric cancer (GC) from healthy controls (AUC > 0.95). Moreover, the two ratios performed well in distinguishing GC using an independent cohort by the library-based quantification strategy with diagnostic accuracy up to 85%. GlyPep-Quant is poised for broader glycoproteomic applications.

Suggested Citation

  • He Zhu & Zheng Fang & Lei Liu & Yan Wang & Hongqiang Qin & Yongzhan Nie & Mingming Dong & Mingliang Ye, 2025. "Library-based virtual match-between-runs quantification in GlyPep-Quant improves site-specific glycan identification," Nature Communications, Nature, vol. 16(1), pages 1-17, December.
  • Handle: RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-61673-6
    DOI: 10.1038/s41467-025-61673-6
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-025-61673-6
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-025-61673-6?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Yi Yang & Guoquan Yan & Siyuan Kong & Mengxi Wu & Pengyuan Yang & Weiqian Cao & Liang Qiao, 2021. "GproDIA enables data-independent acquisition glycoproteomics with comprehensive statistical control," Nature Communications, Nature, vol. 12(1), pages 1-15, December.
    2. Siyuan Kong & Pengyun Gong & Wen-Feng Zeng & Biyun Jiang & Xinhang Hou & Yang Zhang & Huanhuan Zhao & Mingqi Liu & Guoquan Yan & Xinwen Zhou & Xihua Qiao & Mengxi Wu & Pengyuan Yang & Chao Liu & Weiqi, 2022. "pGlycoQuant with a deep residual network for quantitative glycoproteomics at intact glycopeptide level," Nature Communications, Nature, vol. 13(1), pages 1-17, December.
    3. Wen-Feng Zeng & Xie-Xuan Zhou & Sander Willems & Constantin Ammar & Maria Wahle & Isabell Bludau & Eugenia Voytik & Maximillian T. Strauss & Matthias Mann, 2022. "AlphaPeptDeep: a modular deep learning framework to predict peptide properties for proteomics," Nature Communications, Nature, vol. 13(1), pages 1-14, December.
    4. Ruedi Aebersold & Matthias Mann, 2003. "Mass spectrometry-based proteomics," Nature, Nature, vol. 422(6928), pages 198-207, March.
    5. Weiping Sun & Qianqiu Zhang & Xiyue Zhang & Ngoc Hieu Tran & M. Ziaur Rahman & Zheng Chen & Chao Peng & Jun Ma & Ming Li & Lei Xin & Baozhen Shan, 2023. "Glycopeptide database search and de novo sequencing with PEAKS GlycanFinder enable highly sensitive glycoproteomics," Nature Communications, Nature, vol. 14(1), pages 1-15, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Yi Yang & Qun Fang, 2024. "Prediction of glycopeptide fragment mass spectra by deep learning," Nature Communications, Nature, vol. 15(1), pages 1-12, December.
    2. Wen-Feng Zeng & Guoquan Yan & Huan-huan Zhao & Chao Liu & Weiqian Cao, 2024. "Uncovering missing glycans and unexpected fragments with pGlycoNovo for site-specific glycosylation analysis across species," Nature Communications, Nature, vol. 15(1), pages 1-13, December.
    3. Siyuan Kong & Pengyun Gong & Wen-Feng Zeng & Biyun Jiang & Xinhang Hou & Yang Zhang & Huanhuan Zhao & Mingqi Liu & Guoquan Yan & Xinwen Zhou & Xihua Qiao & Mengxi Wu & Pengyuan Yang & Chao Liu & Weiqi, 2022. "pGlycoQuant with a deep residual network for quantitative glycoproteomics at intact glycopeptide level," Nature Communications, Nature, vol. 13(1), pages 1-17, December.
    4. Alexander Kaever & Manuel Landesfeind & Kirstin Feussner & Burkhard Morgenstern & Ivo Feussner & Peter Meinicke, 2014. "Meta-Analysis of Pathway Enrichment: Combining Independent and Dependent Omics Data Sets," PLOS ONE, Public Library of Science, vol. 9(2), pages 1-12, February.
    5. Dayle L Sampson & Tony J Parker & Zee Upton & Cameron P Hurst, 2011. "A Comparison of Methods for Classifying Clinical Samples Based on Proteomics Data: A Case Study for Statistical and Machine Learning Approaches," PLOS ONE, Public Library of Science, vol. 6(9), pages 1-11, September.
    6. Jiang Tan & Hui-Zhen Fu & Yuh-Shan Ho, 2014. "A bibliometric analysis of research on proteomics in Science Citation Index Expanded," Scientometrics, Springer;Akadémiai Kiadó, vol. 98(2), pages 1473-1490, February.
    7. Qiannan Liu & Xiaoyan Lu & Yao Deng & Han Zhang & Rumeng Wei & Hongrui Li & Ying Feng & Juan Wei & Fang Ma & Yan Zhang & Xia Zou, 2025. "Global characterization of mouse testis O-glycoproteome landscape during spermatogenesis," Nature Communications, Nature, vol. 16(1), pages 1-17, December.
    8. Jacques Colinge & Keiryn L Bennett, 2007. "Introduction to Computational Proteomics," PLOS Computational Biology, Public Library of Science, vol. 3(7), pages 1-10, July.
    9. Pan Fang & Xiangming Yu & MengYang Ding & Cong Qifei & Hongyu Jiang & Qi Shi & Weiwei Zhao & Weimin Zheng & Yingning Li & Zixiang Ling & Wei-Jun Kong & Pengyuan Yang & Huali Shen, 2025. "Ultradeep N-glycoproteome atlas of mouse reveals spatiotemporal signatures of brain aging and neurodegenerative diseases," Nature Communications, Nature, vol. 16(1), pages 1-16, December.
    10. Yun Xu & Wolfgang Schrader, 2021. "Studying the Complexity of Biomass Derived Biofuels," Energies, MDPI, vol. 14(8), pages 1-13, April.
    11. ?uksza Marta & Kluge Bogus?aw & Ostrowski Jerzy & Karczmarski Jakub & Gambin Anna, 2009. "Two-Stage Model-Based Clustering for Liquid Chromatography Mass Spectrometry Data Analysis," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 8(1), pages 1-36, February.
    12. Charlotte Adams & Wassim Gabriel & Kris Laukens & Mario Picciani & Mathias Wilhelm & Wout Bittremieux & Kurt Boonen, 2024. "Fragment ion intensity prediction improves the identification rate of non-tryptic peptides in timsTOF," Nature Communications, Nature, vol. 15(1), pages 1-11, December.
    13. Patrick Leopold Rüther & Immanuel Mirnes Husic & Pernille Bangsgaard & Kristian Murphy Gregersen & Pernille Pantmann & Milena Carvalho & Ricardo Miguel Godinho & Lukas Friedl & João Cascalheira & Albe, 2022. "SPIN enables high throughput species identification of archaeological bone by proteomics," Nature Communications, Nature, vol. 13(1), pages 1-14, December.
    14. Klemens Fröhlich & Eva Brombacher & Matthias Fahrner & Daniel Vogele & Lucas Kook & Niko Pinter & Peter Bronsert & Sylvia Timme-Bronsert & Alexander Schmidt & Katja Bärenfaller & Clemens Kreutz & Oliv, 2022. "Benchmarking of analysis strategies for data-independent acquisition proteomics using a large-scale dataset comprising inter-patient heterogeneity," Nature Communications, Nature, vol. 13(1), pages 1-13, December.
    15. Ling Li & Mingming Niu & Alyssa Erickson & Jie Luo & Kincaid Rowbotham & Kai Guo & He Huang & Yuxin Li & Yi Jiang & Junguk Hur & Chunyu Liu & Junmin Peng & Xusheng Wang, 2022. "SMAP is a pipeline for sample matching in proteogenomics," Nature Communications, Nature, vol. 13(1), pages 1-9, December.
    16. Brar, D.S. & Mackill, D.J. & Hardy, Bill (ed.), 2007. "Rice Genetics V- Proceedings of the Fifth International Rice Genetics Symposium," IRRI Books, International Rice Research Institute (IRRI), number 164486, January.
    17. Xiang Zhang & Tianze Ling & Zhi Jin & Sheng Xu & Zhiqiang Gao & Boyan Sun & Zijie Qiu & Jiaqi Wei & Nanqing Dong & Guangshuai Wang & Guibin Wang & Leyuan Li & Muhammad Abdul-Mageed & Laks V. S. Lakshm, 2025. "π-PrimeNovo: an accurate and efficient non-autoregressive deep learning model for de novo peptide sequencing," Nature Communications, Nature, vol. 16(1), pages 1-16, December.
    18. Benjamin A Shoemaker & Anna R Panchenko, 2007. "Deciphering Protein–Protein Interactions. Part II. Computational Methods to Predict Protein and Domain Interaction Partners," PLOS Computational Biology, Public Library of Science, vol. 3(4), pages 1-7, April.
    19. Stephan Krueger & Patrick Giavalisco & Leonard Krall & Marie-Caroline Steinhauser & Dirk Büssis & Bjoern Usadel & Ulf-Ingo Flügge & Alisdair R Fernie & Lothar Willmitzer & Dirk Steinhauser, 2011. "A Topological Map of the Compartmentalized Arabidopsis thaliana Leaf Metabolome," PLOS ONE, Public Library of Science, vol. 6(3), pages 1-16, March.
    20. Kertcher, Zack & Venkatraman, Rohan & Coslor, Erica, 2020. "Pleasingly parallel: Early cross-disciplinary work for innovation diffusion across boundaries in grid computing," Journal of Business Research, Elsevier, vol. 116(C), pages 581-594.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-61673-6. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.