Author
Listed:
- Shichao Feng
- Hong-Long Ji
- Huan Wang
- Bailu Zhang
- Ryan Sterzenbach
- Chongle Pan
- Xuan Guo
Abstract
Metaproteomics based on high-throughput tandem mass spectrometry (MS/MS) plays a crucial role in characterizing microbiome functions. The acquired MS/MS data is searched against a protein sequence database to identify peptides, which are then used to infer a list of proteins present in a metaproteome sample. While the problem of protein inference has been well-studied for proteomics of single organisms, it remains a major challenge for metaproteomics of complex microbial communities because of the large number of degenerate peptides shared among homologous proteins in different organisms. This challenge calls for improved discrimination of true protein identifications from false protein identifications given a set of unique and degenerate peptides identified in metaproteomics. MetaLP was developed here for protein inference in metaproteomics using an integrative linear programming method. Taxonomic abundance information extracted from metagenomics shotgun sequencing or 16s rRNA gene amplicon sequencing, was incorporated as prior information in MetaLP. Benchmarking with mock, human gut, soil, and marine microbial communities demonstrated significantly higher numbers of protein identifications by MetaLP than ProteinLP, PeptideProphet, DeepPep, PIPQ, and Sipros Ensemble. In conclusion, MetaLP could substantially improve protein inference for complex metaproteomes by incorporating taxonomic abundance information in a linear programming model.Author summary: Inferring a reliable list of proteins from identified peptides in metaproteomics is non-trivial because of the prevalence of degenerate peptides in many metaproteome databases. Degenerate peptides are shared among multiple proteins and, therefore, cannot be uniquely attributed to any protein. Here, we developed a protein inference algorithm, MetaLP, for shotgun proteomics analysis of microbial communities to better handle degenerate peptides. Two key innovations in MetaLP were the use of taxonomic abundances as prior information and the formulation of protein inference as a linear programming problem. These features enabled MetaLP to produce substantially more protein identifications in complex metaproteomic datasets than many existing protein inference algorithms.
Suggested Citation
Shichao Feng & Hong-Long Ji & Huan Wang & Bailu Zhang & Ryan Sterzenbach & Chongle Pan & Xuan Guo, 2022.
"MetaLP: An integrative linear programming method for protein inference in metaproteomics,"
PLOS Computational Biology, Public Library of Science, vol. 18(10), pages 1-20, October.
Handle:
RePEc:plo:pcbi00:1010603
DOI: 10.1371/journal.pcbi.1010603
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1010603. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.