IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1011705.html
   My bibliography  Save this article

Statistical prediction of microbial metabolic traits from genomes

Author

Listed:
  • Zeqian Li
  • Ahmed Selim
  • Seppe Kuehn

Abstract

The metabolic activity of microbial communities is central to their role in biogeochemical cycles, human health, and biotechnology. Despite the abundance of sequencing data characterizing these consortia, it remains a serious challenge to predict microbial metabolic traits from sequencing data alone. Here we culture 96 bacterial isolates individually and assay their ability to grow on 10 distinct compounds as a sole carbon source. Using these data as well as two existing datasets, we show that statistical approaches can accurately predict bacterial carbon utilization traits from genomes. First, we show that classifiers trained on gene content can accurately predict bacterial carbon utilization phenotypes by encoding phylogenetic information. These models substantially outperform predictions made by constraint-based metabolic models automatically constructed from genomes. This result solidifies our current knowledge about the strong connection between phylogeny and metabolic traits. However, phylogeny-based predictions fail to predict traits for taxa that are phylogenetically distant from any strains in the training set. To overcome this we train improved models on gene presence/absence to predict carbon utilization traits from gene content. We show that models that predict carbon utilization traits from gene presence/absence can generalize to taxa that are phylogenetically distant from the training set either by exploiting biochemical information for feature selection or by having sufficiently large datasets. In the latter case, we provide evidence that a statistical approach can identify putatively mechanistic genes involved in metabolic traits. Our study demonstrates the potential power for predicting microbial phenotypes from genotypes using statistical approaches.Author summary: The metabolic activity of microbes is essential to sustaining life on Earth, biotechnological processes, and host fitness. As a result, the metabolic traits of microbes have been a focus of microbiology and microbial ecology for centuries, historically relying on painstaking laboratory experiments. Sequencing technologies have given us an unprecedented look at microbial genomes, but connecting genomes to specific traits in non-model bacteria remained a huge challenge.

Suggested Citation

  • Zeqian Li & Ahmed Selim & Seppe Kuehn, 2023. "Statistical prediction of microbial metabolic traits from genomes," PLOS Computational Biology, Public Library of Science, vol. 19(12), pages 1-35, December.
  • Handle: RePEc:plo:pcbi00:1011705
    DOI: 10.1371/journal.pcbi.1011705
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1011705
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1011705&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1011705?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Ibrahim E. Elsemman & Angelica Rodriguez Prado & Pranas Grigaitis & Manuel Garcia Albornoz & Victoria Harman & Stephen W. Holman & Johan Heerden & Frank J. Bruggeman & Mark M. M. Bisschops & Nikolaus , 2022. "Whole-cell modeling in yeast predicts compartment-specific proteome constraints that drive metabolic strategies," Nature Communications, Nature, vol. 13(1), pages 1-12, December.
    2. Alicia Sanchez-Gorostiaga & Djordje Bajić & Melisa L Osborne & Juan F Poyatos & Alvaro Sanchez, 2019. "High-order interactions distort the functional landscape of microbial consortia," PLOS Biology, Public Library of Science, vol. 17(12), pages 1-34, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Lu Wu & Xu-Wen Wang & Zining Tao & Tong Wang & Wenlong Zuo & Yu Zeng & Yang-Yu Liu & Lei Dai, 2024. "Data-driven prediction of colonization outcomes for complex microbial communities," Nature Communications, Nature, vol. 15(1), pages 1-15, December.
    2. Guy Amit & Amir Bashan, 2023. "Top-down identification of keystone taxa in the microbiome," Nature Communications, Nature, vol. 14(1), pages 1-12, December.
    3. Fariello, Ricardo & de Aguiar, Marcus A.M., 2024. "Third order interactions shift the critical coupling in multidimensional Kuramoto models," Chaos, Solitons & Fractals, Elsevier, vol. 187(C).

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1011705. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.