IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1004838.html
   My bibliography  Save this article

Rapid Prediction of Bacterial Heterotrophic Fluxomics Using Machine Learning and Constraint Programming

Author

Listed:
  • Stephen Gang Wu
  • Yuxuan Wang
  • Wu Jiang
  • Tolutola Oyetunde
  • Ruilian Yao
  • Xuehong Zhang
  • Kazuyuki Shimizu
  • Yinjie J Tang
  • Forrest Sheng Bao

Abstract

13C metabolic flux analysis (13C-MFA) has been widely used to measure in vivo enzyme reaction rates (i.e., metabolic flux) in microorganisms. Mining the relationship between environmental and genetic factors and metabolic fluxes hidden in existing fluxomic data will lead to predictive models that can significantly accelerate flux quantification. In this paper, we present a web-based platform MFlux (http://mflux.org) that predicts the bacterial central metabolism via machine learning, leveraging data from approximately 100 13C-MFA papers on heterotrophic bacterial metabolisms. Three machine learning methods, namely Support Vector Machine (SVM), k-Nearest Neighbors (k-NN), and Decision Tree, were employed to study the sophisticated relationship between influential factors and metabolic fluxes. We performed a grid search of the best parameter set for each algorithm and verified their performance through 10-fold cross validations. SVM yields the highest accuracy among all three algorithms. Further, we employed quadratic programming to adjust flux profiles to satisfy stoichiometric constraints. Multiple case studies have shown that MFlux can reasonably predict fluxomes as a function of bacterial species, substrate types, growth rate, oxygen conditions, and cultivation methods. Due to the interest of studying model organism under particular carbon sources, bias of fluxome in the dataset may limit the applicability of machine learning models. This problem can be resolved after more papers on 13C-MFA are published for non-model species.Author Summary: Metabolic information is important for disease treatment, bioprocess optimization, environmental remediation, biogeochemical cycle regulation, and our understanding of life’s origin and evolution. 13C-MFA can quantify microbial physiology at the level of metabolic reaction rates. To speed up microbial characterizations and fluxomic studies, we hypothesize that genetic and environmental factors generate specific fluxome patterns that can be recognized by machine learning. Aided by constraint programming and quadratic optimization, our platform based on machine learning (ML) can predict meaningful metabolic information about bacterial species in their environments. Further, it can offer constraints to improve the accuracy of flux balance analysis. This study infers that the bacterial metabolic network has a certain degree of rigidity in allocating carbon fluxes, and different microbial species may share common regulatory strategies for balancing carbon and energy metabolisms. As a proof of concept, we demonstrate that the use of data-driven artificial intelligence (AI) approaches, e.g., ML, may assist mechanistic based models to elucidate the topology of microbial fluxomes.

Suggested Citation

  • Stephen Gang Wu & Yuxuan Wang & Wu Jiang & Tolutola Oyetunde & Ruilian Yao & Xuehong Zhang & Kazuyuki Shimizu & Yinjie J Tang & Forrest Sheng Bao, 2016. "Rapid Prediction of Bacterial Heterotrophic Fluxomics Using Machine Learning and Constraint Programming," PLOS Computational Biology, Public Library of Science, vol. 12(4), pages 1-22, April.
  • Handle: RePEc:plo:pcbi00:1004838
    DOI: 10.1371/journal.pcbi.1004838
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1004838
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1004838&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1004838?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Adi L Tarca & Vincent J Carey & Xue-wen Chen & Roberto Romero & Sorin Drăghici, 2007. "Machine Learning and Its Applications to Biology," PLOS Computational Biology, Public Library of Science, vol. 3(6), pages 1-11, June.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Tolutola Oyetunde & Di Liu & Hector Garcia Martin & Yinjie J Tang, 2019. "Machine learning framework for assessment of microbial factory performance," PLOS ONE, Public Library of Science, vol. 14(1), pages 1-15, January.
    2. Guido Zampieri & Supreeta Vijayakumar & Elisabeth Yaneske & Claudio Angione, 2019. "Machine and deep learning meet genome-scale metabolic modeling," PLOS Computational Biology, Public Library of Science, vol. 15(7), pages 1-24, July.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Früh, Linus & Kampen, Helge & Kerkow, Antje & Schaub, Günter A. & Walther, Doreen & Wieland, Ralf, 2018. "Modelling the potential distribution of an invasive mosquito species: comparative evaluation of four machine learning methods and their combinations," Ecological Modelling, Elsevier, vol. 388(C), pages 136-144.
    2. Asa Ben-Hur & Cheng Soon Ong & Sören Sonnenburg & Bernhard Schölkopf & Gunnar Rätsch, 2008. "Support Vector Machines and Kernels for Computational Biology," PLOS Computational Biology, Public Library of Science, vol. 4(10), pages 1-10, October.
    3. Wang, Jia & Hu, Jun & Shen, Shifei & Zhuang, Jun & Ni, Shunjiang, 2020. "Crime risk analysis through big data algorithm with urban metrics," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 545(C).
    4. Lior Shamir & John D Delaney & Nikita Orlov & D Mark Eckley & Ilya G Goldberg, 2010. "Pattern Recognition Software and Techniques for Biological Image Analysis," PLOS Computational Biology, Public Library of Science, vol. 6(11), pages 1-10, November.
    5. Joana Rosado Coelho & João André Carriço & Daniel Knight & Jose-Luis Martínez & Ian Morrissey & Marco Rinaldo Oggioni & Ana Teresa Freitas, 2013. "The Use of Machine Learning Methodologies to Analyse Antibiotic and Biocide Susceptibility in Staphylococcus aureus," PLOS ONE, Public Library of Science, vol. 8(2), pages 1-10, February.
    6. Shun Adachi, 2017. "Rigid geometry solves “curse of dimensionality” effects in clustering methods: An application to omics data," PLOS ONE, Public Library of Science, vol. 12(6), pages 1-20, June.
    7. Parag Parashar & Chun Han Chen & Chandni Akbar & Sze Ming Fu & Tejender S Rawat & Sparsh Pratik & Rajat Butola & Shih Han Chen & Albert S Lin, 2019. "Analytics-statistics mixed training and its fitness to semisupervised manufacturing," PLOS ONE, Public Library of Science, vol. 14(8), pages 1-18, August.
    8. Ribeiro, Haroldo V. & Lopes, Diego D. & Pessa, Arthur A.B. & Martins, Alvaro F. & da Cunha, Bruno R. & Gonçalves, Sebastián & Lenzi, Ervin K. & Hanley, Quentin S. & Perc, Matjaž, 2023. "Deep learning criminal networks," Chaos, Solitons & Fractals, Elsevier, vol. 172(C).
    9. Dolores Wolfram & Ravi Starzl & Hubert Hackl & Derek Barclay & Theresa Hautz & Bettina Zelger & Gerald Brandacher & W P Andrew Lee & Nadine Eberhart & Yoram Vodovotz & Johann Pratschke & Gerhard Piere, 2014. "Insights from Computational Modeling in Inflammation and Acute Rejection in Limb Transplantation," PLOS ONE, Public Library of Science, vol. 9(6), pages 1-11, June.
    10. Lyaqini, S. & Nachaoui, M. & Hadri, A., 2022. "An efficient primal-dual method for solving non-smooth machine learning problem," Chaos, Solitons & Fractals, Elsevier, vol. 155(C).
    11. Malka N. Halgamuge, 2020. "Supervised Machine Learning Algorithms for Bioelectromagnetics: Prediction Models and Feature Selection Techniques Using Data from Weak Radiofrequency Radiation Effect on Human and Animals Cells," IJERPH, MDPI, vol. 17(12), pages 1-27, June.
    12. Dennis Pischel & Jörn H Buchbinder & Kai Sundmacher & Inna N Lavrik & Robert J Flassig, 2018. "A guide to automated apoptosis detection: How to make sense of imaging flow cytometry data," PLOS ONE, Public Library of Science, vol. 13(5), pages 1-17, May.
    13. Willcock, Simon & Martínez-López, Javier & Hooftman, Danny A.P. & Bagstad, Kenneth J. & Balbi, Stefano & Marzo, Alessia & Prato, Carlo & Sciandrello, Saverio & Signorello, Giovanni & Voigt, Brian & , 2018. "Machine learning for ecosystem services," Ecosystem Services, Elsevier, vol. 33(PB), pages 165-174.
    14. Takaya Saito & Marc Rehmsmeier, 2015. "The Precision-Recall Plot Is More Informative than the ROC Plot When Evaluating Binary Classifiers on Imbalanced Datasets," PLOS ONE, Public Library of Science, vol. 10(3), pages 1-21, March.
    15. Guido Zampieri & Supreeta Vijayakumar & Elisabeth Yaneske & Claudio Angione, 2019. "Machine and deep learning meet genome-scale metabolic modeling," PLOS Computational Biology, Public Library of Science, vol. 15(7), pages 1-24, July.
    16. Bahareh Torkzaban & Amir Hossein Kayvanjoo & Arman Ardalan & Soraya Mousavi & Roberto Mariotti & Luciana Baldoni & Esmaeil Ebrahimie & Mansour Ebrahimi & Mehdi Hosseini-Mazinani, 2015. "Machine Learning Based Classification of Microsatellite Variation: An Effective Approach for Phylogeographic Characterization of Olive Populations," PLOS ONE, Public Library of Science, vol. 10(11), pages 1-17, November.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1004838. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.