IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0028966.html

Understanding and Classifying Metabolite Space and Metabolite-Likeness

Author

Listed:
  • Julio E Peironcely
  • Theo Reijmers
  • Leon Coulier
  • Andreas Bender
  • Thomas Hankemeier

Abstract

While the entirety of ‘Chemical Space’ is huge (and assumed to contain between 1063 and 10200 ‘small molecules’), distinct subsets of this space can nonetheless be defined according to certain structural parameters. An example of such a subspace is the chemical space spanned by endogenous metabolites, defined as ‘naturally occurring’ products of an organisms' metabolism. In order to understand this part of chemical space in more detail, we analyzed the chemical space populated by human metabolites in two ways. Firstly, in order to understand metabolite space better, we performed Principal Component Analysis (PCA), hierarchical clustering and scaffold analysis of metabolites and non-metabolites in order to analyze which chemical features are characteristic for both classes of compounds. Here we found that heteroatom (both oxygen and nitrogen) content, as well as the presence of particular ring systems was able to distinguish both groups of compounds. Secondly, we established which molecular descriptors and classifiers are capable of distinguishing metabolites from non-metabolites, by assigning a ‘metabolite-likeness’ score. It was found that the combination of MDL Public Keys and Random Forest exhibited best overall classification performance with an AUC value of 99.13%, a specificity of 99.84% and a selectivity of 88.79%. This performance is slightly better than previous classifiers; and interestingly we found that drugs occupy two distinct areas of metabolite-likeness, the one being more ‘synthetic’ and the other being more ‘metabolite-like’. Also, on a truly prospective dataset of 457 compounds, 95.84% correct classification was achieved. Overall, we are confident that we contributed to the tasks of classifying metabolites, as well as to understanding metabolite chemical space better. This knowledge can now be used in the development of new drugs that need to resemble metabolites, and in our work particularly for assessing the metabolite-likeness of candidate molecules during metabolite identification in the metabolomics field.

Suggested Citation

  • Julio E Peironcely & Theo Reijmers & Leon Coulier & Andreas Bender & Thomas Hankemeier, 2011. "Understanding and Classifying Metabolite Space and Metabolite-Likeness," PLOS ONE, Public Library of Science, vol. 6(12), pages 1-14, December.
  • Handle: RePEc:plo:pone00:0028966
    DOI: 10.1371/journal.pone.0028966
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0028966
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0028966&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0028966?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Lê, Sébastien & Josse, Julie & Husson, François, 2008. "FactoMineR: An R Package for Multivariate Analysis," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 25(i01).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Surun, Clément & Drechsler, Martin, 2018. "Effectiveness of Tradable Permits for the Conservation of Metacommunities With Two Competing Species," Ecological Economics, Elsevier, vol. 147(C), pages 189-196.
    2. Alexander Platzer & Thomas Nussbaumer & Thomas Karonitsch & Josef S Smolen & Daniel Aletaha, 2019. "Analysis of gene expression in rheumatoid arthritis and related conditions offers insights into sex-bias, gene biotypes and co-expression patterns," PLOS ONE, Public Library of Science, vol. 14(7), pages 1-23, July.
    3. Baccar, Mariem & Raynal, Hélène & Sekhar, Muddu & Bergez, Jacques-Eric & Willaume, Magali & Casel, Pierre & Giriraj, P. & Murthy, Sanjeeva & Ruiz, Laurent, 2023. "Dynamics of crop category choices reveal strategies and tactics used by smallholder farmers in India to cope with unreliable water availability," Agricultural Systems, Elsevier, vol. 211(C).
    4. Aditi Sahu & Kivanc Kose & Lukas Kraehenbuehl & Candice Byers & Aliya Holland & Teguru Tembo & Anthony Santella & Anabel Alfonso & Madison Li & Miguel Cordova & Melissa Gill & Christi Fox & Salvador G, 2022. "In vivo tumor immune microenvironment phenotypes correlate with inflammation and vasculature to predict immunotherapy response," Nature Communications, Nature, vol. 13(1), pages 1-19, December.
    5. Roopam Shukla & Ankit Agarwal & Kamna Sachdeva & Juergen Kurths & P. K. Joshi, 2019. "Climate change perception: an analysis of climate change and risk perceptions among farmer types of Indian Western Himalayas," Climatic Change, Springer, vol. 152(1), pages 103-119, January.
    6. Cholez, Celia & Pauly, Olivier & Mahdad, Maral & Mehrabi, Sepide & Giagnocavo, Cynthia & Bijman, Jos, 2023. "Heterogeneity of inter-organizational collaborations in agrifood chain sustainability-oriented innovations," Agricultural Systems, Elsevier, vol. 212(C).
    7. Munten, Pauline & Swaen, Valérie & Vanhamme, Joëlle, 2024. "Exploring rebound effects in Access-Based services (ABS)," Journal of Business Research, Elsevier, vol. 182(C).
    8. Florence Jacquet & A Aboul-Naga & Bernard Hubert, 2020. "The contribution of ARIMNet to address livestock systems resilience in the Mediterranean region," Post-Print hal-03625860, HAL.
    9. Latifa Chaouachi & Miriam Marín-Sanz & Francisco Barro & Chahine Karmous, 2024. "Genetic diversity of durum wheat (Triticum turgidum ssp. durum) to mitigate abiotic stress: Drought, heat, and their combination," PLOS ONE, Public Library of Science, vol. 19(4), pages 1-20, April.
    10. Marika Vitali & Paolo Bosi & Elena Santacroce & Paolo Trevisi, 2021. "The multivariate approach identifies relationships between pre-slaughter factors, body lesions, ham defects and carcass traits in pigs," PLOS ONE, Public Library of Science, vol. 16(5), pages 1-14, May.
    11. Silvana Nisgoski & Joielan Xipaia dos Santos & Helena Cristina Vieira & Tawani Lorena Naide & Rafaela Stange & Washington Duarte Silva da Silva & Deivison Venicio Souza & Natally Celestino Gama & Márc, 2023. "Provenance Identification of Leaves and Nuts of Bertholletia excelsa Bonpl by Near-Infrared Spectroscopy and Color Parameters for Sustainable Extraction," Sustainability, MDPI, vol. 15(21), pages 1-15, November.
    12. repec:plo:pcbi00:1007496 is not listed on IDEAS
    13. Alessandro Bonadonna & Stefano Duglio & Luigi Bollani & Giovanni Peira, 2022. "Mountain Food Products: A Cluster Analysis Based on Young Consumers’ Perceptions," Sustainability, MDPI, vol. 14(19), pages 1-14, September.
    14. Cyrille Bassolo Baki & Joost Wellens & Farid Traoré & Sié Palé & Bakary Djaby & Apolline Bambara & Nguyen T. T. Thao & Missa Hié & Bernard Tychon, 2022. "Assessment of Hydro-Agricultural Infrastructures in Burkina Faso by Using Multiple Correspondence Analysis Approach," Sustainability, MDPI, vol. 14(20), pages 1-20, October.
    15. Gennifer Meldrum & Dunja Mijatović & Wilfredo Rojas & Juana Flores & Milton Pinto & Grover Mamani & Eleuterio Condori & David Hilaquita & Helga Gruberg & Stefano Padulosi, 2018. "Climate change and crop diversity: farmers’ perceptions and adaptation on the Bolivian Altiplano," Environment, Development and Sustainability: A Multidisciplinary Approach to the Theory and Practice of Sustainable Development, Springer, vol. 20(2), pages 703-730, April.
    16. Claire H Luby & Julie C Dawson & Irwin L Goldman, 2016. "Assessment and Accessibility of Phenotypic and Genotypic Diversity of Carrot (Daucus carota L. var. sativus) Cultivars Commercially Available in the United States," PLOS ONE, Public Library of Science, vol. 11(12), pages 1-19, December.
    17. Hugo R Oliveira & Diana Tomás & Manuela Silva & Susana Lopes & Wanda Viegas & Maria Manuela Veloso, 2016. "Genetic Diversity and Population Structure in Vicia faba L. Landraces and Wild Related Species Assessed by Nuclear SSRs," PLOS ONE, Public Library of Science, vol. 11(5), pages 1-18, May.
    18. Bottaro, Giorgia & Liagre, Ludwig & Pettenella, Davide, 2024. "The Forest Sector in EU Member States' National Recovery and Resilience Plans: a preliminary analysis," Forest Policy and Economics, Elsevier, vol. 160(C).
    19. Elio Romano & Rocco Roma & Flavio Tidona & Giorgio Giraffa & Andrea Bragaglio, 2021. "Dairy Farms and Life Cycle Assessment (LCA): The Allocation Criterion Useful to Estimate Undesirable Products," Sustainability, MDPI, vol. 13(8), pages 1-24, April.
    20. repec:plo:pone00:0224079 is not listed on IDEAS
    21. Louise Chavarie & Kimberly L Howland & Les N Harris & Michael J Hansen & William J Harford & Colin P Gallagher & Shauna M Baillie & Brendan Malley & William M Tonn & Andrew M Muir & Charles C Krueger, 2018. "From top to bottom: Do Lake Trout diversify along a depth gradient in Great Bear Lake, NT, Canada?," PLOS ONE, Public Library of Science, vol. 13(3), pages 1-28, March.
    22. Ettie M. Lipner & Joshua French & Carleton R. Bern & Katherine Walton-Day & David Knox & Michael Strong & D. Rebecca Prevots & James L. Crooks, 2020. "Nontuberculous Mycobacterial Disease and Molybdenum in Colorado Watersheds," IJERPH, MDPI, vol. 17(11), pages 1-15, May.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0028966. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.