IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1010029.html
   My bibliography  Save this article

Bioactivity assessment of natural compounds using machine learning models trained on target similarity between drugs

Author

Listed:
  • Vinita Periwal
  • Stefan Bassler
  • Sergej Andrejev
  • Natalia Gabrielli
  • Kaustubh Raosaheb Patil
  • Athanasios Typas
  • Kiran Raosaheb Patil

Abstract

Natural compounds constitute a rich resource of potential small molecule therapeutics. While experimental access to this resource is limited due to its vast diversity and difficulties in systematic purification, computational assessment of structural similarity with known therapeutic molecules offers a scalable approach. Here, we assessed functional similarity between natural compounds and approved drugs by combining multiple chemical similarity metrics and physicochemical properties using a machine-learning approach. We computed pairwise similarities between 1410 drugs for training classification models and used the drugs shared protein targets as class labels. The best performing models were random forest which gave an average area under the ROC of 0.9, Matthews correlation coefficient of 0.35, and F1 score of 0.33, suggesting that it captured the structure-activity relation well. The models were then used to predict protein targets of circa 11k natural compounds by comparing them with the drugs. This revealed therapeutic potential of several natural compounds, including those with support from previously published sources as well as those hitherto unexplored. We experimentally validated one of the predicted pair’s activities, viz., Cox-1 inhibition by 5-methoxysalicylic acid, a molecule commonly found in tea, herbs and spices. In contrast, another natural compound, 4-isopropylbenzoic acid, with the highest similarity score when considering most weighted similarity metric but not picked by our models, did not inhibit Cox-1. Our results demonstrate the utility of a machine-learning approach combining multiple chemical features for uncovering protein binding potential of natural compounds.Author summary: A large fraction of small-molecule drugs has originated from natural compounds making them an attractive resource for search of potential lead compounds. Yet, this resource is not extensively explored because of their vast number and technical barriers to obtaining them in pure form. Computational approaches can expedite exploration of natural compounds and their derivatives at a much larger scale. Towards this, we took advantage of the known protein targets of drugs to mine natural compounds with similarity to known small-molecule drugs. The underlying hypothesis is that two compounds binding to the same protein target are similar from a bioactivity viewpoint. To identify high-dimensional structural features of the compounds underlying their bioactivity, we computed various structural features of paired drugs (i.e., drugs sharing a common protein target) and used these to train machine learning classifiers. The trained classification models were then used to predict similarity between drugs and natural compounds. We assessed the resulting predictions–protein target binding by natural compounds—through an extensive literature survey, and experimental validated a novel prediction. Together, our results outline a workflow and provide a resource to explore therapeutic potential of natural compounds.

Suggested Citation

  • Vinita Periwal & Stefan Bassler & Sergej Andrejev & Natalia Gabrielli & Kaustubh Raosaheb Patil & Athanasios Typas & Kiran Raosaheb Patil, 2022. "Bioactivity assessment of natural compounds using machine learning models trained on target similarity between drugs," PLOS Computational Biology, Public Library of Science, vol. 18(4), pages 1-21, April.
  • Handle: RePEc:plo:pcbi00:1010029
    DOI: 10.1371/journal.pcbi.1010029
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1010029
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1010029&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1010029?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Jon Clardy & Christopher Walsh, 2004. "Lessons from natural molecules," Nature, Nature, vol. 432(7019), pages 829-837, December.
    2. Guha, Rajarshi, 2007. "Chemical Informatics Functionality in R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 18(i05).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Anna Cichonska & Balaguru Ravikumar & Elina Parri & Sanna Timonen & Tapio Pahikkala & Antti Airola & Krister Wennerberg & Juho Rousu & Tero Aittokallio, 2017. "Computational-experimental approach to drug-target interaction mapping: A case study on kinase inhibitors," PLOS Computational Biology, Public Library of Science, vol. 13(8), pages 1-28, August.
    2. Alina Bărbulescu & Lucica Barbeș & Cristian Ștefan Dumitriu, 2022. "Computer-Aided Methods for Molecular Classification," Mathematics, MDPI, vol. 10(9), pages 1-19, May.
    3. Jenna E. Leeuwen & Wail Ba-Alawi & Emily Branchard & Jennifer Cruickshank & Wiebke Schormann & Joseph Longo & Jennifer Silvester & Peter L. Gross & David W. Andrews & David W. Cescon & Benjamin Haibe-, 2022. "Computational pharmacogenomic screen identifies drugs that potentiate the anti-breast cancer activity of statins," Nature Communications, Nature, vol. 13(1), pages 1-17, December.
    4. Jiangyong Gu & Yuanshen Gui & Lirong Chen & Gu Yuan & Hui-Zhe Lu & Xiaojie Xu, 2013. "Use of Natural Products as Chemical Library for Drug Discovery and Network Pharmacology," PLOS ONE, Public Library of Science, vol. 8(4), pages 1-10, April.
    5. repec:jss:jstsof:18:i01 is not listed on IDEAS
    6. Jiadong Hu & Shi Qiu & Feiyan Wang & Qing Li & Chun-Lei Xiang & Peng Di & Ziding Wu & Rui Jiang & Jinxing Li & Zhen Zeng & Jing Wang & Xingxing Wang & Yuchen Zhang & Shiyuan Fang & Yuqi Qiao & Jie Din, 2023. "Functional divergence of CYP76AKs shapes the chemodiversity of abietane-type diterpenoids in genus Salvia," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
    7. Aleksandr Ianevski & Kristen Nader & Kyriaki Driva & Wojciech Senkowski & Daria Bulanova & Lidia Moyano-Galceran & Tanja Ruokoranta & Heikki Kuusanmäki & Nemo Ikonen & Philipp Sergeev & Markus Vähä-Ko, 2024. "Single-cell transcriptomes identify patient-tailored therapies for selective co-inhibition of cancer clones," Nature Communications, Nature, vol. 15(1), pages 1-16, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1010029. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.