IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v14y2023i1d10.1038_s41467-023-37572-z.html
   My bibliography  Save this article

Improving the generalizability of protein-ligand binding predictions with AI-Bind

Author

Listed:
  • Ayan Chatterjee

    (Northeastern University)

  • Robin Walters

    (Northeastern University)

  • Zohair Shafi

    (Northeastern University)

  • Omair Shafi Ahmed

    (Northeastern University)

  • Michael Sebek

    (Northeastern University
    Northeastern University)

  • Deisy Gysi

    (Northeastern University
    Northeastern University
    Brigham and Women’s Hospital, Harvard Medical School)

  • Rose Yu

    (University of California)

  • Tina Eliassi-Rad

    (Northeastern University
    Northeastern University
    Santa Fe Institute
    Northeastern University)

  • Albert-László Barabási

    (Northeastern University
    Northeastern University
    Central European University)

  • Giulia Menichetti

    (Northeastern University
    Northeastern University
    Brigham and Women’s Hospital, Harvard Medical School)

Abstract

Identifying novel drug-target interactions is a critical and rate-limiting step in drug discovery. While deep learning models have been proposed to accelerate the identification process, here we show that state-of-the-art models fail to generalize to novel (i.e., never-before-seen) structures. We unveil the mechanisms responsible for this shortcoming, demonstrating how models rely on shortcuts that leverage the topology of the protein-ligand bipartite network, rather than learning the node features. Here we introduce AI-Bind, a pipeline that combines network-based sampling strategies with unsupervised pre-training to improve binding predictions for novel proteins and ligands. We validate AI-Bind predictions via docking simulations and comparison with recent experimental evidence, and step up the process of interpreting machine learning prediction of protein-ligand binding by identifying potential active binding sites on the amino acid sequence. AI-Bind is a high-throughput approach to identify drug-target combinations with the potential of becoming a powerful tool in drug discovery.

Suggested Citation

  • Ayan Chatterjee & Robin Walters & Zohair Shafi & Omair Shafi Ahmed & Michael Sebek & Deisy Gysi & Rose Yu & Tina Eliassi-Rad & Albert-László Barabási & Giulia Menichetti, 2023. "Improving the generalizability of protein-ligand binding predictions with AI-Bind," Nature Communications, Nature, vol. 14(1), pages 1-15, December.
  • Handle: RePEc:nat:natcom:v:14:y:2023:i:1:d:10.1038_s41467-023-37572-z
    DOI: 10.1038/s41467-023-37572-z
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-023-37572-z
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-023-37572-z?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. David E. Gordon & Gwendolyn M. Jang & Mehdi Bouhaddou & Jiewei Xu & Kirsten Obernier & Kris M. White & Matthew J. O’Meara & Veronica V. Rezelj & Jeffrey Z. Guo & Danielle L. Swaney & Tia A. Tummino & , 2020. "A SARS-CoV-2 protein interaction map reveals targets for drug repurposing," Nature, Nature, vol. 583(7816), pages 459-468, July.
    2. Christoph Gorgulla & Andras Boeszoermenyi & Zi-Fu Wang & Patrick D. Fischer & Paul W. Coote & Krishna M. Padmanabha Das & Yehor S. Malets & Dmytro S. Radchenko & Yurii S. Moroz & David A. Scott & Kons, 2020. "An open-source drug discovery platform enables ultra-large virtual screens," Nature, Nature, vol. 580(7805), pages 663-668, April.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Taha Y. Taha & Irene P. Chen & Jennifer M. Hayashi & Takako Tabata & Keith Walcott & Gabriella R. Kimmerly & Abdullah M. Syed & Alison Ciling & Rahul K. Suryawanshi & Hannah S. Martin & Bryan H. Bach , 2023. "Rapid assembly of SARS-CoV-2 genomes reveals attenuation of the Omicron BA.1 variant through NSP6," Nature Communications, Nature, vol. 14(1), pages 1-13, December.
    2. David Gomez-Zepeda & Danielle Arnold-Schild & Julian Beyrle & Arthur Declercq & Ralf Gabriels & Elena Kumm & Annica Preikschat & Mateusz Krzysztof Łącki & Aurélie Hirschler & Jeewan Babu Rijal & Chris, 2024. "Thunder-DDA-PASEF enables high-coverage immunopeptidomics and is boosted by MS2Rescore with MS2PIP timsTOF fragmentation prediction model," Nature Communications, Nature, vol. 15(1), pages 1-18, December.
    3. Christine E. Peters & Ursula Schulze-Gahmen & Manon Eckhardt & Gwendolyn M. Jang & Jiewei Xu & Ernst H. Pulido & Conner Bardine & Charles S. Craik & Melanie Ott & Or Gozani & Kliment A. Verba & Ruth H, 2022. "Structure-function analysis of enterovirus protease 2A in complex with its essential host factor SETD3," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
    4. Paul Beroza & James J. Crawford & Oleg Ganichkin & Leo Gendelev & Seth F. Harris & Raphael Klein & Anh Miu & Stefan Steinbacher & Franca-Maria Klingler & Christian Lemmen, 2022. "Chemical space docking enables large-scale structure-based virtual screening to discover ROCK1 kinase inhibitors," Nature Communications, Nature, vol. 13(1), pages 1-10, December.
    5. Gabriela Dias Noske & Yun Song & Rafaela Sachetto Fernandes & Rod Chalk & Haitem Elmassoudi & Lizbé Koekemoer & C. David Owen & Tarick J. El-Baba & Carol V. Robinson & Glaucius Oliva & Andre Schutzer , 2023. "An in-solution snapshot of SARS-COV-2 main protease maturation process and inhibition," Nature Communications, Nature, vol. 14(1), pages 1-13, December.
    6. Haofeng Wang & Qi Yang & Xiaoce Liu & Zili Xu & Maolin Shao & Dongxu Li & Yinkai Duan & Jielin Tang & Xianqiang Yu & Yumin Zhang & Aihua Hao & Yajie Wang & Jie Chen & Chenghao Zhu & Luke Guddat & Hong, 2023. "Structure-based discovery of dual pathway inhibitors for SARS-CoV-2 entry," Nature Communications, Nature, vol. 14(1), pages 1-14, December.
    7. Sara Sunshine & Andreas S. Puschnik & Joseph M. Replogle & Matthew T. Laurie & Jamin Liu & Beth Shoshana Zha & James K. Nuñez & Janie R. Byrum & Aidan H. McMorrow & Matthew B. Frieman & Juliane Winkle, 2023. "Systematic functional interrogation of SARS-CoV-2 host factors using Perturb-seq," Nature Communications, Nature, vol. 14(1), pages 1-13, December.
    8. Xiaopan Gao & Huabin Tian & Kaixiang Zhu & Qing Li & Wei Hao & Linyue Wang & Bo Qin & Hongyu Deng & Sheng Cui, 2022. "Structural basis for Sarbecovirus ORF6 mediated blockage of nucleocytoplasmic transport," Nature Communications, Nature, vol. 13(1), pages 1-11, December.
    9. Thomas Kruse & Caroline Benz & Dimitriya H. Garvanska & Richard Lindqvist & Filip Mihalic & Fabian Coscia & Raviteja Inturi & Ahmed Sayadi & Leandro Simonetti & Emma Nilsson & Muhammad Ali & Johanna K, 2021. "Large scale discovery of coronavirus-host factor protein interaction motifs reveals SARS-CoV-2 specific mechanisms and vulnerabilities," Nature Communications, Nature, vol. 12(1), pages 1-13, December.
    10. Filip Mihalič & Leandro Simonetti & Girolamo Giudice & Marie Rubin Sander & Richard Lindqvist & Marie Berit Akpiroro Peters & Caroline Benz & Eszter Kassa & Dilip Badgujar & Raviteja Inturi & Muhammad, 2023. "Large-scale phage-based screening reveals extensive pan-viral mimicry of host short linear motifs," Nature Communications, Nature, vol. 14(1), pages 1-20, December.
    11. Lifan Chen & Zisheng Fan & Jie Chang & Ruirui Yang & Hui Hou & Hao Guo & Yinghui Zhang & Tianbiao Yang & Chenmao Zhou & Qibang Sui & Zhengyang Chen & Chen Zheng & Xinyue Hao & Keke Zhang & Rongrong Cu, 2023. "Sequence-based drug design as a concept in computational drug design," Nature Communications, Nature, vol. 14(1), pages 1-21, December.
    12. Hanbaek Lyu & Yacoub H. Kureh & Joshua Vendrow & Mason A. Porter, 2024. "Learning low-rank latent mesoscale structures in networks," Nature Communications, Nature, vol. 15(1), pages 1-15, December.
    13. Ma’ayan Israeli & Yaara Finkel & Yfat Yahalom-Ronen & Nir Paran & Theodor Chitlaru & Ofir Israeli & Inbar Cohen-Gihon & Moshe Aftalion & Reut Falach & Shahar Rotem & Uri Elia & Ital Nemet & Limor Klik, 2022. "Genome-wide CRISPR screens identify GATA6 as a proviral host factor for SARS-CoV-2 via modulation of ACE2," Nature Communications, Nature, vol. 13(1), pages 1-16, December.
    14. Charulata Jindal & Sandeep Kumar & Sunil Sharma & Yuk Ming Choi & Jimmy T. Efird, 2020. "The Prevention and Management of COVID-19: Seeking a Practical and Timely Solution," IJERPH, MDPI, vol. 17(11), pages 1-11, June.
    15. Kelsey M. Haas & Michael J. McGregor & Mehdi Bouhaddou & Benjamin J. Polacco & Eun-Young Kim & Thong T. Nguyen & Billy W. Newton & Matthew Urbanowski & Heejin Kim & Michael A. P. Williams & Veronica V, 2023. "Proteomic and genetic analyses of influenza A viruses identify pan-viral host targets," Nature Communications, Nature, vol. 14(1), pages 1-27, December.
    16. Jiakai Hou & Yanjun Wei & Jing Zou & Roshni Jaffery & Long Sun & Shaoheng Liang & Ningbo Zheng & Ashley M. Guerrero & Nicholas A. Egan & Ritu Bohat & Si Chen & Caishang Zheng & Xiaobo Mao & S. Stephen, 2024. "Integrated multi-omics analyses identify anti-viral host factors and pathways controlling SARS-CoV-2 infection," Nature Communications, Nature, vol. 15(1), pages 1-14, December.
    17. Filip Mihalič & Caroline Benz & Eszter Kassa & Richard Lindqvist & Leandro Simonetti & Raviteja Inturi & Hanna Aronsson & Eva Andersson & Celestine N. Chi & Norman E. Davey & Anna K. Överby & Per Jemt, 2023. "Identification of motif-based interactions between SARS-CoV-2 protein domains and human peptide ligands pinpoint antiviral targets," Nature Communications, Nature, vol. 14(1), pages 1-18, December.
    18. Pisanu Buphamalai & Tomislav Kokotovic & Vanja Nagy & Jörg Menche, 2021. "Network analysis reveals rare disease signatures across multiple levels of biological organization," Nature Communications, Nature, vol. 12(1), pages 1-15, December.
    19. Daniel Strebinger & Chris J. Frangieh & Mirco J. Friedrich & Guilhem Faure & Rhiannon K. Macrae & Feng Zhang, 2023. "Cell type-specific delivery by modular envelope design," Nature Communications, Nature, vol. 14(1), pages 1-18, December.
    20. Maria I. Freiberger & Victoria Ruiz-Serra & Camila Pontes & Miguel Romero-Durana & Pablo Galaz-Davison & Cesar A. Ramírez-Sarmiento & Claudio D. Schuster & Marcelo A. Marti & Peter G. Wolynes & Diego , 2023. "Local energetic frustration conservation in protein families and superfamilies," Nature Communications, Nature, vol. 14(1), pages 1-14, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:14:y:2023:i:1:d:10.1038_s41467-023-37572-z. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.