Author
Listed:
- Wei He
- Xu Tian
- Xue Li
- Peifu Han
- Shuang Wang
- Lin Liu
- Tao Song
Abstract
Accurate prediction of molecular properties is a key component of Artificial Intelligence-driven Drug Design (AIDD). Despite significant progress in improving these predictive models, balancing accuracy with computational complexity remains a challenge. Molecular topological and geometric features provide rich spatial information, crucial for improving prediction accuracy, but their extraction typically increases model complexity. To address this, we propose TGF-M (Topology-augmented Geometric Features for Molecular Property Prediction), a novel predictive model that optimizes feature extraction to enhance information capture and improve model accuracy, and reduces model complexity to lower computational cost. This approach enhances the model’s ability to leverage both topological and geometric features without unnecessary complexity. On the re-segmented PCQM4Mv2 dataset, TGF-M performs remarkably, achieving a low mean absolute error (MAE) of 0.0647 in the HOMO-LUMO gap prediction task with only 6.4M parameters. Compared to two recent state-of-the-art models evaluated within a unified validation framework, TGF-M demonstrates comparable performance with less than one-tenth of the parameters. We conducted an in-depth analysis of TGF-M’s chemical interpretability. The results further validate the method’s effectiveness in leveraging complex molecular topology and geometry during model learning, underscoring its potential and advantages. The trained models and source code of TGF-M are publicly available at https://github.com/TiAW-Go/TGF-M.Author summary: Predicting molecular properties is a cornerstone of drug discovery, directly influencing the development of new medicines. Current approaches often rely heavily on computationally expensive 3D structural data, posing challenges for large-scale or real-time applications. In the context of molecular modeling, topology represents the atom-to-atom connectivity within a molecule, while geometry describes the precise spatial arrangement of these atoms. Combining these two aspects allows for a more comprehensive understanding of molecular properties, as topology captures structural relationships and geometry encodes spatial interactions. This work introduces a novel method that combines molecular geometric and topological features to enhance prediction accuracy while significantly reducing computational complexity. By bridging the gap between molecular connectivity (2D topology) and spatial arrangements (3D geometry), our approach not only offers a more efficient pathway to understanding molecular behavior but also demonstrates the potential to make advanced predictive models more accessible. This work paves the way for scalable and interpretable molecular modeling, addressing key challenges in data-driven biology and providing new tools for applications in drug design.
Suggested Citation
Wei He & Xu Tian & Xue Li & Peifu Han & Shuang Wang & Lin Liu & Tao Song, 2025.
"TGF-M: Topology-augmented geometric features enhance molecular property prediction,"
PLOS Computational Biology, Public Library of Science, vol. 21(4), pages 1-22, April.
Handle:
RePEc:plo:pcbi00:1013004
DOI: 10.1371/journal.pcbi.1013004
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1013004. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.