IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v16y2025i1d10.1038_s41467-025-61778-y.html
   My bibliography  Save this article

Accurate prediction of synthesizability and precursors of 3D crystal structures via large language models

Author

Listed:
  • Zhilong Song

    (Southeast University)

  • Shuaihua Lu

    (Southeast University)

  • Minggang Ju

    (Southeast University)

  • Qionghua Zhou

    (Southeast University
    Suzhou Laboratory)

  • Jinlan Wang

    (Southeast University
    Suzhou Laboratory)

Abstract

Accessing the synthesizability of crystal structures is crucial for transforming theoretical materials into real-world applications. Nevertheless, there is a significant gap between actual synthesizability and thermodynamic or kinetic stability commonly used to screen synthesizable structures. Herein, we develop the Crystal Synthesis Large Language Models (CSLLM) framework, which utilizes three specialized LLMs to predict the synthesizability of arbitrary 3D crystal structures, possible synthetic methods, and suitable precursors, respectively. We construct a comprehensive dataset including synthesizable/non-synthesizable crystal structures and develop an efficient text representation for crystal structures to fine-tune LLMs. Our Synthesizability LLM achieves state-of-the-art accuracy (98.6%), significantly outperforming traditional synthesizability screening based on thermodynamic and kinetic stability. Its outstanding generalization ability is further demonstrated in experimental structures with complexity considerably exceeding that of the training data. Furthermore, both the Method and Precursor LLMs exceed 90% accuracy in classifying possible synthetic methods and identifying solid-state synthetic precursors for common binary and ternary compounds, respectively. Leveraging CSLLM, tens of thousands of synthesizable theoretical structures are successfully identified, with their 23 key properties predicted using accurate graph neural network models.

Suggested Citation

  • Zhilong Song & Shuaihua Lu & Minggang Ju & Qionghua Zhou & Jinlan Wang, 2025. "Accurate prediction of synthesizability and precursors of 3D crystal structures via large language models," Nature Communications, Nature, vol. 16(1), pages 1-11, December.
  • Handle: RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-61778-y
    DOI: 10.1038/s41467-025-61778-y
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-025-61778-y
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-025-61778-y?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Claudio Zeni & Robert Pinsler & Daniel Zügner & Andrew Fowler & Matthew Horton & Xiang Fu & Zilong Wang & Aliaksandra Shysheya & Jonathan Crabbé & Shoko Ueda & Roberto Sordillo & Lixin Sun & Jake Smit, 2025. "A generative model for inorganic materials design," Nature, Nature, vol. 639(8055), pages 624-632, March.
    2. Yuanfeng Xu & Luis Elcoro & Zhi-Da Song & Benjamin J. Wieder & M. G. Vergniory & Nicolas Regnault & Yulin Chen & Claudia Felser & B. Andrei Bernevig, 2020. "High-throughput calculations of magnetic topological materials," Nature, Nature, vol. 586(7831), pages 702-707, October.
    3. Shuaihua Lu & Qionghua Zhou & Yixin Ouyang & Yilv Guo & Qiang Li & Jinlan Wang, 2018. "Accelerated discovery of stable lead-free hybrid organic-inorganic perovskites via machine learning," Nature Communications, Nature, vol. 9(1), pages 1-8, December.
    4. Daniil A. Boiko & Robert MacKnight & Ben Kline & Gabe Gomes, 2023. "Autonomous chemical research with large language models," Nature, Nature, vol. 624(7992), pages 570-578, December.
    5. Baicheng Weng & Zhilong Song & Rilong Zhu & Qingyu Yan & Qingde Sun & Corey G. Grice & Yanfa Yan & Wan-Jian Yin, 2020. "Simple descriptor derived from symbolic regression accelerating the discovery of new perovskite catalysts," Nature Communications, Nature, vol. 11(1), pages 1-8, December.
    6. Luis M. Antunes & Keith T. Butler & Ricardo Grau-Crespo, 2024. "Crystal structure generation with autoregressive large language modeling," Nature Communications, Nature, vol. 15(1), pages 1-16, December.
    7. Arunima K. Singh & Joseph H. Montoya & John M. Gregoire & Kristin A. Persson, 2019. "Robust and synthesizable photocatalysts for CO2 reduction: a data-driven materials discovery," Nature Communications, Nature, vol. 10(1), pages 1-9, December.
    8. Xinyu Chen & Shuaihua Lu & Qian Chen & Qionghua Zhou & Jinlan Wang, 2024. "From bulk effective mass to 2D carrier mobility accurate prediction via adversarial transfer learning," Nature Communications, Nature, vol. 15(1), pages 1-9, December.
    9. Yilei Wu & Chang-Feng Wang & Ming-Gang Ju & Qiangqiang Jia & Qionghua Zhou & Shuaihua Lu & Xinying Gao & Yi Zhang & Jinlan Wang, 2024. "Universal machine learning aided synthesis approach of two-dimensional perovskites in a typical laboratory," Nature Communications, Nature, vol. 15(1), pages 1-10, December.
    10. Xinyu Chen & Shuaihua Lu & Qian Chen & Qionghua Zhou & Jinlan Wang, 2024. "Author Correction: From bulk effective mass to 2D carrier mobility accurate prediction via adversarial transfer learning," Nature Communications, Nature, vol. 15(1), pages 1-1, December.
    11. Keith T. Butler & Daniel W. Davies & Hugh Cartwright & Olexandr Isayev & Aron Walsh, 2018. "Machine learning for molecular and materials science," Nature, Nature, vol. 559(7715), pages 547-555, July.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Zhilong Song & Linfeng Fan & Shuaihua Lu & Chongyi Ling & Qionghua Zhou & Jinlan Wang, 2025. "Inverse design of promising electrocatalysts for CO2 reduction via generative models and bird swarm algorithm," Nature Communications, Nature, vol. 16(1), pages 1-10, December.
    2. Xinyu Chen & Shuaihua Lu & Qian Chen & Qionghua Zhou & Jinlan Wang, 2024. "From bulk effective mass to 2D carrier mobility accurate prediction via adversarial transfer learning," Nature Communications, Nature, vol. 15(1), pages 1-9, December.
    3. Luis M. Antunes & Keith T. Butler & Ricardo Grau-Crespo, 2024. "Crystal structure generation with autoregressive large language modeling," Nature Communications, Nature, vol. 15(1), pages 1-16, December.
    4. Luozhijie Jin & Zijian Du & Le Shu & Yan Cen & Yuanfeng Xu & Yongfeng Mei & Hao Zhang, 2025. "Transformer-generated atomic embeddings to enhance prediction accuracy of crystal properties with machine learning," Nature Communications, Nature, vol. 16(1), pages 1-11, December.
    5. Yilei Wu & Chang-Feng Wang & Ming-Gang Ju & Qiangqiang Jia & Qionghua Zhou & Shuaihua Lu & Xinying Gao & Yi Zhang & Jinlan Wang, 2024. "Universal machine learning aided synthesis approach of two-dimensional perovskites in a typical laboratory," Nature Communications, Nature, vol. 15(1), pages 1-10, December.
    6. Xiaoxin Zhang & Hongyuan He & Yu Chen & Guangming Yang & Xiao Xiao & Haiping Lv & Yongkang Xiang & Shuxiong Wang & Chang Jiang & Jianhui Li & Zhou Chen & Subiao Liu & Ning Yan & Xue Yong & Abdullah N., 2025. "Co-expression of multi-genes for polynary perovskite electrocatalysts for reversible solid oxide cells," Nature Communications, Nature, vol. 16(1), pages 1-14, December.
    7. Haotian Chen & Xinjie Shen & Zeqi Ye & Wenjun Feng & Haoxue Wang & Xiao Yang & Xu Yang & Weiqing Liu & Jiang Bian, 2024. "Towards Data-Centric Automatic R&D," Papers 2404.11276, arXiv.org, revised Jul 2024.
    8. Han Li & Ruotian Zhang & Yaosen Min & Dacheng Ma & Dan Zhao & Jianyang Zeng, 2023. "A knowledge-guided pre-training framework for improving molecular representation learning," Nature Communications, Nature, vol. 14(1), pages 1-13, December.
    9. Nilmani Singh & Stephan Lane & Tianhao Yu & Jingxia Lu & Adrianna Ramos & Haiyang Cui & Huimin Zhao, 2025. "A generalized platform for artificial intelligence-powered autonomous enzyme engineering," Nature Communications, Nature, vol. 16(1), pages 1-13, December.
    10. Tian Xie & Arthur France-Lanord & Yanming Wang & Jeffrey Lopez & Michael A. Stolberg & Megan Hill & Graham Michael Leverick & Rafael Gomez-Bombarelli & Jeremiah A. Johnson & Yang Shao-Horn & Jeffrey C, 2022. "Accelerating amorphous polymer electrolyte screening by learning to reduce errors in molecular dynamics simulated properties," Nature Communications, Nature, vol. 13(1), pages 1-10, December.
    11. Li, Yi & Liu, Kailong & Foley, Aoife M. & Zülke, Alana & Berecibar, Maitane & Nanini-Maury, Elise & Van Mierlo, Joeri & Hoster, Harry E., 2019. "Data-driven health estimation and lifetime prediction of lithium-ion batteries: A review," Renewable and Sustainable Energy Reviews, Elsevier, vol. 113(C), pages 1-1.
    12. O. V. Mythreyi & M. Rohith Srinivaas & Tigga Amit Kumar & R. Jayaganthan, 2021. "Machine-Learning-Based Prediction of Corrosion Behavior in Additively Manufactured Inconel 718," Data, MDPI, vol. 6(8), pages 1-16, July.
    13. Youssef El Arfaoui & Mohammed Khenfouch & Nabil Habiballah & Simone Giusepponi, 2025. "Engineering optoelectronic properties of the Pb-free perovskite FASiBr3 − XIX (X = 0, 1, 2 or 3) for photovoltaic applications: first principle analysis," The European Physical Journal B: Condensed Matter and Complex Systems, Springer;EDP Sciences, vol. 98(4), pages 1-15, April.
    14. Xu Liu & Yihan Zhang & Yifan Xie & Ledu Wang & Liyu Gan & Jialei Li & Jiahe Li & Hongli Zhang & Linjiang Chen & Weiwei Shang & Jun Jiang & Gang Zou, 2025. "Design of circularly polarized phosphorescence materials guided by transfer learning," Nature Communications, Nature, vol. 16(1), pages 1-10, December.
    15. Sarmad Dashti Latif & Ali Najah Ahmed, 2023. "A review of deep learning and machine learning techniques for hydrological inflow forecasting," Environment, Development and Sustainability: A Multidisciplinary Approach to the Theory and Practice of Sustainable Development, Springer, vol. 25(11), pages 12189-12216, November.
    16. Wang, Zixuan & Chen, Zijian & Wang, Boyuan & Wu, Chuang & Zhou, Chao & Peng, Yang & Zhang, Xinyu & Ni, Zongming & Chung, Chi-yung & Chan, Ching-chuen & Yang, Jian & Zhao, Haitao, 2025. "Digital manufacturing of perovskite materials and solar cells," Applied Energy, Elsevier, vol. 377(PB).
    17. Li, Jing & Yu, Qian, 2024. "Scientists’ disciplinary characteristics and collaboration behaviour under the convergence paradigm: A multilevel network perspective," Journal of Informetrics, Elsevier, vol. 18(1).
    18. Xiaoyun Lin & Xiaowei Du & Shican Wu & Shiyu Zhen & Wei Liu & Chunlei Pei & Peng Zhang & Zhi-Jian Zhao & Jinlong Gong, 2024. "Machine learning-assisted dual-atom sites design with interpretable descriptors unifying electrocatalytic reactions," Nature Communications, Nature, vol. 15(1), pages 1-13, December.
    19. Snehi Shrestha & Kieran James Barvenik & Tianle Chen & Haochen Yang & Yang Li & Meera Muthachi Kesavan & Joshua M. Little & Hayden C. Whitley & Zi Teng & Yaguang Luo & Eleonora Tubaldi & Po-Yen Chen, 2024. "Machine intelligence accelerated design of conductive MXene aerogels with programmable properties," Nature Communications, Nature, vol. 15(1), pages 1-14, December.
    20. Oscar Méndez-Lucio & Christos A. Nicolaou & Berton Earnshaw, 2024. "MolE: a foundation model for molecular graphs using disentangled attention," Nature Communications, Nature, vol. 15(1), pages 1-9, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-61778-y. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.