IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v16y2025i1d10.1038_s41467-025-57148-3.html
   My bibliography  Save this article

Rapid and accurate prediction of protein homo-oligomer symmetry using Seq2Symm

Author

Listed:
  • Meghana Kshirsagar

    (Microsoft Corporation)

  • Artur Meller

    (Washington University in St. Louis
    Washington University in St. Louis)

  • Ian R. Humphreys

    (University of Washington
    University of Washington)

  • Samuel Sledzieski

    (Microsoft Corporation
    Massachusetts Institute of Technology)

  • Yixi Xu

    (Microsoft Corporation)

  • Rahul Dodhia

    (Microsoft Corporation)

  • Eric Horvitz

    (Microsoft Corp
    Stanford)

  • Bonnie Berger

    (Massachusetts Institute of Technology
    Massachusetts Institute of Technology)

  • Gregory R. Bowman

    (University of Pennsylvania)

  • Juan Lavista Ferres

    (Microsoft Corporation)

  • David Baker

    (University of Washington
    University of Washington
    University of Washington)

  • Minkyung Baek

    (Seoul National University)

Abstract

The majority of proteins must form higher-order assemblies to perform their biological functions, yet few machine learning models can accurately and rapidly predict the symmetry of assemblies involving multiple copies of the same protein chain. Here, we address this gap by finetuning several classes of protein foundation models, to predict homo-oligomer symmetry. Our best model named Seq2Symm, which utilizes ESM2, outperforms existing template-based and deep learning methods achieving an average AUC-PR of 0.47, 0.44 and 0.49 across homo-oligomer symmetries on three held-out test sets compared to 0.24, 0.24 and 0.25 with template-based search. Seq2Symm uses a single sequence as input and can predict at the rate of ~80,000 proteins/hour. We apply this method to 5 proteomes and ~3.5 million unlabeled protein sequences, showing its promise to be used in conjunction with downstream computationally intensive all-atom structure generation methods such as RoseTTAFold2 and AlphaFold2-multimer. Code, datasets, model are available at: https://github.com/microsoft/seq2symm .

Suggested Citation

  • Meghana Kshirsagar & Artur Meller & Ian R. Humphreys & Samuel Sledzieski & Yixi Xu & Rahul Dodhia & Eric Horvitz & Bonnie Berger & Gregory R. Bowman & Juan Lavista Ferres & David Baker & Minkyung Baek, 2025. "Rapid and accurate prediction of protein homo-oligomer symmetry using Seq2Symm," Nature Communications, Nature, vol. 16(1), pages 1-11, December.
  • Handle: RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-57148-3
    DOI: 10.1038/s41467-025-57148-3
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-025-57148-3
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-025-57148-3?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Joseph L. Watson & David Juergens & Nathaniel R. Bennett & Brian L. Trippe & Jason Yim & Helen E. Eisenach & Woody Ahern & Andrew J. Borst & Robert J. Ragotte & Lukas F. Milles & Basile I. M. Wicky & , 2023. "De novo design of protein structure and function with RFdiffusion," Nature, Nature, vol. 620(7976), pages 1089-1100, August.
    2. Kathryn Tunyasuvunakool & Jonas Adler & Zachary Wu & Tim Green & Michal Zielinski & Augustin Žídek & Alex Bridgland & Andrew Cowie & Clemens Meyer & Agata Laydon & Sameer Velankar & Gerard J. Kleywegt, 2021. "Highly accurate protein structure prediction for the human proteome," Nature, Nature, vol. 596(7873), pages 590-596, August.
    3. Martin Steinegger & Johannes Söding, 2018. "Clustering huge protein sequence sets in linear time," Nature Communications, Nature, vol. 9(1), pages 1-8, December.
    4. John Jumper & Richard Evans & Alexander Pritzel & Tim Green & Michael Figurnov & Olaf Ronneberger & Kathryn Tunyasuvunakool & Russ Bates & Augustin Žídek & Anna Potapenko & Alex Bridgland & Clemens Me, 2021. "Highly accurate protein structure prediction with AlphaFold," Nature, Nature, vol. 596(7873), pages 583-589, August.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Arne Matthys & Jan Felix & Joao Paulo Portela Catani & Kenny Roose & Wim Nerinckx & Benthe Buyten & Daria Fijalkowska & Nico Callewaert & Savvas N. Savvides & Xavier Saelens, 2025. "Single-domain antibodies directed against hemagglutinin and neuraminidase protect against influenza B viruses," Nature Communications, Nature, vol. 16(1), pages 1-19, December.
    2. Daniel R. Fox & Kazem Asadollahi & Imogen Samuels & Bradley A. Spicer & Ashleigh Kropp & Christopher J. Lupton & Kevin Lim & Chunxiao Wang & Hari Venugopal & Marija Dramicanin & Gavin J. Knott & Rhys , 2025. "Inhibiting heme piracy by pathogenic Escherichia coli using de novo-designed proteins," Nature Communications, Nature, vol. 16(1), pages 1-15, December.
    3. Yash Chainani & Jacob Diaz & Margaret Guilarte-Silva & Vincent Blay & Quan Zhang & William Sprague & Keith E. J. Tyo & Linda J. Broadbelt & Aindrila Mukhopadhyay & Jay D. Keasling & Hector Garcia Mart, 2025. "Merging the computational design of chimeric type I polyketide synthases with enzymatic pathways for chemical biosynthesis," Nature Communications, Nature, vol. 16(1), pages 1-17, December.
    4. Aika Iwama & Ryoji Kise & Hiroaki Akasaka & Fumiya K. Sano & Hidetaka S. Oshima & Asuka Inoue & Wataru Shihoya & Osamu Nureki, 2024. "Structure and dynamics of the pyroglutamylated RF-amide peptide QRFP receptor GPR103," Nature Communications, Nature, vol. 15(1), pages 1-13, December.
    5. Wei Lu & Jixian Zhang & Weifeng Huang & Ziqiao Zhang & Xiangyu Jia & Zhenyu Wang & Leilei Shi & Chengtao Li & Peter G. Wolynes & Shuangjia Zheng, 2024. "DynamicBind: predicting ligand-specific protein-ligand complex structure with a deep equivariant generative model," Nature Communications, Nature, vol. 15(1), pages 1-13, December.
    6. Laura Shub & Wenjin Liu & Georgios Skiniotis & Michael J. Keiser & Michael J. Robertson, 2025. "MIC: A deep learning tool for assigning ions and waters in cryo-EM and crystal structures," Nature Communications, Nature, vol. 16(1), pages 1-14, December.
    7. Chase R. Freschlin & Sarah A. Fahlberg & Pete Heinzelman & Philip A. Romero, 2024. "Neural network extrapolation to distant regions of the protein fitness landscape," Nature Communications, Nature, vol. 15(1), pages 1-13, December.
    8. Isak S. Pretorius & Thomas A. Dixon & Michael Boers & Ian T. Paulsen & Daniel L. Johnson, 2025. "The coming wave of confluent biosynthetic, bioinformational and bioengineering technologies," Nature Communications, Nature, vol. 16(1), pages 1-8, December.
    9. David Moi & Shunsuke Nishio & Xiaohui Li & Clari Valansi & Mauricio Langleib & Nicolas G. Brukman & Kateryna Flyak & Christophe Dessimoz & Daniele de Sanctis & Kathryn Tunyasuvunakool & John Jumper & , 2022. "Discovery of archaeal fusexins homologous to eukaryotic HAP2/GCS1 gamete fusion proteins," Nature Communications, Nature, vol. 13(1), pages 1-18, December.
    10. Enrico Orsi & Lennart Schada von Borzyskowski & Stephan Noack & Pablo I. Nikel & Steffen N. Lindner, 2024. "Automated in vivo enzyme engineering accelerates biocatalyst optimization," Nature Communications, Nature, vol. 15(1), pages 1-14, December.
    11. Wenjun Zheng, 2024. "Predicting hotspots for disease-causing single nucleotide variants using sequences-based coevolution, network analysis, and machine learning," PLOS ONE, Public Library of Science, vol. 19(5), pages 1-21, May.
    12. Lucien F. Krapp & Fernando A. Meireles & Luciano A. Abriata & Jean Devillard & Sarah Vacle & Maria J. Marcaida & Matteo Dal Peraro, 2024. "Context-aware geometric deep learning for protein sequence design," Nature Communications, Nature, vol. 15(1), pages 1-10, December.
    13. Jeffrey A. Ruffolo & Lee-Shin Chu & Sai Pooja Mahajan & Jeffrey J. Gray, 2023. "Fast, accurate antibody structure prediction from deep learning on massive set of natural antibodies," Nature Communications, Nature, vol. 14(1), pages 1-13, December.
    14. Ivan Koludarov & Tobias Senoner & Timothy N. W. Jackson & Daniel Dashevsky & Michael Heinzinger & Steven D. Aird & Burkhard Rost, 2023. "Domain loss enabled evolution of novel functions in the snake three-finger toxin gene superfamily," Nature Communications, Nature, vol. 14(1), pages 1-15, December.
    15. Simon d’Oelsnitz & Daniel J. Diaz & Wantae Kim & Daniel J. Acosta & Tyler L. Dangerfield & Mason W. Schechter & Matthew B. Minus & James R. Howard & Hannah Do & James M. Loy & Hal S. Alper & Y. Jessie, 2024. "Biosensor and machine learning-aided engineering of an amaryllidaceae enzyme," Nature Communications, Nature, vol. 15(1), pages 1-14, December.
    16. Julia Koehler Leman & Pawel Szczerbiak & P. Douglas Renfrew & Vladimir Gligorijevic & Daniel Berenberg & Tommi Vatanen & Bryn C. Taylor & Chris Chandler & Stefan Janssen & Andras Pataki & Nick Carrier, 2023. "Sequence-structure-function relationships in the microbial protein universe," Nature Communications, Nature, vol. 14(1), pages 1-11, December.
    17. Junhui Peng & Li Zhao, 2024. "The origin and structural evolution of de novo genes in Drosophila," Nature Communications, Nature, vol. 15(1), pages 1-14, December.
    18. Ye Yuan & Lei Chen & Kexu Song & Miaomiao Cheng & Ling Fang & Lingfei Kong & Lanlan Yu & Ruonan Wang & Zhendong Fu & Minmin Sun & Qian Wang & Chengjun Cui & Haojue Wang & Jiuyang He & Xiaonan Wang & Y, 2024. "Stable peptide-assembled nanozyme mimicking dual antifungal actions," Nature Communications, Nature, vol. 15(1), pages 1-17, December.
    19. Ivica Odorčić & Mohamed Belal Hamed & Sam Lismont & Lucía Chávez-Gutiérrez & Rouslan G. Efremov, 2024. "Apo and Aβ46-bound γ-secretase structures provide insights into amyloid-β processing by the APH-1B isoform," Nature Communications, Nature, vol. 15(1), pages 1-14, December.
    20. Pantelis Livanos & Choy Kriechbaum & Sophia Remers & Arvid Herrmann & Sabine Müller, 2025. "Kinesin-12 POK2 polarization is a prerequisite for a fully functional division site and aids cell plate positioning," Nature Communications, Nature, vol. 16(1), pages 1-17, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-57148-3. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.