IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v16y2025i1d10.1038_s41467-025-64221-4.html
   My bibliography  Save this article

Predicting sequence-specific amplification efficiency in multi-template PCR with deep learning

Author

Listed:
  • Andreas L. Gimpel

    (ETH Zurich)

  • Bowen Fan

    (ETH Zurich
    ETH Zurich
    Swiss Institute for Bioinformatics (SIB))

  • Dexiong Chen

    (ETH Zurich
    Swiss Institute for Bioinformatics (SIB)
    Max Planck Institute of Biochemistry)

  • Laetitia O. D. Wölfle

    (University of Stuttgart)

  • Max Horn

    (ETH Zurich
    Swiss Institute for Bioinformatics (SIB))

  • Laetitia Meng-Papaxanthos

    (ETH Zurich
    Swiss Institute for Bioinformatics (SIB))

  • Philipp L. Antkowiak

    (ETH Zurich)

  • Wendelin J. Stark

    (ETH Zurich)

  • Beat Christen

    (University of Stuttgart)

  • Karsten Borgwardt

    (ETH Zurich
    Swiss Institute for Bioinformatics (SIB)
    Max Planck Institute of Biochemistry)

  • Robert N. Grass

    (ETH Zurich)

Abstract

Multi-template polymerase chain reaction (PCR) is a critical technique enabling the parallel amplification of diverse DNA molecules, thereby facilitating applications in fields from quantitative molecular biology to DNA data storage. However, non-homogeneous amplification due to sequence-specific amplification efficiencies often results in skewed abundance data, compromising accuracy and sensitivity. In this study, we address amplification efficiency in complex amplicon libraries by employing one-dimensional convolutional neural networks (1D-CNNs) to predict sequence-specific amplification efficiencies, based on sequence information alone. Trained on reliably annotated datasets derived from synthetic DNA pools, these models achieve a high predictive performance (AUROC: 0.88, AUPRC: 0.44), thereby enabling the design of inherently homogeneous amplicon libraries. We further introduce CluMo, a deep learning interpretation framework that identifies specific motifs adjacent to adapter priming sites as closely associated with poor amplification. This insight leads to the elucidation of adapter-mediated self-priming as the major mechanism causing low amplification efficiency, challenging long-standing PCR design assumptions. By addressing the basis for non-homogeneous amplification in multi-template PCR, our deep-learning approach reduces the required sequencing depth to recover 99% of amplicon sequences fourfold, and opens new avenues to improve the efficiency of DNA amplification in fields such as genomics, diagnostics, and synthetic biology.

Suggested Citation

  • Andreas L. Gimpel & Bowen Fan & Dexiong Chen & Laetitia O. D. Wölfle & Max Horn & Laetitia Meng-Papaxanthos & Philipp L. Antkowiak & Wendelin J. Stark & Beat Christen & Karsten Borgwardt & Robert N. G, 2025. "Predicting sequence-specific amplification efficiency in multi-template PCR with deep learning," Nature Communications, Nature, vol. 16(1), pages 1-14, December.
  • Handle: RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-64221-4
    DOI: 10.1038/s41467-025-64221-4
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-025-64221-4
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-025-64221-4?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Fatih Ozsolak & Adam R. Platt & Dan R. Jones & Jeffrey G. Reifenberger & Lauryn E. Sass & Peter McInerney & John F. Thompson & Jayson Bowers & Mirna Jarosz & Patrice M. Milos, 2009. "Direct RNA sequencing," Nature, Nature, vol. 461(7265), pages 814-818, October.
    2. Lifu Song & Feng Geng & Zi-Yi Gong & Xin Chen & Jijun Tang & Chunye Gong & Libang Zhou & Rui Xia & Ming-Zhe Han & Jing-Yi Xu & Bing-Zhi Li & Ying-Jin Yuan, 2022. "Robust data storage in DNA by de Bruijn graph-based de novo strand assembly," Nature Communications, Nature, vol. 13(1), pages 1-9, December.
    3. Andreas L. Gimpel & Wendelin J. Stark & Reinhard Heckel & Robert N. Grass, 2023. "A digital twin for DNA data storage based on comprehensive quantification of errors and biases," Nature Communications, Nature, vol. 14(1), pages 1-12, December.
    4. Yuan-Jyue Chen & Christopher N. Takahashi & Lee Organick & Callista Bee & Siena Dumas Ang & Patrick Weiss & Bill Peck & Georg Seelig & Luis Ceze & Karin Strauss, 2020. "Quantifying molecular bias in DNA data storage," Nature Communications, Nature, vol. 11(1), pages 1-9, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Afsaneh Sadremomtaz & Robert F. Glass & Jorge Eduardo Guerrero & Dennis R. LaJeunesse & Eric A. Josephs & Reza Zadegan, 2023. "Digital data storage on DNA tape using CRISPR base editors," Nature Communications, Nature, vol. 14(1), pages 1-10, December.
    2. Zihui Yan & Guanjin Qu & Xin Chen & Gang Zheng & Huaming Wu, 2025. "DNA StairLoop: enabling high-fidelity data recovery and robust error correction in DNA-based data storage," Nature Communications, Nature, vol. 16(1), pages 1-10, December.
    3. Qingyuan Fan & Xuyang Zhao & Junyao Li & Ronghui Liu & Ming Liu & Qishun Feng & Yanping Long & Yang Fu & Jixian Zhai & Qing Pan & Yi Li, 2025. "De novo non-canonical nanopore basecalling enables private communication using heavily-modified DNA data at single-molecule level," Nature Communications, Nature, vol. 16(1), pages 1-11, December.
    4. Cauã Antunes Westmann & Leander Goldbach & Andreas Wagner, 2024. "The highly rugged yet navigable regulatory landscape of the bacterial transcription factor TetR," Nature Communications, Nature, vol. 15(1), pages 1-20, December.
    5. Weigang Chen & Rui Qin & Quan Guo & Jian Guo & Qi Ge & Yingjin Yuan, 2025. "Approaching single-molecule assembly-free readout from medium-length encoded DNA," Nature Communications, Nature, vol. 16(1), pages 1-11, December.
    6. Isak S. Pretorius & Thomas A. Dixon & Michael Boers & Ian T. Paulsen & Daniel L. Johnson, 2025. "The coming wave of confluent biosynthetic, bioinformational and bioengineering technologies," Nature Communications, Nature, vol. 16(1), pages 1-8, December.
    7. Lifu Song & Feng Geng & Zi-Yi Gong & Xin Chen & Jijun Tang & Chunye Gong & Libang Zhou & Rui Xia & Ming-Zhe Han & Jing-Yi Xu & Bing-Zhi Li & Ying-Jin Yuan, 2022. "Robust data storage in DNA by de Bruijn graph-based de novo strand assembly," Nature Communications, Nature, vol. 13(1), pages 1-9, December.
    8. Zhi Weng & Jiangxue Li & Yi Wu & Xuehao Xiu & Fei Wang & Xiaolei Zuo & Ping Song & Chunhai Fan, 2025. "Massively parallel homogeneous amplification of chip-scale DNA for DNA information storage (MPHAC-DIS)," Nature Communications, Nature, vol. 16(1), pages 1-11, December.
    9. Andreas L. Gimpel & Wendelin J. Stark & Reinhard Heckel & Robert N. Grass, 2023. "A digital twin for DNA data storage based on comprehensive quantification of errors and biases," Nature Communications, Nature, vol. 14(1), pages 1-12, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-64221-4. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.