IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1005807.html
   My bibliography  Save this article

A machine learning approach for predicting CRISPR-Cas9 cleavage efficiencies and patterns underlying its mechanism of action

Author

Listed:
  • Shiran Abadi
  • Winston X Yan
  • David Amar
  • Itay Mayrose

Abstract

The adaptation of the CRISPR-Cas9 system as a genome editing technique has generated much excitement in recent years owing to its ability to manipulate targeted genes and genomic regions that are complementary to a programmed single guide RNA (sgRNA). However, the efficacy of a specific sgRNA is not uniquely defined by exact sequence homology to the target site, thus unintended off-targets might additionally be cleaved. Current methods for sgRNA design are mainly concerned with predicting off-targets for a given sgRNA using basic sequence features and employ elementary rules for ranking possible sgRNAs. Here, we introduce CRISTA (CRISPR Target Assessment), a novel algorithm within the machine learning framework that determines the propensity of a genomic site to be cleaved by a given sgRNA. We show that the predictions made with CRISTA are more accurate than other available methodologies. We further demonstrate that the occurrence of bulges is not a rare phenomenon and should be accounted for in the prediction process. Beyond predicting cleavage efficiencies, the learning process provides inferences regarding patterns that underlie the mechanism of action of the CRISPR-Cas9 system. We discover that attributes that describe the spatial structure and rigidity of the entire genomic site as well as those surrounding the PAM region are a major component of the prediction capabilities.Author summary: The CRISPR-Cas9 system, a microbial adaptive immune system, was recently exploited for modulating DNA sequences within the endogenous genome in many organisms. This system has emerged as a technology of choice for genome editing with promising therapeutic and research advancements. However, these exciting developments were not paralleled by deep understanding of CRISPR-Cas9 cleavage efficiency. Indeed, while numerous studies have been conducted in order to define better guidelines to determine CRISPR-Cas9 specificity, much ambiguity remains surrounding its mechanism of action. Here, we present a machine-learning based algorithm that was trained on genome-wide experimental data. The algorithm considers a broad range of features that describe different attributes that potentially impact the cleavage efficacy of CRISPR-Cas9 including genomic attributes, RNA thermodynamics, and those concerning sequence similarity. We further found that incorporating the possibility for DNA or RNA bulges play an important role in prediction accuracy. Together, these result in a predictive model that can be used both to predict the cleavage propensity of a new genomic site according to the genomic context, as well as to learn on the importance of different features on CRISPR-Cas9 efficiency and selectivity.

Suggested Citation

  • Shiran Abadi & Winston X Yan & David Amar & Itay Mayrose, 2017. "A machine learning approach for predicting CRISPR-Cas9 cleavage efficiencies and patterns underlying its mechanism of action," PLOS Computational Biology, Public Library of Science, vol. 13(10), pages 1-24, October.
  • Handle: RePEc:plo:pcbi00:1005807
    DOI: 10.1371/journal.pcbi.1005807
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005807
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1005807&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1005807?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. F. Ann Ran & Le Cong & Winston X. Yan & David A. Scott & Jonathan S. Gootenberg & Andrea J. Kriz & Bernd Zetsche & Ophir Shalem & Xuebing Wu & Kira S. Makarova & Eugene V. Koonin & Phillip A. Sharp & , 2015. "In vivo genome editing using Staphylococcus aureus Cas9," Nature, Nature, vol. 520(7546), pages 186-191, April.
    2. Carolin Anders & Ole Niewoehner & Alessia Duerst & Martin Jinek, 2014. "Structural basis of PAM-dependent target DNA recognition by the Cas9 endonuclease," Nature, Nature, vol. 513(7519), pages 569-573, September.
    3. Benjamin P. Kleinstiver & Michelle S. Prew & Shengdar Q. Tsai & Ved V. Topkar & Nhu T. Nguyen & Zongli Zheng & Andrew P. W. Gonzales & Zhuyun Li & Randall T. Peterson & Jing-Ruey Joanna Yeh & Martin J, 2015. "Engineered CRISPR-Cas9 nucleases with altered PAM specificities," Nature, Nature, vol. 523(7561), pages 481-485, July.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Qinchang Chen & Guohui Chuai & Haihang Zhang & Jin Tang & Liwen Duan & Huan Guan & Wenhui Li & Wannian Li & Jiaying Wen & Erwei Zuo & Qing Zhang & Qi Liu, 2023. "Genome-wide CRISPR off-target prediction and optimization using RNA-DNA interaction fingerprints," Nature Communications, Nature, vol. 14(1), pages 1-17, December.
    2. Xiaoguang Pan & Kunli Qu & Hao Yuan & Xi Xiang & Christian Anthon & Liubov Pashkova & Xue Liang & Peng Han & Giulia I. Corsi & Fengping Xu & Ping Liu & Jiayan Zhong & Yan Zhou & Tao Ma & Hui Jiang & J, 2022. "Massively targeted evaluation of therapeutic CRISPR off-targets in cells," Nature Communications, Nature, vol. 13(1), pages 1-14, December.
    3. Hsiu-Hui Tsai & Hsiao-Jung Kao & Ming-Wei Kuo & Chin-Hsien Lin & Chun-Min Chang & Yi-Yin Chen & Hsiao-Huei Chen & Pui-Yan Kwok & Alice L. Yu & John Yu, 2023. "Whole genomic analysis reveals atypical non-homologous off-target large structural variants induced by CRISPR-Cas9-mediated genome editing," Nature Communications, Nature, vol. 14(1), pages 1-9, December.
    4. Sebastian M. Siegner & Laura Ugalde & Alexandra Clemens & Laura Garcia-Garcia & Juan A. Bueren & Paula Rio & Mehmet E. Karasu & Jacob E. Corn, 2022. "Adenine base editing efficiently restores the function of Fanconi anemia hematopoietic stem and progenitor cells," Nature Communications, Nature, vol. 13(1), pages 1-15, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Zhaohui Zhong & Guanqing Liu & Zhongjie Tang & Shuyue Xiang & Liang Yang & Lan Huang & Yao He & Tingting Fan & Shishi Liu & Xuelian Zheng & Tao Zhang & Yiping Qi & Jian Huang & Yong Zhang, 2023. "Efficient plant genome engineering using a probiotic sourced CRISPR-Cas9 system," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
    2. Jian Wang & Yuxi Teng & Ruihua Zhang & Yifei Wu & Lei Lou & Yusong Zou & Michelle Li & Zhong-Ru Xie & Yajun Yan, 2021. "Engineering a PAM-flexible SpdCas9 variant as a universal gene repressor," Nature Communications, Nature, vol. 12(1), pages 1-10, December.
    3. Dawn G. L. Thean & Hoi Yee Chu & John H. C. Fong & Becky K. C. Chan & Peng Zhou & Cynthia C. S. Kwok & Yee Man Chan & Silvia Y. L. Mak & Gigi C. G. Choi & Joshua W. K. Ho & Zongli Zheng & Alan S. L. W, 2022. "Machine learning-coupled combinatorial mutagenesis enables resource-efficient engineering of CRISPR-Cas9 genome editor activities," Nature Communications, Nature, vol. 13(1), pages 1-14, December.
    4. Dalton T. Ham & Tyler S. Browne & Pooja N. Banglorewala & Tyler L. Wilson & Richard K. Michael & Gregory B. Gloor & David R. Edgell, 2023. "A generalizable Cas9/sgRNA prediction model using machine transfer learning with small high-quality datasets," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
    5. Maarten H. Geurts & Shashank Gandhi & Matteo G. Boretto & Ninouk Akkerman & Lucca L. M. Derks & Gijs Son & Martina Celotti & Sarina Harshuk-Shabso & Flavia Peci & Harry Begthel & Delilah Hendriks & Pa, 2023. "One-step generation of tumor models by base editor multiplexing in adult stem cell-derived organoids," Nature Communications, Nature, vol. 14(1), pages 1-18, December.
    6. Shunsuke Kawasaki & Hiroki Ono & Moe Hirosawa & Takeru Kuwabara & Shunsuke Sumi & Suji Lee & Knut Woltjen & Hirohide Saito, 2023. "Programmable mammalian translational modulators by CRISPR-associated proteins," Nature Communications, Nature, vol. 14(1), pages 1-17, December.
    7. Fang Liang & Yu Zhang & Lin Li & Yexin Yang & Ji-Feng Fei & Yanmei Liu & Wei Qin, 2022. "SpG and SpRY variants expand the CRISPR toolbox for genome editing in zebrafish," Nature Communications, Nature, vol. 13(1), pages 1-10, December.
    8. Raed Ibraheim & Phillip W. L. Tai & Aamir Mir & Nida Javeed & Jiaming Wang & Tomás C. Rodríguez & Suk Namkung & Samantha Nelson & Eraj Shafiq Khokhar & Esther Mintzer & Stacy Maitland & Zexiang Chen &, 2021. "Self-inactivating, all-in-one AAV vectors for precision Cas9 genome editing via homology-directed repair in vivo," Nature Communications, Nature, vol. 12(1), pages 1-17, December.
    9. Xiangjun He & Zhenjie Zhang & Junyi Xue & Yaofeng Wang & Siqi Zhang & Junkang Wei & Chenzi Zhang & Jue Wang & Brian Anugerah Urip & Chun Christopher Ngan & Junjiang Sun & Yuefeng Li & Zhiqian Lu & Hui, 2022. "Low-dose AAV-CRISPR-mediated liver-specific knock-in restored hemostasis in neonatal hemophilia B mice with subtle antibody response," Nature Communications, Nature, vol. 13(1), pages 1-17, December.
    10. Qiao Liu & Di He & Lei Xie, 2019. "Prediction of off-target specificity and cell-specific fitness of CRISPR-Cas System using attention boosted deep learning and network-based gene feature," PLOS Computational Biology, Public Library of Science, vol. 15(10), pages 1-22, October.
    11. Margot Karlikow & Evan Amalfitano & Xiaolong Yang & Jennifer Doucet & Abigail Chapman & Peivand Sadat Mousavi & Paige Homme & Polina Sutyrina & Winston Chan & Sofia Lemak & Alexander F. Yakunin & Adam, 2023. "CRISPR-induced DNA reorganization for multiplexed nucleic acid detection," Nature Communications, Nature, vol. 14(1), pages 1-11, December.
    12. Nathan Bamidele & Han Zhang & Xiaolong Dong & Haoyang Cheng & Nicholas Gaston & Hailey Feinzig & Hanbing Cao & Karen Kelly & Jonathan K. Watts & Jun Xie & Guangping Gao & Erik J. Sontheimer, 2024. "Domain-inlaid Nme2Cas9 adenine base editors with improved activity and targeting scope," Nature Communications, Nature, vol. 15(1), pages 1-13, December.
    13. Kazuki Kato & Sae Okazaki & Soumya Kannan & Han Altae-Tran & F. Esra Demircioglu & Yukari Isayama & Junichiro Ishikawa & Masahiro Fukuda & Rhiannon K. Macrae & Tomohiro Nishizawa & Kira S. Makarova & , 2022. "Structure of the IscB–ωRNA ribonucleoprotein complex, the likely ancestor of CRISPR-Cas9," Nature Communications, Nature, vol. 13(1), pages 1-10, December.
    14. Ang Li & Hitoshi Mitsunobu & Shin Yoshioka & Takahisa Suzuki & Akihiko Kondo & Keiji Nishida, 2022. "Cytosine base editing systems with minimized off-target effect and molecular size," Nature Communications, Nature, vol. 13(1), pages 1-8, December.
    15. Hongzhi Zeng & Qichen Yuan & Fei Peng & Dacheng Ma & Ananya Lingineni & Kelly Chee & Peretz Gilberd & Emmanuel C. Osikpa & Zheng Sun & Xue Gao, 2023. "A split and inducible adenine base editor for precise in vivo base editing," Nature Communications, Nature, vol. 14(1), pages 1-14, December.
    16. Péter István Kulcsár & András Tálas & Zoltán Ligeti & Eszter Tóth & Zsófia Rakvács & Zsuzsa Bartos & Sarah Laura Krausz & Ágnes Welker & Vanessza Laura Végi & Krisztina Huszár & Ervin Welker, 2023. "A cleavage rule for selection of increased-fidelity SpCas9 variants with high efficiency and no detectable off-targets," Nature Communications, Nature, vol. 14(1), pages 1-20, December.
    17. Matteo Ciciani & Michele Demozzi & Eleonora Pedrazzoli & Elisabetta Visentin & Laura Pezzè & Lorenzo Federico Signorini & Aitor Blanco-Miguez & Moreno Zolfo & Francesco Asnicar & Antonio Casini & Anna, 2022. "Automated identification of sequence-tailored Cas9 proteins using massive metagenomic data," Nature Communications, Nature, vol. 13(1), pages 1-8, December.
    18. Giulia I. Corsi & Kunli Qu & Ferhat Alkan & Xiaoguang Pan & Yonglun Luo & Jan Gorodkin, 2022. "CRISPR/Cas9 gRNA activity depends on free energy changes and on the target PAM context," Nature Communications, Nature, vol. 13(1), pages 1-14, December.
    19. Behrouz Eslami-Mossallam & Misha Klein & Constantijn V. D. Smagt & Koen V. D. Sanden & Stephen K. Jones & John A. Hawkins & Ilya J. Finkelstein & Martin Depken, 2022. "A kinetic model predicts SpCas9 activity, improves off-target classification, and reveals the physical basis of targeting fidelity," Nature Communications, Nature, vol. 13(1), pages 1-10, December.
    20. Lin Zhao & Sabrina R. T. Koseki & Rachel A. Silverstein & Nadia Amrani & Christina Peng & Christian Kramme & Natasha Savic & Martin Pacesa & Tomás C. Rodríguez & Teodora Stan & Emma Tysinger & Lauren , 2023. "PAM-flexible genome editing with an engineered chimeric Cas9," Nature Communications, Nature, vol. 14(1), pages 1-8, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1005807. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.