Predicting prime editing efficiency and product purity by deep learning

被引:71
作者
Mathis, Nicolas [1 ]
Allam, Ahmed [2 ]
Kissling, Lucas [1 ]
Marquart, Kim Fabiano [1 ,3 ]
Schmidheini, Lukas [1 ,3 ]
Solari, Cristina [1 ]
Balazs, Zsolt [2 ]
Krauthammer, Michael [2 ]
Schwank, Gerald [1 ]
机构
[1] Univ Zurich, Inst Pharmacol & Toxicol, Zurich, Switzerland
[2] Univ Zurich, Dept Quant Biomed, Zurich, Switzerland
[3] Swiss Fed Inst Technol, Inst Mol Hlth Sci, Zurich, Switzerland
基金
瑞士国家科学基金会;
关键词
REPAIR;
D O I
10.1038/s41587-022-01613-7
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Prime editing is a versatile genome editing tool but requires experimental optimization of the prime editing guide RNA (pegRNA) to achieve high editing efficiency. Here we conducted a high-throughput screen to analyze prime editing outcomes of 92,423 pegRNAs on a highly diverse set of 13,349 human pathogenic mutations that include base substitutions, insertions and deletions. Based on this dataset, we identified sequence context features that influence prime editing and trained PRIDICT (prime editing guide prediction), an attention-based bidirectional recurrent neural network. PRIDICT reliably predicts editing rates for all small-sized genetic changes with a Spearman's R of 0.85 and 0.78 for intended and unintended edits, respectively. We validated PRIDICT on endogenous editing sites as well as an external dataset and showed that pegRNAs with high (>70) versus low (<70) PRIDICT scores showed substantially increased prime editing efficiencies in different cell types in vitro (12-fold) and in hepatocytes in vivo (tenfold), highlighting the value of PRIDICT for basic and for translational research applications.
引用
收藏
页码:1151 / +
页数:22
相关论文
共 51 条
[1]  
[Anonymous], 2017, P INT C NEURAL INFOR
[2]   Search-and-replace genome editing without double-strand breaks or donor DNA [J].
Anzalone, Andrew V. ;
Randolph, Peyton B. ;
Davis, Jessie R. ;
Sousa, Alexander A. ;
Koblan, Luke W. ;
Levy, Jonathan M. ;
Chen, Peter J. ;
Wilson, Christopher ;
Newby, Gregory A. ;
Raguram, Aditya ;
Liu, David R. .
NATURE, 2019, 576 (7785) :149-+
[3]  
Ba J. L., 2016, PREPRINT
[4]   NCBI GEO: archive for functional genomics data sets-update [J].
Barrett, Tanya ;
Wilhite, Stephen E. ;
Ledoux, Pierre ;
Evangelista, Carlos ;
Kim, Irene F. ;
Tomashevsky, Maxim ;
Marshall, Kimberly A. ;
Phillippy, Katherine H. ;
Sherman, Patti M. ;
Holko, Michelle ;
Yefanov, Andrey ;
Lee, Hyeseung ;
Zhang, Naigong ;
Robertson, Cynthia L. ;
Serova, Nadezhda ;
Davis, Sean ;
Soboleva, Alexandra .
NUCLEIC ACIDS RESEARCH, 2013, 41 (D1) :D991-D995
[5]  
Bergstra J, 2012, J MACH LEARN RES, V13, P281
[6]  
Bill CA, 1998, GENETICS, V149, P1935
[7]   In vivo prime editing of a metabolic liver disease in mice [J].
Boeck, Desiree ;
Rothgangl, Tanja ;
Villiger, Lukas ;
Schmidheini, Lukas ;
Matsushita, Mai ;
Mathis, Nicolas ;
Ioannidi, Eleonora ;
Rimann, Nicole ;
Grisch-Chan, Hiu Man ;
Kreutzer, Susanne ;
Kontarakis, Zacharias ;
Kopf, Manfred ;
Thoeny, Beat ;
Schwank, Gerald .
SCIENCE TRANSLATIONAL MEDICINE, 2022, 14 (636)
[8]   Reproducible inference of transcription factor footprints in ATAC-seq and DNase-seq datasets using protocol-specific bias modeling [J].
Calviello, Aslihan Karabacak ;
Hirsekorn, Antje ;
Wurmus, Ricardo ;
Yusuf, Dilmurat ;
Ohler, Uwe .
GENOME BIOLOGY, 2019, 20 (1)
[9]   Enhanced prime editing systems by manipulating cellular determinants of editing outcomes [J].
Chen, Peter J. ;
Hussmann, Jeffrey A. ;
Yan, Jun ;
Knipping, Friederike ;
Ravisankar, Purnima ;
Chen, Pin-Fang ;
Chen, Cidi ;
Nelson, James W. ;
Newby, Gregory A. ;
Sahin, Mustafa ;
Osborn, Mark J. ;
Weissman, Jonathan S. ;
Adamson, Britt ;
Liu, David R. .
CELL, 2021, 184 (22) :5635-+
[10]   XGBoost: A Scalable Tree Boosting System [J].
Chen, Tianqi ;
Guestrin, Carlos .
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :785-794