Synergizing CRISPR/Cas9 off-target predictions for ensemble insights and practical applications

被引:23
作者
Zhang, Shixiong [1 ]
Li, Xiangtao [1 ,2 ]
Lin, Qiuzhen [3 ]
Wong, Ka-Chun [1 ]
机构
[1] City Univ Hong Kong, Dept Comp Sci, Kowloon Tong, Hong Kong, Peoples R China
[2] Northeast Normal Univ, Dept Comp Sci & Informat Technol, Changchun 130117, Jilin, Peoples R China
[3] Shenzhen Univ, Coll Comp Sci & Software Engn, Shenzhen 518060, Peoples R China
关键词
CHROMATIN-STATE DISCOVERY; GUIDE RNA; GENOME; CRISPR-CAS9; CLEAVAGE; SEQ; NUCLEASES; DYNAMICS; SYSTEMS; DNA;
D O I
10.1093/bioinformatics/bty748
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation The RNA-guided CRISPR/Cas9 system has been widely applied to genome editing. CRISPR/Cas9 system can effectively edit the on-target genes. Nonetheless, it has recently been demonstrated that many homologous off-target genomic sequences could be mutated, leading to unexpected gene-editing outcomes. Therefore, a plethora of tools were proposed for the prediction of off-target activities of CRISPR/Cas9. Nonetheless, each computational tool has its own advantages and drawbacks under diverse conditions. It is hardly believed that a single tool is optimal for all conditions. Hence, we would like to explore the ensemble learning potential on synergizing multiple tools with genomic annotations together to enhance its predictive abilities. Results We proposed an ensemble learning framework which synergizes multiple tools together to predict the off-target activities of CRISPR/Cas9 in different combinations. Interestingly, the ensemble learning using AdaBoost outperformed other individual off-target predictive tools. We also investigated the effect of evolutionary conservation (PhyloP and PhastCons) and chromatin annotations (ChromHMM and Segway) and found that only PhyloP can enhance the predictive capabilities further. Case studies are conducted to reveal ensemble insights into the off-target predictions, demonstrating how the current study can be applied in different genomic contexts. The best prediction predicted by AdaBoost is up to 0.9383 (AUC) and 0.2998 (PRC) that outperforms other classifiers. This is ascribable to the fact that AdaBoost introduces a new weak classifier (i.e. decision stump) in each iteration to learn the DNA sequences that were misclassified as off-targets until a small error rate is reached iteratively. Availability and implementation The source codes are freely available on GitHub at https://github.com/Alexzsx/CRISPR. Supplementary information Supplementary data are available at Bioinformatics online.
引用
收藏
页码:1108 / 1115
页数:8
相关论文
共 51 条
[1]   Cas-OFFinder: a fast and versatile algorithm that searches for potential off-target sites of Cas9 RNA-guided endonucleases [J].
Bae, Sangsu ;
Park, Jeongbin ;
Kim, Jin-Soo .
BIOINFORMATICS, 2014, 30 (10) :1473-1475
[2]   CRISPR-Mediated Base Editing Enables Efficient Disruption of Eukaryotic Genes through Induction of STOP Codons [J].
Billon, Pierre ;
Bryant, Eric E. ;
Joseph, Sarah A. ;
Nambiar, Tarun S. ;
Hayward, Samuel B. ;
Rothstein, Rodney ;
Ciccia, Alberto .
MOLECULAR CELL, 2017, 67 (06) :1068-+
[3]  
Bishop C. M., 1995, NEURAL NETWORKS PATT
[4]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[5]   A tutorial on Support Vector Machines for pattern recognition [J].
Burges, CJC .
DATA MINING AND KNOWLEDGE DISCOVERY, 1998, 2 (02) :121-167
[6]   Probing the impact of chromatin conformation on genome editing tools [J].
Chen, Xiaoyu ;
Rinsma, Marrit ;
Janssen, Josephine M. ;
Liu, Jin ;
Maggio, Ignazio ;
Goncalves, Manuel A. F. V. .
NUCLEIC ACIDS RESEARCH, 2016, 44 (13) :6482-6492
[7]   Analysis of off-target effects of CRISPR/Cas-derived RNA-guided endonucleases and nickases [J].
Cho, Seung Woo ;
Kim, Sojung ;
Kim, Yongsub ;
Kweon, Jiyeon ;
Kim, Heon Seok ;
Bae, Sangsu ;
Kim, Jin-Soo .
GENOME RESEARCH, 2014, 24 (01) :132-141
[8]   Targeted genome engineering in human cells with the Cas9 RNA-guided endonuclease [J].
Cho, Seung Woo ;
Kim, Sojung ;
Kim, Jong Min ;
Kim, Jin-Soo .
NATURE BIOTECHNOLOGY, 2013, 31 (03) :230-232
[9]   COSMID: A Web-based Tool for Identifying and Validating CRISPR/Cas Off-target Sites [J].
Cradick, Thomas J. ;
Qiu, Peng ;
Lee, Ciaran M. ;
Fine, Eli J. ;
Bao, Gang .
MOLECULAR THERAPY-NUCLEIC ACIDS, 2014, 3 :e214
[10]   CRISPR/Cas9 systems targeting β-globin and CCR5 genes have substantial off-target activity [J].
Cradick, Thomas J. ;
Fine, Eli J. ;
Antico, Christopher J. ;
Bao, Gang .
NUCLEIC ACIDS RESEARCH, 2013, 41 (20) :9584-9592