Interpretable CRISPR/Cas9 off-target activities with mismatches and indels prediction using BERT

被引:4
|
作者
Luo, Ye [1 ]
Chen, Yaowen [1 ]
Xie, HuanZeng [1 ]
Zhu, Wentao [1 ]
Zhang, Guishan [1 ]
机构
[1] Shantou Univ, Coll Engn, Shantou 515063, Peoples R China
基金
中国国家自然科学基金;
关键词
CRISPER/Cas9; Off-target; BERT; Adaptive batch-wise olass balancing; Deep learning; GENOME EDITING TECHNOLOGIES; CLASSIFICATION; CRISPR-CAS9; SPECIFICITY; DESIGN; CAS9; SYSTEMS; DNA;
D O I
10.1016/j.compbiomed.2024.107932
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Off-target effects of CRISPR/Cas9 can lead to suboptimal genome editing outcomes. Numerous deep learning-based approaches have achieved excellent performance for off-target prediction; however, few can predict the off-target activities with both mismatches and indels between single guide RNA (sgRNA) and target DNA sequence pair. In addition, data imbalance is a common pitfall for off-target prediction. Moreover, due to the complexity of genomic contexts, generating an interpretable model also remains challenged. To address these issues, firstly we developed a BERT-based model called CRISPR-BERT for enhancing the prediction of off-target activities with both mismatches and indels. Secondly, we proposed an adaptive batch-wise class balancing strategy to combat the noise exists in imbalanced off-target data. Finally, we applied a visualization approach for investigating the generalizable nucleotide position-dependent patterns of sgRNA-DNA pair for off-target activity. In our comprehensive comparison to existing methods on five mismatches-only datasets and two mismatches-and-indels datasets, CRISPR-BERT achieved the best performance in terms of AUROC and PRAUC. Besides, the visualization analysis demonstrated how implicit knowledge learned by CRISPR-BERT facilitates off-target prediction, which shows potential in model interpretability. Collectively, CRISPR-BERT provides an accurate and interpretable framework for off-target prediction, further contributes to sgRNA optimization in practical use for improved target specificity in CRISPR/Cas9 genome editing. The source code is available at https://github.com/BrokenStringx/CRISPR-BERT
引用
收藏
页数:15
相关论文
共 50 条
  • [21] Measuring and Reducing Off-Target Activities of Programmable Nucleases Including CRISPR-Cas9
    Koo, Taeyoung
    Lee, Jungjoon
    Kim, Jin-Soo
    MOLECULES AND CELLS, 2015, 38 (06) : 475 - 481
  • [22] Efficient CRISPR/Cas9 genome editing with low off-target effects in zebrafish
    Hruscha, Alexander
    Krawitz, Peter
    Rechenberg, Alexandra
    Heinrich, Verena
    Hecht, Jochen
    Haass, Christian
    Schmid, Bettina
    DEVELOPMENT, 2013, 140 (24): : 4982 - 4987
  • [23] Multigene Knockout Utilizing Off-Target Mutations of the CRISPR/Cas9 System in Rice
    Endo, Masaki
    Mikami, Masafumi
    Toki, Seiichi
    PLANT AND CELL PHYSIOLOGY, 2015, 56 (01) : 41 - 47
  • [24] Avoiding the off-target effects of CRISPR/cas9 system is still a challenging accomplishment for genetic transformation
    Herai, Roberto H.
    GENE, 2019, 700 : 176 - 178
  • [25] CrnnCrispr: An Interpretable Deep Learning Method for CRISPR/Cas9 sgRNA On-Target Activity Prediction
    Zhu, Wentao
    Xie, Huanzeng
    Chen, Yaowen
    Zhang, Guishan
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2024, 25 (08)
  • [26] piCRISPR: Physically informed deep learning models for CRISPR/Cas9 off-target cleavage prediction
    Stortz, Florian
    Mak, Jeffrey K.
    Minary, Peter
    ARTIFICIAL INTELLIGENCE IN THE LIFE SCIENCES, 2023, 3
  • [27] Whole genome analysis of CRISPR Cas9 sgRNA off-target homologies via an efficient computational algorithm
    Zhou, Hong
    Zhou, Michael
    Li, Daisy
    Manthey, Joseph
    Lioutikova, Ekaterina
    Wang, Hong
    Zeng, Xiao
    BMC GENOMICS, 2017, 18
  • [28] R-CRISPR: A Deep Learning Network to Predict Off-Target Activities with Mismatch, Insertion and Deletion in CRISPR-Cas9 System
    Niu, Rui
    Peng, Jiajie
    Zhang, Zhipeng
    Shang, Xuequn
    GENES, 2021, 12 (12)
  • [29] Predicting CRISPR-Cas9 off-target effects in human primary cells using bidirectional LSTM with BERT embedding
    Sari, Orhan
    Liu, Ziying
    Pan, Youlian
    Shao, Xiaojian
    BIOINFORMATICS ADVANCES, 2025, 5 (01):
  • [30] CRISPR/CAS9 Target Prediction with Deep Learning
    Aktas, Ozlem
    Dogan, Elif
    Ensari, Tolga
    2019 SCIENTIFIC MEETING ON ELECTRICAL-ELECTRONICS & BIOMEDICAL ENGINEERING AND COMPUTER SCIENCE (EBBT), 2019,