A systematic method for solving data imbalance in CRISPR off-target prediction tasks

被引:0
|
作者
Guan Z. [1 ]
Jiang Z. [1 ]
机构
[1] School of Computer Science and Technology, East China Normal University, Shanghai
关键词
CRISPR/Cas9; system; Data imbalance; Off-target prediction;
D O I
10.1016/j.compbiomed.2024.108781
中图分类号
学科分类号
摘要
Accurately identifying potential off-target sites in the CRISPR/Cas9 system is crucial for improving the efficiency and safety of editing. However, the imbalance of available off-target datasets has posed a major obstacle in enhancing prediction performance. Despite several prediction models have been developed to address this issue, there remains a lack of systematic research on handling data imbalance in off-target prediction. This article systematically investigates the data imbalance issue in off-target datasets and explores numerous methods to process data imbalance from a novel perspective. First, we highlight the impact of the imbalance problem on off-target prediction tasks by determining the imbalance ratios present in these datasets. Then, we provide a comprehensive review of various sampling techniques and cost-sensitive methods to mitigate class imbalance in off-target datasets. Finally, systematic experiments are conducted on several state-of-the-art prediction models to illustrate the impact of applying data imbalance solutions. The results show that class imbalance processing methods significantly improve the off-target prediction capabilities of the models across multiple testing datasets. The code and datasets used in this study are available at https://github.com/gzrgzx/CRISPR_Data_Imbalance. © 2024 Elsevier Ltd
引用
收藏
相关论文
共 50 条
  • [41] Cas9-chromatin binding information enables more accurate CRISPR off-target prediction
    Singh, Ritambhara
    Kuscu, Cem
    Quinlan, Aaron
    Qi, Yanjun
    Adli, Mazhar
    NUCLEIC ACIDS RESEARCH, 2015, 43 (18)
  • [42] Crispr-SGRU: Prediction of CRISPR/Cas9 Off-Target Activities with Mismatches and Indels Using Stacked BiGRU
    Zhang, Guishan
    Luo, Ye
    Xie, Huanzeng
    Dai, Zhiming
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2024, 25 (20)
  • [43] Interpretable CRISPR/Cas9 off-target activities with mismatches and indels prediction using BERT
    Luo, Ye
    Chen, Yaowen
    Xie, HuanZeng
    Zhu, Wentao
    Zhang, Guishan
    COMPUTERS IN BIOLOGY AND MEDICINE, 2024, 169
  • [44] CRISPRon/off: CRISPR/Cas9 on- and off-target gRNA design
    Anthon, Christian
    Corsi, Giulia Ilaria
    Gorodkin, Jan
    BIOINFORMATICS, 2022, 38 (24) : 5437 - 5439
  • [45] CRISPR/Cas Systems in Genome Editing: Methodologies and Tools for sgRNA Design, Off-Target Evaluation, and Strategies to Mitigate Off-Target Effects
    Manghwar, Hakim
    Li, Bo
    Ding, Xiao
    Hussain, Amjad
    Lindsey, Keith
    Zhang, Xianlong
    Jin, Shuangxia
    ADVANCED SCIENCE, 2020, 7 (06)
  • [46] Structure and dynamics of off-target effects in CRISPR-Cas9
    Arantes, Pablo R.
    Mitchell, Brandon P.
    Saha, Aakash
    Nierzwicki, Lukasz
    Pacesa, Martin
    Jinek, Martin
    Palermo, Giulia
    BIOPHYSICAL JOURNAL, 2023, 122 (03) : 190A - 190A
  • [47] OffScan: a universal and fast CRISPR off-target sites detection tool
    Cui, Yingbo
    Liao, Xiangke
    Peng, Shaoliang
    Tang, Tao
    Huang, Chun
    Yang, Canqun
    BMC GENOMICS, 2020, 21 (Suppl 1)
  • [48] Battling CRISPR-Cas9 off-target genome editing
    Daisy Li
    Hong Zhou
    Xiao Zeng
    Cell Biology and Toxicology, 2019, 35 : 403 - 406
  • [49] A Self-restricted CRISPR System to Reduce Off-target Effects
    Chen, Yanhao
    Liu, Xiaojian
    Zhang, Yongxian
    Wang, Hui
    Ying, Hao
    Liu, Mingyao
    Li, Dali
    Lui, Kathy O.
    Ding, Qiurong
    MOLECULAR THERAPY, 2016, 24 (09) : 1508 - 1510
  • [50] OffScan: a universal and fast CRISPR off-target sites detection tool
    Yingbo Cui
    Xiangke Liao
    Shaoliang Peng
    Tao Tang
    Chun Huang
    Canqun Yang
    BMC Genomics, 21