A systematic method for solving data imbalance in CRISPR off-target prediction tasks

被引:0
|
作者
Guan Z. [1 ]
Jiang Z. [1 ]
机构
[1] School of Computer Science and Technology, East China Normal University, Shanghai
关键词
CRISPR/Cas9; system; Data imbalance; Off-target prediction;
D O I
10.1016/j.compbiomed.2024.108781
中图分类号
学科分类号
摘要
Accurately identifying potential off-target sites in the CRISPR/Cas9 system is crucial for improving the efficiency and safety of editing. However, the imbalance of available off-target datasets has posed a major obstacle in enhancing prediction performance. Despite several prediction models have been developed to address this issue, there remains a lack of systematic research on handling data imbalance in off-target prediction. This article systematically investigates the data imbalance issue in off-target datasets and explores numerous methods to process data imbalance from a novel perspective. First, we highlight the impact of the imbalance problem on off-target prediction tasks by determining the imbalance ratios present in these datasets. Then, we provide a comprehensive review of various sampling techniques and cost-sensitive methods to mitigate class imbalance in off-target datasets. Finally, systematic experiments are conducted on several state-of-the-art prediction models to illustrate the impact of applying data imbalance solutions. The results show that class imbalance processing methods significantly improve the off-target prediction capabilities of the models across multiple testing datasets. The code and datasets used in this study are available at https://github.com/gzrgzx/CRISPR_Data_Imbalance. © 2024 Elsevier Ltd
引用
收藏
相关论文
共 50 条
  • [31] Achieving Plant CRISPR Targeting that Limits Off-Target Effects
    Wolt, Jeffrey D.
    Wang, Kan
    Sashital, Dipali
    Lawrence-Dill, Carolyn J.
    PLANT GENOME, 2016, 9 (03):
  • [32] Engineering guide RNA to reduce the off-target effects of CRISPR
    Jing Wu
    Hao Yin
    JournalofGeneticsandGenomics, 2019, 46 (11) : 523 - 529
  • [33] CRISPR off-target analysis in genetically engineered rats and mice
    Anderson, Keith R.
    Haeussler, Maximilian
    Watanabe, Colin
    Janakiraman, Vasantharajan
    Lund, Jessica
    Modrusan, Zora
    Stinson, Jeremy
    Bei, Qixin
    Buechler, Andrew
    Yu, Charles
    Thamminana, Sobha R.
    Tam, Lucinda
    Sowick, Michael-Anne
    Alcantar, Tuija
    O'Neil, Natasha
    Li, Jinjie
    Ta, Linda
    Lima, Lisa
    Roose-Girma, Merone
    Rairdan, Xin
    Durinck, Steffen
    Warming, Soren
    NATURE METHODS, 2018, 15 (07) : 512 - +
  • [34] Off-target Effect of CRISPR/Cas9 and Optimization
    Guo Quan-Juan
    Han Qiu-Ju
    Zhang Jian
    PROGRESS IN BIOCHEMISTRY AND BIOPHYSICS, 2018, 45 (08) : 798 - 807
  • [35] CRISPR off-target analysis in genetically engineered rats and mice
    Keith R. Anderson
    Maximilian Haeussler
    Colin Watanabe
    Vasantharajan Janakiraman
    Jessica Lund
    Zora Modrusan
    Jeremy Stinson
    Qixin Bei
    Andrew Buechler
    Charles Yu
    Sobha R. Thamminana
    Lucinda Tam
    Michael-Anne Sowick
    Tuija Alcantar
    Natasha O’Neil
    Jinjie Li
    Linda Ta
    Lisa Lima
    Merone Roose-Girma
    Xin Rairdan
    Steffen Durinck
    Søren Warming
    Nature Methods, 2018, 15 : 512 - 514
  • [36] It's CRISPR Clear: Off-Target Study Misses the Mark
    Pruett-Miller, Shondra M.
    CRISPR JOURNAL, 2018, 1 (02): : 130 - 131
  • [37] ONLINE AND OFFLINE TOOLS: CRISPR/CAS OFF-TARGET DETECTION
    Sangar, V. C.
    Samant, L.
    Pawar, S.
    Dhawale, P.
    Chowdhary, A. S.
    INTERNATIONAL JOURNAL OF PHARMACEUTICAL SCIENCES AND RESEARCH, 2016, 7 (05): : 1889 - 1895
  • [38] Guide-Guard: Off-Target Predicting in CRISPR Applications
    Bingham, Joseph
    Arussy, Netanel
    Zonouz, Saman
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2022, 2022, 13756 : 423 - 431
  • [39] Prediction-based highly sensitive CRISPR off-target validation using target-specific DNA enrichment
    Seung-Hun Kang
    Wi-jae Lee
    Ju-Hyun An
    Jong-Hee Lee
    Young-Hyun Kim
    Hanseop Kim
    Yeounsun Oh
    Young-Ho Park
    Yeung Bae Jin
    Bong-Hyun Jun
    Junho K. Hur
    Sun-Uk Kim
    Seung Hwan Lee
    Nature Communications, 11
  • [40] Prediction-based highly sensitive CRISPR off-target validation using target-specific DNA enrichment
    Kang, Seung-Hun
    Lee, Wi-jae
    An, Ju-Hyun
    Lee, Jong-Hee
    Kim, Young-Hyun
    Kim, Hanseop
    Oh, Yeounsun
    Park, Young-Ho
    Jin, Yeung Bae
    Jun, Bong-Hyun
    Hur, Junho K.
    Kim, Sun-Uk
    Lee, Seung Hwan
    NATURE COMMUNICATIONS, 2020, 11 (01)