A Machine Learning Approach to Identify the Importance of Novel Features for CRISPR/Cas9 Activity Prediction

被引:5
作者
Vora, Dhvani Sandip [1 ]
Verma, Yugesh [1 ]
Sundar, Durai [1 ,2 ]
机构
[1] Indian Inst Technol Delhi, Dept Biochem Engn & Biotechnol, New Delhi 110016, India
[2] Indian Inst Technol Delhi, Yardi Sch Artificial Intelligence, New Delhi 110016, India
关键词
CRISPR/Cas9; genome editing; machine learning; SHAP values; binding energy; off-targets; OFF-TARGET CLEAVAGE; UNBIASED DETECTION; SGRNA DESIGN; DNA CLEAVAGE; RNA; CRISPR-CAS9; COMPLEX; PROTEIN; CAS9; BINDING;
D O I
10.3390/biom12081123
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The reprogrammable CRISPR/Cas9 genome editing tool's growing popularity is hindered by unwanted off-target effects. Efforts have been directed toward designing efficient guide RNAs as well as identifying potential off-target threats, yet factors that determine efficiency and off-target activity remain obscure. Based on sequence features, previous machine learning models performed poorly on new datasets, thus there is a need for the incorporation of novel features. The binding energy estimation of the gRNA-DNA hybrid as well as the Cas9-gRNA-DNA hybrid allowed generating better performing machine learning models for the prediction of Cas9 activity. The analysis of feature contribution towards the model output on a limited dataset indicated that energy features played a determining role along with the sequence features. The binding energy features proved essential for the prediction of on-target activity and off-target sites. The plateau, in the performance on unseen datasets, of current machine learning models could be overcome by incorporating novel features, such as binding energy, among others. The models are provided on GitHub (GitHub Inc., San Francisco, CA, USA).
引用
收藏
页数:15
相关论文
共 65 条
  • [1] A machine learning approach for predicting CRISPR-Cas9 cleavage efficiencies and patterns underlying its mechanism of action
    Abadi, Shiran
    Yan, Winston X.
    Amar, David
    Mayrose, Itay
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2017, 13 (10)
  • [2] CRISPR-Cas9 off-targeting assessment with nucleic acid duplex energy parameters
    Alkan, Ferhat
    Wenzel, Anne
    Anthon, Christian
    Havgaard, Jakob Hull
    Gorodkin, Jan
    [J]. GENOME BIOLOGY, 2018, 19
  • [3] High-throughput biochemical profiling reveals sequence determinants of dCas9 off-target binding and unbinding
    Boyle, Evan A.
    Andreasson, Johan O. L.
    Chircus, Lauren M.
    Sternberg, Samuel H.
    Wu, Michelle J.
    Guegler, Chantal K.
    Doudna, Jennifer A.
    Greenleaf, William J.
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2017, 114 (21) : 5461 - 5466
  • [4] Structural basis for mismatch surveillance by CRISPR-Cas9
    Bravo, Jack P. K.
    Liu, Mu-Sen
    Hibshman, Grace N.
    Dangerfield, Tyler L.
    Jung, Kyungseok
    McCool, Ryan S.
    Johnson, Kenneth A.
    Taylor, David W.
    [J]. NATURE, 2022, 603 (7900) : 343 - 347
  • [5] Catalytic Mechanism of Non-Target DNA Cleavage in CRISPR-Cas9 Revealed by Ab Initio Molecular Dynamics
    Casalino, Lorenzo
    Nierzwicki, Lukasz
    Jinek, Martin
    Palermo, Giulia
    [J]. ACS CATALYSIS, 2020, 10 (22) : 13596 - 13605
  • [6] Molecular dynamics simulations highlight the structural differences among DNA:DNA, RNA:RNA, and DNA:RNA hybrid duplexes
    Cheatham, TE
    Kollman, PA
    [J]. JOURNAL OF THE AMERICAN CHEMICAL SOCIETY, 1997, 119 (21) : 4805 - 4825
  • [7] Using local chromatin structure to improve CRISPR/Cas9 efficiency in zebrafish
    Chen, Yunru
    Zeng, Shiyang
    Hu, Ruikun
    Wang, Xiangxiu
    Huang, Weilai
    Liu, Jiangfang
    Wang, Luying
    Liu, Guifen
    Cao, Ying
    Zhang, Yong
    [J]. PLOS ONE, 2017, 12 (08):
  • [8] DeepCRISPR: optimized CRISPR guide RNA design by deep learning
    Chuai, Guohui
    Ma, Hanhui
    Yan, Jifang
    Chen, Ming
    Hong, Nanfang
    Xue, Dongyu
    Zhou, Chi
    Zhu, Chenyu
    Chen, Ke
    Duan, Bin
    Gu, Feng
    Qu, Sheng
    Huang, Deshuang
    Wei, Jia
    Liu, Qi
    [J]. GENOME BIOLOGY, 2018, 19
  • [9] CRISPOR: intuitive guide selection for CRISPR/Cas9 genome editing experiments and screens
    Concordet, Jean-Paul
    Haeussler, Maximilian
    [J]. NUCLEIC ACIDS RESEARCH, 2018, 46 (W1) : W242 - W245
  • [10] Multiplex Genome Engineering Using CRISPR/Cas Systems
    Cong, Le
    Ran, F. Ann
    Cox, David
    Lin, Shuailiang
    Barretto, Robert
    Habib, Naomi
    Hsu, Patrick D.
    Wu, Xuebing
    Jiang, Wenyan
    Marraffini, Luciano A.
    Zhang, Feng
    [J]. SCIENCE, 2013, 339 (6121) : 819 - 823