CNNsite: Prediction of DNA-binding Residues in Proteins Using Convolutional Neural Network with Sequence Features

被引:0
|
作者
Zhou, Jiyun [1 ,2 ]
Lu, Qin [2 ]
Xu, Ruifeng [1 ]
Gui, Lin [1 ]
Wang, Hongpeng [1 ]
机构
[1] Harbin Inst Technol, Shenzhen Grad Sch, Sch Comp Sci & Technol, Shenzhen, Peoples R China
[2] Hong Kong Polytech Univ, Dept Comp, Hong Kong, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
EFFICIENT PREDICTION; ACCURATE PREDICTION; SITES;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Protein-DNA complexes play crucial roles in gene regulation. The prediction of the residues involved in protein-DNA interactions is critical for understanding gene regulation. Although many methods have been proposed, most of them overlooked motif features. Motif features are sub sequences and are important for the recognition between a protein and DNA. In order to efficiently use motif features for the prediction of DNA-binding residues, we first apply the Convolutional Neural Network (CNN) method to capture the motif features from the sequences around the target residues. CNN modeling consists of a set of learnable motif detectors that can capture the important motif features by scanning the sequences around the target residues. Then we use a neural network classifier, referred to as CNNsite, by combining the captured motif features, sequence features and evolutionary features to predict binding residues from sequences. The datasets PDNA-62 and PDNA-224 are used to evaluate the performance of CNNsite by five-fold cross-validation. Performance evaluation shows that the motif features performs better than sequence features and evolutionary features with at least 6.73% on ST, 0.097 on MCC and 0.069 on AUC. When comparing with previously published methods, CNNsite performs better with at least 0.019 on MCC, 4.37% on ST and 0.040 on AUC. CNNsite is also evaluated on an independent dataset TS-72 and CNNsite outperforms the previous methods by at least 0.012 on AUC. The discriminant powers of the motif features of size from 2 to 6 residues show that many motif features with large discriminant power are composed by the residues that play important roles in the DNA-protein interactions. The standalone version of the CNNsite is available at http://hlt.hitsz.edu.cn:8080/CNNsite/.
引用
收藏
页码:78 / 85
页数:8
相关论文
共 50 条
  • [21] An improved sequence based prediction protocol for DNA-binding proteins using SVM and comprehensive feature analysis
    Zou, Chuanxin
    Gong, Jiayu
    Li, Honglin
    BMC BIOINFORMATICS, 2013, 14
  • [22] An improved sequence based prediction protocol for DNA-binding proteins using SVM and comprehensive feature analysis
    Chuanxin Zou
    Jiayu Gong
    Honglin Li
    BMC Bioinformatics, 14
  • [23] A Review of DNA-binding Proteins Prediction Methods
    Qu, Kaiyang
    Wei, Leyi
    Zou, Quan
    CURRENT BIOINFORMATICS, 2019, 14 (03) : 246 - 254
  • [24] Sequence Dependence of Binding and Exchange of Nonspecific Dna-Binding Proteins
    Graham, John S.
    Johnson, Reid C.
    Marko, John F.
    BIOPHYSICAL JOURNAL, 2011, 100 (03) : 70 - 70
  • [25] Identification of DNA-binding Proteins Using Structural, Electrostatic and Evolutionary Features
    Nimrod, Guy
    Szilagyi, Andras
    Leslie, Christina
    Ben-Tal, Nir
    JOURNAL OF MOLECULAR BIOLOGY, 2009, 387 (04) : 1040 - 1053
  • [26] EmbedCaps-DBP: Predicting DNA-Binding Proteins Using Protein Sequence Embedding and Capsule Network
    Naim, Muhammad Khaerul
    Mengko, Tati Rajab
    Hertadi, Rukman
    Purwarianti, Ayu
    Susanty, Meredita
    IEEE ACCESS, 2023, 11 : 121256 - 121268
  • [27] Application of DNA-Binding Protein Prediction Based on Graph Convolutional Network and Contact Map
    Lu, Weizhong
    Zhou, Nan
    Ding, Yijie
    Wu, Hongjie
    Zhang, Yu
    Fu, Qiming
    Li, Haiou
    BIOMED RESEARCH INTERNATIONAL, 2022, 2022
  • [28] TargetDBP plus : Enhancing the Performance of Identifying DNA-Binding Proteins via Weighted Convolutional Features
    Hu, Jun
    Rao, Liang
    Zhu, Yi-Heng
    Zhang, Gui-Jun
    Yu, Dong-Jun
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2021, 61 (01) : 505 - 515
  • [29] SEQUENCE-SPECIFIC DNA-BINDING BY MYC PROTEINS
    KERKHOFF, E
    BISTER, K
    KLEMPNAUER, KH
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1991, 88 (10) : 4323 - 4327
  • [30] Using electrostatic potentials to predict DNA-binding sites on DNA-binding proteins
    Jones, S
    Shanahan, HP
    Berman, HM
    Thornton, JM
    NUCLEIC ACIDS RESEARCH, 2003, 31 (24) : 7189 - 7198