CNNsite: Prediction of DNA-binding Residues in Proteins Using Convolutional Neural Network with Sequence Features

被引:0
|
作者
Zhou, Jiyun [1 ,2 ]
Lu, Qin [2 ]
Xu, Ruifeng [1 ]
Gui, Lin [1 ]
Wang, Hongpeng [1 ]
机构
[1] Harbin Inst Technol, Shenzhen Grad Sch, Sch Comp Sci & Technol, Shenzhen, Peoples R China
[2] Hong Kong Polytech Univ, Dept Comp, Hong Kong, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
EFFICIENT PREDICTION; ACCURATE PREDICTION; SITES;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Protein-DNA complexes play crucial roles in gene regulation. The prediction of the residues involved in protein-DNA interactions is critical for understanding gene regulation. Although many methods have been proposed, most of them overlooked motif features. Motif features are sub sequences and are important for the recognition between a protein and DNA. In order to efficiently use motif features for the prediction of DNA-binding residues, we first apply the Convolutional Neural Network (CNN) method to capture the motif features from the sequences around the target residues. CNN modeling consists of a set of learnable motif detectors that can capture the important motif features by scanning the sequences around the target residues. Then we use a neural network classifier, referred to as CNNsite, by combining the captured motif features, sequence features and evolutionary features to predict binding residues from sequences. The datasets PDNA-62 and PDNA-224 are used to evaluate the performance of CNNsite by five-fold cross-validation. Performance evaluation shows that the motif features performs better than sequence features and evolutionary features with at least 6.73% on ST, 0.097 on MCC and 0.069 on AUC. When comparing with previously published methods, CNNsite performs better with at least 0.019 on MCC, 4.37% on ST and 0.040 on AUC. CNNsite is also evaluated on an independent dataset TS-72 and CNNsite outperforms the previous methods by at least 0.012 on AUC. The discriminant powers of the motif features of size from 2 to 6 residues show that many motif features with large discriminant power are composed by the residues that play important roles in the DNA-protein interactions. The standalone version of the CNNsite is available at http://hlt.hitsz.edu.cn:8080/CNNsite/.
引用
收藏
页码:78 / 85
页数:8
相关论文
共 50 条
  • [31] CacPred: a cascaded convolutional neural network for TF-DNA binding prediction
    Shuangquan Zhang
    Anjun Ma
    Xuping Xie
    Zhichao Lian
    Yan Wang
    BMC Genomics, 26 (Suppl 2)
  • [32] Identification of DNA-Binding Proteins Using Support Vector Machine with Sequence Information
    Ma, Xin
    Wu, Jiansheng
    Xue, Xiaoyun
    COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE, 2013, 2013
  • [33] Prediction of DNA-binding residues in proteins from amino acid sequences using a random forest model with a hybrid feature
    Wu, Jiansheng
    Liu, Hongde
    Duan, Xueye
    Ding, Yan
    Wu, Hongtao
    Bai, Yunfei
    Sun, Xiao
    BIOINFORMATICS, 2009, 25 (01) : 30 - 35
  • [34] Improving DNA-Binding Protein Prediction Using Three-Part Sequence-Order Feature Extraction and a Deep Neural Network Algorithm
    Hu, Jun
    Zeng, Wen-Wu
    Jia, Ning-Xin
    Arif, Muhammad
    Yu, Dong-Jun
    Zhang, Gui-Jun
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2023, 63 (03) : 1044 - 1057
  • [35] Shape string: A new feature for prediction of DNA-binding residues
    Wang, Duo-Duo
    Li, Tong-Hua
    Sun, Jiang-Ming
    Li, Da-Peng
    Xiong, Wen-Wei
    Wang, Wen-Yan
    Tang, Sheng-Nan
    BIOCHIMIE, 2013, 95 (02) : 354 - 358
  • [36] Using multiple convolutional window scanning of convolutional neural network for an efficient prediction of ATP-binding sites in transport proteins
    Nguyen, Trinh-Trung-Duong
    Chen, Syun
    Ho, Quang-Thai
    Ou, Yu-Yen
    PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2022, 90 (07) : 1486 - 1492
  • [37] Analysis of sequence Specificities of DNA-binding proteins with protein binding microarrays
    Bulyk, Martha L.
    DNA MICROARRAYS PART A: ARRAY PLATFORMS AND WET-BENCH PROTOCOLS, 2006, 410 : 279 - +
  • [38] Prediction of DNA-binding specificity in zinc finger proteins
    Sumedha Roy
    Shayoni Dutta
    Kanika Khanna
    Shruti Singla
    Durai Sundar
    Journal of Biosciences, 2012, 37 : 483 - 491
  • [40] Prediction of mono- and di-nucleotide-specific DNA-binding sites in proteins using neural networks
    Andrabi, Munazah
    Mizuguchi, Kenji
    Sarai, Akinori
    Ahmad, Shandar
    BMC STRUCTURAL BIOLOGY, 2009, 9