A Novel Sequence-Based Method of Predicting Protein DNA-Binding Residues, Using a Machine Learning Approach

被引:5
|
作者
Cai, Yudong [1 ,2 ]
He, ZhiSong [3 ]
Shi, Xiaohe [4 ,5 ]
Kong, Xiangying [4 ,5 ,6 ]
Gu, Lei [7 ]
Xie, Lu [8 ]
机构
[1] Shanghai Univ, Inst Syst Biol, Shanghai 200244, Peoples R China
[2] Fudan Univ, Ctr Computat Syst Biol, Shanghai 200433, Peoples R China
[3] Zhejiang Univ, Dept Bioinformat, Coll Life Sci, Hangzhou 310058, Zhejiang, Peoples R China
[4] Chinese Acad Sci, Shanghai Inst Biol Sci, Inst Hlth Sci, Beijing 100864, Peoples R China
[5] Shanghai Jiao Tong Univ, Sch Med, Shanghai, Peoples R China
[6] Shanghai Jiao Tong Univ, Ruijin Hosp, State Key Lab Med Genom, Shanghai 200025, Peoples R China
[7] Fraunhofer Inst Algorithms & Sci Comp, Dept Bioinformat, Aachen, Germany
[8] Shanghai Ctr Bioinformat Technol, Shanghai 200235, Peoples R China
关键词
bioinformatics; data mining; machine learning; mRMR; protein-DNA interaction; SITES; INFORMATION; IDENTIFICATION; RECOGNITION; MODELS; MOTIFS; DOMAIN; P53;
D O I
10.1007/s10059-010-0093-0
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Protein-DNA interactions play an essential role in transcriptional regulation, DNA repair, and many vital biological processes. The mechanism of protein-DNA binding, however, remains unclear. For the study of many diseases, researchers must improve their understanding of the amino acid motifs that recognize DNA. Because identifying these motifs experimentally is expensive and time-consuming, it is necessary to devise an approach for computational prediction. Some in silico methods have been developed, but there are still considerable limitations. In this study, we used a machine learning approach to develop a new sequence-based method of predicting protein-DNA binding residues. To make these predictions, we used the properties of the micro-environment of each amino acid from the AAIndex as well as conservation scores. Testing by the cross-validation method, we obtained an overall accuracy of 94.89%. Our method shows that the amino acid micro-environment is important for DNA binding, and that it is possible to identify the protein-DNA binding sites with it.
引用
收藏
页码:99 / 105
页数:7
相关论文
共 50 条
  • [31] Evolutionary approach to predicting the binding site residues of a protein from its primary sequence
    Tseng, Yan Yuan
    Li, Wen-Hsiung
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2011, 108 (13) : 5313 - 5318
  • [32] A SVM-based Approach for Predicting DNA-binding Residues in Proteins from Amino Acid Sequences
    Ma, Xin
    Wu, Jian-Sheng
    Liu, Hong-De
    Yang, Xi-Nan
    Xie, Jian-Ming
    Sun, Xiao
    2009 INTERNATIONAL JOINT CONFERENCE ON BIOINFORMATICS, SYSTEMS BIOLOGY AND INTELLIGENT COMPUTING, PROCEEDINGS, 2009, : 225 - 229
  • [33] DNABP: Identification of DNA-Binding Proteins Based on Feature Selection Using a Random Forest and Predicting Binding Residues
    Ma, Xin
    Guo, Jing
    Sun, Xiao
    PLOS ONE, 2016, 11 (12):
  • [34] DRNApred, fast sequence-based method that accurately predicts and discriminates DNA- and RNA-binding residues
    Yan, Jing
    Kurgan, Lukasz
    NUCLEIC ACIDS RESEARCH, 2017, 45 (10)
  • [35] Predicting DNA-binding Locations and Orientation on Proteins Using Knowledge-based Learning of Geometric Properties
    Wang, Chien-Chih
    Chen, Chien-Yu
    2010 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2010, : 3 - 8
  • [36] gDNA-Prot: Predict DNA-binding proteins by employing support vector machine and a novel numerical characterization of protein sequence
    Zhang, Yan-ping
    Wuyunqiqige
    Zheng, Wei
    Liu, Shuyi
    Zhao, Chunguang
    JOURNAL OF THEORETICAL BIOLOGY, 2016, 406 : 8 - 16
  • [37] Sequence-based Detection of DNA-binding Proteins using Multiple-View Features Allied with Feature Selection
    Zhou, Liling
    Song, Xiaoning
    Yu, Dong-Jun
    Sun, Jun
    MOLECULAR INFORMATICS, 2020, 39 (08)
  • [38] Prediction of DNA-binding residues from sequence information using convolutional neural network
    Zhou, Jiyun
    Lu, Qin
    Xu, Ruifeng
    Gui, Lin
    Wang, Hongpeng
    INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2017, 17 (02) : 132 - 152
  • [39] Predicting subcellular location of protein with evolution information and sequence-based deep learning
    Liao, Zhijun
    Pan, Gaofeng
    Sun, Chao
    Tang, Jijun
    BMC BIOINFORMATICS, 2021, 22 (SUPPL 10)
  • [40] DNA-binding protein prediction based on deep transfer learning
    Yan, Jun
    Jiang, Tengsheng
    Liu, Junkai
    Lu, Yaoyao
    Guan, Shixuan
    Li, Haiou
    Wu, Hongjie
    Ding, Yijie
    MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2022, 19 (08) : 7719 - 7736