A Novel Sequence-Based Method of Predicting Protein DNA-Binding Residues, Using a Machine Learning Approach

被引:5
|
作者
Cai, Yudong [1 ,2 ]
He, ZhiSong [3 ]
Shi, Xiaohe [4 ,5 ]
Kong, Xiangying [4 ,5 ,6 ]
Gu, Lei [7 ]
Xie, Lu [8 ]
机构
[1] Shanghai Univ, Inst Syst Biol, Shanghai 200244, Peoples R China
[2] Fudan Univ, Ctr Computat Syst Biol, Shanghai 200433, Peoples R China
[3] Zhejiang Univ, Dept Bioinformat, Coll Life Sci, Hangzhou 310058, Zhejiang, Peoples R China
[4] Chinese Acad Sci, Shanghai Inst Biol Sci, Inst Hlth Sci, Beijing 100864, Peoples R China
[5] Shanghai Jiao Tong Univ, Sch Med, Shanghai, Peoples R China
[6] Shanghai Jiao Tong Univ, Ruijin Hosp, State Key Lab Med Genom, Shanghai 200025, Peoples R China
[7] Fraunhofer Inst Algorithms & Sci Comp, Dept Bioinformat, Aachen, Germany
[8] Shanghai Ctr Bioinformat Technol, Shanghai 200235, Peoples R China
关键词
bioinformatics; data mining; machine learning; mRMR; protein-DNA interaction; SITES; INFORMATION; IDENTIFICATION; RECOGNITION; MODELS; MOTIFS; DOMAIN; P53;
D O I
10.1007/s10059-010-0093-0
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Protein-DNA interactions play an essential role in transcriptional regulation, DNA repair, and many vital biological processes. The mechanism of protein-DNA binding, however, remains unclear. For the study of many diseases, researchers must improve their understanding of the amino acid motifs that recognize DNA. Because identifying these motifs experimentally is expensive and time-consuming, it is necessary to devise an approach for computational prediction. Some in silico methods have been developed, but there are still considerable limitations. In this study, we used a machine learning approach to develop a new sequence-based method of predicting protein-DNA binding residues. To make these predictions, we used the properties of the micro-environment of each amino acid from the AAIndex as well as conservation scores. Testing by the cross-validation method, we obtained an overall accuracy of 94.89%. Our method shows that the amino acid micro-environment is important for DNA binding, and that it is possible to identify the protein-DNA binding sites with it.
引用
收藏
页码:99 / 105
页数:7
相关论文
共 50 条
  • [21] Combining Biochemical Features and Evolutionary Information for Predicting DNA-Binding Residues in Protein Sequences
    Wang, Liangjiang
    ADVANCES IN COMPUTATIONAL SCIENCE AND ENGINEERING, 2009, 28 : 176 - 189
  • [22] Identification of DNA-binding residues of a protein from its primary sequence
    Ma, Xin
    Hu, Lefu
    2012 FIFTH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID 2012), VOL 1, 2012, : 290 - 293
  • [23] Sequence-Based Prediction of Protein-Peptide Binding Sites Using Support Vector Machine
    Taherzadeh, Ghazaleh
    Yang, Yuedong
    Zhang, Tuo
    Liew, Alan Wee-Chung
    Zhou, Yaoqi
    JOURNAL OF COMPUTATIONAL CHEMISTRY, 2016, 37 (13) : 1223 - 1229
  • [24] EmbedCaps-DBP: Predicting DNA-Binding Proteins Using Protein Sequence Embedding and Capsule Network
    Naim, Muhammad Khaerul
    Mengko, Tati Rajab
    Hertadi, Rukman
    Purwarianti, Ayu
    Susanty, Meredita
    IEEE ACCESS, 2023, 11 : 121256 - 121268
  • [25] Method for Predicting Hot Spot Residues at Protein-Protein Interface Based on the Extreme Learning Machine
    Qiu, Yanzi
    Ping, Pengyao
    Wang, Lei
    Pei, Tingrui
    PROCEEDINGS OF 2017 3RD IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATIONS (ICCC), 2017, : 2689 - 2698
  • [26] StackDPPred: a stacking based prediction of DNA-binding protein from sequence
    Mishra, Avdesh
    Pokhrel, Pujan
    Hoque, Md Tamjidul
    BIOINFORMATICS, 2019, 35 (03) : 433 - 441
  • [27] Sequence-Based Predicting Bacterial Essential ncRNAs Algorithm by Machine Learning
    Ye, Yuan-Nong
    Liang, Ding-Fa
    Labena, Abraham Alemayehu
    Zeng, Zhu
    INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2023, 36 (03) : 2731 - 2741
  • [28] Sequence-based analysis and prediction of lantibiotics: A machine learning approach
    Poorinmohammad, Naghmeh
    Hamedi, Javad
    Moghaddam, Mohammad Hossein Abbaspour Motlagh
    COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2018, 77 : 199 - 206
  • [29] Prediction and validation of protein-protein interactors from genome-wide DNA-binding data using a knowledge-based machine-learning approach
    Waardenberg, Ashley J.
    Homan, Bernou
    Mohamed, Stephanie
    Harvey, Richard P.
    Bouveret, Romaric
    OPEN BIOLOGY, 2016, 6 (09)
  • [30] Improving Sequence-Based Prediction of Protein Peptide Binding Residues by Introducing Intrinsic Disorder and a Consensus Method
    Zhao, Zijuan
    Peng, Zhenling
    Yang, Jianyi
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2018, 58 (07) : 1459 - 1468