Inadequacy of Evolutionary Profiles Vis-a-vis Single Sequences in Predicting Transient DNA-Binding Sites in Proteins

被引:2
|
作者
Arya, Ajay [1 ]
Varghese, Dana Mary [1 ]
Verma, Ajay Kumar [1 ]
Ahmad, Shandar [1 ]
机构
[1] Jawaharlal Nehru Univ, Sch Computat & Integrat Sci, New Delhi 110067, India
关键词
Protein-DNA interactions; DNA-binding sites; Specificity determining positions; Transient and conserved binding sites; WEB SERVER; RESIDUES;
D O I
10.1016/j.jmb.2022.167640
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Sequence-based prediction of DNA-binding residues in a protein is a widely studied problem for which machine learning methods with continuously improving predictive power have been developed. Concatenated rows within a sliding window of a Position Specific Substitution Matrix (PSSM) of the protein are currently used as the primary feature set in almost all the methods of predicting DNA-binding residues. Here we report that these evolutionary profiles are powerful, only for identifying conserved binding sites and fall short for the residue positions which undergo binding to non-binding transitions in closely related proteins. We created a database of highly similar protein pairs with known protein-DNA complexes and investigated differential predictability of conserved and transient binding residues within each pair. Retraining machine learning models uniformly, we compared the predictive powers of the models trained on PSSMs against similarly trained models on sparse-encoded single sequences. We found that the transient binding site predictions from evolutionary profiles are outperformed by single-sequence based models under controlled experiments by as much as 8 percentage points. Thus, we conclude that the PSSMbased models are inadequate to predict high-specificity DNA-binding residues. These findings are of critical significance for the design of mutant-and species-specific DNA ligands and for homology based modeling of protein-DNA complexes.(c) 2022 Elsevier Ltd. All rights reserved.
引用
收藏
页数:11
相关论文
共 18 条
  • [1] Using evolutionary and structural information to predict DNA-binding sites on DNA-binding proteins
    Kuznetsov, Igor B.
    Gou, Zhenkun
    Li, Run
    Hwang, Seungwoo
    PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2006, 64 (01) : 19 - 27
  • [2] Predicting Target DNA Sequences of DNA-Binding Proteins Based on Unbound Structures
    Chen, Chien-Yu
    Chien, Ting-Ying
    Lin, Chih-Kang
    Lin, Chih-Wei
    Weng, Yi-Zhong
    Chang, Darby Tien-Hao
    PLOS ONE, 2012, 7 (02):
  • [3] Predicting DNA-binding sites of proteins from amino acid sequence
    Changhui Yan
    Michael Terribilini
    Feihong Wu
    Robert L Jernigan
    Drena Dobbs
    Vasant Honavar
    BMC Bioinformatics, 7
  • [4] Predicting DNA-binding sites of proteins from amino acid sequence
    Yan, Changhui
    Terribilini, Michael
    Wu, Feihong
    Jernigan, Robert L.
    Dobbs, Drena
    Honavar, Vasant
    BMC BIOINFORMATICS, 2006, 7 (1)
  • [5] Identification of DNA-binding proteins using support vector machines and evolutionary profiles
    Kumar, Manish
    Gromiha, Michael M.
    Raghava, Gajendra P. S.
    BMC BIOINFORMATICS, 2007, 8 (1)
  • [6] Identification of DNA-binding proteins using support vector machines and evolutionary profiles
    Manish Kumar
    Michael M Gromiha
    Gajendra PS Raghava
    BMC Bioinformatics, 8
  • [7] Combining Biochemical Features and Evolutionary Information for Predicting DNA-Binding Residues in Protein Sequences
    Wang, Liangjiang
    ADVANCES IN COMPUTATIONAL SCIENCE AND ENGINEERING, 2009, 28 : 176 - 189
  • [9] Analysis and classification of DNA-binding sites in single-stranded and double-stranded DNA-binding proteins using protein information
    Wang, Wei
    Liu, Juan
    Xiong, Yi
    Zhu, Lida
    Zhou, Xionghui
    IET SYSTEMS BIOLOGY, 2014, 8 (04) : 176 - 183
  • [10] Predicting DNA-binding sites of proteins based on sequential and 3D structural information
    Li, Bi-Qing
    Feng, Kai-Yan
    Ding, Juan
    Cai, Yu-Dong
    MOLECULAR GENETICS AND GENOMICS, 2014, 289 (03) : 489 - 499