A novel prediction method for protein DNA-binding residues based on neighboring residue correlations

被引:1
作者
Song, Jiazhi [1 ,2 ,3 ]
Liu, Guixia [1 ,3 ]
Jiang, Jingqing [2 ]
机构
[1] Jilin Univ, Coll Comp Sci & Technol, 2699 Qianjin St, Changchun 130012, Jilin, Peoples R China
[2] Inner Mongolia Minzu Univ, Coll Comp Sci & Technol, Tongliao, Inner Mongolia, Peoples R China
[3] Jilin Univ, Coll Comp Sci & Technol, Dept Key Lab Symbol Computat & Knowledge Engn, Minist Educ, Changchun, Jilin, Peoples R China
关键词
Bioinformatics; protein; machine learning; binding sites; sequence information; INTEGRATING SEQUENCE; DOMAIN; SITES;
D O I
10.1080/13102818.2022.2122871
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Accurately identifying the protein DNA-binding residues is important for understanding the protein-DNA recognition mechanism and protein function annotation. Many computational methods have been proposed to predict protein-DNA binding residues using protein sequence information; however, for severe imbalanced data like the protein-DNA binding dataset, the under-sampling technique which is applied by most previous methods cannot achieve satisfactory performance. In this study, an adjustment algorithm is proposed to offset the biased prediction results from the classifier. The proposed adjustment algorithm uses the binding probability between the target residue and its neighboring residues to identify more true binding residues which are wrongly predicted as non-binding. The proposed prediction method with adjustment algorithm achieves an area under the ROC curve (AUC) of 0.926 and 0.866 on two benchmark datasets and 0.882 on the independent testing set, which demonstrates that the proposed method can efficiently predict specific residues for protein-DNA interactions.
引用
收藏
页码:865 / 877
页数:13
相关论文
共 50 条
  • [21] TargetDBP: Accurate DNA-Binding Protein Prediction Via Sequence-Based Multi-View Feature Learning
    Hu, Jun
    Zhou, Xiao-Gen
    Zhu, Yi-Heng
    Yu, Dong-Jun
    Zhang, Gui-Jun
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2020, 17 (04) : 1419 - 1429
  • [22] Local-DPP: An improved DNA-binding protein prediction method by exploring local evolutionary information
    Wei, Leyi
    Tang, Jijun
    Zou, Quan
    INFORMATION SCIENCES, 2017, 384 : 135 - 144
  • [23] The DUF328 family member YaaA is a DNA-binding protein with a novel fold
    Prahlad, Janani
    Yuan, Yifeng
    Lin, Jiusheng
    Chang, Chou-Wei
    Iwata-Reuyl, Dirk
    Liu, Yilun
    de Crecy-Lagard, Valerie
    Wilson, Mark A.
    JOURNAL OF BIOLOGICAL CHEMISTRY, 2020, 295 (41) : 14236 - 14247
  • [24] Prediction of DNA-Binding Propensity of Proteins by the Ball-Histogram Method
    Szaboova, Andrea
    Kuzelka, Ondrej
    Morales, Sergio E.
    Zelezny, Filip
    Tolar, Jakub
    BIOINFORMATICS RESEARCH AND APPLICATIONS, 2011, 6674 : 358 - +
  • [25] A Novel Approach to Predict Core Residues on Cancer-Related DNA-Binding Domains
    Wong, Ka-Chun
    CANCER INFORMATICS, 2016, 15 : 1 - 7
  • [26] Protein metal binding residue prediction based on neural networks
    Lin, CT
    Lin, KL
    Yang, CH
    Chung, IF
    Huang, CD
    Yang, YS
    INTERNATIONAL JOURNAL OF NEURAL SYSTEMS, 2005, 15 (1-2) : 71 - 84
  • [27] Prediction of protein structural classes based on correlations of amino acid residues
    Wang, SQ
    Liu, H
    Du, QS
    Wei, DQ
    ACTA PHYSICO-CHIMICA SINICA, 2004, 20 (05) : 498 - 502
  • [28] CNNsite: Prediction of DNA-binding Residues in Proteins Using Convolutional Neural Network with Sequence Features
    Zhou, Jiyun
    Lu, Qin
    Xu, Ruifeng
    Gui, Lin
    Wang, Hongpeng
    2016 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2016, : 78 - 85
  • [29] LGC-DBP: the method of DNA-binding protein identification based on PSSM and deep learning
    Zhu, Yiqi
    Sun, Ailun
    FRONTIERS IN GENETICS, 2024, 15
  • [30] New Descriptors of Evolutionary Information for Accurate Prediction of DNA and RNA-Binding Residues in Protein Sequences
    Wang, Liangjiang
    Huang, Caiyan
    2009 INTERNATIONAL JOINT CONFERENCE ON BIOINFORMATICS, SYSTEMS BIOLOGY AND INTELLIGENT COMPUTING, PROCEEDINGS, 2009, : 246 - 250