An ensemble of reduced alphabets with protein encoding based on grouped weight for predicting DNA-binding proteins

被引:24
|
作者
Nanni, Loris [1 ]
Lumini, Alessandra [1 ]
机构
[1] Univ Bologna, DEIS, CNR, IEIIT, I-40136 Bologna, Italy
关键词
Multi-classifier; Amino-acid alphabets; Support vector machine; DNA-binding proteins; Ensemble classifier; AMINO-ACID-COMPOSITION; SUPPORT VECTOR MACHINE; SUBCELLULAR LOCATION PREDICTION; STRUCTURAL CLASS PREDICTION; COMPLEXITY MEASURE FACTOR; ENZYME SUBFAMILY CLASSES; IMPROVED HYBRID APPROACH; WEB-SERVER; CELLULAR-AUTOMATA; SUBNUCLEAR LOCALIZATION;
D O I
10.1007/s00726-008-0044-7
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
It is well known in the literature that an ensemble of classifiers obtains good performance with respect to that obtained by a stand-alone method. Hence, it is very important to develop ensemble methods well suited for bioinformatics data. In this work, we propose to combine the feature extraction method based on grouped weight with a set of amino-acid alphabets obtained by a Genetic Algorithm. The proposed method is applied for predicting DNA-binding proteins. As classifiers, the linear support vector machine and the radial basis function support vector machine are tested. As performance indicators, the accuracy and Matthews's correlation coefficient are reported. Matthews's correlation coefficient obtained by our ensemble method is a parts per thousand 0.97 when the jackknife cross-validation is used. This result outperforms the performance obtained in the literature using the same dataset where the features are extracted directly from the amino-acid sequence.
引用
收藏
页码:167 / 175
页数:9
相关论文
共 50 条
  • [1] An ensemble of reduced alphabets with protein encoding based on grouped weight for predicting DNA-binding proteins
    Loris Nanni
    Alessandra Lumini
    Amino Acids, 2009, 36 : 167 - 175
  • [2] StackPDB: Predicting DNA-binding proteins based on XGB-RFE feature optimization and stacked ensemble classifier
    Zhang, Qingmei
    Liu, Peishun
    Wang, Xue
    Zhang, Yaqun
    Han, Yu
    Yu, Bin
    APPLIED SOFT COMPUTING, 2021, 99
  • [3] A sequence-based multiple kernel model for identifying DNA-binding proteins
    Qian, Yuqing
    Jiang, Limin
    Ding, Yijie
    Tang, Jijun
    Guo, Fei
    BMC BIOINFORMATICS, 2021, 22 (SUPPL 3)
  • [4] PseKNC and Adaboost-Based Method for DNA-Binding Proteins Recognition
    Yang, Lina
    Li, Xiangyu
    Shu, Ting
    Wang, Patrick
    Li, Xichun
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2021, 35 (07)
  • [5] Combing ontologies and dipeptide composition for predicting DNA-binding proteins
    Nanni, Loris
    Lumini, Alessandra
    AMINO ACIDS, 2008, 34 (04) : 635 - 641
  • [6] Combing ontologies and dipeptide composition for predicting DNA-binding proteins
    Loris Nanni
    Alessandra Lumini
    Amino Acids, 2008, 34 : 635 - 641
  • [7] StackDPP: a stacking ensemble based DNA-binding protein prediction model
    Ahmed, Sheikh Hasib
    Bose, Dibyendu Brinto
    Khandoker, Rafi
    Rahman, M. Saifur
    BMC BIOINFORMATICS, 2024, 25 (01)
  • [9] Predicting Functional Interactions Among DNA-Binding Proteins
    Khushi, Matloob
    Choudhury, Nazim
    Arthur, Jonathan W.
    Clarke, Christine L.
    Graham, J. Dinny
    NEURAL INFORMATION PROCESSING (ICONIP 2018), PT V, 2018, 11305 : 70 - 80
  • [10] Prediction of DNA-binding proteins by interaction fusion feature representation and selective ensemble
    You, Wenjie
    Yang, Zijiang
    Guo, Guangbao
    Wan, Xiu-Feng
    Ji, Guoli
    KNOWLEDGE-BASED SYSTEMS, 2019, 163 : 598 - 610