Combing ontologies and dipeptide composition for predicting DNA-binding proteins

被引:29
作者
Nanni, Loris [1 ]
Lumini, Alessandra [1 ]
机构
[1] Univ Bologna, DEIS, CNR, IEIIT, I-40136 Bologna, Italy
关键词
DNA-binding proteins; gene ontology; dipeptide composition; Chou's pseudo amino acid composition; multi-classifier;
D O I
10.1007/s00726-007-0016-3
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Given a novel protein it is very important to know if it is a DNA-binding protein, because DNA-binding proteins participate in the fundamental role to regulate gene expression. In this work, we propose a parallel fusion between a classifier trained using the features extracted from the gene ontology database and a classifier trained using the dipeptide composition of the protein. As classifiers the support vector machine (SVM) and the 1-nearest neighbour are used. Matthews's correlation coefficient obtained by our fusion method is approximate to 0.97 when the jackknife cross-validation is used; this result outperforms the best performance obtained in the literature (0.924) using the same dataset where the SVM is trained using only the Chou's pseudo amino acid based features. In this work also the area under the ROC-curve (AUC) is reported and our results show that the fusion permits to obtain a very interesting 0.995 AUC. In particular we want to stress that our fusion obtains a 5% false negative with a 0% of false positive. Matthews's correlation coefficient obtained using the single best GO-number is only 0.7211 and hence it is not possible to use the gene ontology database as a simple lookup table. Finally, we test the complementarity of the two tested feature extraction methods using the Q-statistic. We obtain the very interesting result of 0.58, which means that the features extracted from the gene ontology database and the features extracted from the amino acid sequence are partially independent and that their parallel fusion should be studied more.
引用
收藏
页码:635 / 641
页数:7
相关论文
共 50 条
  • [21] DNA-Binding Proteins and Structural Characteristics of Chloroplast Nucleoids
    Yu. P. Oleskina
    N. P. Yurina
    S. M. Mel'nik
    G. G. Belkina
    M. S. Odintsova
    Russian Journal of Plant Physiology, 2001, 48 : 487 - 492
  • [22] EmbedCaps-DBP: Predicting DNA-Binding Proteins Using Protein Sequence Embedding and Capsule Network
    Naim, Muhammad Khaerul
    Mengko, Tati Rajab
    Hertadi, Rukman
    Purwarianti, Ayu
    Susanty, Meredita
    IEEE ACCESS, 2023, 11 : 121256 - 121268
  • [23] StackPDB: Predicting DNA-binding proteins based on XGB-RFE feature optimization and stacked ensemble classifier
    Zhang, Qingmei
    Liu, Peishun
    Wang, Xue
    Zhang, Yaqun
    Han, Yu
    Yu, Bin
    APPLIED SOFT COMPUTING, 2021, 99
  • [24] T3: Targeted proteomics of DNA-binding proteins
    Nagore, Linda I.
    Jarrett, Harry W.
    ANALYTICAL BIOCHEMISTRY, 2015, 474 : 8 - 15
  • [25] DYNAMICS OF DNA-BINDING ACTIVITY OF CYTOPLASMIC PROTEINS DURING AUTOPROTEOLYSIS
    ZAMOTRINSKII, AV
    KUZMIN, AI
    BULLETIN OF EXPERIMENTAL BIOLOGY AND MEDICINE, 1990, 110 (10) : 1352 - 1355
  • [26] DNA-BINDING PROTEINS IN THE SERA OF PATIENTS WITH MALIGNANT-MELANOMA
    REIMER, G
    MAIBACH, M
    LEONHARDI, G
    ARCHIVES OF DERMATOLOGICAL RESEARCH, 1979, 264 (03) : 265 - 273
  • [27] PredDRBP-MLP: Prediction of DNA-binding proteins and RNA-binding proteins by multilayer perceptron
    Arican, Ozgur Can
    Gumus, Ozgur
    COMPUTERS IN BIOLOGY AND MEDICINE, 2023, 164
  • [28] A DNA-binding protein capture technology that purifies proteins by directly isolating the target DNA
    Wang, Zhibo
    He, Zihang
    Wang, Jingxin
    Wang, Chao
    Gao, Caiqiu
    Wang, Yucheng
    PLANT SCIENCE, 2023, 335
  • [29] DNA-Binding Proteins Essential for Protein-Primed Bacteriophage Φ29 DNA Replication
    Salas, Margarita
    Holguera, Isabel
    Redrejo-Rodriguez, Modesto
    de Vega, Miguel
    FRONTIERS IN MOLECULAR BIOSCIENCES, 2016, 3
  • [30] An evidence of presence of DNA-binding proteins in selection of dystrophin gene promoter
    Mishra, S
    Mittal, B
    EXPERIMENTAL AND MOLECULAR MEDICINE, 1996, 28 (03): : 131 - 134