Coding of amino acids by texture descriptors

被引:4
作者
Nanni, Loris [1 ]
Lumini, Alessandra [1 ]
机构
[1] Univ Bologna, DEIS, I-47023 Cesena, Italy
关键词
Protein classification; Peptide classification; Vaccine development; Locally binary patterns; Discrete cosine transform; Support vector machine; PROTEASE CLEAVAGE SITES; SUPPORT VECTOR MACHINES; SUBCELLULAR LOCATION; NEURAL-NETWORK; WEB-SERVER; PREDICTION; ENSEMBLE; MODEL;
D O I
10.1016/j.artmed.2009.10.001
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Objective: In this paper we propose a new feature extractor for peptide/protein classification based on the calculation of texture descriptors. Representing a peptide/protein using a matrix descriptor, instead of a vector, allows to deal with the peptide/protein as an image and to use texture descriptors for representation purposes. Methods and materials: A matrix descriptor, which is a squared matrix of the dimension of the peptide/protein, is obtained considering a partial ordering of the amino acids of the peptide/protein according to their value of a given physicochemical property. Each matrix descriptor is considered as a texture image and several texture descriptors are considered to obtain a compact representation which is scale invariant (i.e. independent on the length of the peptide protein). The texture descriptors tested in this work are: local binary patterns (LMP), discrete cosine transform (DCT) and Daubechies wavelets. Results and conclusion: The experimental section reports several tests, aimed at supporting our ideas, performed on the following datasets: vaccine dataset for the predictions of peptides that bind human leukocyte antigens; human immunodeficiency virus (HIV-1) protease cleavage site prediction dataset and membrane proteins type dataset. The experimental results confirm the usefulness of the novel descriptors: the performance obtained by our system on the three difficult datasets is quite high, indicating that the proposed method is a feasible system for extracting information from peptides and proteins. The performance obtained by each of the three texture descriptors calculated from the matrix-based representation, and coupled to a support vector machine classifier, is lower than the performance obtained by other vector-based descriptors based on physicochemical properties proposed in the literature. Anyway the new descriptors bring different information and our tests show that the texture descriptors and the vector-based descriptors can be combined to improve the overall performance of the system. In particular the proposed approach improves the state-of-the-art results in two out of three tested problems (HIV-1 protease cleavage site prediction dataset and membrane proteins type dataset). (C) 2009 Elsevier B.V. All rights reserved.
引用
收藏
页码:43 / 50
页数:8
相关论文
共 53 条
[1]  
[Anonymous], 1973, Pattern Classification and Scene Analysis
[2]  
Bozic I, 2005, LECT NOTES COMPUT SC, V3578, P375
[3]   Computational methods for prediction of T-cell epitopes - a framework for modelling, testing, and applications [J].
Brusic, V ;
Bajic, VB ;
Petrovsky, N .
METHODS, 2004, 34 (04) :436-443
[4]   Prediction of promiscuous peptides that bind HLA class I molecules [J].
Brusic, V ;
Petrovsky, N ;
Zhang, GL ;
Bajic, VB .
IMMUNOLOGY AND CELL BIOLOGY, 2002, 80 (03) :280-285
[5]   Artificial neural network model for predicting HIV protease cleavage sites in protein [J].
Cai, YD ;
Chou, KC .
ADVANCES IN ENGINEERING SOFTWARE, 1998, 29 (02) :119-128
[6]   Support vector machines for predicting HIV protease cleavage sites in protein [J].
Cai, YD ;
Liu, XJ ;
Xu, XB ;
Chou, KC .
JOURNAL OF COMPUTATIONAL CHEMISTRY, 2002, 23 (02) :267-274
[7]   Prediction of protein subcellular locations by incorporating quasi-sequence-order effect [J].
Chou, KC .
BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, 2000, 278 (02) :477-483
[8]   Predicting protein-protein interactions from sequences in a hybridization space [J].
Chou, KC ;
Cai, YD .
JOURNAL OF PROTEOME RESEARCH, 2006, 5 (02) :316-322
[9]   Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes [J].
Chou, KC .
BIOINFORMATICS, 2005, 21 (01) :10-19
[10]   Prediction of human immunodeficiency virus protease cleavage sites in proteins [J].
Chou, KC .
ANALYTICAL BIOCHEMISTRY, 1996, 233 (01) :1-14