Gram-positive and gram-negative subcellular localization using rotation forest and physicochemical-based features

被引:24
作者
Dehzangi, Abdollah [1 ,2 ]
Sohrabi, Sohrab [3 ]
Heffernan, Rhys [3 ]
Sharma, Alok [1 ,4 ]
Lyons, James [3 ]
Paliwal, Kuldip [3 ]
Sattar, Abdul [1 ,2 ]
机构
[1] Griffith Univ, Inst Integrated & Intelligent Syst, Brisbane, Qld 4111, Australia
[2] Natl ICT Australia NICTA, Brisbane, Qld, Australia
[3] Griffith Univ, Sch Engn, Brisbane, Qld 4111, Australia
[4] Univ S Pacific, Sch Engn, Suva, Fiji
基金
澳大利亚研究理事会;
关键词
FOLD PREDICTION-PROBLEM; AMINO-ACID-COMPOSITION; PROTEINS; ENSEMBLE; CLASSIFIER; LOCATIONS; FUSION; PLOC;
D O I
10.1186/1471-2105-16-S4-S1
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: The functioning of a protein relies on its location in the cell. Therefore, predicting protein subcellular localization is an important step towards protein function prediction. Recent studies have shown that relying on Gene Ontology (GO) for feature extraction can improve the prediction performance. However, for newly sequenced proteins, the GO is not available. Therefore, for these cases, the prediction performance of GO based methods degrade significantly. Results: In this study, we develop a method to effectively employ physicochemical and evolutionary-based information in the protein sequence. To do this, we propose segmentation based feature extraction method to explore potential discriminatory information based on physicochemical properties of the amino acids to tackle Gram-positive and Gram-negative subcellular localization. We explore our proposed feature extraction techniques using 10 attributes that have been experimentally selected among a wide range of physicochemical attributes. Finally by applying the Rotation Forest classification technique to our extracted features, we enhance Gram-positive and Gram-negative subcellular localization accuracies up to 3.4% better than previous studies which used GO for feature extraction. Conclusion: By proposing segmentation based feature extraction method to explore potential discriminatory information based on physicochemical properties of the amino acids as well as using Rotation Forest classification technique, we are able to enhance the Gram-positive and Gram-negative subcellular localization prediction accuracies, significantly.
引用
收藏
页数:8
相关论文
共 35 条
[11]  
Dehzangi A., 2013, AUSTRALAS C ARTIF IN, V8272, P32, DOI DOI 10.1007/978-3-319-03680-9_4
[12]  
Dehzangi A, 2015, INT J DATA MINING BI
[13]   A Combination of Feature Extraction Methods with an Ensemble of Different Classifiers for Protein Structural Class Prediction Problem [J].
Dehzangi, Abdollah ;
Paliwal, Kuldip ;
Sharma, Alok ;
Dehzangi, Omid ;
Sattar, Abdul .
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2013, 10 (03) :564-575
[14]  
Dehzangi A, 2013, LECT NOTES COMPUT SC, V7802, P345, DOI 10.1007/978-3-642-36546-1_36
[15]  
Dehzangi A, 2013, LECT NOTES COMPUT SC, V7802, P335, DOI 10.1007/978-3-642-36546-1_35
[16]  
Dehzangi A, 2011, INFORMATION-TOKYO, V14, P3611
[17]   Fold Prediction Problem: The Application of New Physical and Physicochemical-Based Features [J].
Dehzangi, Abdollah ;
Phon-Amnuaisuk, Somnuk .
PROTEIN AND PEPTIDE LETTERS, 2011, 18 (02) :174-185
[18]  
Dehzangi A, 2009, LECT NOTES COMPUT SC, V5864, P503, DOI 10.1007/978-3-642-10684-2_56
[19]  
Dehzangi A, 2010, LECT NOTES COMPUT SC, V6023, P217, DOI 10.1007/978-3-642-12211-8_19
[20]   Methods for predicting bacterial protein subcellular localization [J].
Gardy, Jennifer L. ;
Brinkman, Fiona S. L. .
NATURE REVIEWS MICROBIOLOGY, 2006, 4 (10) :741-751