Predicting Apoptosis Protein Subcellular Locations based on the Protein Overlapping Property Matrix and Tri-Gram Encoding

被引:3
作者
Yang, Yang [1 ]
Zheng, Huiwen [2 ]
Wang, Chunhua [3 ]
Xiao, Wanyue [1 ]
Liu, Taigang [3 ,4 ]
机构
[1] Shanghai Ocean Univ, AIEN Inst, Shanghai 201306, Peoples R China
[2] Univ Tasmania, Coll Sci & Engn, Hobart, Tas 7001, Australia
[3] Shanghai Ocean Univ, Coll Informat Technol, Shanghai 201306, Peoples R China
[4] Minist Agr, Key Lab Fisheries Informat, Shanghai 201306, Peoples R China
基金
中国国家自然科学基金;
关键词
tri-gram; protein overlapping property matrix; subcellular location; support vector machine; recursive feature elimination; AMINO-ACID-COMPOSITION; SEQUENCE-BASED PREDICTOR; LOCALIZATION PREDICTION; REPRESENTATION; CLASSIFICATION; IDENTIFICATION; PSEAAC; SITES;
D O I
10.3390/ijms20092344
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
To reveal the working pattern of programmed cell death, knowledge of the subcellular location of apoptosis proteins is essential. Besides the costly and time-consuming method of experimental determination, research into computational locating schemes, focusing mainly on the innovation of representation techniques on protein sequences and the selection of classification algorithms, has become popular in recent decades. In this study, a novel tri-gram encoding model is proposed, which is based on using the protein overlapping property matrix (POPM) for predicting apoptosis protein subcellular location. Next, a 1000-dimensional feature vector is built to represent a protein. Finally, with the help of support vector machine-recursive feature elimination (SVM-RFE), we select the optimal features and put them into a support vector machine (SVM) classifier for predictions. The results of jackknife tests on two benchmark datasets demonstrate that our proposed method can achieve satisfactory prediction performance level with less computing capacity required and could work as a promising tool to predict the subcellular locations of apoptosis proteins.
引用
收藏
页数:9
相关论文
共 45 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   iGHBP: Computational identification of growth hormone binding proteins from sequences using extremely randomised tree [J].
Basith, Shaherin ;
Manavalan, Balachandran ;
Shin, Tae Hwan ;
Lee, Gwang .
COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2018, 16 :412-420
[3]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[4]   Prediction of apoptosis protein subcellular location using improved hybrid approach and pseudo-amino acid composition [J].
Chen, Ying-Li ;
Li, Qian-Zhong .
JOURNAL OF THEORETICAL BIOLOGY, 2007, 248 (02) :377-381
[5]   Prediction of the subcellular location of apoptosis proteins [J].
Chen, Ying-Li ;
Li, Qian-Zhong .
JOURNAL OF THEORETICAL BIOLOGY, 2007, 245 (04) :775-783
[6]   Using Chou's pseudo amino acid composition to predict subcellular localization of apoptosis proteins: An approach with immune genetic algorithm-based ensemble classifier [J].
Ding, Yong-Sheng ;
Zhang, Tong-Liang .
PATTERN RECOGNITION LETTERS, 2008, 29 (13) :1887-1892
[7]   PhosphoSVM: prediction of phosphorylation sites by integrating various protein sequence attributes with a support vector machine [J].
Dou, Yongchao ;
Yao, Bo ;
Zhang, Chi .
AMINO ACIDS, 2014, 46 (06) :1459-1469
[8]   Prediction of catalytic residues based on an overlapping amino acid classification [J].
Dou, Yongchao ;
Zheng, Xiaoqi ;
Yang, Jialiang ;
Wang, Jun .
AMINO ACIDS, 2010, 39 (05) :1353-1361
[9]   Prediction of subcellular location apoptosis proteins with ensemble classifier and feature selection [J].
Gu, Quan ;
Ding, Yong-Sheng ;
Jiang, Xiao-Ying ;
Zhang, Tong-Liang .
AMINO ACIDS, 2010, 38 (04) :975-983
[10]   Gene selection for cancer classification using support vector machines [J].
Guyon, I ;
Weston, J ;
Barnhill, S ;
Vapnik, V .
MACHINE LEARNING, 2002, 46 (1-3) :389-422