Identification of Secretory Proteins in Mycobacterium tuberculosis Using Pseudo Amino Acid Composition

被引:145
作者
Yang, Huan [1 ]
Tang, Hua [2 ]
Chen, Xin-Xin [1 ]
Zhang, Chang-Jian [1 ]
Zhu, Pan-Pan [1 ,3 ]
Ding, Hui [1 ]
Chen, Wei [1 ,4 ]
Lin, Hao [1 ]
机构
[1] Univ Elect Sci & Technol China, Ctr Informat Biol, Sch Life Sci & Technol, Minist Educ,Key Lab Neuroinformat, Chengdu 610054, Peoples R China
[2] Southwest Med Univ, Dept Pathophysiol, Luzhou 646000, Peoples R China
[3] Harbin Inst Technol, Key Lab Network Oriented Intelligent Computat, Shenzhen Grad Sch, Shenzhen 518055, Guangdong, Peoples R China
[4] North China Univ Sci & Technol, Ctr Genom & Computat Biol, Sch Sci, Dept Phys, Tangshan 063000, Peoples R China
关键词
SUPPORT VECTOR MACHINES; PREDICTION; DIMENSION; SETS;
D O I
10.1155/2016/5413903
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Tuberculosis is killingmillions of lives every year and on the blacklist of the most appalling public health problems. Recent findings suggest that secretory protein of Mycobacterium tuberculosis may serve the purpose of developing specific vaccines and drugs due to their antigenicity. Responding to global infectious disease, we focused on the identification of secretory proteins in Mycobacterium tuberculosis. A novel method called MycoSec was designed by incorporating g-gap dipeptide compositions into pseudo amino acid composition. Analysis of variance-based technique was applied in the process of feature selection and a total of 374 optimal features were obtained and used for constructing the final predicting model. In the jackknife test, MycoSec yielded a good performance with the area under the receiver operating characteristic curve of 0.93, demonstrating that the proposed system is powerful and robust. For user's convenience, the web server MycoSec was established and an obliging manual on how to use it was provided for getting around any trouble unnecessary.
引用
收藏
页数:7
相关论文
共 32 条
[1]  
[Anonymous], 2002, Principal components analysis
[2]   The universal protein resource (UniProt) [J].
Bairoch, A ;
Apweiler, R ;
Wu, CH ;
Barker, WC ;
Boeckmann, B ;
Ferro, S ;
Gasteiger, E ;
Huang, HZ ;
Lopez, R ;
Magrane, M ;
Martin, MJ ;
Natale, DA ;
O'Donovan, C ;
Redaschi, N ;
Yeh, LSL .
NUCLEIC ACIDS RESEARCH, 2005, 33 :D154-D159
[3]   The use of the area under the roc curve in the evaluation of machine learning algorithms [J].
Bradley, AP .
PATTERN RECOGNITION, 1997, 30 (07) :1145-1159
[4]   Support vector machines for predicting membrane protein types by using functional domain composition [J].
Cai, YD ;
Zhou, GP ;
Chou, KC .
BIOPHYSICAL JOURNAL, 2003, 84 (05) :3257-3263
[5]   iMiRNA-SSF: Improving the Identification of MicroRNA Precursors by Combining Negative Sets with Different Distributions [J].
Chen, Junjie ;
Wang, Xiaolong ;
Liu, Bin .
SCIENTIFIC REPORTS, 2016, 6
[6]   Using functional domain composition and support vector machines for prediction of protein subcellular location [J].
Chou, KC ;
Cai, YD .
JOURNAL OF BIOLOGICAL CHEMISTRY, 2002, 277 (48) :45765-45769
[7]   PREDICTION OF PROTEIN STRUCTURAL CLASSES [J].
CHOU, KC ;
ZHANG, CT .
CRITICAL REVIEWS IN BIOCHEMISTRY AND MOLECULAR BIOLOGY, 1995, 30 (04) :275-349
[8]  
CORTES C, 1995, MACH LEARN, V20, P273, DOI 10.1023/A:1022627411411
[9]   Support vector machine classification and validation of cancer tissue samples using microarray expression data [J].
Furey, TS ;
Cristianini, N ;
Duffy, N ;
Bednarski, DW ;
Schummer, M ;
Haussler, D .
BIOINFORMATICS, 2000, 16 (10) :906-914
[10]   Evaluation of signal peptide prediction algorithms for identification of mycobacterial signal peptides using sequence data from proteomic methods [J].
Leversen, Nils Anders ;
de Souza, Gustavo A. ;
Malen, Hiwa ;
Prasad, Swati ;
Jonassen, Inge ;
Wiker, Harald G. .
MICROBIOLOGY-SGM, 2009, 155 :2375-2383