Prediction of Essential Proteins in Prokaryotes by Incorporating Various Physico-chemical Features into the General form of Chou's Pseudo Amino Acid Composition

被引:26
|
作者
Sarangi, Aditya Narayan [1 ]
Lohani, Mohtashim [2 ]
Aggarwal, Rakesh [1 ]
机构
[1] Sanjay Gandhi Postgrad Inst Med Sci, Sch Telemed & Biomed Informat, Biomed Informat Ctr, Lucknow 226014, Uttar Pradesh, India
[2] Integral Univ, Dept Biotechnol, Lucknow 226026, Uttar Pradesh, India
关键词
Machine learning; support vector machine; essential protein; classification; SUPPORT VECTOR MACHINES; OUTER-MEMBRANE PROTEINS; POTENTIAL-DRUG TARGETS; SUBCELLULAR-LOCALIZATION; CRYSTALLIZATION PROPENSITY; NETWORK TOPOLOGY; WEB SERVER; SEQUENCE; PSEAAC; IDENTIFICATION;
D O I
10.2174/0929866511320070008
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Prediction of essential proteins of a pathogenic organism is the key for the potential drug target identification, because inhibition of these would be fatal for the pathogen. Identification of these proteins requires the use of complex experimental techniques which are quite expensive and time consuming. We implemented Support Vector Machine algorithm to develop a classifier model for in silico prediction of prokaryotic essential proteins based on the physico-chemical properties of the amino acid sequences. This classifier was designed based on a set of 10 physico-chemical descriptor vectors (DVs) and 4 hybrid DVs calculated from amino acid sequences using PROFEAT and PseAAC servers. The classifier was trained using data sets consisting of 500 known essential and 500 non-essential proteins (n=1,000) and evaluated using an external validation set consisting of 3,462 essential proteins and 5,538 non-essential proteins (n=9,000). The performances of individual DV sets were evaluated. DV set 13, which is the combination of composition, transition and distribution descriptor set and hybrid autocorrelation descriptor set, provided accuracy of 91.2% in 10-fold cross-validation of the training set and an accuracy of 89.7% in external validation set and of 91.8% and 88.1% using a different yeast protein dataset. Our result indicates that this classification model can be used for identification of novel prokaryotic essential proteins.
引用
收藏
页码:781 / 795
页数:15
相关论文
共 50 条
  • [31] OligoPred: A web-server for predicting homo-oligomeric proteins by incorporating discrete wavelet transform into Chou's pseudo amino acid composition
    Qiu, Jian-Ding
    Suo, Sheng-Bao
    Sun, Xing-Yu
    Shi, Shao-Ping
    Liang, Ru-Ping
    JOURNAL OF MOLECULAR GRAPHICS & MODELLING, 2011, 30 : 129 - 134
  • [32] Predict protein structural class by incorporating two different modes of evolutionary information into Chou's general pseudo amino acid composition
    Liang, Yunyun
    Zhang, Shengli
    JOURNAL OF MOLECULAR GRAPHICS & MODELLING, 2017, 78 : 110 - 117
  • [33] Prediction of G-protein coupled receptors and their subfamilies by incorporating various sequence features into Chou's general PseAAC
    Tiwari, Arvind Kumar
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2016, 134 : 197 - 213
  • [34] Prediction of Subcellular Localization of Apoptosis Protein Using Chou’s Pseudo Amino Acid Composition
    Hao Lin
    Hao Wang
    Hui Ding
    Ying-Li Chen
    Qian-Zhong Li
    Acta Biotheoretica, 2009, 57 : 321 - 330
  • [35] Predict protein structural class for low-similarity sequences by evolutionary difference information into the general form of Chou's pseudo amino acid composition
    Zhang, Lichao
    Zhao, Xiqiang
    Kong, Liang
    JOURNAL OF THEORETICAL BIOLOGY, 2014, 355 : 105 - 110
  • [36] Predicting Protein Fold Types by the General Form of Chou's Pseudo Amino Acid Composition: Approached from Optimal Feature Extractions
    Liu, Lei
    Hu, Xiu-Zhen
    Liu, Xing-Xing
    Wang, Ying
    Li, Shao-Bo
    PROTEIN AND PEPTIDE LETTERS, 2012, 19 (04) : 439 - 449
  • [37] Probabilistic expression of spatially varied amino acid dimers into general form of Chou's pseudo amino acid composition for protein fold recognition
    Saini, Harsh
    Raicar, Gaurav
    Sharma, Alok
    Lal, Sunil
    Dehzangi, Abdollah
    Lyons, James
    Paliwal, Kuldip K.
    Imoto, Seiya
    Miyano, Satoru
    JOURNAL OF THEORETICAL BIOLOGY, 2015, 380 : 291 - 298
  • [38] Predicting the Classification of Transcription Factors by Incorporating their Binding Site Properties into a Novel Mode of Chou's Pseudo Amino Acid Composition
    Ren, Liang-Yun
    Zhang, Yu-Sen
    Gutman, Ivan
    PROTEIN AND PEPTIDE LETTERS, 2012, 19 (11) : 1170 - 1176
  • [39] Predicting protein submitochondria locations by combining different descriptors into the general form of Chou’s pseudo amino acid composition
    Guo-Liang Fan
    Qian-Zhong Li
    Amino Acids, 2012, 43 : 545 - 555
  • [40] SecretP: Identifying bacterial secreted proteins by fusing Chou's pseudo-amino acid composition
    Yu, Lezheng
    Guo, Yanzhi
    Li, Yizhou
    Li, Gongbing
    Li, Menglong
    Luo, Jiesi
    Xiong, Wenjia
    Qin, Wenli
    JOURNAL OF THEORETICAL BIOLOGY, 2010, 267 (01) : 1 - 6