Prediction of Essential Proteins in Prokaryotes by Incorporating Various Physico-chemical Features into the General form of Chou's Pseudo Amino Acid Composition

被引:26
|
作者
Sarangi, Aditya Narayan [1 ]
Lohani, Mohtashim [2 ]
Aggarwal, Rakesh [1 ]
机构
[1] Sanjay Gandhi Postgrad Inst Med Sci, Sch Telemed & Biomed Informat, Biomed Informat Ctr, Lucknow 226014, Uttar Pradesh, India
[2] Integral Univ, Dept Biotechnol, Lucknow 226026, Uttar Pradesh, India
关键词
Machine learning; support vector machine; essential protein; classification; SUPPORT VECTOR MACHINES; OUTER-MEMBRANE PROTEINS; POTENTIAL-DRUG TARGETS; SUBCELLULAR-LOCALIZATION; CRYSTALLIZATION PROPENSITY; NETWORK TOPOLOGY; WEB SERVER; SEQUENCE; PSEAAC; IDENTIFICATION;
D O I
10.2174/0929866511320070008
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Prediction of essential proteins of a pathogenic organism is the key for the potential drug target identification, because inhibition of these would be fatal for the pathogen. Identification of these proteins requires the use of complex experimental techniques which are quite expensive and time consuming. We implemented Support Vector Machine algorithm to develop a classifier model for in silico prediction of prokaryotic essential proteins based on the physico-chemical properties of the amino acid sequences. This classifier was designed based on a set of 10 physico-chemical descriptor vectors (DVs) and 4 hybrid DVs calculated from amino acid sequences using PROFEAT and PseAAC servers. The classifier was trained using data sets consisting of 500 known essential and 500 non-essential proteins (n=1,000) and evaluated using an external validation set consisting of 3,462 essential proteins and 5,538 non-essential proteins (n=9,000). The performances of individual DV sets were evaluated. DV set 13, which is the combination of composition, transition and distribution descriptor set and hybrid autocorrelation descriptor set, provided accuracy of 91.2% in 10-fold cross-validation of the training set and an accuracy of 89.7% in external validation set and of 91.8% and 88.1% using a different yeast protein dataset. Our result indicates that this classification model can be used for identification of novel prokaryotic essential proteins.
引用
收藏
页码:781 / 795
页数:15
相关论文
共 50 条
  • [41] Using the concept of Chou's pseudo amino acid composition for risk type prediction of human papillomaviruses
    Esmaeili, Maryam
    Mohabatkar, Hassan
    Mohsenzadeh, Sasan
    JOURNAL OF THEORETICAL BIOLOGY, 2010, 263 (02) : 203 - 209
  • [42] Prediction of Cell Wall Lytic Enzymes Using Chou's Amphiphilic Pseudo Amino Acid Composition
    Ding, Hui
    Luo, Liaofu
    Lin, Hao
    PROTEIN AND PEPTIDE LETTERS, 2009, 16 (04) : 351 - 355
  • [43] Chou's pseudo amino acid composition improves sequence-based antifreeze protein prediction
    Mondal, Sukanta
    Pai, Priyadarshini P.
    JOURNAL OF THEORETICAL BIOLOGY, 2014, 356 : 30 - 35
  • [44] Protein sequence analysis by incorporating modified chaos game and physicochemical properties into Chou's general pseudo amino acid composition
    Xu, Chunrui
    Sun, Dandan
    Liu, Shenghui
    Zhang, Yusen
    JOURNAL OF THEORETICAL BIOLOGY, 2016, 406 : 105 - 115
  • [45] Analysis and prediction of ion channel inhibitors by using feature selection and Chou's general pseudo amino acid composition
    Mei, Juan
    Fu, Yi
    Zhao, Ji
    JOURNAL OF THEORETICAL BIOLOGY, 2018, 456 : 41 - 48
  • [46] Using Chou's General Pseudo Amino Acid Composition to Classify Laccases from Bacterial and Fungal Sources via Chou's Five-Step Rule
    Behbahani, Mandana
    Nosrati, Mokhtar
    Moradi, Mohammad
    Mohabatkar, Hassan
    APPLIED BIOCHEMISTRY AND BIOTECHNOLOGY, 2020, 190 (03) : 1035 - 1048
  • [47] Prediction subcellular localization of Gram-negative bacterial proteins by support vector machine using wavelet denoising and Chou's pseudo amino acid composition
    Yu, Bin
    Li, Shan
    Chen, Cheng
    Xu, Jiameng
    Qiu, Wenying
    Wu, Xue
    Chen, Ruixin
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2017, 167 : 102 - 112
  • [48] Predicting subcellular localization of mycobacterial proteins by using Chou's pseudo amino acid composition
    Lin, Hao
    Ding, Hui
    Guo, Feng-Biao
    Zhang, An-Ying
    Huang, Jian
    PROTEIN AND PEPTIDE LETTERS, 2008, 15 (07) : 739 - 744
  • [49] OOgenesis_Pred: A sequence-based method for predicting oogenesis proteins by six different modes of Chou's pseudo amino acid composition
    Rahimi, Maryam
    Bakhtiarizadeh, Mohammad Reza
    Mohammadi-Sangcheshmeh, Abdollah
    JOURNAL OF THEORETICAL BIOLOGY, 2017, 414 : 128 - 136
  • [50] iPhosH-PseAAC: Identify Phosphohistidine Sites in Proteins by Blending Statistical Moments and Position Relative Features According to the Chou's 5-Step Rule and General Pseudo Amino Acid Composition
    Awais, Muhammad
    Hussain, Waqar
    Khan, Yaser Daanial
    Rasool, Nouman
    Khan, Sher Afzal
    Chou, Kuo-Chen
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2021, 18 (02) : 596 - 610