Prediction of subcellular location of apoptosis proteins combining tri-gram encoding based on PSSM and recursive feature elimination

被引:17
作者
Liu, Taigang [1 ]
Tao, Peiying [2 ]
Li, Xiaowei [2 ]
Qin, Yufang [1 ]
Wang, Chunhua [1 ]
机构
[1] Shanghai Ocean Univ, Coll Informat Technol, Shanghai 201306, Peoples R China
[2] Shanghai Ocean Univ, Coll Food Sci & Technol, Shanghai 201306, Peoples R China
基金
中国国家自然科学基金;
关键词
Feature selection; Position-specific score matrix; Protein sequence representation; Support vector machine; AMINO-ACID-COMPOSITION; LOCALIZATION;
D O I
10.1016/j.jtbi.2014.11.010
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Knowledge of apoptosis proteins plays an important role in understanding the mechanism of programmed cell death. Obtaining information on subcellular location of apoptosis proteins is very helpful to reveal the apoptosis mechanism and understand the function of apoptosis proteins. Because of the cost in time and labor associated with large-scale wet-bench experiments, computational prediction of apoptosis proteins subcellular location becomes very important and many computational tools have been developed in the recent decades. Existing methods differ in the protein sequence representation techniques and classification algorithms adopted. In this study, we firstly introduce a sequence encoding scheme based on tri-grams computed directly from position-specific score matrices, which incorporates evolution information represented in the PSI-BLAST profile and sequence-order information. Then SVM-RFE algorithm is applied for feature selection and reduced vectors are input to a support vector machine classifier to predict subcellular location of apoptosis proteins. Jackknife tests on three widely used datasets show that our method provides the state-of-the-art performance in comparison with other existing methods. (C) 2014 Elsevier Ltd. All rights reserved.
引用
收藏
页码:8 / 12
页数:5
相关论文
共 24 条
  • [1] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    Altschul, SF
    Madden, TL
    Schaffer, AA
    Zhang, JH
    Zhang, Z
    Miller, W
    Lipman, DJ
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
  • [2] LIBSVM: A Library for Support Vector Machines
    Chang, Chih-Chung
    Lin, Chih-Jen
    [J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
  • [3] Prediction of protein structural class using novel evolutionary collocation-based sequence representation
    Chen, Ke
    Kurgan, Lukasz A.
    Ruan, Jishou
    [J]. JOURNAL OF COMPUTATIONAL CHEMISTRY, 2008, 29 (10) : 1596 - 1604
  • [4] Prediction of apoptosis protein subcellular location using improved hybrid approach and pseudo-amino acid composition
    Chen, Ying-Li
    Li, Qian-Zhong
    [J]. JOURNAL OF THEORETICAL BIOLOGY, 2007, 248 (02) : 377 - 381
  • [5] Prediction of the subcellular location of apoptosis proteins
    Chen, Ying-Li
    Li, Qian-Zhong
    [J]. JOURNAL OF THEORETICAL BIOLOGY, 2007, 245 (04) : 775 - 783
  • [6] Cherkassky V, 1997, IEEE Trans Neural Netw, V8, P1564, DOI 10.1109/TNN.1997.641482
  • [7] PREDICTION OF PROTEIN STRUCTURAL CLASSES
    CHOU, KC
    ZHANG, CT
    [J]. CRITICAL REVIEWS IN BIOCHEMISTRY AND MOLECULAR BIOLOGY, 1995, 30 (04) : 275 - 349
  • [8] Recent progress in protein subcellular location prediction
    Chou, Kuo-Chen
    Shen, Hong-Bin
    [J]. ANALYTICAL BIOCHEMISTRY, 2007, 370 (01) : 1 - 16
  • [9] Using Chou's pseudo amino acid composition to predict subcellular localization of apoptosis proteins: An approach with immune genetic algorithm-based ensemble classifier
    Ding, Yong-Sheng
    Zhang, Tong-Liang
    [J]. PATTERN RECOGNITION LETTERS, 2008, 29 (13) : 1887 - 1892
  • [10] Prediction of subcellular location apoptosis proteins with ensemble classifier and feature selection
    Gu, Quan
    Ding, Yong-Sheng
    Jiang, Xiao-Ying
    Zhang, Tong-Liang
    [J]. AMINO ACIDS, 2010, 38 (04) : 975 - 983