Feature Extraction Techniques for Protein Subcellular Localization Prediction

被引:3
作者
Gao, Qing-Bin [1 ]
Jin, Zhi-Chao [1 ]
Wu, Cheng [1 ]
Sun, Ya-Lin [1 ]
He, Jia [1 ]
He, Xiang [2 ]
机构
[1] Second Mil Med Univ, Dept Hlth Stat, Shanghai 200433, Peoples R China
[2] Second Mil Med Univ, Fac Hlth Serv, Shanghai 200433, Peoples R China
基金
上海市自然科学基金; 中国国家自然科学基金;
关键词
Subcellular localization; protein function; feature extraction; protein encoding; computational biology; bioinformatics; SUPPORT VECTOR MACHINES; AMINO-ACID-COMPOSITION; FUNCTIONAL DOMAIN COMPOSITION; GRAM-NEGATIVE-BACTERIA; WEB-SERVER; ENSEMBLE CLASSIFIER; LOCATION PREDICTION; CLEAVAGE SITES; APOPTOSIS PROTEINS; FUSION CLASSIFIER;
D O I
10.2174/157489309788184765
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
To understand the structure and function of a protein, an important task is to know where it occurs in the cell. Thus, a computational method for properly predicting the subcellular location of proteins would be significant in interpreting the original data produced by large-scale genome sequencing projects. Prediction of protein subcellular localization is now a hot topic in bioinformatics community, which has been extensively studied in the past several years. Many computational methods have been proposed by the investigators, but they are still far from the final frontier. Among these methods, except for the prediction algorithms, the main factor influencing the prediction performance of various methods is the techniques used to extract features for characterizing proteins, i.e. the protein encoding schemes. To enhance the prediction performance of existing methods, many different approaches have been taken towards developing efficient and accurate methods for protein subcellular localization prediction, ranging from sorting signal based systems to machine learning as well as a variety of alignment-free techniques based on the physiochemical properties of their amino acid sequences. This review describes the inherent difficulties in developing a protein subcellular localization method and includes feature extraction techniques previously employed in this area. It is anticipated to serve as a guide for readers working in this field.
引用
收藏
页码:120 / 128
页数:9
相关论文
共 135 条
  • [1] The InterPro database, an integrated documentation resource for protein families, domains and functional sites
    Apweiler, R
    Attwood, TK
    Bairoch, A
    Bateman, A
    Birney, E
    Biswas, M
    Bucher, P
    Cerutti, T
    Corpet, F
    Croning, MDR
    Durbin, R
    Falquet, L
    Fleischmann, W
    Gouzy, J
    Hermjakob, H
    Hulo, N
    Jonassen, I
    Kahn, D
    Kanapin, A
    Karavidopoulou, Y
    Lopez, R
    Marx, B
    Mulder, NJ
    Oinn, TM
    Pagni, M
    Servant, F
    Sigrist, CJA
    Zdobnov, EM
    [J]. NUCLEIC ACIDS RESEARCH, 2001, 29 (01) : 37 - 40
  • [2] PSLpred: prediction of subcellular localization of bacterial proteins
    Bhasin, M
    Garg, A
    Raghava, GPS
    [J]. BIOINFORMATICS, 2005, 21 (10) : 2522 - 2524
  • [3] ESLpred: SVM-based method for subcellular localization of eukaryotic proteins using dipeptide composition and PSI-BLAST
    Bhasin, M
    Raghava, GPS
    [J]. NUCLEIC ACIDS RESEARCH, 2004, 32 : W414 - W419
  • [4] Prediction of subcellular localization using sequence-biased recurrent networks
    Bodén, M
    Hawkins, J
    [J]. BIOINFORMATICS, 2005, 21 (10) : 2279 - 2286
  • [5] The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003
    Boeckmann, B
    Bairoch, A
    Apweiler, R
    Blatter, MC
    Estreicher, A
    Gasteiger, E
    Martin, MJ
    Michoud, K
    O'Donovan, C
    Phan, I
    Pilbout, S
    Schneider, M
    [J]. NUCLEIC ACIDS RESEARCH, 2003, 31 (01) : 365 - 370
  • [6] Brady Scott, 2008, Pac Symp Biocomput, P604
  • [7] Predicting protein subcellular locations using hierarchical ensemble of Bayesian classifiers based on Markov chains
    Bulashevska, Alla
    Eils, Roland
    [J]. BMC BIOINFORMATICS, 2006, 7 (1)
  • [8] Predicting 22 protein localizations in budding yeast
    Cai, YD
    Chou, KC
    [J]. BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, 2004, 323 (02) : 425 - 428
  • [9] Predicting subcellular localization of proteins in a hybridization space
    Cai, YD
    Chou, KC
    [J]. BIOINFORMATICS, 2004, 20 (07) : 1151 - 1156
  • [10] Relation between amino acid composition and cellular location of proteins
    Cedano, J
    Aloy, P
    PerezPons, JA
    Querol, E
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1997, 266 (03) : 594 - 600