Multi-Label Learning With Fuzzy Hypergraph Regularization for Protein Subcellular Location Prediction

被引:12
作者
Chen, Jing [1 ,2 ]
Tang, Yuan Yan [1 ,2 ]
Chen, C. L. Philip [1 ]
Fang, Bin [3 ]
Lin, Yuewei [4 ]
Shang, Zhaowei [3 ]
机构
[1] Univ Macau, Fac Sci & Technol, Taipa, Macau, Peoples R China
[2] Chongqing Univ, Chongqing 400030, Peoples R China
[3] Chongqing Univ, Coll Comp Sci, Chongqing 400030, Peoples R China
[4] Univ S Carolina, Columbia, SC 29208 USA
基金
国家自然科学基金重大项目;
关键词
Dictionary learning; hypergraph regularization; multi-label learning; protein subcellular localization; AMINO-ACID-COMPOSITION; SUPPORT VECTOR MACHINES; POSITIVE BACTERIAL PROTEINS; AVERAGE CHEMICAL-SHIFT; GRAM-NEGATIVE-BACTERIA; GENERAL-FORM; EVOLUTIONARY INFORMATION; ENSEMBLE CLASSIFIER; CHOUS PSEAAC; LOCALIZATION PREDICTION;
D O I
10.1109/TNB.2014.2341111
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Protein subcellular location prediction aims to predict the location where a protein resides within a cell using computational methods. Considering the main limitations of the existing methods, we propose a hierarchical multi-label learning model FHML for both single-location proteins and multi-location proteins. The latent concepts are extracted through feature space decomposition and label space decomposition under the nonnegative data factorization framework. The extracted latent concepts are used as the codebook to indirectly connect the protein features to their annotations. We construct dual fuzzy hypergraphs to capture the intrinsic high-order relations embedded in not only feature space, but also label space. Finally, the subcellular location annotation information is propagated from the labeled proteins to the unlabeled proteins by performing dual fuzzy hypergraph Laplacian regularization. The experimental results on the six protein benchmark datasets demonstrate the superiority of our proposed method by comparing it with the state-of-the-art methods, and illustrate the benefit of exploiting both feature correlations and label correlations.
引用
收藏
页码:438 / 447
页数:10
相关论文
共 89 条
[1]   Extensive feature detection of N-terminal protein sorting signals [J].
Bannai, H ;
Tamada, Y ;
Maruyama, O ;
Nakai, K ;
Miyano, S .
BIOINFORMATICS, 2002, 18 (02) :298-305
[2]   Prediction of cellular toxicity of halocarbons from computed chemodescriptors: A hierarchical QSAR approach [J].
Basak, SC ;
Balasubramanian, K ;
Gute, BD ;
Mills, D ;
Gorczynska, A ;
Roszak, S .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2003, 43 (04) :1103-1109
[3]   SherLoc2: A High-Accuracy Hybrid Method for Predicting Subcellular Localization of Proteins [J].
Briesemeister, Sebastian ;
Blum, Torsten ;
Brady, Scott ;
Lam, Yin ;
Kohlbacher, Oliver ;
Shatkay, Hagit .
JOURNAL OF PROTEOME RESEARCH, 2009, 8 (11) :5363-5366
[4]   Predicting protein subcellular locations using hierarchical ensemble of Bayesian classifiers based on Markov chains [J].
Bulashevska, Alla ;
Eils, Roland .
BMC BIOINFORMATICS, 2006, 7 (1)
[5]   Relation between amino acid composition and cellular location of proteins [J].
Cedano, J ;
Aloy, P ;
PerezPons, JA ;
Querol, E .
JOURNAL OF MOLECULAR BIOLOGY, 1997, 266 (03) :594-600
[6]   EuLoc: a web-server for accurately predict protein subcellular localization in eukaryotes by incorporating various features of sequence segments into the general form of Chou's PseAAC [J].
Chang, Tzu-Hao ;
Wu, Li-Ching ;
Lee, Tzong-Yi ;
Chen, Shu-Pin ;
Huang, Hsien-Da ;
Horng, Jorng-Tzong .
JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN, 2013, 27 (01) :91-103
[7]   iRSpot-PseDNC: identify recombination spots with pseudo dinucleotide composition [J].
Chen, Wei ;
Feng, Peng-Mian ;
Lin, Hao ;
Chou, Kuo-Chen .
NUCLEIC ACIDS RESEARCH, 2013, 41 (06) :e68
[9]  
Chou K., 2010, Nat. Sci, V2, P1090, DOI DOI 10.4236/NS.2010.210136
[10]  
Chou K.C., 2009, Nat Sci, V1, P63