Multi-Label Learning With Fuzzy Hypergraph Regularization for Protein Subcellular Location Prediction

被引:12
作者
Chen, Jing [1 ,2 ]
Tang, Yuan Yan [1 ,2 ]
Chen, C. L. Philip [1 ]
Fang, Bin [3 ]
Lin, Yuewei [4 ]
Shang, Zhaowei [3 ]
机构
[1] Univ Macau, Fac Sci & Technol, Taipa, Macau, Peoples R China
[2] Chongqing Univ, Chongqing 400030, Peoples R China
[3] Chongqing Univ, Coll Comp Sci, Chongqing 400030, Peoples R China
[4] Univ S Carolina, Columbia, SC 29208 USA
基金
国家自然科学基金重大项目;
关键词
Dictionary learning; hypergraph regularization; multi-label learning; protein subcellular localization; AMINO-ACID-COMPOSITION; SUPPORT VECTOR MACHINES; POSITIVE BACTERIAL PROTEINS; AVERAGE CHEMICAL-SHIFT; GRAM-NEGATIVE-BACTERIA; GENERAL-FORM; EVOLUTIONARY INFORMATION; ENSEMBLE CLASSIFIER; CHOUS PSEAAC; LOCALIZATION PREDICTION;
D O I
10.1109/TNB.2014.2341111
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Protein subcellular location prediction aims to predict the location where a protein resides within a cell using computational methods. Considering the main limitations of the existing methods, we propose a hierarchical multi-label learning model FHML for both single-location proteins and multi-location proteins. The latent concepts are extracted through feature space decomposition and label space decomposition under the nonnegative data factorization framework. The extracted latent concepts are used as the codebook to indirectly connect the protein features to their annotations. We construct dual fuzzy hypergraphs to capture the intrinsic high-order relations embedded in not only feature space, but also label space. Finally, the subcellular location annotation information is propagated from the labeled proteins to the unlabeled proteins by performing dual fuzzy hypergraph Laplacian regularization. The experimental results on the six protein benchmark datasets demonstrate the superiority of our proposed method by comparing it with the state-of-the-art methods, and illustrate the benefit of exploiting both feature correlations and label correlations.
引用
收藏
页码:438 / 447
页数:10
相关论文
共 89 条
[11]   Using functional domain composition and support vector machines for prediction of protein subcellular location [J].
Chou, KC ;
Cai, YD .
JOURNAL OF BIOLOGICAL CHEMISTRY, 2002, 277 (48) :45765-45769
[12]   Protein subcellular location prediction [J].
Chou, KC ;
Elrod, DW .
PROTEIN ENGINEERING, 1999, 12 (02) :107-118
[13]   Prediction of protein cellular attributes using pseudo-amino acid composition [J].
Chou, KC .
PROTEINS-STRUCTURE FUNCTION AND GENETICS, 2001, 43 (03) :246-255
[14]   Recent progress in protein subcellular location prediction [J].
Chou, Kuo-Chen ;
Shen, Hong-Bin .
ANALYTICAL BIOCHEMISTRY, 2007, 370 (01) :1-16
[15]   Euk-mPLoc: A fusion classifier for large-scale eukaryotic protein subcellular location prediction by incorporating multiple sites [J].
Chou, Kuo-Chen ;
Shen, Hong-Bin .
JOURNAL OF PROTEOME RESEARCH, 2007, 6 (05) :1728-1734
[16]   iLoc-Hum: using the accumulation-label scale to predict subcellular locations of human proteins with both single and multiple sites [J].
Chou, Kuo-Chen ;
Wu, Zhi-Cheng ;
Xiao, Xuan .
MOLECULAR BIOSYSTEMS, 2012, 8 (02) :629-641
[17]   Some remarks on predicting multi-label attributes in molecular biosystems [J].
Chou, Kuo-Chen .
MOLECULAR BIOSYSTEMS, 2013, 9 (06) :1092-1100
[18]   iLoc-Euk: A Multi-Label Classifier for Predicting the Subcellular Localization of Singleplex and Multiplex Eukaryotic Proteins [J].
Chou, Kuo-Chen ;
Wu, Zhi-Cheng ;
Xiao, Xuan .
PLOS ONE, 2011, 6 (03)
[19]   Some remarks on protein attribute prediction and pseudo amino acid composition [J].
Chou, Kuo-Chen .
JOURNAL OF THEORETICAL BIOLOGY, 2011, 273 (01) :236-247
[20]   Plant-mPLoc: A Top-Down Strategy to Augment the Power for Predicting Plant Protein Subcellular Localization [J].
Chou, Kuo-Chen ;
Shen, Hong-Bin .
PLOS ONE, 2010, 5 (06)