Identification of Multi-Functional Enzyme with Multi-Label Classifier

被引:21
作者
Che, Yuxin [1 ]
Ju, Ying [1 ]
Xuan, Ping [2 ]
Long, Ren [3 ]
Xing, Fei [4 ]
机构
[1] Xiamen Univ, Sch Informat Sci & Technol, Xiamen 361005, Fujian, Peoples R China
[2] Heilongjiang Univ, Sch Comp Sci & Technol, Harbin 150080, Peoples R China
[3] Harbin Inst Technol, Shenzhen Grad Sch, Sch Comp Sci & Technol, Shenzhen 518055, Guangdong, Peoples R China
[4] Xiamen Univ, Sch Aerosp Engn, Xiamen 361005, Fujian, Peoples R China
关键词
SEQUENCE-BASED PREDICTOR; MODIFIED MAHALANOBIS DISCRIMINANT; SUBCELLULAR-LOCALIZATION; MYCOBACTERIAL PROTEINS; P SYSTEMS; PSEUDO; DNA; GENERATION; MODES;
D O I
10.1371/journal.pone.0153503
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Enzymes are important and effective biological catalyst proteins participating in almost all active cell processes. Identification of multi-functional enzymes is essential in understanding the function of enzymes. Machine learning methods perform better in protein structure and function prediction than traditional biological wet experiments. Thus, in this study, we explore an efficient and effective machine learning method to categorize enzymes according to their function. Multi-functional enzymes are predicted with a special machine learning strategy, namely, multi-label classifier. Sequence features are extracted from a position-specific scoring matrix with autocross-covariance transformation. Experiment results show that the proposed method obtains an accuracy rate of 94.1% in classifying six main functional classes through five cross-validation tests and outperforms state-of-the-art methods. In addition, 91.25% accuracy is achieved in multi-functional enzyme prediction, which is often ignored in other enzyme function prediction studies. The online prediction server and datasets can be accessed from the link http://server.malab.cn/MEC/.
引用
收藏
页数:13
相关论文
共 78 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]  
[Anonymous], MOL BIOSYSTEMS
[3]  
[Anonymous], BIG DATA RES
[4]  
[Anonymous], IEEE T KNOWLEDGE DAT
[5]  
[Anonymous], P 5 HELL C ART INT
[6]  
[Anonymous], NEUROCOMPUTING
[7]  
[Anonymous], IEEE T NANOBIOSCIENC
[8]  
[Anonymous], THEORETICAL COMPUTER
[9]  
[Anonymous], ENZML MULTILABEL PRE
[10]  
[Anonymous], INTERDISCIP SCI