Prediction of antioxidant proteins using hybrid feature representation method and random forest

被引:55
作者
Ao, Chunyan [1 ,2 ]
Zhou, Wenyang [3 ]
Gao, Lin [1 ]
Dong, Benzhi [4 ]
Yu, Liang [1 ]
机构
[1] Xidian Univ, Sch Comp Sci & Technol, Xian, Peoples R China
[2] Univ Elect Sci & Technol China, Inst Fundamental & Frontier Sci, Chengdu, Peoples R China
[3] Harbin Inst Technol, Sch Life Sci & Technol, Ctr Bioinformat, Harbin, Peoples R China
[4] Northeast Forestry Univ, Coll Comp Sci & Engn, Harbin, Peoples R China
基金
中国国家自然科学基金;
关键词
Antioxidant protein; Hybrid feature representation methods; MRMD; Random forest; AMINO-ACID-COMPOSITION; FREE-RADICALS; CLASSIFICATION; IDENTIFICATION; DISEASE; COVARIANCE; MECHANISM; SOFTWARE; DRUGS;
D O I
10.1016/j.ygeno.2020.08.016
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Natural antioxidant proteins are mainly found in plants and animals, which interact to eliminate excessive free radicals and protect cells and DNA from damage, prevent and treat some diseases. Therefore, accurate identification of antioxidant proteins is important for the development of new drugs and research of related diseases. This article proposes novel method based on the combination of random forest and hybrid features that can accurately predict antioxidant proteins. Four single feature extraction methods (188D, profile-based Auto-cross covariance (ACC-PSSM), N-gram, and g-gap) and hybrid feature representation methods were used to feature extraction. Three feature selection methods (MRMD, t-SNE, and the optimal feature set selection) were adopted to determine the optimal features. The new hybrid feature vectors derived by combining 188D with the other three features all have indicators ranging from 0.9550 to 0.9990. The novel method showed better performance compared with the other methods.
引用
收藏
页码:4666 / 4674
页数:9
相关论文
共 103 条
[1]   The Molecular Mechanism of the Catalase Reaction [J].
Alfonso-Prieto, Mercedes ;
Biarnes, Xevi ;
Vidossich, Pietro ;
Rovira, Carme .
JOURNAL OF THE AMERICAN CHEMICAL SOCIETY, 2009, 131 (33) :11751-11761
[2]   DIETARY CARCINOGENS AND ANTICARCINOGENS - OXYGEN RADICALS AND DEGENERATIVE DISEASES [J].
AMES, BN .
SCIENCE, 1983, 221 (4617) :1256-1264
[3]   OXIDANTS, ANTIOXIDANTS, AND THE DEGENERATIVE DISEASES OF AGING [J].
AMES, BN ;
SHIGENAGA, MK ;
HAGEN, TM .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1993, 90 (17) :7915-7922
[4]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[5]  
Brown P. F., 1992, Computational Linguistics, V18, P467
[6]   Prediction of antioxidant proteins by incorporating statistical moments based features into Chou's PseAAC [J].
Butt, Ahmad Hassan ;
Rasool, Nouman ;
Khan, Yaser Daanial .
JOURNAL OF THEORETICAL BIOLOGY, 2019, 473 :1-8
[7]   On solutions and representations of spiking neural P systems with rules on synapses [J].
Cabarle, Francis George C. ;
de la Cruz, Ren Tristan A. ;
Cailipan, Dionne Peter P. ;
Zhang, Defu ;
Liu, Xiangrong ;
Zeng, Xiangxiang .
INFORMATION SCIENCES, 2019, 501 :30-49
[8]   SVM-Prot: web-based support vector machine software for functional classification of a protein from its primary sequence [J].
Cai, CZ ;
Han, LY ;
Ji, ZL ;
Chen, X ;
Chen, YZ .
NUCLEIC ACIDS RESEARCH, 2003, 31 (13) :3692-3697
[9]   Prediction of Protein Ubiquitination Sites in Arabidopsis thaliana [J].
Chen, Jiajing ;
Zhao, Jianan ;
Yang, Shiping ;
Chen, Zhen ;
Zhang, Ziding .
CURRENT BIOINFORMATICS, 2019, 14 (07) :614-620
[10]   iRSpot-PseDNC: identify recombination spots with pseudo dinucleotide composition [J].
Chen, Wei ;
Feng, Peng-Mian ;
Lin, Hao ;
Chou, Kuo-Chen .
NUCLEIC ACIDS RESEARCH, 2013, 41 (06) :e68