Predicting subcellular localization of multi-label proteins by incorporating the sequence features into Chou's PseAAC

被引:50
作者
Javed, Faisal [1 ]
Hayat, Maqsood [1 ]
机构
[1] Abdul Wali Khan Univ Mardan, Dept Comp Sci, Mardan, Pakistan
关键词
Split Amino Acid Composition; Pseudo Amino Acid Composition; ML-KNN; Rank-SVM; SMOTE; AMINO-ACID-COMPOSITION; PSEUDO NUCLEOTIDE COMPOSITION; IDENTIFY RECOMBINATION SPOTS; FUSION CLASSIFIER; SITES; PLOC; BIOINFORMATICS; DISCRIMINATION; LOCATIONS; MODES;
D O I
10.1016/j.ygeno.2018.09.004
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
The emergence of numerous genome projects has made the experimental classification of the protein localization almost impossible due to the exponential increase in the number of protein samples. However, most of the applications are merely developed for single-plex and completely ignored the presence of one protein at two or more locations in a cell. In this regard, few attempts were carried out to target Multi-label protein localizations; consequently, undesirable accuracies are achieved. This paper presents a novel approach, in which a discrete feature extraction method is fused with physicochemical properties of amino acids by using Chou's general form of Pseudo Amino Acid Composition. The technique is tested on two benchmark datasets namely: Gpos-mploc and Virus-mPLoc. The empirical results demonstrated that the proposed method yields better results via two examined classifiers i.e. ML-KNN and Rank-SVM. It is established that the proposed model has improved values in all performance measures considered for the comparison.
引用
收藏
页码:1325 / 1332
页数:8
相关论文
共 92 条
[1]  
[Anonymous], 2002, ADV NEURAL INFORM PR
[2]  
[Anonymous], 1994, P MACH LEARN P
[3]  
[Anonymous], 2012, MACHINE LEARNING PRO
[4]  
[Anonymous], 2018, BIOINFORMATICS
[5]   iMem-2LSAAC: A two-level model for discrimination of membrane proteins and their types by extending the notion of SAAC into chou's pseudo amino acid composition [J].
Arif, Muhammad ;
Hayat, Maqsood ;
Jan, Zahoor .
JOURNAL OF THEORETICAL BIOLOGY, 2018, 442 :11-21
[6]   Going from where to why-interpretable prediction of protein subcellular localization [J].
Briesemeister, Sebastian ;
Rahnenfuehrer, Joerg ;
Kohlbacher, Oliver .
BIOINFORMATICS, 2010, 26 (09) :1232-1238
[7]   Prediction of Protein Subcellular Locations with Feature Selection and Analysis [J].
Cai, Yudong ;
He, Jianfeng ;
Li, Xinlei ;
Feng, Kaiyan ;
Lu, Lin ;
Feng, Kairui ;
Kong, Xiangyin ;
Lu, Wencong .
PROTEIN AND PEPTIDE LETTERS, 2010, 17 (04) :464-472
[8]  
Cerri R., 2009, BRAZ S BIOINF
[9]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[10]   iRNA-3typeA: Identifying Three Types of Modification at RNA's Adenosine Sites [J].
Chen, Wei ;
Feng, Pengmian ;
Yang, Hui ;
Ding, Hui ;
Lin, Hao ;
Chou, Kuo-Chen .
MOLECULAR THERAPY-NUCLEIC ACIDS, 2018, 11 :468-474