Prediction of subcellular location of apoptosis proteins by incorporating PsePSSM and DCCA coefficient based on LFDA dimensionality reduction

被引:57
作者
Yu, Bin [1 ,2 ,3 ]
Li, Shan [1 ,2 ]
Qiu, Wenying [1 ,2 ]
Wang, Minghui [1 ,2 ]
Du, Junwei [4 ]
Zhang, Yusen [5 ]
Chen, Xing [6 ]
机构
[1] Qingdao Univ Sci & Technol, Coll Math & Phys, Qingdao 266061, Peoples R China
[2] Qingdao Univ Sci & Technol, Artificial Intelligence & Biomed Big Data Res Ctr, Qingdao 266061, Peoples R China
[3] Univ Sci & Technol China, Sch Life Sci, Hefei 230027, Anhui, Peoples R China
[4] Qingdao Univ Sci & Technol, Coll Informat Sci & Technol, Qingdao 266061, Peoples R China
[5] Shandong Univ Weihai, Sch Math & Stat, Weihai 264209, Peoples R China
[6] China Univ Min & Technol, Sch Informat & Control Engn, Xuzhou 21116, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
Apoptosis proteins; Subcellular localization; Pseudo-position specific scoring matrix; Detrended cross-correlation analysis coefficient; Local fisher discriminant analysis; Support vector machine; SUPPORT VECTOR MACHINE; AMINO-ACID-COMPOSITION; SECONDARY STRUCTURE PREDICTION; NEGATIVE BACTERIAL PROTEINS; MULTI-LABEL PREDICTOR; PROGRAMMED CELL-DEATH; LOCALIZATION PREDICTION; ENSEMBLE CLASSIFIER; STRUCTURAL CLASS; ACCURATE PREDICTION;
D O I
10.1186/s12864-018-4849-9
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: Apoptosis is associated with some human diseases, including cancer, autoimmune disease, neurodegenerative disease and ischemic damage, etc. Apoptosis proteins subcellular localization information is very important for understanding the mechanism of programmed cell death and the development of drugs. Therefore, the prediction of subcellular localization of apoptosis protein is still a challenging task. Results: In this paper, we propose a novel method for predicting apoptosis protein subcellular localization, called PsePSSM-DCCA-LFDA. Firstly, the protein sequences are extracted by combining pseudo-position specific scoring matrix (PsePSSM) and detrended cross-correlation analysis coefficient (DCCA coefficient), then the extracted feature information is reduced dimensionality by LFDA (local Fisher discriminant analysis). Finally, the optimal feature vectors are input to the SVM classifier to predict subcellular location of the apoptosis proteins. The overall prediction accuracy of 99.7, 99.6 and 100% are achieved respectively on the three benchmark datasets by the most rigorous jackknife test, which is better than other state-of-the-art methods. Conclusion: The experimental results indicate that our method can significantly improve the prediction accuracy of subcellular localization of apoptosis proteins, which is quite high to be able to become a promising tool for further proteomics studies. The source code and all datasets are available at https://github.com/QUST-BSBRC/PsePSSM-DCCA-LFDA/.
引用
收藏
页数:17
相关论文
共 98 条
[1]   Classification of membrane protein types using Voting Feature Interval in combination with Chou's Pseudo Amino Acid Composition [J].
Ali, Farman ;
Hayat, Maqsood .
JOURNAL OF THEORETICAL BIOLOGY, 2015, 384 :78-83
[2]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[3]  
[Anonymous], 2005, CONT CINEMA
[4]   DeepLoc: prediction of protein subcellular localization using deep learning [J].
Armenteros, Jose Juan Almagro ;
Sonderby, Casper Kaae ;
Sonderby, Soren Kaae ;
Nielsen, Henrik ;
Winther, Ole .
BIOINFORMATICS, 2017, 33 (21) :3387-3395
[5]   Laplacian eigenmaps for dimensionality reduction and data representation [J].
Belkin, M ;
Niyogi, P .
NEURAL COMPUTATION, 2003, 15 (06) :1373-1396
[6]   ESLpred: SVM-based method for subcellular localization of eukaryotic proteins using dipeptide composition and PSI-BLAST [J].
Bhasin, M ;
Raghava, GPS .
NUCLEIC ACIDS RESEARCH, 2004, 32 :W414-W419
[7]   Going from where to why-interpretable prediction of protein subcellular localization [J].
Briesemeister, Sebastian ;
Rahnenfuehrer, Joerg ;
Kohlbacher, Oliver .
BIOINFORMATICS, 2010, 26 (09) :1232-1238
[8]   Predicting protein subcellular locations using hierarchical ensemble of Bayesian classifiers based on Markov chains [J].
Bulashevska, Alla ;
Eils, Roland .
BMC BIOINFORMATICS, 2006, 7 (1)
[9]   Support vector machines for predicting membrane protein types by using functional domain composition [J].
Cai, YD ;
Zhou, GP ;
Chou, KC .
BIOPHYSICAL JOURNAL, 2003, 84 (05) :3257-3263
[10]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)