Long sequence feature extraction based on deep learning neural network for protein secondary structure prediction

被引:0
作者
Chen, Yehong [1 ]
机构
[1] Qilu Univ Technol, Sch Printing & Packaging, Jinan, Shandong, Peoples R China
来源
2017 IEEE 3RD INFORMATION TECHNOLOGY AND MECHATRONICS ENGINEERING CONFERENCE (ITOEC) | 2017年
基金
中国国家自然科学基金;
关键词
Sparse auto-encoder; Convolutional neural network; Self-taught learning; Feature extraction; Protein secondary structure prediction; Softmax classifier;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, a long sequence feature extraction method (LSFE) is proposed for protein secondary structure prediction. The proposed method is based on deep learning architecture which is mainly composed of three-layers: sparse auto-encoder, convolution feature extraction layer, and the softmax classifier. PSSM (position-specific scoring matrix) is used as the raw sequence representation. Two groups of self-taught feature filters are learned from 5-polypeptides and 13-polypeptides by the sparse auto-encoder layer. Finally, the new representations of 35-polypeptides got by the convolution layer are fed into the softmax classifier, as the top shallow classifier, for fast prediction. The experimental results indicate that overall accuracy (Q3) of around 74% on 25PDB is got within very short waiting time. Hence this deep learning architecture breaks up the top bound of window size in the art-of-state SVM+PSSM classifier, and showing the potential power in future work on bigger dataset.
引用
收藏
页码:843 / 847
页数:5
相关论文
共 15 条
[1]   Prediction of protein secondary structure content using support vector machine [J].
Chen, Chao ;
Tian, Yuanxin ;
Zou, Xiaoyong ;
Cai, Peixiang ;
Mo, Jinyuan .
TALANTA, 2007, 71 (05) :2069-2073
[2]  
Cuff JA, 2000, PROTEINS, V40, P502, DOI 10.1002/1097-0134(20000815)40:3<502::AID-PROT170>3.0.CO
[3]  
2-Q
[4]   Comparison study on statistical features of predicted secondary structures for protein structural class prediction: From content to position [J].
Dai, Qi ;
Li, Yan ;
Liu, Xiaoqing ;
Yao, Yuhua ;
Cao, Yunjie ;
He, Pingan .
BMC BIOINFORMATICS, 2013, 14
[5]  
Jaiswal Kunal, 2007, In Silico Biology, V7, P559
[6]   DICTIONARY OF PROTEIN SECONDARY STRUCTURE - PATTERN-RECOGNITION OF HYDROGEN-BONDED AND GEOMETRICAL FEATURES [J].
KABSCH, W ;
SANDER, C .
BIOPOLYMERS, 1983, 22 (12) :2577-2637
[7]   Analysis of an optimal hidden Markov model for secondary structure prediction [J].
Martin, Juliette ;
Gibrat, Jean-Francois ;
Rodolphe, Francois .
BMC STRUCTURAL BIOLOGY, 2006, 6
[8]   Improving protein secondary structure prediction using a multi-modal BP method [J].
Qu, Wu ;
Sui, Haifeng ;
Yang, Bingru ;
Qian, Wenbin .
COMPUTERS IN BIOLOGY AND MEDICINE, 2011, 41 (10) :946-959
[9]  
Raven P.H., 1997, HOW SCI THINK
[10]  
Sui Haifeng, 2011, PREDICTING PROTEIN S, V24, P304