Two-stage support vector machines to protein relative solvent accessibility prediction

被引:0
作者
Nguyen, MN [1 ]
Rajapakse, JC [1 ]
机构
[1] Nanyang Technol Univ, Sch Comp Engn, BioInformat Res Ctr, Singapore 639798, Singapore
来源
PROCEEDINGS OF THE 2004 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY | 2004年
关键词
protein structure prediction; solvent accessibility; support vector machines; PSI-BLAST;
D O I
暂无
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Bioinformatics techniques to Relative Solvent Accessibility (RSA) prediction are mostly single-stage approaches; they predict solvent accessibility of proteins by taking into account only the information available in amino acid sequences. In this paper, we propose to use Support Vector Machines (SVMs) as a second stage following the existing single-stage approaches for RSA prediction problem to improve the accuracy. The purpose of the second stage is to capture the contextual relationship of solvent accessibility elements in a neighborhood in determining the solvent accessibility at a particular site. We demonstrate our approach by introducing SVMs to the output of single-stage SVM classifier. Two-stage SVM approach achieves accuracies upto 90.4% and 90.2% on the Manesh dataset of 215 protein structures and the RS126 dataset of 126 nonhomologous globular proteins, respectively, which are better than the highest reported scores on both datasets to date.
引用
收藏
页码:67 / 72
页数:6
相关论文
共 35 条
[1]  
ADAMCZAK R, 2004, IN PRESS PROTEINS ST
[2]   NETASA: neural network based prediction of solvent accessibility [J].
Ahmad, S ;
Gromiha, MM .
BIOINFORMATICS, 2002, 18 (06) :819-824
[3]  
[Anonymous], 1982, ESTIMATION DEPENDENC
[4]  
BENNETT KP, 1999, ADV KERNEL METHODS S, P43
[5]   Predicting residue solvent accessibility from protein sequence by considering the sequence environment [J].
Carugo, O .
PROTEIN ENGINEERING, 2000, 13 (09) :607-609
[6]  
Chandonia JM, 1999, PROTEINS, V35, P293
[7]  
Cristianini N., 2000, Intelligent Data Analysis: An Introduction, DOI 10.1017/CBO9780511801389
[8]  
Cuff JA, 2000, PROTEINS, V40, P502, DOI 10.1002/1097-0134(20000815)40:3<502::AID-PROT170>3.0.CO
[9]  
2-Q
[10]   IMPROVED STRATEGY IN ANALYTIC SURFACE CALCULATION FOR MOLECULAR-SYSTEMS - HANDLING OF SINGULARITIES AND COMPUTATIONAL-EFFICIENCY [J].
EISENHABER, F ;
ARGOS, P .
JOURNAL OF COMPUTATIONAL CHEMISTRY, 1993, 14 (11) :1272-1280