Secondary structure prediction with support vector machines

被引:165
作者
Ward, JJ [1 ]
McGuffin, LJ [1 ]
Buxton, BF [1 ]
Jones, DT [1 ]
机构
[1] UCL, Dept Comp Sci, Bioinformat Grp, London WC1E 6BT, England
基金
英国医学研究理事会; 英国生物技术与生命科学研究理事会;
关键词
D O I
10.1093/bioinformatics/btg223
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: A new method that uses support vector machines (SVMs) to predict protein secondary structure is described and evaluated. The study is designed to develop a reliable prediction method using an alternative technique and to investigate the applicability of SVMs to this type of bioinformatics problem. Methods: Binary SVMs are trained to discriminate between two structural classes. The binary classifiers are combined in several ways to predict multi-class secondary structure. Results: The average three-state prediction accuracy per protein (Q(3)) is estimated by cross-validation to be 77.07+/-0.26% with a segment overlap (Sov) score of 73.32+/-0.39%. The SVM performs similarly to the 'state-of-the-art' PSIPRED prediction method on a non-homologous test set of 121 proteins despite being trained on substantially fewer examples. A simple consensus of the SVM, PSIPRED and PROFsec achieves significantly higher prediction accuracy than the individual methods.
引用
收藏
页码:1650 / 1655
页数:6
相关论文
共 29 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]  
[Anonymous], ADV LARGE MARGIN CLA
[3]  
Bishop C. M., 1995, NEURAL NETWORKS PATT
[4]   Knowledge-based analysis of microarray gene expression data by using support vector machines [J].
Brown, MPS ;
Grundy, WN ;
Lin, D ;
Cristianini, N ;
Sugnet, CW ;
Furey, TS ;
Ares, M ;
Haussler, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (01) :262-267
[5]  
Burges C. J. C., 1997, ADV NEURAL INFORM PR, V9
[6]   A tutorial on Support Vector Machines for pattern recognition [J].
Burges, CJC .
DATA MINING AND KNOWLEDGE DISCOVERY, 1998, 2 (02) :121-167
[7]  
Christianini N., 2000, INTRO SUPPORT VECTOR, DOI DOI 10.1017/CBO9780511801389
[8]  
Cuff JA, 1999, PROTEINS, V34, P508, DOI 10.1002/(SICI)1097-0134(19990301)34:4<508::AID-PROT10>3.0.CO
[9]  
2-4
[10]   Exact simplification of support vector solutions [J].
Downs, T ;
Gates, KE ;
Masters, A .
JOURNAL OF MACHINE LEARNING RESEARCH, 2002, 2 (02) :293-297