Combining discriminant models with new multi-class SVMs

被引:73
作者
Guermeur, Y [1 ]
机构
[1] LORIA, F-54506 Vandoeuvre Les Nancy, France
关键词
classifier fusion; generalisation performance; hierarchical sequence processing systems; protein secondary structure prediction; statistical learning theory; Support Vector Machines;
D O I
10.1007/s100440200015
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The idea of performing model combination, instead of model selection, has a long theoretical background in statistics. However, making use of theoretical results is ordinarily subject to the satisfaction of strong hypotheses (weak error correlation, availability of large training sets, possibility to rerun the training procedure an arbitrary number of times, etc.). In contrast, the practitioner is frequently faced with the problem of combining a given set of pre-trained classifiers, with highly correlated errors, using only a small training sample. Overfitting is then the main risk, which cannot be overcome but with a strict complexity control of the combiner selected. This suggests that SVMs should be well suited for these difficult situations. Investigating this idea, we introduce a family of multi-class SVMs and assess them as ensemble methods on a real-world problem. This task, protein secondary structure prediction, is an open problem in biocomputing for which model combination appears to be an issue of central importance. Experimental evidence highlights the gain in quality resulting from combining some of the most widely used prediction methods with our SVMs rather than with the ensemble methods traditionally used in the field. The gain increases when the outputs of the combiners are post-processed with a DP algorithm.
引用
收藏
页码:168 / 179
页数:12
相关论文
共 88 条
[1]  
AIZERMAN MA, 1965, AUTOMAT REM CONTR+, V25, P821
[2]   Scale-sensitive dimensions, uniform convergence, and learnability [J].
Alon, N ;
BenDavid, S ;
CesaBianchi, N ;
Haussler, D .
JOURNAL OF THE ACM, 1997, 44 (04) :615-631
[3]  
[Anonymous], 1998, CSDTR9804 U LOND DEP
[4]  
[Anonymous], 1982, ESTIMATION DEPENDENC
[5]  
ANTHONY M, 1997, NEURAL COMPUTING SUR, V1, P1
[6]  
Baldi P, 2001, BIOINFORMATICS MACHI
[7]  
Baldwin J, 1999, CFO-MAG SR FINANC EX, V15, P11
[8]   The sample complexity of pattern classification with neural networks: The size of the weights is more important than the size of the network [J].
Bartlett, PL .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1998, 44 (02) :525-536
[9]   COMBINATION OF FORECASTS [J].
BATES, JM ;
GRANGER, CWJ .
OPERATIONAL RESEARCH QUARTERLY, 1969, 20 (04) :451-&
[10]   SECONDARY STRUCTURE PREDICTION - COMBINATION OF 3 DIFFERENT METHODS [J].
BIOU, V ;
GIBRAT, JF ;
LEVIN, JM ;
ROBSON, B ;
GARNIER, J .
PROTEIN ENGINEERING, 1988, 2 (03) :185-191