Combining discriminant models with new multi-class SVMs

被引：73

作者：

Guermeur, Y ^{[1
]}

机构：

[1] LORIA, F-54506 Vandoeuvre Les Nancy, France

来源：

PATTERN ANALYSIS AND APPLICATIONS | 2002年 / 5卷 / 02期

关键词：

classifier fusion; generalisation performance; hierarchical sequence processing systems; protein secondary structure prediction; statistical learning theory; Support Vector Machines;

D O I：

10.1007/s100440200015

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The idea of performing model combination, instead of model selection, has a long theoretical background in statistics. However, making use of theoretical results is ordinarily subject to the satisfaction of strong hypotheses (weak error correlation, availability of large training sets, possibility to rerun the training procedure an arbitrary number of times, etc.). In contrast, the practitioner is frequently faced with the problem of combining a given set of pre-trained classifiers, with highly correlated errors, using only a small training sample. Overfitting is then the main risk, which cannot be overcome but with a strict complexity control of the combiner selected. This suggests that SVMs should be well suited for these difficult situations. Investigating this idea, we introduce a family of multi-class SVMs and assess them as ensemble methods on a real-world problem. This task, protein secondary structure prediction, is an open problem in biocomputing for which model combination appears to be an issue of central importance. Experimental evidence highlights the gain in quality resulting from combining some of the most widely used prediction methods with our SVMs rather than with the ensemble methods traditionally used in the field. The gain increases when the outputs of the combiners are post-processed with a DP algorithm.

引用

页码：168 / 179

页数：12

共 88 条

[1]

AIZERMAN MA, 1965, AUTOMAT REM CONTR+, V25, P821

[2] Scale-sensitive dimensions, uniform convergence, and learnability [J].