On the algorithmic implementation of stochastic discrimination

被引:73
作者
Kleinberg, EM [1 ]
机构
[1] SUNY Buffalo, Dept Math, Buffalo, NY 14214 USA
关键词
pattern recognition; classification algorithms; stochastic discrimination; SD;
D O I
10.1109/34.857004
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Stochastic discrimination is a general methodology for constructing classifiers appropriate for pattern recognition. It is based on combining arbitrary numbers of Very weak components, which are usually generated by some pseudorandom process, and it has the property that the very complex and accurate classifiers produced in this way retain the ability, characteristic of their weak component pieces, to generalize to new data. In fact, it is often observed, in practice, that classifier performance on test sets continues to rise as more weak components are added, even after performance on training sets seems to have reached a maximum. This is predicted by the underlying theory, for even though the formal error rate on the training set may have reached a minimum, more sophisticated measures intrinsic to this method indicate that classifier performance on both training and test sets continues to improve as complexity increases, in this paper, we begin with a review of the method of stochastic discrimination as applied to pattern recognition. Through a progression of examples keyed to various theoretical issues, we discuss considerations involved with its algorithmic implementation. We then take such an algorithmic implementation and compare its performance, on a large set of standardized pattern recognition problems from the University of California Irvine, and Statlog collections, to many other techniques reported on in the literature, including boosting and bagging. in doing these studies, we compare our results to those reported in the literature by the various authors for the other methods, using the same data and study paradigms used by them. Included in this paper is an outline of the underlying mathematical theory of stochastic discrimination and a remark concerning boosting, which provides a theoretical justification for properties of that method observed in practice, including its ability to generalize.
引用
收藏
页码:473 / 490
页数:18
相关论文
共 14 条
[1]  
BERLIND R, 1994, THESIS SUNY BUFFALO
[2]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[3]  
CHEN D, 1998, THESIS SUNY BUFFALO
[4]   A decision-theoretic generalization of on-line learning and an application to boosting [J].
Freund, Y ;
Schapire, RE .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1997, 55 (01) :119-139
[5]  
Freund Y, 1996, Experiments with a new boosting algorithm. In proceedings 13th Int Conf Mach learn. Pp.148-156, P45
[6]  
Ho TK, 1998, IEEE T PATTERN ANAL, V20, P832, DOI 10.1109/34.709601
[7]  
Kleinberg E., 1990, Ann. Math. Artif. Intell, V1, P207, DOI DOI 10.1007/BF01531079
[8]  
Kleinberg EM, 1996, ANN STAT, V24, P2319
[9]  
KLEINBERG EM, IN PRESS NOTE MATH U
[10]  
KLEINBERG EM, IN PRESS P 1 INT WOR