Combining predictions in pairwise classification: An optimal adaptive voting strategy and its relation to weighted voting

被引:94
作者
Huellermeier, Eyke [1 ]
Vanderlooy, Stijn [2 ]
机构
[1] Univ Marburg, Dept Math & Comp Sci, D-35032 Marburg, Germany
[2] Maastricht Univ, Dept Knowledge Engn, Maastricht ICT Competence Ctr, Maastricht, Netherlands
关键词
Learning by pairwise comparison; Label ranking; Aggregation strategies; Classifier combination; Weighted voting; MAP prediction; CLASSIFIERS;
D O I
10.1016/j.patcog.2009.06.013
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Weighted voting is the commonly used strategy for combining predictions in pairwise classification. Even though it shows good classification performance in practice, it is often criticized for lacking a sound theoretical justification. In this paper, we study the problem of combining predictions within a formal framework of label ranking and, under some model assumptions, derive a generalized voting strategy in which predictions are properly adapted according to the strengths of the corresponding base classifiers. We call this strategy adaptive voting and show that it is optimal in the sense of yielding a MAP prediction of the class label of a test instance. Moreover, we offer a theoretical justification for weighted voting by showing that it yields a good approximation of the optimal adaptive voting prediction. This result is further corroborated by empirical evidence from experiments with real and synthetic data sets showing that, even though adaptive voting is sometimes able to achieve consistent improvements, weighted voting is in general quite competitive, all the more in cases where the aforementioned model assumptions underlying adaptive voting are not met. In this sense, weighted voting appears to be a more robust aggregation strategy. (C) 2009 Elsevier Ltd. All rights reserved.
引用
收藏
页码:128 / 142
页数:15
相关论文
共 33 条
  • [1] Ailon N., 2008, MACH LEARN, V29, P103
  • [2] [Anonymous], 1994, Fuzzy preference modelling and multicriteria decision support
  • [3] [Anonymous], 2007, Uci machine learning repository
  • [4] Caruana R., 2005, P 22 INT C MACH LEAR, P625, DOI [DOI 10.1145/1102351.1102430, 10.1145/1102351.1102430]
  • [5] Ultraconservative online algorithms for multiclass problems
    Crammer, K
    Singer, Y
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) : 951 - 991
  • [6] Cutzu F, 2003, LECT NOTES COMPUT SC, V2709, P115
  • [7] Dekel O., 2003, ADV NEURAL INFORM PR, V16
  • [8] Demsar J, 2006, J MACH LEARN RES, V7, P1
  • [9] Dietterich TG, 1994, J ARTIF INTELL RES, V2, P263
  • [10] On the optimality of the simple Bayesian classifier under zero-one loss
    Domingos, P
    Pazzani, M
    [J]. MACHINE LEARNING, 1997, 29 (2-3) : 103 - 130