Risk Bounds for the Majority Vote: From a PAC-Bayesian Analysis to a Learning Algorithm

被引:0
作者
Germain, Pascal [1 ]
Lacasse, Alexandre [1 ]
Laviolette, Francois [1 ]
Marchand, Mario [1 ]
Roy, Jean-Francis [1 ]
机构
[1] Univ Laval, Dept Informat & Genie Logiciel, Quebec City, PQ G1V 0A6, Canada
基金
加拿大创新基金会; 加拿大自然科学与工程研究理事会;
关键词
majority vote; ensemble methods; learning theory; PAC-Bayesian theory; sample compression; COMPRESSION;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We propose an extensive analysis of the behavior of majority votes in binary classification. In particular, we introduce a risk bound for majority votes, called the C-bound, that takes into account the average quality of the voters and their average disagreement. We also propose an extensive PAC-Bayesian analysis that shows how the C-bound can be estimated from various observations contained in the training data. The analysis intends to be self-contained and can be used as introductory material to PAC-Bayesian statistical learning theory. It starts from a general PAC-Bayesian perspective and ends with uncommon PAC-Bayesian bounds. Some of these bounds contain no Kullback-Leibler divergence and others allow kernel functions to be used as voters (via the sample compression setting). Finally, out of the analysis, we propose the MinCq learning algorithm that basically minimizes the C-bound. MinCq reduces to a simple quadratic program. Aside from being theoretically grounded, MinCq achieves state-of-the-art performance, as shown in our extensive empirical comparison with both AdaBoost and the Support Vector Machine.
引用
收藏
页码:787 / 860
页数:74
相关论文
共 56 条
  • [21] Floyd S, 1995, MACH LEARN, V21, P269
  • [22] Gelman A., 2004, Bayesian data analysis, V2nd
  • [23] Germain P., 2011, P 28 INT C MACH LEAR, P297
  • [24] Germain Pascal, 2009, ICML, P45
  • [25] Germain Pascal, 2013, Intern. Conf. Mach. Learn, P738
  • [26] Giguere Sebastien, 2013, ICML, P107
  • [27] Higgs M, 2010, LECT NOTES ARTIF INT, V6331, P148, DOI 10.1007/978-3-642-16108-7_15
  • [28] Lacoste A, 2012, Artificial Intelligence and Statistics, P665
  • [29] Langford J, 2005, J MACH LEARN RES, V6, P273
  • [30] Langford John., 2001, Technical report