Bayes error rate estimation using classifier ensembles

被引:0
|
作者
Tumer, Kagan [2 ]
Ghosh, Joydeep [1 ]
机构
[1] Department of Electrical Engineering, University of Texas, Austin, TX, United States
[2] NASA Ames Research Center, Mail Stop 269-4, Moffett Field, CA 94035-1000, United States
关键词
Approximation theory - Learning systems - Neural networks - Probability - Statistical methods;
D O I
10.1080/10255810305042
中图分类号
学科分类号
摘要
The Bayes error rate gives a statistical lower bound on the error achievable for a given classification problem and the associated choice of features. By reliably estimating this rate, one can assess the usefulness of the feature set that is being used for classification. Moreover, by comparing the accuracy achieved by a given classifier with the Bayes rate, one can quantify how effective that classifier is. Classical approaches for estimating or finding bounds for the Bayes error, in general, yield rather weak results for small sample sizes; unless the problem has some simple characteristics, such as Gaussian class-conditional likelihoods. This article shows how the outputs of a classifier ensemble can be used to provide reliable and easily obtainable estimates of the Bayes error with negligible extra computation. Three methods of varying sophistication are described. First, we present a framework that estimates the Bayes error when multiple classifiers, each providing an estimate of the a posteriori class probabilities, are combined through averaging. Second, we bolster this approach by adding an information theoretic measure of output correlation to the estimate. Finally, we discuss a more general method that just looks at the class labels indicated by ensemble members and provides error estimates based on the disagreements among classifiers. The methods are illustrated for artificial data, a difficult four-class problem involving underwater acoustic data, and two problems from the Proben1 benchmarks. For data sets with known Bayes error, the combiner-based methods introduced in this article outperform existing methods. The estimates obtained by the proposed methods also seem quite reliable for the real-life data sets for which the true Bayes rates are unknown.
引用
收藏
页码:95 / 109
相关论文
共 50 条
  • [1] Similarity Estimation Using Bayes Ensembles
    Emrich, Tobias
    Graf, Franz
    Kriegel, Hans-Peter
    Schubert, Matthias
    Thoma, Marisa
    SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT, 2010, 6187 : 537 - 554
  • [2] An efficient Bayes error rate estimation method
    Qingqiang Chen
    Fuyuan Cao
    Ying Xing
    Jiye Liang
    Machine Learning, 2025, 114 (6)
  • [3] Object Position Estimation Using Naive Bayes Classifier Algorithm
    Malik, Reza Firsandaya
    Pratama, Eko
    Ubaya, Huda
    Zulfahmi, Rido
    Stiawan, Deris
    Exaudi, Kemahyanto
    2018 INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING AND COMPUTER SCIENCE (ICECOS), 2018, : 39 - 43
  • [4] META LEARNING OF BOUNDS ON THE BAYES CLASSIFIER ERROR
    Moon, A. Kevin R.
    Hero, C. Alfred O., III
    Delouille, B. Veronique
    2015 IEEE SIGNAL PROCESSING AND SIGNAL PROCESSING EDUCATION WORKSHOP (SP/SPE), 2015, : 13 - 18
  • [5] Estimations of the Error in Bayes Classifier with Fuzzy Observations
    Burduk, Robert
    COMPUTATIONAL COLLECTIVE INTELLIGENCE: TECHNOLOGIES AND APPLICATIONS, PT I, 2011, 6922 : 123 - 131
  • [6] Bayes error evaluation of the Gaussian ML classifier
    Lee, C
    Choi, E
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2000, 38 (03): : 1471 - 1475
  • [7] NONPARAMETRIC BAYES ERROR ESTIMATION USING UNCLASSIFIED SAMPLES
    FUKUNAGA, K
    KESSELL, DL
    IEEE TRANSACTIONS ON INFORMATION THEORY, 1973, 19 (04) : 434 - 440
  • [8] An Adaptive Method of PCA for Minimization of Classification Error Using Naive Bayes Classifier
    Kumar, Devesh
    Singh, Ravinder
    Kumar, Abhishek
    Sharma, Nagesh
    PROCEEDINGS OF THE 4TH INTERNATIONAL CONFERENCE ON ECO-FRIENDLY COMPUTING AND COMMUNICATION SYSTEMS, 2015, 70 : 9 - 15
  • [9] A Training Sample Size Estimation for the Bayes Classifier
    Salazar, Addisson
    Vergara, Luis
    Gonzalez, Alberto
    2023 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE, CSCI 2023, 2023, : 277 - 283
  • [10] Using diversity measures for generating error-correcting output codes in classifier ensembles
    Kuncheva, LI
    PATTERN RECOGNITION LETTERS, 2005, 26 (01) : 83 - 90