A power-controlled reliability assessment for multi-class probabilistic classifiers

被引:2
作者
Gweon, Hyukjun [1 ]
机构
[1] Western Univ, Dept Stat & Actuarial Sci, 1151 Richmond St, London, ON N6A 3K7, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Reliability assessment; Multi-class classification; Expected power; Bayesian approach; OF-FIT TESTS;
D O I
10.1007/s11634-022-00528-0
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In multi-class classification, the output of a probabilistic classifier is a probability distribution of the classes. In this work, we focus on a statistical assessment of the reliability of probabilistic classifiers for multi-class problems. Our approach generates a Pearson chi(2) statistic based on the k-nearest-neighbors in the prediction space. Further, we develop a Bayesian approach for estimating the expected power of the reliability test that can be used for an appropriate sample size k. We propose a sampling algorithm and demonstrate that this algorithm obtains a valid prior distribution. The effectiveness of the proposed reliability test and expected power is evaluated through a simulation study. We also provide illustrative examples of the proposed methods with practical applications.
引用
收藏
页码:927 / 949
页数:23
相关论文
共 29 条
  • [1] [Anonymous], 1969, Research bulletin
  • [2] [Anonymous], 1977, Journal of the Royal Statistical Society Series C (Applied Statistics), DOI DOI 10.2307/2346866
  • [3] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [4] Breiman L., 1984, Classification and Regression Trees, DOI DOI 10.1201/9781315139470
  • [5] Increasing the reliability of reliability diagrams
    Brocker, Jochen
    Smith, Leonard A.
    [J]. WEATHER AND FORECASTING, 2007, 22 (03) : 651 - 661
  • [6] A Bayesian approach to sample size determination for studies designed to evaluate continuous medical tests
    Cheng, Dunlei
    Branscum, Adam J.
    Stamey, James D.
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2010, 54 (02) : 298 - 307
  • [7] Bayesian sample size calculations for a non-inferiority test of two proportions in clinical trials
    Daimon, Takashi
    [J]. CONTEMPORARY CLINICAL TRIALS, 2008, 29 (04) : 507 - 516
  • [8] Dua D., 2017, UCI Machine Learning Repository: Individual household electric power consumption Data Set
  • [9] Multinomial goodness-of-fit tests for logistic regression models
    Fagerland, Morten W.
    Hosmer, David W.
    Bofin, Anna M.
    [J]. STATISTICS IN MEDICINE, 2008, 27 (21) : 4238 - 4253
  • [10] Fix E., 1951, Discriminatory Analysis-Nonparametric Discrimination: Consistency Properties, DOI DOI 10.2307/1403797