Reliably Calibrated Isotonic Regression

被引:1
作者
Nyberg, Otto [1 ]
Klami, Arto [1 ]
机构
[1] Univ Helsinki, Helsinki, Finland
来源
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2021, PT I | 2021年 / 12712卷
基金
芬兰科学院;
关键词
Isotonic regression; Calibration; E-commerce;
D O I
10.1007/978-3-030-75762-5_46
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Using classifiers for decision making requires well-calibrated probabilities for estimation of expected utility. Furthermore, knowledge of the reliability is needed to quantify uncertainty. Outputs of most classifiers can be calibrated, typically by using isotonic regression that bins classifier outputs together to form empirical probability estimates. However, especially for highly imbalanced problems it produces bins with few samples resulting in probability estimates with very large uncertainty. We provide a formal method for quantifying the reliability of calibration and extend isotonic regression to provide reliable calibration with guarantees for width of credible intervals of the probability estimates. We demonstrate the method in calibrating purchase probabilities in e-commerce and achieve significant reduction in uncertainty without compromising accuracy.
引用
收藏
页码:578 / 589
页数:12
相关论文
共 22 条
  • [1] [Anonymous], 2005, UAI, DOI DOI 10.5555/3020336.3020388
  • [2] [Anonymous], 2021, Bayesian data analysis
  • [3] AN EMPIRICAL DISTRIBUTION FUNCTION FOR SAMPLING WITH INCOMPLETE INFORMATION
    AYER, M
    BRUNK, HD
    EWING, GM
    REID, WT
    SILVERMAN, E
    [J]. ANNALS OF MATHEMATICAL STATISTICS, 1955, 26 (04): : 641 - 647
  • [4] Machine learning approaches in medical image analysis: From detection to diagnosis
    de Bruijne, Marleen
    [J]. MEDICAL IMAGE ANALYSIS, 2016, 33 : 94 - 97
  • [5] Eustache D, 2018, P ADKDD TARGETAD WOR
  • [6] PAV and the ROC convex hull
    Fawcett, Tom
    Niculescu-Mizil, Alexandru
    [J]. MACHINE LEARNING, 2007, 68 (01) : 97 - 106
  • [7] Guo CA, 2017, PR MACH LEARN RES, V70
  • [8] Kumar A, 2019, ADV NEUR IN, V32
  • [9] Louzada Francisco, 2016, Surveys in Operations Research and Management Science, V21, P117, DOI 10.1016/j.sorms.2016.10.001
  • [10] McMahan HB, 2013, 19TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'13), P1222