Reliable Classifications with Guaranteed Confidence Using the Dempster-Shafer Theory of Evidence

被引:0
作者
Kempkes, Marie C. [1 ,2 ]
Dunjko, Vedran [2 ]
van Nieuwenburg, Evert
Spiegelberg, Jakob [1 ]
机构
[1] Volkswagen Grp Innovat, Berliner Ring 2, D-38440 Wolfsburg, Germany
[2] Leiden Univ, Niels Bohrweg 1, NL-2333 CA Leiden, Netherlands
来源
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: RESEARCH TRACK, PT II, ECML PKDD 2024 | 2024年 / 14942卷
基金
荷兰研究理事会;
关键词
Uncertainty Quantification; Dempster-Shafer Theory; Conformal Prediction; FRAMEWORK;
D O I
10.1007/978-3-031-70344-7_6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Reliably capturing predictive uncertainty is indispensable for the deployment of machine learning (ML) models in safety-critical domains. The most commonly used approaches to uncertainty quantification are, however, either computationally costly in inference or incapable of capturing different types of uncertainty (i.e., aleatoric and epistemic). In this paper, we tackle this issue using the Dempster-Shafer theory of evidence, which only recently gained attention as a tool to estimate uncertainty in ML. By training a neural network to return a generalized probability measure and combining it with conformal prediction, we obtain set predictions with guaranteed user-specified confidence. We test our method on various datasets and empirically show that it reflects uncertainty more reliably than a calibrated classifier with softmax output, since our approach yields smaller and hence more informative prediction sets at the same bounded error level in particular for samples with high epistemic uncertainty. In order to deal with the exponential scaling inherent to classifiers within Dempster-Shafer theory, we introduce a second approach with reduced complexity, which also returns smaller sets than the comparative method, even on large classification tasks with more than 40 distinct labels. Our results indicate that the proposed methods are promising approaches to obtain reliable and informative predictions in the presence of both aleatoric and epistemic uncertainty in only one forward-pass through the network.
引用
收藏
页码:89 / 105
页数:17
相关论文
共 49 条
[1]  
Aggarwal CC, 2014, CH CRC DATA MIN KNOW, P457
[2]  
[Anonymous], CIFAR-10
[3]  
[Anonymous], 1987, Mathematics and Logic
[4]  
Balasubramanian V., 2014, Newnes, DOI [10.1016/C2012-0-00234-7, DOI 10.1016/C2012-0-00234-7]
[5]   Distribution-free, Risk-controlling Prediction Sets [J].
Bates, Stephen ;
Angelopoulos, Anastasios ;
Lei, Lihua ;
Malik, Jitendra ;
Jordan, Michael .
JOURNAL OF THE ACM, 2021, 68 (06)
[6]  
Bengs V., 2022, Neural Information Processing Systems
[7]  
Bengs V., 2023, INT C MACHINE LEARNI, P2078
[8]  
Breiman L, 1996, MACH LEARN, V24, P123, DOI 10.1007/BF00058655
[9]   Validity, consonant plausibility measures, and conformal prediction [J].
Cella, Leonardo ;
Martin, Ryan .
INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2022, 141 :110-130
[10]  
CHOW CK, 1970, IEEE T INFORM THEORY, V16, P41, DOI 10.1109/TIT.1970.1054406