Exploring Evaluation Methods for Interpretable Machine Learning: A Survey

被引：8

作者：

Alangari, Nourah ^{[1
]}

Menai, Mohamed El Bachir ^{[1
]}

Mathkour, Hassan ^{[1
]}

Almosallam, Ibrahim ^{[2
]}

机构：

[1] King Saud Univ, Coll Comp & Informat Sci, Dept Comp Sci, Riyadh 11543, Saudi Arabia

[2] Saudi Informat Technol Co SITE, Riyadh 12382, Saudi Arabia

来源：

INFORMATION | 2023年 / 14卷 / 08期

关键词：

interpretability; explainable AI; evaluating interpretability; BLACK-BOX; RULES; CLASSIFICATION; ACCURACY; ISSUES;

D O I：

10.3390/info14080469

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In recent times, the progress of machine learning has facilitated the development of decision support systems that exhibit predictive accuracy, surpassing human capabilities in certain scenarios. However, this improvement has come at the cost of increased model complexity, rendering them black-box models that obscure their internal logic from users. These black boxes are primarily designed to optimize predictive accuracy, limiting their applicability in critical domains such as medicine, law, and finance, where both accuracy and interpretability are crucial factors for model acceptance. Despite the growing body of research on interpretability, there remains a significant dearth of evaluation methods for the proposed approaches. This survey aims to shed light on various evaluation methods employed in interpreting models. Two primary procedures are prevalent in the literature: qualitative and quantitative evaluations. Qualitative evaluations rely on human assessments, while quantitative evaluations utilize computational metrics. Human evaluation commonly manifests as either researcher intuition or well-designed experiments. However, this approach is susceptible to human biases and fatigue and cannot adequately compare two models. Consequently, there has been a recent decline in the use of human evaluation, with computational metrics gaining prominence as a more rigorous method for comparing and assessing different approaches. These metrics are designed to serve specific goals, such as fidelity, comprehensibility, or stability. The existing metrics often face challenges when scaling or being applied to different types of model outputs and alternative approaches. Another important factor that needs to be addressed is that while evaluating interpretability methods, their results may not always be entirely accurate. For instance, relying on the drop in probability to assess fidelity can be problematic, particularly when facing the challenge of out-of-distribution data. Furthermore, a fundamental challenge in the interpretability domain is the lack of consensus regarding its definition and requirements. This issue is compounded in the evaluation process and becomes particularly apparent when assessing comprehensibility.

引用

页数：29

共 50 条

[21] Interpretable machine learning for weather and climate prediction: A review
Yang, Ruyi
Hu, Jingyu
Li, Zihao
Mu, Jianli
Yu, Tingzhao
Xia, Jiangjiang
Li, Xuhong
Dasgupta, Aritra
Xiong, Haoyi
ATMOSPHERIC ENVIRONMENT, 2024, 338
[22] Machine Learning and Deep Learning for Loan Prediction in Banking: Exploring Ensemble Methods and Data Balancing
Sayed, Eslam Hussein
Alabrah, Amerah
Rahouma, Kamel Hussein
Zohaib, Muhammad
Badry, Rasha M.
IEEE ACCESS, 2024, 12 : 193997 - 194019
[23] Survey and Evaluation of Hypertension Machine Learning Research
du Toit, Clea
Tran, Tran Quoc Bao
Deo, Neha
Aryal, Sachin
Lip, Stefanie
Sykes, Robert
Manandhar, Ishan
Sionakidis, Aristeidis
Stevenson, Leah
Pattnaik, Harsha
Alsanosi, Safaa
Kassi, Maria
Le, Ngoc
Rostron, Maggie
Nichol, Sarah
Aman, Alisha
Nawaz, Faisal
Mehta, Dhruven
Tummala, Ramakumar
McCallum, Linsay
Reddy, Sandeep
Visweswaran, Shyam
Kashyap, Rahul
Joe, Bina
Padmanabhan, Sandosh
JOURNAL OF THE AMERICAN HEART ASSOCIATION, 2023, 12 (09):
[24] Interpretable machine learning for materials design
James Dean
Matthias Scheffler
Thomas A. R. Purcell
Sergey V. Barabash
Rahul Bhowmik
Timur Bazhirov
Journal of Materials Research, 2023, 38 : 4477 - 4496
[25] Interpretable machine learning for materials design
Dean, James
Scheffler, Matthias
Purcell, Thomas A. R.
Barabash, Sergey V.
Bhowmik, Rahul
Bazhirov, Timur
JOURNAL OF MATERIALS RESEARCH, 2023, 38 (20) : 4477 - 4496
[26] Study on Interpretable Surrogate Model for Power System Stability Evaluation Machine Learning
Han T.
Chen J.
Li Y.
He G.
Li H.
Zhongguo Dianji Gongcheng Xuebao/Proceedings of the Chinese Society of Electrical Engineering, 2020, 40 (13): : 4122 - 4130
[27] Crime Prediction Methods Based on Machine Learning: A Survey
Yin, Junxiang
CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 74 (02): : 4601 - 4629
[28] Thy-Wise: An interpretable machine learning model for the evaluation of thyroid nodules
Jin, Zhe
Pei, Shufang
Ouyang, Lizhu
Zhang, Lu
Mo, Xiaokai
Chen, Qiuying
You, Jingjing
Chen, Luyan
Zhang, Bin
Zhang, Shuixing
INTERNATIONAL JOURNAL OF CANCER, 2022, 151 (12) : 2229 - 2243
[29] Survey of Machine Learning Methods for Big Data Applications
Vinothini, A.
Priya, S. Baghavathi
2017 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE IN DATA SCIENCE (ICCIDS), 2017,
[30] Interpretable machine learning: Fundamental principles and 10 grand challenges
Rudin, Cynthia
Chen, Chaofan
Chen, Zhi
Huang, Haiyang
Semenova, Lesia
Zhong, Chudi
STATISTICS SURVEYS, 2022, 16 : 1 - 85

← 1 2 3 4 5 →