A quantitative evaluation of explainable AI methods using the depth of decision tree

被引:3
作者
Ahmed, Nizar Abdulaziz Mahyoub [1 ]
Alpkocak, Adil [2 ]
机构
[1] Dokuz Eylul Univ, Dept Comp Engn, Izmir, Turkey
[2] Izmir Bakircay Univ, Dept Comp Engn, Izmir, Turkey
关键词
Explainable AI; medical multiclass classification; SHAP; LIME; decision tree; quantitative explainability evaluation;
D O I
10.55730/1300-0632.3924
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
It is necessary to develop an explainable model to clarify how and why a medical model makes a particular decision. Local posthoc explainable AI (XAI) techniques, such as SHAP and LIME, interpret classification system predictions by displaying the most important features and rules underlying any prediction locally. Therefore, in order to compare two or more XAI methods, they must first be evaluated qualitatively or quantitatively. This paper proposes quantitative XAI evaluation metrics that are not based on biased and subjective human judgment. On the other hand, it is dependent on the depth of the decision tree (DT) to automatically and effectively measure the complexity of XAI methods. Our study introduces a novel XAI strategy that measures the complexity of any XAI method by using a characteristic of another model as a proxy. The output of XAI methods, specifically feature importance scores from SHAP and LIME, is fed into the DT in our proposal. The DT will then draw a full tree based on the feature importance score decisions. As a result, we developed two main metrics that can be used to assess the DT's complexity and thus the associated XAI method: the total depth of the tree (TDT) and the average of the weighted class depth (ACD). The results show that SHAP outperforms LIME and is thus less complex. Furthermore, in terms of the number of documents and features, SHAP is more scalable. These results can indicate whether a specific XAI method is suitable for dealing with different document scales. Furthermore, they can demonstrate which features can be used to improve the performance of the black-box model, in this case, a feedforward neural network (FNN).
引用
收藏
页码:2054 / 2072
页数:20
相关论文
共 35 条
[1]   Classification of Biomedical Texts for Cardiovascular Diseases with Deep Neural Network Using a Weighted Feature Representation Method [J].
Ahmed, Nizar ;
Dilmac, Fatih ;
Alpkocak, Adil .
HEALTHCARE, 2020, 8 (04)
[2]  
ANDREAS H, 2020, KUNSTL INTELL, V34, P193, DOI DOI 10.1007/S13218-020-00636-Z
[3]  
Angelov Plamen, 2020, Explainable -by -design approach for covid-19 classification via CT -scan, DOI [DOI 10.1101/2020.04.24.20078584, 10.1101/2020.04.24.20078584]
[4]  
Antwarg L., 2019, arXiv
[5]   Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI [J].
Barredo Arrieta, Alejandro ;
Diaz-Rodriguez, Natalia ;
Del Ser, Javier ;
Bennetot, Adrien ;
Tabik, Siham ;
Barbado, Alberto ;
Garcia, Salvador ;
Gil-Lopez, Sergio ;
Molina, Daniel ;
Benjamins, Richard ;
Chatila, Raja ;
Herrera, Francisco .
INFORMATION FUSION, 2020, 58 :82-115
[6]   Machine learning explainability via microaggregation and shallow decision trees [J].
Blanco-Justicia, Alberto ;
Domingo-Ferrer, Josep ;
Martinez, Sergio ;
Sanchez, David .
KNOWLEDGE-BASED SYSTEMS, 2020, 194
[7]   Machine Learning Interpretability: A Survey on Methods and Metrics [J].
Carvalho, Diogo, V ;
Pereira, Eduardo M. ;
Cardoso, Jaime S. .
ELECTRONICS, 2019, 8 (08)
[8]  
Doshi-Velez F., 2018, Explainable and Interpretable Models in Computer Vision and Machine Learning, P3, DOI 10.1007/978-3-319-98131-4_1
[9]   Interpretability in healthcare: A comparative study of local machine learning interpretability techniques [J].
ElShawi, Radwa ;
Sherif, Youssef ;
Al-Mallah, Mouaz ;
Sakr, Sherif .
COMPUTATIONAL INTELLIGENCE, 2021, 37 (04) :1633-1650
[10]  
Feng JY, 2020, PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), P1478