How does the model make predictions? A systematic literature review on the explainability power of machine learning in healthcare

被引:58
作者
Allgaier, Johannes [1 ]
Mulansky, Lena [1 ]
Draelos, Rachel Lea [2 ]
Pryss, Ruediger [1 ]
机构
[1] Julius Maximilians Univ Wurzburg JMU, Inst Clin Epidemiol & Biometry, Wurzburg, Germany
[2] Cydoc, Durham, NC USA
关键词
Explainable artificial intelligence; XAI; Interpretable machine learning; PRISMA; Medicine; Healthcare; Review; ARTIFICIAL-INTELLIGENCE; SKIN-CANCER; BLACK-BOX; EXPLANATIONS;
D O I
10.1016/j.artmed.2023.102616
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Background: Medical use cases for machine learning (ML) are growing exponentially. The first hospitals are already using ML systems as decision support systems in their daily routine. At the same time, most ML systems are still opaque and it is not clear how these systems arrive at their predictions.Methods: In this paper, we provide a brief overview of the taxonomy of explainability methods and review popular methods. In addition, we conduct a systematic literature search on PubMed to investigate which explainable artificial intelligence (XAI) methods are used in 450 specific medical supervised ML use cases, how the use of XAI methods has emerged recently, and how the precision of describing ML pipelines has evolved over the past 20 years.Results: A large fraction of publications with ML use cases do not use XAI methods at all to explain ML pre-dictions. However, when XAI methods are used, open-source and model-agnostic explanation methods are more commonly used, with SHapley Additive exPlanations (SHAP) and Gradient Class Activation Mapping (Grad -CAM) for tabular and image data leading the way. ML pipelines have been described in increasing detail and uniformity in recent years. However, the willingness to share data and code has stagnated at about one-quarter.Conclusions: XAI methods are mainly used when their application requires little effort. The homogenization of reports in ML use cases facilitates the comparability of work and should be advanced in the coming years. Experts who can mediate between the worlds of informatics and medicine will become more and more in demand when using ML systems due to the high complexity of the domain.
引用
收藏
页数:13
相关论文
共 83 条
[61]   "Why Should I Trust You?" Explaining the Predictions of Any Classifier [J].
Ribeiro, Marco Tulio ;
Singh, Sameer ;
Guestrin, Carlos .
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :1135-1144
[63]   Using a Deep Learning Algorithm and Integrated Gradients Explanation to Assist Grading for Diabetic Retinopathy [J].
Sayres, Rory ;
Taly, Ankur ;
Rahimy, Ehsan ;
Blumer, Katy ;
Coz, David ;
Hammel, Naama ;
Krause, Jonathan ;
Narayanaswamy, Arunachalam ;
Rastegar, Zahra ;
Wu, Derek ;
Xu, Shawn ;
Barb, Scott ;
Joseph, Anthony ;
Shumski, Michael ;
Smith, Jesse ;
Sood, Arjun B. ;
Corrado, Greg S. ;
Peng, Lily ;
Webster, Dale R. .
OPHTHALMOLOGY, 2019, 126 (04) :552-564
[64]  
Selvaraju RR, 2020, INT J COMPUT VISION, V128, P336, DOI [10.1007/s11263-019-01228-7, 10.1109/ICCV.2017.74]
[65]  
Shapley L. S., 1951, Notes on the N-person game II: The value of an N-person game, DOI DOI 10.7249/RM0670
[66]  
Shrikumar A, 2017, PR MACH LEARN RES, V70
[67]  
Simonyan K., 2014, CORR
[68]   Explainable Deep Learning Models in Medical Image Analysis [J].
Singh, Amitojdeep ;
Sengupta, Sourya ;
Lakshminarayanan, Vasudevan .
JOURNAL OF IMAGING, 2020, 6 (06)
[69]   Artificial intelligence-enhanced electrocardiography in cardiovascular disease management [J].
Siontis, Konstantinos C. ;
Noseworthy, Peter A. ;
Attia, Zachi I. ;
Friedman, Paul A. .
NATURE REVIEWS CARDIOLOGY, 2021, 18 (07) :465-478
[70]  
Smilkov D., 2017, arXiv