On the interpretability of machine learning-based model for predicting hypertension

被引：178

作者：

Elshawi, Radwa ^{[1
]}

Al-Mallah, Mouaz H. ^{[2
]}

Sakr, Sherif ^{[1
]}

机构：

[1] Univ Tartu, Inst Comp Sci, Data Syst Grp, 2 J Liivi St, EE-50409 Tartu, Estonia

[2] Houston Methodist Ctr, Tartu, Estonia

来源：

BMC MEDICAL INFORMATICS AND DECISION MAKING | 2019年 / 19卷 / 1期

关键词：

Machine learning; Interpretability; Hypertension; BLOOD-PRESSURE; DECISION TREE; RULES; EXTRACTION; HISTORY;

D O I：

10.1186/s12911-019-0874-0

中图分类号：

R-058 [];

学科分类号：

摘要：

BackgroundAlthough complex machine learning models are commonly outperforming the traditional simple interpretable models, clinicians find it hard to understand and trust these complex models due to the lack of intuition and explanation of their predictions. The aim of this study to demonstrate the utility of various model-agnostic explanation techniques of machine learning models with a case study for analyzing the outcomes of the machine learning random forest model for predicting the individuals at risk of developing hypertension based on cardiorespiratory fitness data.MethodsThe dataset used in this study contains information of 23,095 patients who underwent clinician-referred exercise treadmill stress testing at Henry Ford Health Systems between 1991 and 2009 and had a complete 10-year follow-up. Five global interpretability techniques (Feature Importance, Partial Dependence Plot, Individual Conditional Expectation, Feature Interaction, Global Surrogate Models) and two local interpretability techniques (Local Surrogate Models, Shapley Value) have been applied to present the role of the interpretability techniques on assisting the clinical staff to get better understanding and more trust of the outcomes of the machine learning-based predictions.ResultsSeveral experiments have been conducted and reported. The results show that different interpretability techniques can shed light on different insights on the model behavior where global interpretations can enable clinicians to understand the entire conditional distribution modeled by the trained response function. In contrast, local interpretations promote the understanding of small parts of the conditional distribution for specific instances.ConclusionsVarious interpretability techniques can vary in their explanations for the behavior of the machine learning model. The global interpretability techniques have the advantage that it can generalize over the entire population while local interpretability techniques focus on giving explanations at the level of instances. Both methods can be equally valid depending on the application need. Both methods are effective methods for assisting clinicians on the medical decision process, however, the clinicians will always remain to hold the final say on accepting or rejecting the outcome of the machine learning models and their explanations based on their domain expertise.

引用

页数：32

共 61 条

[11] Using data mining techniques for multi-diseases prediction modeling of hypertension and hyperlipidemia by common risk factors [J].

Chang, Cheng-Ding ;

Wang, Chien-Chih ;

Jiang, Bernard C. .

EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (05) :5507-5513

[12] Decision Tree and Ensemble Learning Algorithms with Their Applications in Bioinformatics [J].

Che, Dongsheng ;

Liu, Qi ;

Rasheed, Khaled ;

Tao, Xiuping .

SOFTWARE TOOLS AND ALGORITHMS FOR BIOLOGICAL SYSTEMS, 2011, 696 :191-199

[13] Machine Learning and Prediction in Medicine - Beyond the Peak of Inflated Expectations [J].

Chen, Jonathan H. ;

Asch, Steven M. .

NEW ENGLAND JOURNAL OF MEDICINE, 2017, 376 (26) :2507-2509

[14] NEAREST NEIGHBOR PATTERN CLASSIFICATION [J].

COVER, TM ;

HART, PE .

IEEE TRANSACTIONS ON INFORMATION THEORY, 1967, 13 (01) :21-+

[15]

Craven Mark, 1994, P 11 INT C INT C MAC, P37, DOI [DOI 10.1016/B978-1-55860-335-6.50013-1, 10.1016/B978-1-55860-335-6.50013-1]

[16] Machine Learning and the Profession of Medicine [J].

Darcy, Alison M. ;

Louie, Alan K. ;

Roberts, Laura Weiss .

JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2016, 315 (06) :551-552

[17] Machine Learning in Medicine [J].

Deo, Rahul C. .

CIRCULATION, 2015, 132 (20) :1920-1930

[18] Hypertension in black patients - An emerging role of the endothelin system in salt-sensitive hypertension [J].

Ergul, A .

HYPERTENSION, 2000, 36 (01) :62-67

[19] Predictive models to assess risk of type 2 diabetes, hypertension and comorbidity: machine-learning algorithms and validation using national health data from Kuwait-a cohort study [J].

Farran, Bassam ;

Channanath, Arshad Mohamed ;

Behbehani, Kazem ;

Thanaraj, Thangavel Alphonse .

BMJ OPEN, 2013, 3 (05)

[20]

Freitas A.A., 2014, ACM SIGKDD Explor. Newsl, V15, P1, DOI DOI 10.1145/2594473.2594475

← 1 2 3 4 5 6 7 →