Explainable machine learning for knee osteoarthritis diagnosis based on a novel fuzzy feature selection methodology

被引:28
作者
Kokkotis, Christos [1 ,2 ]
Ntakolia, Charis [3 ,4 ]
Moustakidis, Serafeim [5 ]
Giakas, Giannis [2 ]
Tsaopoulos, Dimitrios [1 ]
机构
[1] Ctr Res & Technol Hellas, Inst Bioecon & Agritechnol, Volos 38333, Greece
[2] Univ Thessaly, Dept Phys Educ & Sport Sci, TEFAA, Trikala 42100, Greece
[3] Univ Mental Hlth Res Inst, Athens 11527, Greece
[4] Natl Tech Univ Athens, Sch Naval Architecture & Marine Engn, Athens 15772, Greece
[5] AIDEAS OU, Narva Mnt 5, EE-10117 Tallinn, Estonia
基金
欧盟地平线“2020”;
关键词
KOA diagnosis; Machine learning; Clinical data; Explainability; Feature selection; RISK-FACTORS; HIP; IDENTIFICATION;
D O I
10.1007/s13246-022-01106-6
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Knee Osteoarthritis (Kappa omicron Alpha) is a degenerative joint disease of the knee that results from the progressive loss of cartilage. Due to KOA's multifactorial nature and the poor understanding of its pathophysiology, there is a need for reliable tools that will reduce diagnostic errors made by clinicians. The existence of public databases has facilitated the advent of advanced analytics in KOA research however the heterogeneity of the available data along with the observed high feature dimensionality make this diagnosis task difficult. The objective of the present study is to provide a robust Feature Selection (FS) methodology that could: (i) handle the multidimensional nature of the available datasets and (ii) alleviate the defectiveness of existing feature selection techniques towards the identification of important risk factors which contribute to KOA diagnosis. For this aim, we used multidimensional data obtained from the Osteoarthritis Initiative database for individuals without or with KOA. The proposed fuzzy ensemble feature selection methodology aggregates the results of several FS algorithms (filter, wrapper and embedded ones) based on fuzzy logic. The effectiveness of the proposed methodology was evaluated using an extensive experimental setup that involved multiple competing FS algorithms and several well-known ML models. A 73.55% classification accuracy was achieved by the best performing model (Random Forest classifier) on a group of twenty-one selected risk factors. Explainability analysis was finally performed to quantify the impact of the selected features on the model's output thus enhancing our understanding of the rationale behind the decision-making mechanism of the best model.
引用
收藏
页码:219 / 229
页数:11
相关论文
共 56 条
[1]   Hip and Knee Osteoarthritis Affects Younger People, Too [J].
Ackerman, Ilana N. ;
Kemp, Joanne L. ;
Crossley, Kay M. ;
Culvenor, Adam G. ;
Hinman, Rana S. .
JOURNAL OF ORTHOPAEDIC & SPORTS PHYSICAL THERAPY, 2017, 47 (02) :67-79
[2]  
Alexos A., 2020, 2020 11 INT C INFORM, P1
[3]   Exploring deep learning capabilities in knee osteoarthritis case study for classification [J].
Christodoulou, Eirini ;
Moustakidis, Serafeim ;
Papandrianos, Nikolaos ;
Tsaopoulos, Dimitrios ;
Papageorgiou, Elpiniki .
2019 10TH INTERNATIONAL CONFERENCE ON INFORMATION, INTELLIGENCE, SYSTEMS AND APPLICATIONS (IISA), 2019, :271-276
[4]  
Cooper C, 2000, ARTHRITIS RHEUM, V43, P995, DOI 10.1002/1529-0131(200005)43:5<995::AID-ANR6>3.0.CO
[5]  
2-1
[6]   Feature selection via normative fuzzy information weight with application into tumor classification [J].
Dai, Jianhua ;
Chen, Jiaolong .
APPLIED SOFT COMPUTING, 2020, 92
[7]   SVM-RFE based feature selection for tandem mass spectrum quality assessment [J].
Ding, Jiarui ;
Shi, Jinhong ;
Wu, Fang-Xiang .
INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2011, 5 (01) :73-88
[8]  
Farr Ii Jack, 2013, Open Orthop J, V7, P619, DOI 10.2174/1874325001307010619
[9]  
Gayathri B. M., 2015, 2015 IEEE INT C COMP, P1
[10]   An ensemble approach to stabilize the features for multi-domain sentiment analysis using supervised machine learning [J].
Ghosh M. ;
Sanyal G. .
Journal of Big Data, 2018, 5 (01)