Early Thyroid Risk Prediction by Data Mining and Ensemble Classifiers

被引:9
作者
Alshayeji, Mohammad H. [1 ]
机构
[1] Kuwait Univ, Coll Engn & Petr, Dept Comp Engn, POB 5969, Safat 13060, Kuwait
关键词
machine learning; thyroid; data mining; ensemble model; feature engineering; SMOTE;
D O I
10.3390/make5030061
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Thyroid disease is among the most prevalent endocrinopathies worldwide. As the thyroid gland controls human metabolism, thyroid illness is a matter of concern for human health. To save time and reduce error rates, an automatic, reliable, and accurate thyroid identification machine-learning (ML) system is essential. The proposed model aims to address existing work limitations such as the lack of detailed feature analysis, visualization, improvement in prediction accuracy, and reliability. Here, a public thyroid illness dataset containing 29 clinical features from the University of California, Irvine ML repository was used. The clinical features helped us to build an ML model that can predict thyroid illness by analyzing early symptoms and replacing the manual analysis of these attributes. Feature analysis and visualization facilitate an understanding of the role of features in thyroid prediction tasks. In addition, the overfitting problem was eliminated by 5-fold cross-validation and data balancing using the synthetic minority oversampling technique (SMOTE). Ensemble learning ensures prediction model reliability owing to the involvement of multiple classifiers in the prediction decisions. The proposed model achieved 99.5% accuracy, 99.39% sensitivity, and 99.59% specificity with the boosting method which is applicable to real-time computer-aided diagnosis (CAD) systems to ease diagnosis and promote early treatment.
引用
收藏
页码:1195 / 1213
页数:19
相关论文
共 42 条
[1]   Performance Analysis of Machine Learning Algorithms for Thyroid Disease [J].
Abbad Ur Rehman, Hafiz ;
Lin, Chyi-Yeu ;
Mushtaq, Zohaib ;
Su, Shun-Feng .
ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2021, 46 (10) :9437-9449
[2]   Effective K-Nearest Neighbor Algorithms Performance Analysis of Thyroid Disease [J].
Abbad Ur Rehman, Hafiz ;
Lin, Chyi-Yeu ;
Mushtaq, Zohaib .
JOURNAL OF THE CHINESE INSTITUTE OF ENGINEERS, 2021, 44 (01) :77-87
[3]  
Alnaggar M, 2023, Egyptian Journal of Artificial Intelligence, V2, P1, DOI [10.21608/ejai.2023.205554.1008, DOI 10.21608/EJAI.2023.205554.1008]
[4]  
Alshayeji M.H., 2023, Early Thyroid Risk Prediction by Data Mining and Ensemble Classifiers
[5]   CAD systems for COVID-19 diagnosis and disease stage classification by segmentation of infected regions from CT images [J].
Alshayeji, Mohammad H. ;
ChandraBhasi Sindhu, Silpa ;
Abed, Sa'ed .
BMC BIOINFORMATICS, 2022, 23 (01)
[6]   RETRACTED: Empirical Method for Thyroid Disease Classification Using a Machine Learning Approach (Retracted Article) [J].
Alyas, Tahir ;
Hamid, Muhammad ;
Alissa, Khalid ;
Faiz, Tauqeer ;
Tabassum, Nadia ;
Ahmad, Aqeel .
BIOMED RESEARCH INTERNATIONAL, 2022, 2022
[7]  
American Thyroid Association, General Information/Press Room
[8]  
[Anonymous], Bayesian Optimization Book
[9]  
[Anonymous], Thyroid Disease: Causes, Symptoms, Risk Factors, Testing & Treatment
[10]  
[Anonymous], Thyroid Function Tests: Procedure, Side Effects, and Results