An Improved Framework for Detecting Thyroid Disease Using Filter-Based Feature Selection and Stacking Ensemble

被引:13
作者
Obaido, George [1 ,2 ]
Achilonu, Okechinyere [3 ]
Ogbuokiri, Blessing [4 ]
Amadi, Chimeremma Sandra [5 ]
Habeebullahi, Lawal [6 ]
Ohalloran, Tony [7 ]
Chukwu, Chidozie Williams [8 ]
Mienye, Ebikella Domor [9 ]
Aliyu, Mikail [10 ]
Fasawe, Olufunke [10 ]
Modupe, Ibukunola Abosede [11 ]
Omietimi, Erepamo Job [12 ]
Aruleba, Kehinde [13 ]
机构
[1] Univ Calif Berkeley, Berkeley Inst Data Sci, Berkeley, CA 94720 USA
[2] Univ Calif Berkeley, Ctr Human Compatible Artificial Intelligence, Berkeley, CA 94720 USA
[3] Univ Witwatersrand Johannesburg, Sch Publ Hlth, ZA-2017 Johannesburg, South Africa
[4] Brock Univ, Dept Comp Sci, St Catharines, ON L2S 3A1, Canada
[5] Fed Univ Technol Owerri FUTO, Dept Informat Technol, Owerri 460113, Nigeria
[6] Summit Univ Offa, Dept Comp Sci, Offa 250101, Nigeria
[7] Natl Univ Ireland, Sch Comp Sci, Galway H91 TK33, Ireland
[8] Wake Forest Univ, Dept Math, Winston Salem, NC 27106 USA
[9] Univ Johannesburg, Coll Business & Econ, ZA-2006 Johannesburg, South Africa
[10] Univ Calif Berkeley, Sch Publ Hlth, Berkeley, CA 94704 USA
[11] Vaal Univ Technol, Dept Comp Sci, Vanderbijlpark, ZA-1900, South Africa
[12] Univ Pretoria, Dept Geol, ZA-0028 Pretoria, South Africa
[13] Univ Leicester, Sch Comp & Math Sci, Leicester LE1 7RH, England
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Thyroid; Diseases; Predictive models; Feature extraction; Accuracy; Thyroid cancer; Artificial intelligence; Medical services; Ensemble learning; Machine learning; healthcare; machine learning; filter-based stacking ensemble learning; thyroid disease; SUPPORT VECTOR MACHINE; PREDICTION; CHALLENGES; MANAGEMENT; ALGORITHM; CARCINOMA; PAPILLARY; PATTERNS; TUMORS;
D O I
10.1109/ACCESS.2024.3418974
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In recent years, machine learning (ML) has become a pivotal tool for predicting and diagnosing thyroid disease. While many studies have explored the use of individual ML models for thyroid disease detection, the accuracy and robustness of these single-model approaches are often constrained by data imbalance and inherent model biases. This study introduces a filter-based feature selection and stacking-based ensemble ML framework, tailored specifically for thyroid disease detection. This framework capitalizes on the collective strengths of multiple base models by aggregating their predictions, aiming to surpass the predictive performance of individual models. Such an approach can also reduce screening time and costs considering few clinical attributes are used for diagnosis. Through extensive experiments conducted on a clinical thyroid disease dataset, the filter-based feature selection approach and the ensemble learning method demonstrated superior discriminative ability, reflected by improved receiver operating characteristic-area under the curve (ROC-AUC) scores of 99.9%. The proposed framework sheds light on the complementary strengths of different base models, fostering a deeper understanding of their joint predictive performance. Our findings underscore the potential of ensemble strategies to significantly improve the efficacy of ML-based detection of thyroid diseases, marking a shift from reliance on single models to more robust, collective approaches.
引用
收藏
页码:89098 / 89112
页数:15
相关论文
共 112 条
[41]   Clinical artificial intelligence quality improvement: towards continual monitoring and updating of AI algorithms in healthcare [J].
Feng, Jean ;
Phillips, Rachael V. ;
Malenica, Ivana ;
Bishara, Andrew ;
Hubbard, Alan E. ;
Celi, Leo A. ;
Pirracchio, Romain .
NPJ DIGITAL MEDICINE, 2022, 5 (01)
[42]   Evaluation of Individuals With Pulmonary Nodules: When Is It Lung Cancer? Diagnosis and Management of Lung Cancer, 3rd ed: American College of Chest Physicians Evidence-Based Clinical Practice Guidelines [J].
Gould, Michael K. ;
Donington, Jessica ;
Lynch, William R. ;
Mazzone, Peter J. ;
Midthun, David E. ;
Naidich, David P. ;
Wiener, Renda Soylemez .
CHEST, 2013, 143 (05) :E93-E120
[43]   Stacking Ensemble-Based Intelligent Machine Learning Model for Predicting Post-COVID-19 Complications [J].
Gupta, Aditya ;
Jain, Vibha ;
Singh, Amritpal .
NEW GENERATION COMPUTING, 2022, 40 (04) :987-1007
[44]   An efficient algorithm coupled with synthetic minority over-sampling technique to classify imbalanced PubChem BioAssay data [J].
Hao, Ming ;
Wang, Yanli ;
Bryant, Stephen H. .
ANALYTICA CHIMICA ACTA, 2014, 806 :117-127
[45]   Accurate prediction of nanofluid viscosity using a multilayer perceptron artificial neural network (MLP-ANN) [J].
Heidari, Elham ;
Sobati, Mohammad Amin ;
Movahedirad, Salman .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2016, 155 :73-85
[46]   SCREENING FOR THYROID-DISEASE [J].
HELFAND, M ;
CRAPO, LM .
ANNALS OF INTERNAL MEDICINE, 1990, 112 (11) :840-849
[47]   Imaging and Imaging-Based Management of Pediatric Thyroid Nodules [J].
Iakovou, Ioannis ;
Giannoula, Evanthia ;
Sachpekidis, Christos .
JOURNAL OF CLINICAL MEDICINE, 2020, 9 (02)
[48]   Application of machine learning algorithms to predict the thyroid disease risk: an experimental comparative study [J].
Islam, Saima Sharleen ;
Haque, Md Samiul ;
Miah, M. Saef Ullah ;
Bin Sarwar, Talha ;
Nugraha, Ramdhan .
PEERJ COMPUTER SCIENCE, 2022, 8
[49]   Global outlook on nutrition and the environment: meeting the challenges of the next millennium [J].
Iyengar, GV ;
Nair, PP .
SCIENCE OF THE TOTAL ENVIRONMENT, 2000, 249 (1-3) :331-346
[50]  
Izza Y, 2022, J ARTIF INTELL RES, V75, P261