An explainable non-invasive hybrid machine learning framework for accurate prediction of thyroid-stimulating hormone levels

被引:0
作者
Mohammed, Areej [1 ]
Alshraideh, Hussam [2 ,3 ]
Abu-Helalah, Munir [4 ]
Shamayleh, Abdulrahim [2 ]
机构
[1] Department of Industrial Engineering, Engineering Systems Management Program, American University of Sharjah, Sharjah
[2] Department of Industrial Engineering, American University of Sharjah, Sharjah
[3] Industrial Engineering Department, Jordan University of Science and Technology, Irbid
[4] Department of Family and Community Medicine, School of Medicine, University of Jordan, Public Health Institute, Amman
关键词
Hyperthyroidism; Hypothyroidism; LIME; Machine learning; SHAP; Thyroid-stimulating hormone; TSH;
D O I
10.1016/j.compbiomed.2025.109974
中图分类号
学科分类号
摘要
Machine learning models, including thyroid biomarkers, are increasingly utilized in healthcare for biomarker prediction. These models offer the potential to enhance disease diagnosis through data-driven approaches relying on non-invasive techniques. However, no studies have explored the application of fully non-invasive methods for predicting thyroid-stimulating hormone (TSH) levels. Consequently, this study introduces a novel, fully non-invasive framework for predicting TSH levels by developing an innovative hybrid machine learning model that balances performance, complexity, and interpretability. Seven ML models were evaluated, and the best-performing models were integrated into a hybrid approach to balance performance, complexity, and interpretability. A dataset of 6190 instances from Jordan was used for model development. Four-dimensional non-invasive factors, including demographics, symptoms, family history, and newly engineered symptom scores, were incorporated into the model. The hybrid model achieved an R2 of 94.2 % and RMSE of 0.015, demonstrating superior predictive performance. Model interpretability was ensured using LIME and SHAP explainers, confirming aggregated symptom scores' critical role in enhancing prediction accuracy. A robust feature selection technique was implemented, reducing model complexity and enhancing performance. Among the top ten features for predicting TSH levels were hypothyroidism and hyperthyroidism symptom scores, family history, cold intolerance, itchy-dry skin, sweating, hand tremors, and palpitations. The model can be employed to develop cost-effective diagnostic tools for thyroid disorders. It also offers a robust framework that can be generalized to predict other biomarkers and applied in diverse contexts. © 2025 Elsevier Ltd
引用
收藏
相关论文
共 48 条
[21]  
Johannsen D.L., Galgani J.E., Johannsen N.M., Zhang Z., Covington J.D., Ravussin E., Effect of short-term thyroxine administration on energy metabolism and mitochondrial efficiency in humans, PLoS One, 7, 7, (2012)
[22]  
Ruiz-Pacheco M.G., Et al., Severity of fatigue and its relationship with TSH before and after levothyroxine replacement therapy in patients with primary hypothyroidism, Biomedicines, 11, 3, (2023)
[23]  
Zheng X.-Y., Et al., Mendelian randomization study highlights hypothyroidism as a causal determinant of alopecia areata, Front. Endocrinol., 14, (2024)
[24]  
Bogdan C., Ivan V.M., Apostol A., Sandu O.E., 14
[25]  
Islam S.S., Haque M.S., Miah M.S.U., Sarwar T.B., Nugraha R., Application of machine learning algorithms to predict the thyroid disease risk: an experimental comparative study, PeerJ Comput Sci, 8, (2022)
[26]  
Moon K., Jetawat A., Predicting lung cancer with K-nearest neighbors (KNN): a computational approach, Indian J. Sci. Technol., 17, 21, pp. 2199-2206, (2024)
[27]  
Ghawi R., Pfeffer J., Efficient hyperparameter tuning with grid search for text categorization using kNN approach with BM25 similarity, Open Computer Science, 9, 1, pp. 160-180, (2019)
[28]  
Huang Y., Chen C., Miao Y., Prediction model of bone marrow infiltration in patients with malignant lymphoma based on logistic regression and XGBoost algorithm, Comput. Math. Methods Med., 2022, (2022)
[29]  
Chong K., Shah N., Comparison of naive bayes and SVM classification in grid-search hyperparameter tuned and non-hyperparameter tuned healthcare stock market sentiment analysis, Int. J. Adv. Comput. Sci. Appl., 13, 12, (2022)
[30]  
Liu Z., Et al., Mathematical models of amino acid panel for assisting diagnosis of children acute leukemia, J. Transl. Med., 17, 1, (2019)