Utilizing machine learning for early screening of thyroid nodules: a dual-center cross-sectional study in China

被引:0
作者
Weng, Shuwei [1 ,2 ]
Ding, Chen [3 ]
Hu, Die [1 ,2 ]
Chen, Jin [1 ,2 ]
Liu, Yang [4 ]
Liu, Wenwu [1 ,2 ]
Chen, Yang [1 ,2 ]
Guo, Xin [1 ,2 ]
Cao, Chenghui [1 ,2 ]
Yi, Yuting [1 ,2 ]
Yang, Yanyi [5 ,6 ]
Peng, Daoquan [1 ,2 ]
机构
[1] Cent South Univ, Xiangya Hosp 2, Dept Cardiol, Changsha, Hunan, Peoples R China
[2] Res Inst Blood Lipid & Atherosclerosis, Changsha, Hunan, Peoples R China
[3] Soochow Univ, Affiliated Hosp 4, Suzhou Dushu Lake Hosp, Dept Cardiol,Med Ctr, Suzhou, Jiangsu, Peoples R China
[4] Third Mil Med Univ, Xinqiao Hosp,Army Med Univ, Chongqing Clin Res Ctr Kidney & Urol Dis, Dept Nephrol,Key Lab Prevent & Treatment Chron Kid, Chongqing, Peoples R China
[5] Cent South Univ, Xiangya Hosp 2, Hlth Management Ctr, Changsha, Hunan, Peoples R China
[6] Hunan Prov Clin Med Res Ctr Intelligent Management, Changsha, Hunan, Peoples R China
来源
FRONTIERS IN ENDOCRINOLOGY | 2024年 / 15卷
基金
中国国家自然科学基金;
关键词
thyroid nodule; machine learning; early screening; urine iodine; ensemble learning methods; IODINE INTAKE; ASSOCIATION; MANAGEMENT; DIAGNOSIS; AGE;
D O I
10.3389/fendo.2024.1385167
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Background Thyroid nodules, increasingly prevalent globally, pose a risk of malignant transformation. Early screening is crucial for management, yet current models focus mainly on ultrasound features. This study explores machine learning for screening using demographic and biochemical indicators.Methods Analyzing data from 6,102 individuals and 61 variables, we identified 17 key variables to construct models using six machine learning classifiers: Logistic Regression, SVM, Multilayer Perceptron, Random Forest, XGBoost, and LightGBM. Performance was evaluated by accuracy, precision, recall, F1 score, specificity, kappa statistic, and AUC, with internal and external validations assessing generalizability. Shapley values determined feature importance, and Decision Curve Analysis evaluated clinical benefits.Results Random Forest showed the highest internal validation accuracy (78.3%) and AUC (89.1%). LightGBM demonstrated robust external validation performance. Key factors included age, gender, and urinary iodine levels, with significant clinical benefits at various thresholds. Clinical benefits were observed across various risk thresholds, particularly in ensemble models.Conclusion Machine learning, particularly ensemble methods, accurately predicts thyroid nodule presence using demographic and biochemical data. This cost-effective strategy offers valuable insights for thyroid health management, aiding in early detection and potentially improving clinical outcomes. These findings enhance our understanding of the key predictors of thyroid nodules and underscore the potential of machine learning in public health applications for early disease screening and prevention.
引用
收藏
页数:10
相关论文
共 50 条
  • [31] Health care costs of cardiovascular disease in China: a machine learning-based cross-sectional study
    Lu, Mengjie
    Gao, Hong
    Shi, Chenshu
    Xiao, Yuyin
    Li, Xiyang
    Li, Lihua
    Li, Yan
    Li, Guohong
    FRONTIERS IN PUBLIC HEALTH, 2023, 11
  • [32] Using machine learning approach to predict depression and anxiety among patients with epilepsy in China: A cross-sectional study
    Wei, Zihan
    Wang, Xinpei
    Ren, Lei
    Liu, Chang
    Liu, Chao
    Cao, Mi
    Feng, Yan
    Gan, Yanjing
    Li, Guoyan
    Liu, Xufeng
    Liu, Yonghong
    Yang, Lei
    Deng, Yanchun
    JOURNAL OF AFFECTIVE DISORDERS, 2023, 336 : 1 - 8
  • [33] A study of machine learning models for rapid intraoperative diagnosis of thyroid nodules for clinical practice in China
    Ma, Yan
    Zhang, Xiuming
    Yi, Zhongliang
    Ding, Liya
    Cai, Bojun
    Jiang, Zhinong
    Liu, Wangwang
    Zou, Hong
    Wang, Xiaomei
    Fu, Guoxiang
    CANCER MEDICINE, 2024, 13 (03):
  • [34] Determinants of gastric cancer screening attendance in Southeastern China: a cross-sectional study
    Huang, Zhiwen
    Hu, Zhijian
    Wong, Li Ping
    Lin, Yulan
    BMJ OPEN, 2023, 13 (07):
  • [35] Comparison of Different Ultrasound Classification Systems of Thyroid Nodules for Identifying Malignant Potential: A Cross-sectional Study
    Chen, Hua
    Ye, Jun
    Song, Jianming
    You, Yuguang
    Chen, Weihua
    Liu, Yanna
    CLINICS, 2021, 76 : 1 - 7
  • [36] Predicting Gestational Diabetes Mellitus in the first trimester using machine learning algorithms: a cross-sectional study at a hospital fertility health center in Iran
    Bigdeli, Somayeh Kianian
    Ghazisaedi, Marjan
    Ayyoubzadeh, Seyed Mohammad
    Hantoushzadeh, Sedigheh
    Ahmadi, Marjan
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2025, 25 (01)
  • [37] Prediction of Diabetes Using Data Mining and Machine Learning Algorithms: A Cross-Sectional Study
    Shojaee-Mend, Hassan
    Velayati, Farnia
    Tayefi, Batool
    Babaee, Ebrahim
    HEALTHCARE INFORMATICS RESEARCH, 2024, 30 (01) : 73 - 82
  • [38] Profiling the Physical Performance of Young Boxers with Unsupervised Machine Learning: A Cross-Sectional Study
    Merlo, Rodrigo
    Rodriguez-Chavez, Angel
    Gomez-Castaneda, Pedro E.
    Rojas-Jaramillo, Andres
    Petro, Jorge L.
    Kreider, Richard B.
    Bonilla, Diego A.
    SPORTS, 2023, 11 (07)
  • [39] Sarcopenia feature selection and risk prediction using machine learning A cross-sectional study
    Kang, Yang-Jae
    Yoo, Jun-Il
    Ha, Yong-chan
    MEDICINE, 2019, 98 (43)
  • [40] Explainable Machine Learning for Atrial Fibrillation in the General Population Using a Generalized Additive Model ― A Cross-Sectional Study ―
    Kawakami, Masaki
    Karashima, Shigehiro
    Morita, Kento
    Tada, Hayato
    Okada, Hirofumi
    Aono, Daisuke
    Kometani, Mitsuhiro
    Nomura, Akihiro
    Demura, Masashi
    Furukawa, Kenji
    Yoneda, Takashi
    Nambo, Hidetaka
    Kawashiri, Masa-aki
    CIRCULATION REPORTS, 2022, 4 (02) : 73 - 82