PrOsteoporosis: predicting osteoporosis risk using NHANES data and machine learning approach

被引:0
作者
Si, Zebing [1 ,2 ]
Zhang, Di [3 ]
Wang, Huajun [1 ]
Zheng, Xiaofei [1 ]
机构
[1] Jinan Univ, Affiliated Hosp 1, Guangdong Prov Key Lab Speed Capabil, Dept Sports Med,Guangzhou Key Lab Precis Orthoped, Guangzhou 510630, Peoples R China
[2] Yuebei Peoples Hosp, Dept Orthoped, 133 Shaoguan Huimin South Ave, Shaoguan 512026, Peoples R China
[3] Shaoguan Univ, Country Coll Informat Sci & Engn, Shaoguan, Guangdong, Peoples R China
关键词
Osteoporosis; Machine learning; Risk; Model;
D O I
10.1186/s13104-025-07089-3
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
ObjectivesOsteoporosis, prevalent among the elderly population, is primarily diagnosed through bone mineral density (BMD) testing, which has limitations in early detection. This study aims to develop and validate a machine learning approach for osteoporosis identification by integrating demographic data, laboratory and questionnaire data, offering a more practical and effective screening alternative.MethodsIn this study, data from the National Health and Nutrition Examination Survey were analyzed to explore factors linked to osteoporosis. After cleaning, 8766 participants with 223 variables were studied. Minimum Redundancy Maximum Relevance and SelectKBest were employed to select the import features. Four Machine learning algorithms (RF, NN, LightGBM and XGBoost.) were applied to examine osteoporosis, with performance comparisons made. Data balancing was done using SMOTE, and metrics like F1 score, and AUC were evaluated for each algorithm.ResultsThe LightGBM model outperformed others with an F1 score of 0.914, an MCC of 0.831, and an AUC of 0.970 on the training set. On the test set, it achieved an F1 score of 0.912, an MCC of 0.826, and an AUC of 0.972. Top predictors for osteoporosis were height, age, and sex.ConclusionsThis study demonstrates the potential of machine learning models in assessing an individual's risk of developing osteoporosis, a condition that significantly impacts quality of life and imposes substantial healthcare costs. The superior performance of the LightGBM model suggests a promising tool for early detection and personalized prevention strategies. Importantly, identifying height, age, and sex as top predictors offers critical insights into the demographic and physiological factors that clinicians should consider when evaluating patients' risk profiles.
引用
收藏
页数:10
相关论文
共 50 条
[31]   Predicting Multidimensional Poverty with Machine Learning Algorithms: An Open Data Source Approach Using Spatial Data [J].
Muneton-Santa, Guberney ;
Carlos Manrique-Ruiz, Luis .
SOCIAL SCIENCES-BASEL, 2023, 12 (05)
[32]   Vessel Collision Risk Assessment using AIS Data: A Machine Learning Approach [J].
Tritsarolis, Andreas ;
Chondrodima, Eva ;
Pelekis, Nikos ;
Theodoridis, Yannis .
2022 23RD IEEE INTERNATIONAL CONFERENCE ON MOBILE DATA MANAGEMENT (MDM 2022), 2022, :425-430
[33]   Data dissemination approach using machine learning techniques for WBANs [J].
Punj, Roopali ;
Kumar, Rakesh .
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2022, 34 (05)
[35]   Using Ensemble Machine Learning Methods for Predicting Risk of Readmission for Heart Failure [J].
Mahajan, Satish M. ;
Ghani, Rayid .
MEDINFO 2019: HEALTH AND WELLBEING E-NETWORKS FOR ALL, 2019, 264 :243-247
[36]   Predicting Trains Delays using a Two-level Machine Learning Approach [J].
Laifa, Hassiba ;
Khcherif, Raoudha ;
Ben Ghezala, Henda .
ICAART: PROCEEDINGS OF THE 14TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE - VOL 3, 2022, :737-744
[37]   Predicting and analyzing ferry transit delays using open data and machine learning [J].
Sarhani, Malek ;
Nourmohammadzadeh, Abtin ;
Voss, Stefan ;
EL Amrani, Mohammed .
JOURNAL OF PUBLIC TRANSPORTATION, 2025, 27
[38]   Predicting Diabetes Diseases Using Mixed Data and Supervised Machine Learning Algorithms [J].
Daanouni, Othmane ;
Cherradi, Bouchaib ;
Tmiri, Amal .
4TH INTERNATIONAL CONFERENCE ON SMART CITY APPLICATIONS (SCA' 19), 2019,
[39]   Predicting disease severity in multiple sclerosis using multimodal data and machine learning [J].
Andorra, Magi ;
Freire, Ana ;
Zubizarreta, Irati ;
de Rosbo, Nicole Kerlero ;
Bos, Steffan D. ;
Rinas, Melanie ;
Hogestol, Einar A. ;
Benavent, Sigrid A. de Rodez ;
Berge, Tone ;
Brune-Ingebretse, Synne ;
Ivaldi, Federico ;
Cellerino, Maria ;
Pardini, Matteo ;
Vila, Gemma ;
Pulido-Valdeolivas, Irene ;
Martinez-Lapiscina, Elena H. ;
Llufriu, Sara ;
Saiz, Albert ;
Blanco, Yolanda ;
Martinez-Heras, Eloy ;
Solana, Elisabeth ;
Baecker-Koduah, Priscilla ;
Behrens, Janina ;
Kuchling, Joseph ;
Asseyer, Susanna ;
Scheel, Michael ;
Chien, Claudia ;
Zimmermann, Hanna ;
Motamedi, Seyedamirhosein ;
Kauer-Bonin, Josef ;
Brandt, Alex ;
Saez-Rodriguez, Julio ;
Alexopoulos, Leonidas G. ;
Paul, Friedemann ;
Harbo, Hanne F. ;
Shams, Hengameh ;
Oksenberg, Jorge ;
Uccelli, Antonio ;
Baeza-Yates, Ricardo ;
Villoslada, Pablo .
JOURNAL OF NEUROLOGY, 2024, 271 (03) :1133-1149
[40]   Predicting the mortality of patients with Covid-19: A machine learning approach [J].
Emami, Hassan ;
Rabiei, Reza ;
Sohrabei, Solmaz ;
Atashi, Alireza .
HEALTH SCIENCE REPORTS, 2023, 6 (04)