Prediction of preterm birth using machine learning: a comprehensive analysis based on large-scale preschool children survey data in Shenzhen of China

被引:0
作者
Ding, Liwen [1 ]
Yin, Xiaona [2 ]
Wen, Guomin [2 ]
Sun, Dengli [2 ]
Xian, Danxia [2 ]
Zhao, Yafen [2 ]
Zhang, Maolin [1 ]
Yang, Weikang [2 ]
Chen, Weiqing [1 ,3 ]
机构
[1] Sun Yat Sen Univ, Sch Publ Hlth, Dept Epidemiol & Hlth Stat, Guangzhou 510080, Peoples R China
[2] Womens & Childrens Hosp Longhua Dist Shenzhen, Shenzhen 518109, Peoples R China
[3] Xinhua Coll Guangzhou, Sch Hlth Management, Guangzhou 510080, Peoples R China
关键词
Preterm birth; Machine learning; Prediction model; SHAP; Multiple pregnancies; Threatened abortion; ADVANCED MATERNAL AGE; SYSTEMATIC ANALYSIS; RISK-FACTORS; THREATENED MISCARRIAGE; PREGNANCY; HEALTH; MORTALITY; EPIDEMIOLOGY; DISCOVERY; OUTCOMES;
D O I
10.1186/s12884-024-06980-4
中图分类号
R71 [妇产科学];
学科分类号
100211 ;
摘要
BackgroundPreterm birth (PTB) is a significant cause of neonatal mortality and long-term health issues. Accurate prediction and timely prevention of PTB are essential for reducing associated child mortality and morbidity. Traditional predictive methods face challenges due to heterogeneous risk factors and their interaction effects. This study aims to develop and evaluate six machine learning (ML) models to predict PTB using large-scale children survey data from Shenzhen, China, and to identify key predictors through Shapley Additive Explanations (SHAP) analysis.MethodsData from 84,050 mother-child pairs, collected in 2021 and 2022, were processed and divided into training, validation, and test sets. Six ML models were tested: L1-Regularised Logistic Regression, Light Gradient Boosting Machine (LightGBM), Naive Bayes, Random Forests, Support Vector Machine, and Extreme Gradient Boosting (XGBoost). Model performance was evaluated based on discrimination, calibration and clinical utility. SHAP analysis was used to interpret the importance and impact of individual features on PTB prediction.ResultsThe XGBoost model demonstrated the best overall performance, with the area under the receiver operating characteristic curve (AUC) scores of 0.752 and 0.757 in the validation and test sets, respectively, along with favorable calibration and clinical utility. Key predictors identified were multiple pregnancies, threatened abortion, and maternal age of conception. SHAP analysis highlighted the positive impacts of multiple pregnancies and threatened abortion, as well as the negative impact of micronutrient supplementation on PTB.ConclusionOur study found that ML models, particularly XGBoost, show promise in accurately predicting PTB and identifying key risk factors. These findings provide the potential of ML for enhancing clinical interventions, personalizing prenatal care, and informing public health initiatives.
引用
收藏
页数:14
相关论文
共 90 条
[1]   Clinical applications of machine learning in cardiovascular disease and its relevance to cardiac imaging [J].
Al'Aref, Subhi J. ;
Anchouche, Khalil ;
Singh, Gurpreet ;
Slomka, Piotr J. ;
Kolli, Kranthi K. ;
Kumar, Amit ;
Pandey, Mohit ;
Maliakal, Gabriel ;
van Rosendael, Alexander R. ;
Beecy, Ashley N. ;
Berman, Daniel S. ;
Leipsic, Jonathan ;
Nieman, Koen ;
Andreini, Daniele ;
Pontone, Gianluca ;
Schoepf, U. Joseph ;
Shaw, Leslee J. ;
Chang, Hyuk-Jae ;
Narula, Jagat ;
Bax, Jeroen J. ;
Guan, Yuanfang ;
Min, James K. .
EUROPEAN HEART JOURNAL, 2019, 40 (24) :1975-+
[2]   Multiple Imputation for Incomplete Data in Environmental Epidemiology Research [J].
Allotey, Prince Addo ;
Harel, Ofer .
CURRENT ENVIRONMENTAL HEALTH REPORTS, 2019, 6 (02) :62-71
[3]   Using a cohort study of diabetes and peripheral artery disease to compare logistic regression and machine learning via random forest modeling [J].
Austin, Andrea M. ;
Ramkumar, Niveditta ;
Gladders, Barbara ;
Barnes, Jonathan A. ;
Eid, Mark A. ;
Moore, Kayla O. ;
Feinberg, Mark W. ;
Creager, Mark A. ;
Bonaca, Marc ;
Goodney, Philip P. .
BMC MEDICAL RESEARCH METHODOLOGY, 2022, 22 (01)
[4]   Uterine distention as a factor in birth timing: retrospective nationwide cohort study in Sweden [J].
Bacelis, Jonas ;
Juodakis, Julius ;
Waldorf, Kristina M. Adams ;
Sengpiel, Verena ;
Muglia, Louis J. ;
Zhang, Ge ;
Jacobsson, Bo .
BMJ OPEN, 2018, 8 (10)
[5]  
[Баринов С. В. Barinov S. V.], 2020, [Медицинский совет, Medical Council, Meditsinskii sovet], P144, DOI 10.21518/2079-701X-2020-3-144-150
[6]   Clinical and biochemical markers of spontaneous preterm birth in singleton and multiple pregnancies [J].
Barinov, Sergey V. ;
Di Renzo, Gian Carlo ;
Belinina, Antonina A. ;
Koliado, Olga V. ;
Remneva, Olga V. .
JOURNAL OF MATERNAL-FETAL & NEONATAL MEDICINE, 2022, 35 (25) :5724-5729
[7]   Prediction of preterm birth in nulliparous women using logistic regression and machine learning [J].
Belaghi, Reza Arabi ;
Beyene, Joseph ;
McDonald, Sarah D. .
PLOS ONE, 2021, 16 (06)
[8]  
Bitar G, 2024, Am J Perinatol, V41, P3115
[9]   National, regional, and worldwide estimates of preterm birth rates in the year 2010 with time trends since 1990 for selected countries: a systematic analysis and implications [J].
Blencowe, Hannah ;
Cousens, Simon ;
Oestergaard, Mikkel Z. ;
Chou, Doris ;
Moller, Ann-Beth ;
Narwal, Rajesh ;
Adler, Alma ;
Garcia, Claudia Vera ;
Rohde, Sarah ;
Say, Lale ;
Lawn, Joy E. .
LANCET, 2012, 379 (9832) :2162-2172
[10]   Global, regional, and national estimates of levels of preterm birth in 2014: a systematic review and modelling analysis [J].
Chawanpaiboon, Saifon ;
Vogel, Joshua P. ;
Moller, Ann-Beth ;
Lumbiganon, Pisake ;
Petzold, Max ;
Hogan, Daniel ;
Landoulsi, Sihem ;
Jampathong, Nampet ;
Kongwattanakul, Kiattisak ;
Laopaiboon, Malinee ;
Lewis, Cameron ;
Rattanakanokchai, Siwanon ;
Teng, Ditza N. ;
Thinkhamrop, Jadsada ;
Watananirun, Kanokwaroon ;
Zhang, Jun ;
Zhou, Wei ;
Gulmezoglu, A. Metin .
LANCET GLOBAL HEALTH, 2019, 7 (01) :E37-E46