Machine Learning-Based Risk Prediction of Discharge Status for Sepsis

被引:0
作者
Cai, Kaida [1 ,2 ]
Lou, Yuqing [2 ]
Wang, Zhengyan [2 ]
Yang, Xiaofang [2 ]
Zhao, Xin [2 ,3 ]
机构
[1] Southeast Univ, Sch Publ Hlth, Nanjing 210009, Peoples R China
[2] Southeast Univ, Sch Math, Nanjing 210009, Peoples R China
[3] Southeast Univ, Key Lab Measurement & Control Complex Syst Engn, Minist Educ, Nanjing 210096, Peoples R China
基金
中国国家自然科学基金;
关键词
machine learning; feature selection; information gain; missing data imputation; sepsis; INTERNATIONAL CONSENSUS DEFINITIONS; IMPUTATION; MORTALITY; SELECTION;
D O I
10.3390/e26080625
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
As a severe inflammatory response syndrome, sepsis presents complex challenges in predicting patient outcomes due to its unclear pathogenesis and the unstable discharge status of affected individuals. In this study, we develop a machine learning-based method for predicting the discharge status of sepsis patients, aiming to improve treatment decisions. To enhance the robustness of our analysis against outliers, we incorporate robust statistical methods, specifically the minimum covariance determinant technique. We utilize the random forest imputation method to effectively manage and impute missing data. For feature selection, we employ Lasso penalized logistic regression, which efficiently identifies significant predictors and reduces model complexity, setting the stage for the application of more complex predictive methods. Our predictive analysis incorporates multiple machine learning methods, including random forest, support vector machine, and XGBoost. We compare the prediction performance of these methods with Lasso penalized logistic regression to identify the most effective approach. Each method's performance is rigorously evaluated through ten iterations of 10-fold cross-validation to ensure robust and reliable results. Our comparative analysis reveals that XGBoost surpasses the other models, demonstrating its exceptional capability to navigate the complexities of sepsis data effectively.
引用
收藏
页数:12
相关论文
共 38 条
[31]   How handling missing data may impact conclusions: A comparison of six different imputation methods for categorical questionnaire data [J].
Stayseth, Marianne Riksheim ;
Clausen, Thomas ;
Roislien, Jo .
SAGE OPEN MEDICINE, 2019, 7
[32]   MissForest-non-parametric missing value imputation for mixed-type data [J].
Stekhoven, Daniel J. ;
Buehlmann, Peter .
BIOINFORMATICS, 2012, 28 (01) :112-118
[33]   Prediction of In-hospital Mortality in Emergency Department Patients With Sepsis: A Local Big Data-Driven, Machine Learning Approach [J].
Taylor, R. Andrew ;
Pare, Joseph R. ;
Venkatesh, Arjun K. ;
Mowafi, Hani ;
Melnick, Edward R. ;
Fleischman, William ;
Hall, M. Kennedy .
ACADEMIC EMERGENCY MEDICINE, 2016, 23 (03) :269-278
[34]   Exploring incomplete data using visualization techniques [J].
Templ, Matthias ;
Alfons, Andreas ;
Filzmoser, Peter .
ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2012, 6 (01) :29-47
[35]   Development of Imputation Methods for Missing Data in Multiple Linear Regression Analysis [J].
Thongsri, Thidarat ;
Samart, Klairung .
LOBACHEVSKII JOURNAL OF MATHEMATICS, 2022, 43 (11) :3390-3399
[37]   Prediction of Severe Sepsis Using SVM Model [J].
Wang, Shu-Li ;
Wu, Fan ;
Wang, Bo-Hang .
ADVANCES IN COMPUTATIONAL BIOLOGY, 2010, 680 :75-81
[38]  
Wang Ziyang, 2022, Chinese Medical Sciences Journal, V37, P201, DOI 10.24920/004102