Predicting hospital length of stay using machine learning on a large open health dataset

被引:0
|
作者
Jain, Raunak [1 ]
Singh, Mrityunjai [1 ]
Rao, A. Ravishankar [2 ]
Garg, Rahul [1 ]
机构
[1] Indian Inst Technol, Delhi, India
[2] Fairleigh Dickinson Univ, Teaneck, NJ 07666 USA
关键词
Machine learning; Artificial intelligence; Health informatics; Open data; Open-source software; Healthcare analytics; REPRODUCIBILITY;
D O I
10.1186/s12913-024-11238-y
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
BackgroundGovernments worldwide are facing growing pressure to increase transparency, as citizens demand greater insight into decision-making processes and public spending. An example is the release of open healthcare data to researchers, as healthcare is one of the top economic sectors. Significant information systems development and computational experimentation are required to extract meaning and value from these datasets. We use a large open health dataset provided by the New York State Statewide Planning and Research Cooperative System (SPARCS) containing 2.3 million de-identified patient records. One of the fields in these records is a patient's length of stay (LoS) in a hospital, which is crucial in estimating healthcare costs and planning hospital capacity for future needs. Hence it would be very beneficial for hospitals to be able to predict the LoS early. The area of machine learning offers a potential solution, which is the focus of the current paper.MethodsWe investigated multiple machine learning techniques including feature engineering, regression, and classification trees to predict the length of stay (LoS) of all the hospital procedures currently available in the dataset. Whereas many researchers focus on LoS prediction for a specific disease, a unique feature of our model is its ability to simultaneously handle 285 diagnosis codes from the Clinical Classification System (CCS). We focused on the interpretability and explainability of input features and the resulting models. We developed separate models for newborns and non-newborns.ResultsThe study yields promising results, demonstrating the effectiveness of machine learning in predicting LoS. The best R2 scores achieved are noteworthy: 0.82 for newborns using linear regression and 0.43 for non-newborns using catboost regression. Focusing on cardiovascular disease refines the predictive capability, achieving an improved R2 score of 0.62. The models not only demonstrate high performance but also provide understandable insights. For instance, birth-weight is employed for predicting LoS in newborns, while diagnostic-related group classification proves valuable for non-newborns.ConclusionOur study showcases the practical utility of machine learning models in predicting LoS during patient admittance. The emphasis on interpretability ensures that the models can be easily comprehended and replicated by other researchers. Healthcare stakeholders, including providers, administrators, and patients, stand to benefit significantly. The findings offer valuable insights for cost estimation and capacity planning, contributing to the overall enhancement of healthcare management and delivery.
引用
收藏
页数:29
相关论文
共 50 条
  • [1] Predicting Hospital Stay Length Using Explainable Machine Learning
    Alsinglawi, Belal S.
    Alnajjar, Fady
    Alorjani, Mohammed S.
    Al-Shari, Osama Mohammed
    Munoz, Mauricio Novoa
    Mubin, Omar
    IEEE ACCESS, 2024, 12 : 90571 - 90585
  • [2] An Interpretable Machine Learning Approach for Predicting Hospital Length of Stay and Readmission
    Liu, Yuxi
    Qin, Shaowen
    ADVANCED DATA MINING AND APPLICATIONS, ADMA 2021, PT I, 2022, 13087 : 73 - 85
  • [3] Predicting length of hospital stay in infants with acute bronchiolitis using machine-learning algorithms
    Seo, Won Hee
    Hutapea, Parsaoran
    Loh, Byoung Gook
    ACTA PAEDIATRICA, 2021, 110 (03) : 961 - 962
  • [4] Machine learning: Predicting hospital length of stay in patients admitted for lupus flares
    Grovu, Radu
    Huo, Yanran
    Nguyen, Andrew
    Mourad, Omar
    Pan, Zihang
    El-Gharib, Khalil
    Wei, Chapman
    Mustafa, Ahmad
    Quan, Theodore
    Slobodnick, Anastasia
    LUPUS, 2023, 32 (12) : 1418 - 1429
  • [5] Predicting Length of Stay using machine learning for total joint replacements performed at a rural community hospital
    Sridhar, Srinivasan
    Whitaker, Bradley
    Mouat-Hunter, Amy
    McCrory, Bernadette
    PLOS ONE, 2022, 17 (11):
  • [6] Using Machine Learning Models to Predict the Length of Stay in a Hospital Setting
    Mekhaldi, Rachda Naila
    Caulier, Patrice
    Chaabane, Sondes
    Chraibi, Abdelahad
    Piechowiak, Sylvain
    TRENDS AND INNOVATIONS IN INFORMATION SYSTEMS AND TECHNOLOGIES, VOL 1, 2020, 1159 : 202 - 211
  • [7] Application of machine learning models on predicting the length of hospital stay in fragility fracture patients
    Lai, Chun-Hei
    Mok, Prudence Kwan-Lam
    Chau, Wai-Wang
    Law, Sheung-Wai
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2024, 24 (01)
  • [8] Application of machine learning models on predicting the length of hospital stay in fragility fracture patients
    Chun-Hei Lai
    Prudence Kwan-Lam Mok
    Wai-Wang Chau
    Sheung-Wai Law
    BMC Medical Informatics and Decision Making, 24
  • [9] Predicting Hospital Length of Stay After Intensive Care Unit Discharge with Machine Learning
    Rojas, J. C.
    Venable, L. R.
    Fahrenbach, J. P.
    Carey, K. A.
    Edelson, D. P.
    Howell, M. D.
    Churpek, M. M.
    AMERICAN JOURNAL OF RESPIRATORY AND CRITICAL CARE MEDICINE, 2018, 197
  • [10] Machine learning model for healthcare investments predicting the length of stay in a hospital & mortality rate
    Aashi Singh Bhadouria
    Ranjeet Kumar Singh
    Multimedia Tools and Applications, 2024, 83 : 27121 - 27191