Predicting hospital length of stay using machine learning on a large open health dataset

被引:0
|
作者
Jain, Raunak [1 ]
Singh, Mrityunjai [1 ]
Rao, A. Ravishankar [2 ]
Garg, Rahul [1 ]
机构
[1] Indian Inst Technol, Delhi, India
[2] Fairleigh Dickinson Univ, Teaneck, NJ 07666 USA
关键词
Machine learning; Artificial intelligence; Health informatics; Open data; Open-source software; Healthcare analytics; REPRODUCIBILITY;
D O I
10.1186/s12913-024-11238-y
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
BackgroundGovernments worldwide are facing growing pressure to increase transparency, as citizens demand greater insight into decision-making processes and public spending. An example is the release of open healthcare data to researchers, as healthcare is one of the top economic sectors. Significant information systems development and computational experimentation are required to extract meaning and value from these datasets. We use a large open health dataset provided by the New York State Statewide Planning and Research Cooperative System (SPARCS) containing 2.3 million de-identified patient records. One of the fields in these records is a patient's length of stay (LoS) in a hospital, which is crucial in estimating healthcare costs and planning hospital capacity for future needs. Hence it would be very beneficial for hospitals to be able to predict the LoS early. The area of machine learning offers a potential solution, which is the focus of the current paper.MethodsWe investigated multiple machine learning techniques including feature engineering, regression, and classification trees to predict the length of stay (LoS) of all the hospital procedures currently available in the dataset. Whereas many researchers focus on LoS prediction for a specific disease, a unique feature of our model is its ability to simultaneously handle 285 diagnosis codes from the Clinical Classification System (CCS). We focused on the interpretability and explainability of input features and the resulting models. We developed separate models for newborns and non-newborns.ResultsThe study yields promising results, demonstrating the effectiveness of machine learning in predicting LoS. The best R2 scores achieved are noteworthy: 0.82 for newborns using linear regression and 0.43 for non-newborns using catboost regression. Focusing on cardiovascular disease refines the predictive capability, achieving an improved R2 score of 0.62. The models not only demonstrate high performance but also provide understandable insights. For instance, birth-weight is employed for predicting LoS in newborns, while diagnostic-related group classification proves valuable for non-newborns.ConclusionOur study showcases the practical utility of machine learning models in predicting LoS during patient admittance. The emphasis on interpretability ensures that the models can be easily comprehended and replicated by other researchers. Healthcare stakeholders, including providers, administrators, and patients, stand to benefit significantly. The findings offer valuable insights for cost estimation and capacity planning, contributing to the overall enhancement of healthcare management and delivery.
引用
收藏
页数:29
相关论文
共 50 条
  • [21] Hospital Length-of-Stay Prediction Using Machine Learning Algorithms-A Literature Review
    Almeida, Guilherme
    Correia, Fernanda Brito
    Borges, Ana Rosa
    Bernardino, Jorge
    APPLIED SCIENCES-BASEL, 2024, 14 (22):
  • [22] Predicting Length of Stay Across Hospital Departments
    Puentes Gutierrez, Jesus Manuel
    Sicilia, Miguel-Angel
    Sanchez-Alonso, Salvador
    Garcia-Barriocanal, Elena
    IEEE ACCESS, 2021, 9 : 44671 - 44680
  • [23] Predicting length of stay in an acute psychiatric hospital
    Huntley, DA
    Cho, DW
    Christman, J
    Csernansky, JG
    PSYCHIATRIC SERVICES, 1998, 49 (08) : 1049 - 1053
  • [24] Telehealth Factors for Predicting Hospital Length of Stay
    Murphy, Mary M.
    JOURNAL OF GERONTOLOGICAL NURSING, 2018, 44 (10): : 16 - 20
  • [25] Predicting Postoperative Length of Stay for Isolated Coronary Artery Bypass Graft Patients Using Machine Learning
    Alshakhs, Fatima
    Alharthi, Hana
    Aslam, Nida
    Khan, Irfan Ullah
    Elasheri, Mohamed
    INTERNATIONAL JOURNAL OF GENERAL MEDICINE, 2020, 13 : 751 - 762
  • [26] Predicting length of stay out of hospital following lung resection using preoperative health status measures
    Janet A. Parsons
    Michael R. Johnston
    Arthur S. Slutsky
    Quality of Life Research, 2003, 12 : 645 - 654
  • [27] Predicting length of stay out of hospital following lung resection using preoperative health status measures
    Parsons, JA
    Johnston, MR
    Slutsky, AS
    QUALITY OF LIFE RESEARCH, 2003, 12 (06) : 645 - 654
  • [28] An explainable machine learning framework for lung cancer hospital length of stay prediction
    Belal Alsinglawi
    Osama Alshari
    Mohammed Alorjani
    Omar Mubin
    Fady Alnajjar
    Mauricio Novoa
    Omar Darwish
    Scientific Reports, 12
  • [29] A Multimodal Machine Learning Model in Pneumonia Patients Hospital Length of Stay Prediction
    Annunziata, Anna
    Cappabianca, Salvatore
    Capuozzo, Salvatore
    Coppola, Nicola
    Di Somma, Camilla
    Docimo, Ludovico
    Fiorentino, Giuseppe
    Gravina, Michela
    Marassi, Lidia
    Marrone, Stefano
    Parmeggiani, Domenico
    Polistina, Giorgio Emanuele
    Reginelli, Alfonso
    Sagnelli, Caterina
    Sansone, Carlo
    BIG DATA AND COGNITIVE COMPUTING, 2024, 8 (12)
  • [30] An explainable machine learning framework for lung cancer hospital length of stay prediction
    Alsinglawi, Belal
    Alshari, Osama
    Alorjani, Mohammed
    Mubin, Omar
    Alnajjar, Fady
    Novoa, Mauricio
    Darwish, Omar
    SCIENTIFIC REPORTS, 2022, 12 (01)