Building predictive models of healthcare costs with open healthcare data

被引:2
作者
Rao, A. Ravishankar [1 ]
Garai, Subrata
Dey, Soumyabrata [2 ]
Peng, Hang
机构
[1] Fairleigh Dickinson Univ, Teaneck, NJ 07666 USA
[2] Clarkson Univ, Potsdam, NY USA
来源
2020 8TH IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI 2020) | 2020年
关键词
open health data; predictive models; cost prediction;
D O I
10.1109/ICHI48887.2020.9374348
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Due to rapidly rising healthcare costs worldwide, there is significant interest in controlling them. An important aspect concerns price transparency, as preliminary efforts have demonstrated that patients will shop for lower costs, driving efficiency. This requires the data to be made available, and models that can predict healthcare costs for a wide range of patient demographics and conditions. We present an approach to this problem by developing a predictive model using machine-learning techniques. We analyzed de-identified patient data from New York State SPARCS (statewide planning and research cooperative system), consisting of 2.3 million records in 2016. We built models to predict costs from patient diagnoses and demographics. We investigated two model classes consisting of sparse regression and decision trees. We obtained the best performance by using a decision tree with depth 10. We obtained an R-2 value of 0.76, which is better than the values reported in the literature for similar problems.
引用
收藏
页码:486 / 488
页数:3
相关论文
共 20 条
  • [1] [Anonymous], 2015, Proceedings of the 5th International Conference on Digital Health 2015
  • [2] [Anonymous], 2020, TRUMP ADM ANNOUNCES
  • [3] [Anonymous], New York State Department of Health clinical practice guideline report of the recommendations: motor disorders
  • [4] Algorithmic Prediction of Health-Care Costs
    Bertsimas, Dimitris
    Bjarnadottir, Margret V.
    Kane, Michael A.
    Kryder, J. Christian
    Pandey, Rudra
    Vempala, Santosh
    Wang, Grant
    [J]. OPERATIONS RESEARCH, 2008, 56 (06) : 1382 - 1392
  • [5] Cumming R.B., 2002, A Comparative Analysis of Claims Based Risk Assessment for Commercial Populations
  • [6] data.medicare, US
  • [7] E. National Academies of Sciences and Medicine, 2019, REPRODUCIBILITY REPL
  • [8] Building Trust in the Power of "Big Data" Research to Serve the Public Good
    Larson, Eric B.
    [J]. JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2013, 309 (23): : 2443 - 2444
  • [9] Morid Mohammad Amin, 2017, AMIA Annu Symp Proc, V2017, P1312
  • [10] Pedregosa F, 2011, J MACH LEARN RES, V12, P2825