Who determines United States Healthcare out-of-pocket costs? Factor ranking and selection using ensemble learning

被引:3
作者
Zhang, Chengcheng [1 ]
Ding, Yujia [1 ,2 ]
Peng, Qidi [1 ,2 ]
机构
[1] Claremont Grad Univ, Dept Econ Sci, 150 E 10th St, Claremont, CA 91711 USA
[2] Claremont Grad Univ, Inst Math Sci, 150 E 10th St, Claremont, CA USA
关键词
Out-of-pocket costs; Health insurance; Variable importance rankings; Ensemble learning; INSURANCE; CANCER; EXPENDITURES; CHILDREN; BURDEN; IMPACT;
D O I
10.1007/s13755-021-00153-9
中图分类号
R-058 [];
学科分类号
摘要
Purpose Healthcare out-of-pocket (OOP) costs consist of the annual expenses paid by individuals or families that are not reimbursed by insurance. In the U.S, broadening healthcare disparities are caused by the rapid increase in OOP costs. With a precise forecast of the OOP costs, governments can improve the design of healthcare policies to better control the OOP costs. This study designs a purely data-driven ensemble learning procedure to achieve a collection of factors that best predict OOP costs. Methods We propose a voting ensemble learning procedure to rank and select factors of OOP costs based on the Medical Expenditure Panel Survey dataset. The method involves utilizing votes from the base learners forward subset selection, backward subset selection, random forest, and LASSO. Results The top-ranking factors selected by our proposed method are insurance type, age, asthma, family size, race, and number of physician office visits. The predictive models using these factors outperform the models that employ the factors commonly considered by the literature through improving the prediction error (test MSE of the OOP costs' log-odds) from 0.462 to 0.382. Conclusion Our results indicate a set of factors which best explain the OOP costs behavior based on a purely data-driven solution. These findings contribute to the discussions regarding demand-side needs for containing rapidly rising OOP costs. Instead of estimating the impact of a single factor on OOP costs, our proposed method allows for the selection of arbitrary-sized factors to best explain OOP costs.
引用
收藏
页数:20
相关论文
共 43 条
[1]  
Ashman J.J., 2019, Characteristics of office-based physician visits, 2016. NCHS Data Brief
[2]   A new correlation coefficient between categorical, ordinal and interval variables with Pearson characteristics [J].
Baak, M. ;
Koopman, R. ;
Snoek, H. ;
Klous, S. .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2020, 152
[3]   High Out-of-Pocket Medical Spending among the Poor and Elderly in Nine Developed Countries [J].
Baird, Katherine .
HEALTH SERVICES RESEARCH, 2016, 51 (04) :1467-1488
[4]  
Barnes Patricia M, 2008, Natl Health Stat Report, P1
[5]  
Breiman L., 2002, MANUAL SETTING USING, V1, P58
[6]  
Carrier E, 2014, AM J MANAG CARE, V20, P925
[7]   Primary language and receipt of recommended health care among Hispanics in the United States [J].
Cheng, Eric M. ;
Chen, Alex ;
Cunningham, William .
JOURNAL OF GENERAL INTERNAL MEDICINE, 2007, 22 (Suppl 2) :283-288
[8]   The Medical Expenditure Panel Survey A National Information Resource to Support Healthcare Cost Research and Inform Policy and Practice [J].
Cohen, Joel W. ;
Cohen, Steven B. ;
Banthin, Jessica S. .
MEDICAL CARE, 2009, 47 (07) :S44-S50
[9]   Pronounced Gender And Age Differences Are Evident In Personal Health Care Spending Per Person [J].
Cylus, Jonathan ;
Hartman, Micah ;
Washington, Benjamin ;
Andrews, Kimberly ;
Catlin, Aaron .
HEALTH AFFAIRS, 2011, 30 (01) :153-160
[10]   Immigrants and Health Care Access, Quality, and Cost [J].
Derose, Kathryn Pitkin ;
Bahney, Benjamin W. ;
Lurie, Nicole ;
Escarce, Jose J. .
MEDICAL CARE RESEARCH AND REVIEW, 2009, 66 (04) :355-408