Identifying Medicare Provider Fraud with Unsupervised Machine Learning

被引:30
作者
Bauder, Richard A. [1 ]
da Rosa, Raquel C. [1 ]
Khoshgoftaar, Taghi M. [1 ]
机构
[1] Florida Atlantic Univ, Boca Raton, FL 33431 USA
来源
2018 IEEE INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION (IRI) | 2018年
关键词
Medicare Part B; Outlier Detection; LEIE; Medicare Fraud; Unsupervised Machine Learning;
D O I
10.1109/IRI.2018.00051
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the increasing number of people ages 65 and older, healthcare programs are being relied on more for quality and affordable care. Given these and other factors, healthcare spending continues to increase, particularly for the elderly. Medicare is one such program affected by the aging population. Fraud in the United States (U.S.) Medicare program is an ongoing issue resulting in higher healthcare costs for beneficiaries. In this paper, we present an empirical study of several unsupervised machine learning methods to detect outliers, indicating fraudulent medical providers, using the Medicare Part B big dataset. We employ two methods, Isolation Forest and Unsupervised Random Forest, which have not previously been used for the detection of Medicare fraud, along with more commonly used methods to include Local Outlier Factor, autoencoders, and k-Nearest Neighbors. In order to validate the fraud detection performance of each method, we use the List of Excluded Individuals/Entities (LEIE) database which contains information on excluded providers. Moreover, we present details on processing the Part B data and incorporating the LEIE fraud labels. Our results indicate that Local Outlier Factor is the best outlier detection method and k-Nearest Neighbors, with 5 neighbors, and autoencoders are the worst at detecting Medicare Part B fraud.
引用
收藏
页码:285 / 292
页数:8
相关论文
共 43 条
[1]   Unsupervised random forest: a tutorial with case studies [J].
Afanador, Nelson Lee ;
Smolinska, Agnieszka ;
Tran, Thanh N. ;
Blanchet, Lionel .
JOURNAL OF CHEMOMETRICS, 2016, 30 (05) :232-241
[2]  
Aggarwal C. C., 2015, Data mining, P237
[3]  
[Anonymous], COMBATING FRAUD HLTH
[4]  
[Anonymous], NAT HLTH EXP PROJ 20
[5]  
[Anonymous], MED FEE SERV PROV UT
[6]  
[Anonymous], COORDINATES ROC CURV
[7]  
[Anonymous], TECH REP
[8]  
[Anonymous], J MACHINE LEARNING R
[9]  
[Anonymous], H20 R INT H2O
[10]  
[Anonymous], 2013, FNN: Fast Nearest Neighbor Search Algorithms and Applications