Identifying Medicare Provider Fraud with Unsupervised Machine Learning

被引:30
作者
Bauder, Richard A. [1 ]
da Rosa, Raquel C. [1 ]
Khoshgoftaar, Taghi M. [1 ]
机构
[1] Florida Atlantic Univ, Boca Raton, FL 33431 USA
来源
2018 IEEE INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION (IRI) | 2018年
关键词
Medicare Part B; Outlier Detection; LEIE; Medicare Fraud; Unsupervised Machine Learning;
D O I
10.1109/IRI.2018.00051
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the increasing number of people ages 65 and older, healthcare programs are being relied on more for quality and affordable care. Given these and other factors, healthcare spending continues to increase, particularly for the elderly. Medicare is one such program affected by the aging population. Fraud in the United States (U.S.) Medicare program is an ongoing issue resulting in higher healthcare costs for beneficiaries. In this paper, we present an empirical study of several unsupervised machine learning methods to detect outliers, indicating fraudulent medical providers, using the Medicare Part B big dataset. We employ two methods, Isolation Forest and Unsupervised Random Forest, which have not previously been used for the detection of Medicare fraud, along with more commonly used methods to include Local Outlier Factor, autoencoders, and k-Nearest Neighbors. In order to validate the fraud detection performance of each method, we use the List of Excluded Individuals/Entities (LEIE) database which contains information on excluded providers. Moreover, we present details on processing the Part B data and incorporating the LEIE fraud labels. Our results indicate that Local Outlier Factor is the best outlier detection method and k-Nearest Neighbors, with 5 neighbors, and autoencoders are the worst at detecting Medicare Part B fraud.
引用
收藏
页码:285 / 292
页数:8
相关论文
共 43 条
[21]   Multivariate outlier detection in medicare claims payments applying probabilistic programming methods [J].
Bauder R.A. ;
Khoshgoftaar T.M. .
Health Services and Outcomes Research Methodology, 2017, 17 (3-4) :256-289
[22]  
Bauder RA, 2016, 2016 15TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2016), P347, DOI [10.1109/ICMLA.2016.28, 10.1109/ICMLA.2016.0063]
[23]  
Bauder RA, 2016, PROC INT C TOOLS ART, P784, DOI [10.1109/ICTAI.2016.0123, 10.1109/ICTAI.2016.120]
[24]  
Bekkar M., 2013, J INF ENG APPL, V3
[25]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[26]   LOF: Identifying density-based local outliers [J].
Breunig, MM ;
Kriegel, HP ;
Ng, RT ;
Sander, J .
SIGMOD RECORD, 2000, 29 (02) :93-104
[27]  
CMS, NAT PROV ID STAND NP
[28]  
Gong FL, 2017, 2017 INTERNATIONAL CONFERENCE ON SOCIAL SCIENCES, ARTS AND HUMANITIES (SSAH 2017), P90
[29]   Learning with limited minority class data [J].
Khoshgoftaar, Taghi M. ;
Seiffert, Chris ;
Van Hulse, Jason ;
Napolitano, Amri ;
Folleco, Andres .
ICMLA 2007: SIXTH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, PROCEEDINGS, 2007, :348-353
[30]  
LEIE, 2017, Office of inspector general leie downloadable databases