Healthcare Fraud Detection using Machine Learning

被引：1

作者：

Prova, Nuzhat Noor Islam ^{[1
]}

机构：

[1] Pace Univ, Seidenberg Sch CSIS, New York, NY 10038 USA

来源：

2024 SECOND INTERNATIONAL CONFERENCE ON INTELLIGENT CYBER PHYSICAL SYSTEMS AND INTERNET OF THINGS, ICOICI 2024 | 2024年

关键词：

Healthcare Fraud Detection; Machine Learning; Deep Learning; Isolation Forest; Random Forest; SVM; XGBoost; Stacking Ensemble; Neural Network; Hyperparameter Tuning and Optimization; Model Interpretability; Real-Time Fraud; Detection; Automated Model Retraining; PREDICTION;

D O I：

10.1109/ICOICI62503.2024.10696476

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Healthcare fraud in the United States signifies a considerable illicit financial drain with estimations suggesting annual losses amounting to tens of billions of dollars. Such fraudulent activities encompass a wide array of schemes including but not limited to billing for unrendered services, upcoding to receive higher reimbursements, and engaging in unlawful kickback arrangements. Recognizing the criticality of this issue, this research analyzes the utilization of machine learning techniques for the detection of healthcare fraud. Through an analysis of the dataset which amalgamates inpatient, and outpatient claims data with beneficiary information across 558,211 records spanning various dimensions, this study illustrates the application of several ML models including Random Forest, XGBoost, SVM, Isolation Forest, a Deep Learning Model,and a Stacking Ensemble approach. The models are evaluated based on their accuracy, precision, recall, F1 score, and ROC AUCscore with a particular focus on their applicability to healthcare fraud detection. Among the models evaluated, the Stacking Ensemble Model emerged as particularly efficacious, achieving an accuracy of 92.79% and an exceptional ROC AUC score of 96.95%. Incorporating hyperparameter tuning, this study further enhances interpretability and decision-making through SHAP value analysis, offering deep insights into model predictions and feature importances. Additionally, it introduces an innovative real-time healthcare fraud detection pipeline and an automated model retraining framework, ensuring the system remains effective against evolving fraud tactics by continuously adapting and improving.

引用

页码：1119 / 1123

页数：5

共 22 条

[1]

Agarwal S., 2023, Scholars J Eng Technol, V11, P191, DOI [10.36347/sjet.2023.v11i09.003, DOI 10.36347/SJET.2023.V11I09.003]

[2]

Agrawal M., 2019, Learning, V6, P7, DOI [DOI 10.35940/IJRTEB3048.078219, 10.35940/ijrte.b3048.078219, DOI 10.35940/IJRTE.B3048.078219]

[3] Medicare Fraud Detection using Machine Learning Methods [J].

Bauder, Richard A. ;

Khoshgoftaar, Taghi M. .

2017 16TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2017, :858-865

[4] Disease Prediction by Machine Learning Over Big Data From Healthcare Communities [J].

Chen, Min ;

Hao, Yixue ;

Hwang, Kai ;

Wang, Lu ;

Wang, Lin .

IEEE ACCESS, 2017, 5 :8869-8879

[5] Optimal feature extraction and classification-oriented medical insurance prediction model: machine learning integrated with the internet of things [J].

Chowdhury S. ;

Mayilvahanan P. ;

Govindaraj R. .

International Journal of Computers and Applications, 2022, 44 (03) :278-290

[6]

Duman E., 2022, Techno-Science, V5, P69

[7] Data-Centric AI for Healthcare Fraud Detection [J].

Johnson J.M. ;

Khoshgoftaar T.M. .

SN Computer Science, 4 (4)

[8] Machine Learning-Based Regression Framework to Predict Health Insurance Premiums [J].

Kaushik, Keshav ;

Bhardwaj, Akashdeep ;

Dwivedi, Ashutosh Dhar ;

Singh, Rajani .

INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH, 2022, 19 (13)

[9]

Liu Q, 2013, P 29 WORLD CONT AUD, P1

[10] A Framework For Fraud Detection in Government Supported National Healthcare Programs [J].

Matloob, Irum ;

Khan, Shoab .

PROCEEDINGS OF THE 11TH INTERNATIONAL CONFERENCE ON ELECTRONICS, COMPUTERS AND ARTIFICIAL INTELLIGENCE (ECAI-2019), 2019,

← 1 2 3 →