Modeling Insurance Fraud Detection Using Imbalanced Data Classification

被引:40
作者
Hassan, Amira Kamil Ibrahim [1 ,2 ]
Abraham, Ajith [1 ,3 ]
机构
[1] Sudan Univ Sci & Technol, Dept Comp Sci, Khartoum, Sudan
[2] MIR Labs, Auburn, WA USA
[3] VSB Tech Univ Ostrava, IT4Innovat, Ostrava, Czech Republic
来源
ADVANCES IN NATURE AND BIOLOGICALLY INSPIRED COMPUTING | 2016年 / 419卷
关键词
Insurance fraud detection; Imbalanced data; Decision tree; Support vector machine and artificial neural network; AUTOMOBILE INSURANCE; CLAIMS;
D O I
10.1007/978-3-319-27400-3_11
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes an innovative insurance fraud detection method to deal with the imbalanced data distribution. The idea is based on building insurance fraud detection models using Decision tree (DT), Support vector machine (SVM) and Artificial Neural Network (ANN), on data partitions derived from under-sampling (with-replacement and without-replacement) of the majority class and merging it with the minority class. Throughout the paper, ten-fold cross validation method of testing is used. Its originality lies in the use of several partitioning under-sampling approaches and choosing the best. Results from a publicly available automobile insurance fraud detection data set demonstrate that DT performs slightly better than other algorithms, so DT model was used to compare between different partitioning-under-sampling approaches. Empirical results illustrate that the proposed model gave better results.
引用
收藏
页码:117 / 127
页数:11
相关论文
共 29 条
[11]  
Farquad M.A.H., 2012, INT J ELECT CUSTOMER, V48, P6
[12]  
Hassan AKI, 2013, 2013 INTERNATIONAL CONFERENCE ON COMPUTING, ELECTRICAL AND ELECTRONICS ENGINEERING (ICCEEE), P239, DOI 10.1109/ICCEEE.2013.6633940
[13]  
Ibarguren I., 2015, KNOWL BASED SYST
[14]  
Kamil A., 2013, Journal of Network and Innovative Computing, V1, P341
[15]  
Pérez JM, 2005, LECT NOTES COMPUT SC, V3686, P381
[16]   Selection bias and auditing policies for insurance claims [J].
Pinquet, Jean ;
Ayuso, Mercedes ;
Guillen, Montserrat .
JOURNAL OF RISK AND INSURANCE, 2007, 74 (02) :425-440
[17]  
Shawe-Taylor J., 2000, INTRO SUPPORT VECTOR, V204
[18]  
Silver M, 2001, J Healthc Inf Manag, V15, P155
[19]  
Sternberg M., 1997, IEEE Transactions on Evolutionary Computation, V1, P225, DOI 10.1109/4235.687883
[20]   A novel hybrid undersampling method for mining unbalanced datasets in banking and insurance [J].
Sundarkumar, G. Ganesh ;
Ravi, Vadlamani .
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2015, 37 :368-377