Modeling Insurance Fraud Detection Using Imbalanced Data Classification

被引:40
作者
Hassan, Amira Kamil Ibrahim [1 ,2 ]
Abraham, Ajith [1 ,3 ]
机构
[1] Sudan Univ Sci & Technol, Dept Comp Sci, Khartoum, Sudan
[2] MIR Labs, Auburn, WA USA
[3] VSB Tech Univ Ostrava, IT4Innovat, Ostrava, Czech Republic
来源
ADVANCES IN NATURE AND BIOLOGICALLY INSPIRED COMPUTING | 2016年 / 419卷
关键词
Insurance fraud detection; Imbalanced data; Decision tree; Support vector machine and artificial neural network; AUTOMOBILE INSURANCE; CLAIMS;
D O I
10.1007/978-3-319-27400-3_11
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes an innovative insurance fraud detection method to deal with the imbalanced data distribution. The idea is based on building insurance fraud detection models using Decision tree (DT), Support vector machine (SVM) and Artificial Neural Network (ANN), on data partitions derived from under-sampling (with-replacement and without-replacement) of the majority class and merging it with the minority class. Throughout the paper, ten-fold cross validation method of testing is used. Its originality lies in the use of several partitioning under-sampling approaches and choosing the best. Results from a publicly available automobile insurance fraud detection data set demonstrate that DT performs slightly better than other algorithms, so DT model was used to compare between different partitioning-under-sampling approaches. Empirical results illustrate that the proposed model gave better results.
引用
收藏
页码:117 / 127
页数:11
相关论文
共 29 条
[1]  
[Anonymous], 2004, SIGKDD Explorations, DOI [10.1145/1007730.1007738, DOI 10.1145/1007730.1007738]
[2]   Data mining with decision trees and decision rules [J].
Apte, C ;
Weiss, S .
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 1997, 13 (2-3) :197-210
[3]   Detection of automobile insurance fraud with discrete choice models and misclassified claims [J].
Artís, M ;
Ayuso, M ;
Guillén, M .
JOURNAL OF RISK AND INSURANCE, 2002, 69 (03) :325-340
[4]   Modelling different types of automobile insurance fraud behaviour in the Spanish market [J].
Artís, M ;
Ayuso, M ;
Guillén, M .
INSURANCE MATHEMATICS & ECONOMICS, 1999, 24 (1-2) :67-81
[5]  
BELHADJI EB, 2000, GENEVA PAP RISK INS, V25, P517
[6]  
Bhowmik R., 2011, Journal of Emerging Trends in Computing and Information Sciences, V2, P156
[7]   Using Kohonen's self-organizing feature map to uncover automobile bodily injury claims fraud [J].
Brockett, PL ;
Xia, XH ;
Derrig, RA .
JOURNAL OF RISK AND INSURANCE, 1998, 65 (02) :245-274
[8]   Fraud detection using a multinomial logit model with missing information [J].
Caudill, SB ;
Ayuso, M ;
Guillén, M .
JOURNAL OF RISK AND INSURANCE, 2005, 72 (04) :539-550
[9]   Distributed data mining in credit card fraud detection [J].
Chan, PK ;
Fan, W ;
Prodromidis, AL ;
Stolfo, SJ .
IEEE INTELLIGENT SYSTEMS & THEIR APPLICATIONS, 1999, 14 (06) :67-74
[10]  
CHAN PK, 1995, INT C MACH LEARN ICM, P90