The application of data mining techniques in financial fraud detection: A classification framework and an academic review of literature

被引:589
作者
Ngai, E. W. T. [2 ]
Hu, Yong [1 ]
Wong, Y. H. [2 ]
Chen, Yijun [1 ]
Sun, Xin [1 ]
机构
[1] Sun Yat Sen Univ, Guangdong Univ Foreign Studies, Dept E Commerce, Inst Business Intelligence & Knowledge Discovery, Guangzhou 510006, Guangdong, Peoples R China
[2] Hong Kong Polytech Univ, Dept Management & Mkt, Kowloon, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
Financial fraud; Fraud detection; Literature review; Data mining; Business intelligence; AUTOMOBILE INSURANCE FRAUD; HEALTH-CARE FRAUD; CHOICE MODELS; CLAIMS; MANAGEMENT; AUDITORS; SYSTEM;
D O I
10.1016/j.dss.2010.08.006
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a review of and classification scheme for - the literature on the application of data mining techniques for the detection of financial fraud. Although financial fraud detection (FFD) is an emerging topic of great importance, a comprehensive literature review of the subject has yet to be carried out. This paper thus represents the first systematic, identifiable and comprehensive academic literature review of the data mining techniques that have been applied to FFD. 49 journal articles on the subject published between 1997 and 2008 was analyzed and classified into four categories of financial fraud (bank fraud, insurance fraud, securities and commodities fraud, and other related financial fraud) and six classes of data mining techniques (classification, regression, clustering, prediction, outlier detection, and visualization). The findings of this review clearly show that data mining techniques have been applied most extensively to the detection of insurance fraud, although corporate fraud and credit card fraud have also attracted a great deal of attention in recent years. In contrast, we find a distinct lack of research on mortgage fraud, money laundering, and securities and commodities fraud. The main data mining techniques used for FFD are logistic models, neural networks, the Bayesian belief network, and decision trees, all of which provide primary solutions to the problems inherent in the detection and classification of fraudulent data. This paper also addresses the gaps between FFD and the needs of the industry to encourage additional research on neglected topics, and concludes with several suggestions for further FFD research. Crown Copyright (C) 2010 Published by Elsevier B.V. All rights reserved.
引用
收藏
页码:559 / 569
页数:11
相关论文
共 88 条
[1]  
Agresti Alan., 1990, CATEGORICAL DATA ANA
[2]   A comprehensive survey of numeric and symbolic outlier mining techniques [J].
Agyemang, Malik ;
Barker, Ken ;
Alhajj, Rada .
INTELLIGENT DATA ANALYSIS, 2006, 10 (06) :521-538
[3]   Applications of data mining in retail business [J].
Ahmed, SR .
ITCC 2004: INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY: CODING AND COMPUTING, VOL 2, PROCEEDINGS, 2004, :455-459
[4]  
[Anonymous], 7 USENIX SEC S SAN A
[5]  
[Anonymous], 1997, MANAG FINANC
[6]  
[Anonymous], IEEE INT C NETW SENS
[7]  
[Anonymous], 1996, APPL MULTIVARIATE TE
[8]  
[Anonymous], 1999, OXFORD CONCISE ENGLI
[9]  
[Anonymous], 2000, Pattern Classification
[10]   Detection of automobile insurance fraud with discrete choice models and misclassified claims [J].
Artís, M ;
Ayuso, M ;
Guillén, M .
JOURNAL OF RISK AND INSURANCE, 2002, 69 (03) :325-340