Personal Loan Fraud Detection Based on Hybrid Supervised and Unsupervised Learning

被引:0
作者
Wen, Hanlin [1 ]
Huang, Fangming [2 ]
机构
[1] Huazhong Univ Sci & Technol, Dept Artificial Intelligence & Automat, Wuhan, Peoples R China
[2] Chinese Univ Hong Kong, Sch Sci & Engn, Shenzhen, Peoples R China
来源
2020 5TH IEEE INTERNATIONAL CONFERENCE ON BIG DATA ANALYTICS (IEEE ICBDA 2020) | 2020年
关键词
supervised learning; unsupervised learning; Extreme Gradient Boosting; principal component analysis;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In recent years, we have been witnessing a dramatic increase on the personal loan for consumption, due to the rapid development of e-services, including e-commerce, e-finance and mobile payments. Resulting from the lack of effective grid verification and supervision, it inevitably leads to large-scale losses caused by credit loan fraud [1]. Considering the difficulty of manual inspection and verification on the large amount of credit card transactions, machine learning methods are commonly used to detect fraudulent transactions automatically. This article has applied the Extreme Gradient Boosting(XGBoost) model for data mining and analysis, which is inspired by its brilliant reputation in various data mining contests. With people's growing concern about privacy protection, how can we apply data mining techniques while taking consideration into privacy terms is one problem. Additionally, according to current loan fraud detection studies, some features are considered to contain little information or a bit of redundancy, whereas others hold the critical information which makes things harder when feature engineering. In order to filter useless information and preserve the useful information without knowing the meaning of our data, this paper combines Kernel Principal Component Analysis (Kernel PCA) together with XGBoost algorithm and proposes a new hybrid unsupervised and supervised learning model, KP-XGBoost. We use grid search to avoid over-fitting and compare the performance of both XGBoost and P-XGBoost and other classical machine learning methods. It turns out that P-XGBoost outperforms XGBoost in fraud detection, which provides a new perspective to detecting the fraud behaviour while protecting clients' privacy.
引用
收藏
页码:339 / 343
页数:5
相关论文
共 18 条
[1]   CARDWATCH: A neural network based database mining system for credit card fraud detection [J].
Aleskerov, E ;
Freisleben, B ;
Rao, B .
PROCEEDINGS OF THE IEEE/IAFE 1997 COMPUTATIONAL INTELLIGENCE FOR FINANCIAL ENGINEERING (CIFER), 1997, :220-226
[2]  
[Anonymous], 2007, 2007 INT C SERV SYST, DOI DOI 10.1109/ICSSSM.2007.4280163
[3]   Data mining for credit card fraud: A comparative study [J].
Bhattacharyya, Siddhartha ;
Jha, Sanjeev ;
Tharakunnel, Kurian ;
Westland, J. Christopher .
DECISION SUPPORT SYSTEMS, 2011, 50 (03) :602-613
[4]  
Bolton RJ, 2002, STAT SCI, V17, P235
[5]   A data mining based system for credit-card fraud detection in e-tail [J].
Carneiro, Nuno ;
Figueira, Goncalo ;
Costa, Miguel .
DECISION SUPPORT SYSTEMS, 2017, 95 :91-101
[6]  
Chen T., 2014, INT C HIGH EN PHYS M
[7]   High-dimensional and large-scale anomaly detection using a linear one-class SVM with deep learning [J].
Erfani, Sarah M. ;
Rajasegarar, Sutharshan ;
Karunasekera, Shanika ;
Leckie, Christopher .
PATTERN RECOGNITION, 2016, 58 :121-134
[8]   基于核主成分分析的多输出模型确认方法 [J].
胡嘉蕊 ;
吕震宙 .
北京航空航天大学学报, 2017, (07) :1470-1480
[9]   A hybrid model for plastic card fraud detection systems [J].
Krivko, M. .
EXPERT SYSTEMS WITH APPLICATIONS, 2010, 37 (08) :6070-6076
[10]  
Mika S., 1999, C ADV NEUR INF PROC