Personal Loan Fraud Detection Based on Hybrid Supervised and Unsupervised Learning

被引:0
作者
Wen, Hanlin [1 ]
Huang, Fangming [2 ]
机构
[1] Huazhong Univ Sci & Technol, Dept Artificial Intelligence & Automat, Wuhan, Peoples R China
[2] Chinese Univ Hong Kong, Sch Sci & Engn, Shenzhen, Peoples R China
来源
2020 5TH IEEE INTERNATIONAL CONFERENCE ON BIG DATA ANALYTICS (IEEE ICBDA 2020) | 2020年
关键词
supervised learning; unsupervised learning; Extreme Gradient Boosting; principal component analysis;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In recent years, we have been witnessing a dramatic increase on the personal loan for consumption, due to the rapid development of e-services, including e-commerce, e-finance and mobile payments. Resulting from the lack of effective grid verification and supervision, it inevitably leads to large-scale losses caused by credit loan fraud [1]. Considering the difficulty of manual inspection and verification on the large amount of credit card transactions, machine learning methods are commonly used to detect fraudulent transactions automatically. This article has applied the Extreme Gradient Boosting(XGBoost) model for data mining and analysis, which is inspired by its brilliant reputation in various data mining contests. With people's growing concern about privacy protection, how can we apply data mining techniques while taking consideration into privacy terms is one problem. Additionally, according to current loan fraud detection studies, some features are considered to contain little information or a bit of redundancy, whereas others hold the critical information which makes things harder when feature engineering. In order to filter useless information and preserve the useful information without knowing the meaning of our data, this paper combines Kernel Principal Component Analysis (Kernel PCA) together with XGBoost algorithm and proposes a new hybrid unsupervised and supervised learning model, KP-XGBoost. We use grid search to avoid over-fitting and compare the performance of both XGBoost and P-XGBoost and other classical machine learning methods. It turns out that P-XGBoost outperforms XGBoost in fraud detection, which provides a new perspective to detecting the fraud behaviour while protecting clients' privacy.
引用
收藏
页码:339 / 343
页数:5
相关论文
共 18 条
[11]   Credit card fraud detection: A fusion approach using Dempster-Shafer theory and Bayesian learning [J].
Panigrahi, Suvasini ;
Kundu, Amlan ;
Sural, Shamik ;
Majumdar, A. K. .
INFORMATION FUSION, 2009, 10 (04) :354-363
[12]   On the communal analysis suspicion. scoring for identity crime in streaming credit applications [J].
Phua, Clifton ;
Gayler, Ross ;
Lee, Vincent ;
Smith-Miles, Kate .
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2009, 195 (02) :595-612
[13]   Real-time credit card fraud detection using computational intelligence [J].
Quah, Jon T. S. ;
Sriganesh, M. .
EXPERT SYSTEMS WITH APPLICATIONS, 2008, 35 (04) :1721-1732
[14]  
Scholkopf B., 1995, KDD
[15]   A systematic analysis of performance measures for classification tasks [J].
Sokolova, Marina ;
Lapalme, Guy .
INFORMATION PROCESSING & MANAGEMENT, 2009, 45 (04) :427-437
[16]   Credit card fraud detection using hidden Markov model [J].
Srivastava, Abhinav ;
Kundu, Amlan ;
Sural, Shamik ;
Majumdar, Arun K. .
IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2008, 5 (01) :37-48
[17]   HOBA: A novel feature engineering methodology for credit card fraud detection with a deep learning architecture [J].
Zhang, Xinwei ;
Han, Yaoci ;
Xu, Wei ;
Wang, Qili .
INFORMATION SCIENCES, 2021, 557 :302-316
[18]   Ensemble boosted trees with synthetic features generation in application to bankruptcy prediction [J].
Zieba, Maciej ;
Tomczak, Sebastian K. ;
Tomczak, Jakub M. .
EXPERT SYSTEMS WITH APPLICATIONS, 2016, 58 :93-101