Towards automated feature engineering for credit card fraud detection using multi-perspective HMMs

被引:61
作者
Lucas, Yvan [1 ,2 ]
Portier, Pierre-Edouard [1 ]
Laporte, Lea [1 ]
He-Guelton, Liyun [3 ]
Caelen, Olivier [3 ]
Granitzer, Michael [2 ]
Calabretto, Sylvie [1 ]
机构
[1] INSA Lyon, Lyon, France
[2] Univ Passau, Passau, Germany
[3] Worldline Lyon, Lyon, France
来源
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE | 2020年 / 102卷
关键词
Crime - Feature extraction - Classification (of information) - Data mining - Decision trees;
D O I
10.1016/j.future.2019.08.029
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Machine learning and data mining techniques have been used extensively in order to detect credit card frauds. However, most studies consider credit card transactions as isolated events and not as a sequence of transactions. In this framework, we model a sequence of credit card transactions from three different perspectives, namely (i) The sequence contains or doesn't contain a fraud (ii) The sequence is obtained by fixing the card-holder or the payment terminal (iii) It is a sequence of spent amount or of elapsed time between the current and previous transactions. Combinations of the three binary perspectives give eight sets of sequences from the (training) set of transactions. Each one of these sequences is modelled with a Hidden Markov Model (HMM). Each HMM associates a likelihood to a transaction given its sequence of previous transactions. These likelihoods are used as additional features in a Random Forest classifier for fraud detection. Our multiple perspectives HMM-based approach offers automated feature engineering to model temporal correlations so as to improve the effectiveness of the classification task and allows for an increase in the detection of fraudulent transactions when combined with the state of the art expert based feature engineering strategy for credit card fraud detection. In extension to previous works, we show that this approach goes beyond ecommerce transactions and provides a robust feature engineering over different datasets, hyperparameters and classifiers. Moreover, we compare strategies to deal with structural missing values. (C) 2019 Elsevier B.V. All rights reserved.
引用
收藏
页码:393 / 402
页数:10
相关论文
共 30 条
[1]  
[Anonymous], IEEE T DEPENDABCLE S
[2]  
[Anonymous], 34 ACM SIGAPP S APPL
[3]  
[Anonymous], ICML 06 P 23 INT C M
[4]  
[Anonymous], INNOVATIONS NEURAL I
[5]  
[Anonymous], P NF
[6]  
[Anonymous], DECIS SUPPORT SYST
[7]  
[Anonymous], P 2013 12 INT C MACH
[8]  
[Anonymous], INT J INF SECUR RES
[9]  
[Anonymous], TECHNOMETRICS
[10]  
[Anonymous], COMPUT SYST