Learning Sequential Behavior Representations for Fraud Detection

被引:27
作者
Guo, Jia [1 ]
Liu, Guannan [1 ]
Zuo, Yuan [1 ]
Wu, Junjie [1 ]
机构
[1] Beihang Univ, Sch Econ & Management, Beijing, Peoples R China
来源
2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM) | 2018年
基金
中国国家自然科学基金;
关键词
Fraud Detection; Behavioral Sequence; LSTM; Attention; MODEL;
D O I
10.1109/ICDM.2018.00028
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Fraud detection is usually regarded as finding a needle in haystack, which is a challenging task because fraudulences are buried in massive normal behaviors. Indeed, a fraudulent incident usually takes place in consecutive time steps to gain illegal benefits, which provides unique clues to probing frauds by considering a complete behavioral sequence, rather than detecting frauds from a snapshot of behaviors. Also, fraudulent behaviors may entail different parties, such that the interaction pattern between sources and targets can help distinguish frauds from normal behaviors. Therefore, in this paper, we model the attributed behavioral sequences generated from consecutive behaviors, in order to capture the sequential patterns, while those deviate from the pattern can be regarded as fraudulence. Considering the characteristics of behavioral sequence, we propose a novel model, HAInt-LSTM, by augmenting traditional LSTM with a modified forget gate where interval time between consecutive time steps are considered. Meanwhile, we employ a self-historical attention mechanism to allow for long-time dependencies, which can help identify repeated or cyclical appearances. In addition, we encode the source information as an interaction module to enhance the learning of behavioral sequences. To validate the effectiveness of the learned sequential behavior representations, we experiment on real-world telecommunication dataset under both supervised and unsupervised scenarios. Experimental results show that the learned representations can better identify fraudulent behaviors, and also show a clear cut with normal sequences in the lower dimensional embedding space through visualization. Last but not least, we visualize the weights of attention mechanism to provide rational interpretation of human behavioral periodicity.
引用
收藏
页码:127 / 136
页数:10
相关论文
共 31 条
[1]   Fraud detection system: A survey [J].
Abdallah, Aisha ;
Maarof, Mohd Aizaini ;
Zainal, Anazida .
JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2016, 68 :90-113
[2]  
[Anonymous], CORR
[3]  
[Anonymous], 2014, CORR
[4]  
[Anonymous], 2015, P ICLR
[5]  
[Anonymous], 1997, Neural Computation
[6]  
[Anonymous], CORR
[7]  
[Anonymous], 2013, Playing atari with deep reinforcement learning
[8]  
[Anonymous], 2015, CoRR
[9]  
[Anonymous], 2015, COMPUTER SCI
[10]   Data mining for credit card fraud: A comparative study [J].
Bhattacharyya, Siddhartha ;
Jha, Sanjeev ;
Tharakunnel, Kurian ;
Westland, J. Christopher .
DECISION SUPPORT SYSTEMS, 2011, 50 (03) :602-613