A Time-Frequency Based Suspicious Activity Detection for Anti-Money Laundering

被引:13
作者
Ketenci, Utku Gorkem [1 ]
Kurt, Tolga [1 ]
Onal, Selim [2 ]
Erbil, Cenk [2 ]
Akturkoglu, Sinan [2 ]
Ilhan, Hande Serban [2 ]
机构
[1] ITU Teknopk, H3M IO, TR-34467 Istanbul, Turkey
[2] Akbank TAS, Sabanci Ctr, TR-34330 Istanbul, Turkey
关键词
Time-frequency analysis; Data models; Feature extraction; Machine learning; Customer relationship management; Hidden Markov models; Anomaly detection; anti-money laundering; compliance; random forest algorithm; time-frequency analysis; transaction monitoring;
D O I
10.1109/ACCESS.2021.3072114
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Money laundering is the crucial mechanism utilized by criminals to inject proceeds of crime into the financial system. The primary responsibility of the detection of suspicious activity related to money laundering is with the financial institutions. Most of the current systems in these institutions are rule-based and ineffective (over 90 % false positives). The available data science-based anti-money laundering (AML) models to replace the existing rule-based systems work on customer relationship management (CRM) features and time characteristics of transaction behaviour. Due to thousands of possible account features, customer features, and their combinations, it is challenging to perform feature engineering to achieve reasonable accuracy. Aiming to improve the detection performance of suspicious transaction monitoring systems for AML systems, in this article, we introduce a novel feature set based on time-frequency analysis, that uses 2-D representations of financial transactions. Random forest is utilized as a machine learning method, and simulated annealing is adopted for hyperparameter tuning. The designed algorithm is tested on real banking data, proving the results' efficacy in practically relevant environments. It is shown that the time-frequency characteristics are discriminatory features for suspicious and non-suspicious entities. Therefore, these features substantially improve the area under curve results (over 1%) of the existing data science-based transaction monitoring systems. Using time-frequency features alone, a false positive rate of 14.9% has been achieved, with an F-score of 59.05%. When combined with transaction and CRM features, the false positive rate is 11.85%, and the F-Score is improved to 74.06%.
引用
收藏
页码:59957 / 59967
页数:11
相关论文
共 41 条
[1]  
Alexandre Claudio, 2015, 7th International Conference on Agents and Artificial Intelligence (ICAART 2015). Proceedings, P230
[2]  
[Anonymous], 2011, Estimating illicit financial flows resulting from drug trafficking and other transnational organized crimes
[3]  
[Anonymous], 2017, STAT MACHINE LEARNIN
[4]  
Awasthi A., 2012, CLUSTERING ALGORITHM, P75
[5]   Representation Learning: A Review and New Perspectives [J].
Bengio, Yoshua ;
Courville, Aaron ;
Vincent, Pascal .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (08) :1798-1828
[6]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[7]   Finding Suspicious Activities in Financial Transactions and Distributed Ledgers [J].
Camino, Ramino ;
State, Radu ;
Montero, Leandro ;
Valtchev, Petko .
2017 17TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2017), 2017, :787-796
[8]  
Chen ZY, 2014, IEEE CONF OPEN SYST, P145, DOI 10.1109/ICOS.2014.7042645
[9]  
Coates A., 2011, JMLR WORKSHOP C P
[10]  
Cohen L., 1995, Time-frequency analysis, V778