A synthetic data set to benchmark anti-money laundering methods

被引:7
作者
Jensen, Rasmus Ingemann Tuffveson [1 ,2 ]
Ferwerda, Joras [3 ]
Jorgensen, Kristian Sand [2 ]
Jensen, Erik Rathje [2 ]
Borg, Martin [2 ]
Krogh, Morten Persson [2 ]
Jensen, Jonas Brunholm [2 ]
Iosifidis, Alexandros [1 ]
机构
[1] Aarhus Univ, Dept Elect & Comp Engn, DK-8200 Aarhus, Denmark
[2] Spar Nord Bank, DK-9100 Aalborg, Denmark
[3] Univ Utrecht, Sch Econ, NL-3584 EC Utrecht, Netherlands
关键词
D O I
10.1038/s41597-023-02569-2
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Bank transactions are highly confidential. As a result, there are no real public data sets that can be used to investigate and compare anti-money laundering (AML) methods in banks. This severely limits research on important AML problems such as efficiency, effectiveness, class imbalance, concept drift, and interpretability. To address the issue, we present SynthAML: a synthetic data set to benchmark statistical and machine learning methods for AML. The data set builds on real data from Spar Nord, a systemically important Danish bank, and contains 20,000 AML alerts and over 16 million transactions. Experimental results indicate that performance on SynthAML can be transferred to the real world. As use cases, we present and discuss open problems in the AML literature.
引用
收藏
页数:10
相关论文
共 27 条
[1]   On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation [J].
Bach, Sebastian ;
Binder, Alexander ;
Montavon, Gregoire ;
Klauschen, Frederick ;
Mueller, Klaus-Robert ;
Samek, Wojciech .
PLOS ONE, 2015, 10 (07)
[2]  
Barth-Jones D., 2012, Social Sciences Research Network, DOI [DOI 10.2139/SSRN.2076397, 10.2139/SSRN.2076397]
[3]  
Bjerregaard E., 2019, SSRN Website, DOI [https://doi.org/10.2139/ssrn.3446636, DOI 10.2139/SSRN.3446636]
[4]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[5]   Unique in the Crowd: The privacy bounds of human mobility [J].
de Montjoye, Yves-Alexandre ;
Hidalgo, Cesar A. ;
Verleysen, Michel ;
Blondel, Vincent D. .
SCIENTIFIC REPORTS, 2013, 3
[6]   Active Learning Through Sequential Design, With Applications to Detection of Money Laundering [J].
Deng, Xinwei ;
Joseph, V. Roshan ;
Sudjianto, Agus ;
Wu, C. F. Jeff .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2009, 104 (487) :969-981
[7]  
FATF, 2023, International Standards on Combating Money Laundering and the Financing of Terrorism & Proliferation
[8]   An introduction to ROC analysis [J].
Fawcett, Tom .
PATTERN RECOGNITION LETTERS, 2006, 27 (08) :861-874
[9]  
Financial Action Task Force (FATF), 2021, Anti-money laundering and counter-terrorist financing measures using digital identity
[10]  
Gerlings J., 2023, P 56 HAW INT C SYST, DOI 10125/103058