Detecting Anomalies in Financial Data Using Machine Learning Algorithms

被引:25
作者
Bakumenko, Alexander [1 ]
Elragal, Ahmed [1 ]
机构
[1] Lulea Univ Technol, Dept Comp Sci Elect & Space Engn, SE-97187 Lulea, Sweden
关键词
general ledger; accounting; auditing; anomaly detection; machine learning;
D O I
10.3390/systems10050130
中图分类号
C [社会科学总论];
学科分类号
03 ; 0303 ;
摘要
Bookkeeping data free of fraud and errors are a cornerstone of legitimate business operations. The highly complex and laborious work of financial auditors calls for finding new solutions and algorithms to ensure the correctness of financial statements. Both supervised and unsupervised machine learning (ML) techniques nowadays are being successfully applied to detect fraud and anomalies in data. In accounting, it is a long-established problem to detect financial misstatements deemed anomalous in general ledger (GL) data. Currently, widely used techniques such as random sampling and manual assessment of bookkeeping rules become challenging and unreliable due to increasing data volumes and unknown fraudulent patterns. To address the sampling risk and financial audit inefficiency, we applied seven supervised ML techniques inclusive of deep learning and two unsupervised ML techniques such as isolation forest and autoencoders. We trained and evaluated our models on a real-life GL dataset and used data vectorization to resolve journal entry size variability. The evaluation results showed that the best trained supervised and unsupervised models have high potential in detecting predefined anomaly types as well as in efficiently sampling data to discern higher-risk journal entries. Based on our findings, we discussed possible practical implications of the resulting solutions in the accounting and auditing contexts.
引用
收藏
页数:29
相关论文
共 35 条
[1]   Data mining applications in accounting: A review of the literature and organizing framework [J].
Amani, Farzaneh A. ;
Fadlalla, Adam M. .
INTERNATIONAL JOURNAL OF ACCOUNTING INFORMATION SYSTEMS, 2017, 24 :32-58
[2]  
[Anonymous], 2008, MCCSIS'08-IADIS Multi Conference on Computer Science and Information Systems
[3]  
Proceedings of Informatics 2008 and Data Mining 2008
[4]  
Ayodele T. O., 2010, New Advances in Machine Learning, P19, DOI DOI 10.5772/9385
[5]  
Baesens B., 2015, Fraud analytics using descriptive, predictive, and social network techniques: A guide to data science for fraud detection
[6]  
Bank D., 2020, arXiv
[7]  
BAS, GEN INF ACC PLAN
[8]  
Becirovic S., 2020, P 2020 19 INT S INFO
[9]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[10]  
Charbuty B., 2021, J. Appl. Sci. Technol. Trends, V2, P20, DOI [DOI 10.38094/JASTT20165, 10.38094/jastt20165]