Improving Tax Audit Efficiency Using Machine Learning: The Role of Taxpayer's Network Data in Fraud Detection

被引:20
作者
Baghdasaryan, Vardan [1 ]
Davtyan, Hrant [2 ]
Sarikyan, Arsine [3 ]
Navasardyan, Zaruhi [3 ]
机构
[1] Amer Univ Armenia, Yerevan, Armenia
[2] Amer Univ Armenia, Coll Business & Econ, Yerevan, Armenia
[3] Amer Univ Armenia, Ctr Business Res & Dev, Yerevan, Armenia
关键词
Compendex;
D O I
10.1080/08839514.2021.2012002
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Using the universe of Armenian business tax payers operating under a standard tax regime, we develop a fraud prediction model based on machine learning tools, with gradient boosting as the primary choice. Having to deal with broadly defined fraud and heterogeneous taxpayers, as well as a relatively small sample, we successfully derive important features from tax returns with a minimum of additional information. Among the important fraud predictors, we obtain historical fraud and audit, share of administrative costs, and external economic activity. We see two main contributions with generalizable practical implications for auditing authorities. First, by focusing on the lift score of the top decile, we demonstrate that even moderately accurate models can improve upon existing accuracy of rule-based approaches. Second, and more importantly, we demonstrate that the information contained in the supplier and buyer network of the taxpayer can be used whenever important predictors of fraud such as historical audits and fraud are not available. This is particularly important for situations with newly established companies, who would otherwise be under-rated in terms of fraud probability.
引用
收藏
页数:23
相关论文
共 20 条
[1]  
Abrantes PC, 2016, 2016 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE & COMPUTATIONAL INTELLIGENCE (CSCI), P435, DOI [10.1109/CSCI.2016.88, 10.1109/CSCI.2016.0089]
[2]  
[Anonymous], 2019, Doing Business project
[3]  
[Anonymous], 2020, World Bank
[4]   Characterization and detection of taxpayers with false invoices using data mining techniques [J].
Castellon Gonzalez, Pamela ;
Velasquez, Juan D. .
EXPERT SYSTEMS WITH APPLICATIONS, 2013, 40 (05) :1427-1436
[5]  
Coita I., 2021, 4 INT C EC SOC SCI R
[6]  
Daley Suzanne., 2010, New York Times
[7]   Tax Fraud Detection for Under-Reporting Declarations Using an Unsupervised Machine Learning Approach [J].
de Roux, Daniel ;
Perez, Boris ;
Moreno, Andres ;
del Pilar Villamil, Maria ;
Figueroa, Cesar .
KDD'18: PROCEEDINGS OF THE 24TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2018, :215-222
[8]   Combining Network Visualization and Data Mining for Tax Risk Assessment [J].
Didimo, Walter ;
Grilli, Luca ;
Liotta, Giuseppe ;
Menconi, Lorenzo ;
Montecchiani, Fabrizio ;
Pagliuca, Daniele .
IEEE ACCESS, 2020, 8 :16073-16086
[9]  
Fuest Clemens., 2009, Tax evasion, tax avoidance and tax expenditures in developing countries: A Review of the literature
[10]   Identifying business misreporting in VAT using network analysis [J].
Gonzalez-Martel, Christian ;
Hernandez, Juan M. ;
Manrique-de-Lara-Penate, Casiano .
DECISION SUPPORT SYSTEMS, 2021, 141