Predicting tax fraud using supervised machine learning approach

被引:2
作者
Murorunkwere, Belle Fille [1 ]
Haughton, Dominique [2 ]
Nzabanita, Joseph [3 ]
Kipkogei, Francis [4 ]
Kabano, Ignace [5 ]
机构
[1] Univ Rwanda, African Ctr Excellence Data Sci, Rwanda Revenue Author, Kigali, Rwanda
[2] Univ Toulouse TSE R 1, Univ Paris 1 SAMM, Toulouse, France
[3] Univ Rwanda, Coll Sci & Technol, Sch Sci, Kigali, Rwanda
[4] Stepwise Inc, Zalda, Nairobi, Kenya
[5] Univ Rwanda, Coll Business & Econ, African Ctr Excellence Data Sci, Kigali, Rwanda
关键词
tax fraud; fraud detection; features importance; supervised machine-learning models; evaluation metrics;
D O I
10.1080/20421338.2023.2187930
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
With the advancement in technology, the tax base in Rwanda has become broader, and as a result, tax fraud is growing. Depending on the dataset used, fraud detection experts and researchers have used different methods to identify questionable cases. This paper aims to predict features of tax fraud using the most robust supervised machine-learning model. This research provides a context where a fraud expert can use a machine-learning model, and an implemented model offers instant feedback to the fraud expert. We evaluate supervised machine learning models such as Artificial Neural Network, Logistic Regression, Decision Tree, Random Forest, GaussianNB and XGBoost. Based on different evaluation metrics, Artificial Neural Network was the most robust model for predicting tax fraud. Findings reveal that the time of business that indicates the difference in time from when a business started and the time it was audited, the domestic businesses, taxpayers who import and export goods, those with no losses, those whose businesses are located in the eastern province, and those registered on withholding and Value Added Tax types are more susceptible to tax fraud. This study is among the few to evaluate the effectiveness of multiple supervised machine-learning models for identifying tax fraud factors on an accurate data set with numerous tax types. The evidence generated in the current study will serve as a valuable tool for both tax policymakers and auditors, as well as for enhancing awareness of more robust methods for predicting tax fraud.
引用
收藏
页码:731 / 742
页数:12
相关论文
共 50 条
  • [21] Fraud detection in publicly traded U.S firms using Beetle Antennae Search: A machine learning approach
    Khan, Ameer Tamoor
    Cao, Xinwei
    Li, Shuai
    Katsikis, Vasilios N.
    Brajevic, Ivona
    Stanimirovic, Predrag S.
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2022, 191
  • [22] Fraud detection with machine learning: model comparison
    Pacheco J.
    Chela J.
    Salomé G.
    [J]. International Journal of Business Intelligence and Data Mining, 2023, 22 (04) : 434 - 450
  • [23] Combining unsupervised and supervised learning in credit card fraud detection
    Carcillo, Fabrizio
    Le Borgne, Yann-Ael
    Caelen, Olivier
    Kessaci, Yacine
    Oble, Frederic
    Bontempi, Gianluca
    [J]. INFORMATION SCIENCES, 2021, 557 : 317 - 331
  • [24] An application of supervised and unsupervised learning approaches to telecommunications fraud detection
    Hilas, Constantinos S.
    Mastorocostas, Paris As.
    [J]. KNOWLEDGE-BASED SYSTEMS, 2008, 21 (07) : 721 - 726
  • [25] Real-time Credit Card Fraud Detection Using Machine Learning
    Thennakoon, Anuruddha
    Bhagyani, Chee
    Premadasa, Sasitha
    Mihiranga, Shalitha
    Kuruwitaarachchi, Nuwan
    [J]. 2019 9TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, DATA SCIENCE & ENGINEERING (CONFLUENCE 2019), 2019, : 488 - 493
  • [26] Credit Card Fraud Detection Using a New Hybrid Machine Learning Architecture
    Malik, Esraa Faisal
    Khaw, Khai Wah
    Belaton, Bahari
    Wong, Wai Peng
    Chew, XinYing
    [J]. MATHEMATICS, 2022, 10 (09)
  • [27] A HYBRID SEMI-SUPERVISED APPROACH FOR FINANCIAL FRAUD DETECTION
    Liu, Jin-Miao
    Tian, Jiang
    Cai, Zhu-Xi
    Zhou, Yue
    Luo, Ren-Hua
    Wang, Ran-Ran
    [J]. PROCEEDINGS OF 2017 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), VOL 1, 2017, : 217 - 222
  • [28] Automobile Insurance Fraud Detection using Supervised Classifiers
    Prasasti, Iffa Maula Nur
    Dhini, Arian
    Laoh, Enrico
    [J]. 2020 5TH INTERNATIONAL WORKSHOP ON BIG DATA AND INFORMATION SECURITY (IWBIS 2020), 2020, : 49 - 53
  • [29] Supervised machine learning for understanding and predicting the status of bistable eukaryotic plankton community in urbanized rivers
    Shang, Jiahui
    Li, Yi
    Zhang, Wenlong
    Ma, Xin
    Yin, Haojie
    Niu, Lihua
    Wang, Longfei
    Zheng, Jinhai
    [J]. WATER RESEARCH, 2024, 266
  • [30] Credit Card Fraud Detection Using State-of-the-Art Machine Learning and Deep Learning Algorithms
    Alarfaj, Fawaz Khaled
    Malik, Iqra
    Khan, Hikmat Ullah
    Almusallam, Naif
    Ramzan, Muhammad
    Ahmed, Muzamil
    [J]. IEEE ACCESS, 2022, 10 : 39700 - 39715