Comparison of Tree-Based Machine Learning Algorithms to Predict Reporting Behavior of Electronic Billing Machines

被引:7
作者
Murorunkwere, Belle Fille [1 ]
Ihirwe, Jean Felicien [2 ]
Kayijuka, Idrissa [3 ]
Nzabanita, Joseph [4 ]
Haughton, Dominique [5 ,6 ,7 ]
机构
[1] Univ Rwanda, African Ctr Excellence Data Sci, POB 4285, Kigali, Rwanda
[2] Univ lAquila, Dept Informat Engn Comp Sci & Math, I-56121 Pisa, Italy
[3] Univ Rwanda, Dept Appl Stat, POB 4285, Kigali, Rwanda
[4] Univ Rwanda, Coll Sci & Technol, Dept Math, POB 3900, Kigali, Rwanda
[5] Bentley Univ, Dept Math Sci & Global Studies, Waltham, MA 02452 USA
[6] Univ Paris 1 SAMM, Dept Math Sci & Global Studies, F-75634 Paris, France
[7] Univ Toulouse 1 TSE R, Dept Math Sci & Global Studies, F-31042 Toulouse, France
关键词
tree-based machine learning algorithms; compliance; value added tax; machine learning; electronic billing machines; reporting behavior;
D O I
10.3390/info14030140
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Tax fraud is a common problem for many tax administrations, costing billions of dollars. Different tax administrations have considered several options to optimize revenue; among them, there is the so-called electronic billing machine (EBM), which aims to monitor all business transactions and, as a result, boost value added tax (VAT) revenue and compliance. Most of the current research has focused on the impact of EBMs on VAT revenue collection and compliance rather than understanding how EBM reporting behavior influences future compliance. The essential contribution of this study is that it leverages both EBM's historical reporting behavior and actual business characteristics to understand and predict the future reporting behavior of EBMs. Herein, tree-based machine learning algorithms such as decision trees, random forest, gradient boost, and XGBoost are utilized, tested, and compared for better performance. The results exhibit the robustness of the random forest model, among others, with an accuracy of 92.3%. This paper clearly presents our approach contribution with respect to existing approaches through well-defined research questions, analysis mechanisms, and constructive discussions. Once applied, we believe that our approach could ultimately help the tax-collecting agency conduct timely interventions on EBM compliance, which will help achieve the EBM objective of improving VAT compliance.
引用
收藏
页数:21
相关论文
共 35 条
  • [1] Andrade J.P.A., 2021, P AN 18 ENC NAC INT
  • [2] [Anonymous], 2014, Int. J. Manag. Soc. Sci
  • [3] Finding Evidence of Fraudster Companies in the CEO's Letter to Shareholders with Sentiment Analysis
    Bel, Nuria
    Bracons, Gabriel
    Anderberg, Sophia
    [J]. INFORMATION, 2021, 12 (08)
  • [4] Mobile Money Fraud Prediction-A Cross-Case Analysis on the Efficiency of Support Vector Machines, Gradient Boosted Decision Trees, and Naive Bayes Algorithms
    Botchey, Francis Effirim
    Qin, Zhen
    Hughes-Lartey, Kwesi
    [J]. INFORMATION, 2020, 11 (08)
  • [5] Casey P., 2015, Electronic Fiscal Devices (EFDs) An empirical study of their impact on taxpayer compliance and administrative efficiency. No. 15-73
  • [6] Chege J.M., 2010, THESIS U NAIROBI NAI
  • [7] Cobham A., TAXATION POLICY DEV
  • [8] Cortes C., 2019, ADV NEUR IN
  • [9] Dangeti P.., 2017, Statistics for machine learning
  • [10] Extreme Gradient Boosting Machine Learning Algorithm For Safe Auto Insurance Operations
    Dhieb, Najmeddine
    Ghazzai, Hakim
    Besbes, Hichem
    Massoud, Yehia
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE OF VEHICULAR ELECTRONICS AND SAFETY (ICVES 19), 2019,