Machine Learning Applied to Open Government Data for the Detection of Improprieties in the Application of Public Resources

被引:0
|
作者
Vaqueiro, Ramon Dantas [1 ]
Vargas, Ana Caroline G. [1 ]
Escovedo, Tatiana [1 ]
Kalinowski, Marcos [1 ]
机构
[1] Pontificia Univ Catolica Rio de Janeiro PUC Rio, Rio De Janeiro, RJ, Brazil
来源
PROCEEDINGS OF THE 19TH BRAZILIAN SYMPOSIUM ON INFORMATION SYSTEMS | 2023年
关键词
Public Purchases; Text Mining; Machine Learning;
D O I
10.1145/3592813.3592908
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Context: Making government data publicly available is an important mechanism of transparency and social control. In this regard, inumerous laws have made it mandatory to divulgate government procurement data. Problem: The large volume of unstructured textual information available on government portals is an obstacle to effective social control. Making it difficult to do more in-depth analyzes of public spending. Solution: Use of Machine Learning algorithms to perform text mining and grouping items acquired by public administration. Labeling public purchases and grouping similar items, in order to facilitate the detection of improprieties in government purchases. IS Theory: This work is associated with the Theory of Computational Learning, which aims to understand the fundamental principles of learning and design better-automated methods. Method: The article is a case study, and its evaluation was executed with the support of specialists in the field. The results were analyzed based on a quantitative approach. Summary of Results: The results observed in the evaluated cases were promising, the resulting clusters from the application of the solution had sufficiently coherent semantic values, in order to allow more complex analyzes of government purchases. Contributions and Impact in the IS area: The results show that applying text mining and machine learning techniques can extract useful information from government purchases data and allowing to perform better analyzes of public spending.
引用
收藏
页码:213 / 220
页数:8
相关论文
共 50 条
  • [41] Fall Detection System by Machine Learning Framework for Public Health
    Rodrigues, Thiago B.
    Salgado, Debora P.
    Cordeiro, Mauricio C.
    Osterwald, Katja M.
    Filho, Teodiano F. B.
    de Lucena Jr, Vicente F.
    Naves, Eduardo L. M.
    Murray, Niall
    9TH INTERNATIONAL CONFERENCE ON EMERGING UBIQUITOUS SYSTEMS AND PERVASIVE NETWORKS (EUSPN-2018) / 8TH INTERNATIONAL CONFERENCE ON CURRENT AND FUTURE TRENDS OF INFORMATION AND COMMUNICATION TECHNOLOGIES IN HEALTHCARE (ICTH-2018), 2018, 141 : 358 - 365
  • [42] Application of Machine Learning Predictive Models for Early Detection of Glaucoma Using Real World Data
    Raju, Murugesan
    Shanmugam, Krishna P.
    Shyu, Chi-Ren
    APPLIED SCIENCES-BASEL, 2023, 13 (04):
  • [43] Machine learning algorithms applied for drone detection and classification: benefits and challenges
    Mrabet, Manel
    Sliti, Maha
    Ammar, Lassaad Ben
    FRONTIERS IN COMMUNICATIONS AND NETWORKS, 2024, 5
  • [44] Research on the application of machine learning to intrusion detection in WSN
    Jiang, Laiwei
    Gu, Haiyang
    Xie, Lixia
    Yang, Hongyu
    Xi'an Dianzi Keji Daxue Xuebao/Journal of Xidian University, 2024, 51 (04): : 206 - 225
  • [45] Big Data in Public Health: Terminology, Machine Learning, and Privacy
    Mooney, Stephen J.
    Pejaver, Vikas
    ANNUAL REVIEW OF PUBLIC HEALTH, VOL 39, 2018, 39 : 95 - 112
  • [46] Application of Machine Learning Algorithms for Android Malware Detection
    Kakavand, Mohsen
    Dabbagh, Mohammad
    Dehghantanha, Ali
    2018 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND INTELLIGENT SYSTEMS (CIIS 2018), 2018, : 32 - 36
  • [47] Assessment and Optimization of Explainable Machine Learning Models Applied to Transcriptomic Data
    Zhao, Yongbing
    Shao, Jinfeng
    Asmann, Yan W.
    GENOMICS PROTEOMICS & BIOINFORMATICS, 2022, 20 (05) : 899 - 911
  • [48] Intrusion Detection Using Data Fusion and Machine Learning
    Hechmi, Jridi Mohamed
    Khlaifi, Hacen
    Bouatay, Amine
    Zrelli, Amira
    Ezzedine, Tahar
    2018 26TH INTERNATIONAL CONFERENCE ON SOFTWARE, TELECOMMUNICATIONS AND COMPUTER NETWORKS (SOFTCOM), 2018, : 235 - 240
  • [49] Fraud Detection in Banking Data by Machine Learning Techniques
    Hashemi, Seyedeh Khadijeh
    Mirtaheri, Seyedeh Leili
    Greco, Sergio
    IEEE ACCESS, 2023, 11 : 3034 - 3043
  • [50] Accounting for Training Data Error in Machine Learning Applied to Earth Observations
    Elmes, Arthur
    Alemohammad, Hamed
    Avery, Ryan
    Caylor, Kelly
    Eastman, J. Ronald
    Fishgold, Lewis
    Friedl, Mark A.
    Jain, Meha
    Kohli, Divyani
    Bayas, Juan Carlos Laso
    Lunga, Dalton
    McCarty, Jessica L.
    Pontius, Robert Gilmore, Jr.
    Reinmann, Andrew B.
    Rogan, John
    Song, Lei
    Stoynova, Hristiana
    Ye, Su
    Yi, Zhuang-Fang
    Estes, Lyndon
    REMOTE SENSING, 2020, 12 (06)