Logging requirement for continuous auditing of responsible machine learning-based applications

被引：0

作者：

Patrick Loic Foalem ^{[1
]}

Leuson Da Silva ^{[1
]}

Foutse Khomh ^{[1
]}

Heng Li ^{[1
]}

Ettore Merlo ^{[1
]}

机构：

[1] Department of Computer Engineering and Software Engineering, Polytechnique Montreal, Montreal, QC

来源：

Empirical Software Engineering | 2025年 / 30卷 / 4期

关键词：

Accountability; Auditing; Empirical; Fairness; GitHub repository; Logging; Machine learning; Responsible ML; Transparency;

D O I：

10.1007/s10664-025-10656-8

中图分类号：

学科分类号：

摘要：

Machine learning (ML) is increasingly used across various industries to automate decision-making processes. However, concerns about the ethical and legal compliance of ML models have arisen due to their lack of transparency, fairness, and accountability. Monitoring, particularly through logging, is a widely used technique in traditional software systems that could be leveraged to assist in auditing ML-based applications. Logs provide a record of an application’s behavior, which can be used for continuous auditing, debugging, and analyzing both the behavior and performance of the application. In this study, we investigate the logging practices of ML practitioners to capture responsible ML-related information in ML applications. We analyzed 85 ML projects hosted on GitHub, leveraging 20 responsible ML libraries that span principles such as privacy, transparency & explainability, fairness, and security & safety. Our analysis revealed important differences in the implementation of responsible AI principles. For example, out of 5,733 function calls analyzed, privacy accounted for 89.3% (5,120 calls), while fairness represented only 2.1% (118 calls), highlighting the uneven emphasis on these principles across projects. Furthermore, our manual analysis of 44,877 issue discussions revealed that only 8.1% of the sampled issues addressed responsible AI principles, with transparency & explainability being the most frequently discussed principles (32.2% of all issues related to responsible AI principles). Additionally, a survey conducted with ML practitioners provided direct insights into their perspectives, informing our exploration of ways to enhance logging practices for more effective, responsible ML auditing. We discovered that while privacy, model interpretability & explainability, fairness, and security & safety are commonly considered, there is a gap in how metrics associated with these principles are logged. Specifically, crucial fairness metrics like group and individual fairness, privacy metrics such as epsilon and delta, and explainability metrics like SHAP values are not considered current logging practices. The insights from this study highlight the need for ML practitioners and logging tool developers to adopt enhanced logging strategies that incorporate a broader range of responsible AI metrics. This adjustment will facilitate the development of auditable and ethically responsible ML applications, ensuring they meet emerging regulatory and societal expectations. These specific insights offer actionable guidance for improving the accountability and trustworthiness of ML systems. © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2025.

引用

共 50 条

[1] Continuous Management of Machine Learning-Based Application Behavior
Anisetti, Marco
Ardagna, Claudio A.
Bena, Nicola
Damiani, Ernesto
Panero, Paolo G.
IEEE TRANSACTIONS ON SERVICES COMPUTING, 2025, 18 (01) : 112 - 125
[2] A Holistic Machine Learning-based Autoscaling Approach for Microservice Applications
Goli, Alireza
Mahmoudi, Nima
Khazaei, Hamzeh
Ardakanian, Omid
CLOSER: PROCEEDINGS OF THE 11TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND SERVICES SCIENCE, 2021, : 190 - 198
[3] Machine Learning-Based Scoring Functions, Development and Applications with SAnDReS
Bitencourt-Ferreira, Gabriela
Rizzotto, Camila
de Azevedo Junior, Walter Filgueira
CURRENT MEDICINAL CHEMISTRY, 2021, 28 (09) : 1746 - 1756
[4] A machine learning-based framework for user recruitment in continuous mobile crowdsensing
Nasser, Ruba
Aboulhosn, Zeina
Mizouni, Rabeb
Singh, Shakti
Otrok, Hadi
AD HOC NETWORKS, 2023, 145
[5] Machine learning-based cement integrity evaluation with a through-tubing logging experimental setup
de Souza, Luis Paulo Brasil
Ferreira, Guilherme Rezende Bessa
Camerini, Isabel Giron
Correia, Tiago de Magalhaes
Ribeiro, Mateus Gheorghe de Castro
Hidalgo, Juan Andres Santisteban
Joao, Bruno Lima Davico de Sao
Llerena, Roberth Waldo Angulo
Kubrusly, Alan Conci
Ayala, Helon Vicente Hultmann
Braga, Arthur Martins Barbosa
Batista, Joao Humberto Guandalini
GEOENERGY SCIENCE AND ENGINEERING, 2023, 227
[6] Machine learning-based modeling in food processing applications: State of the art
Khan, Md. Imran H.
Sablani, Shyam S.
Nayak, Richi
Gu, Yuantong
COMPREHENSIVE REVIEWS IN FOOD SCIENCE AND FOOD SAFETY, 2022, 21 (02): : 1409 - 1438
[7] A review of machine learning-based human activity recognition for diverse applications
Kulsoom, Farzana
Narejo, Sanam
Mehmood, Zahid
Chaudhry, Hassan Nazeer
Butt, Aisha
Bashir, Ali Kashif
NEURAL COMPUTING & APPLICATIONS, 2022, 34 (21) : 18289 - 18324
[8] A review of machine learning-based human activity recognition for diverse applications
Farzana Kulsoom
Sanam Narejo
Zahid Mehmood
Hassan Nazeer Chaudhry
Ayesha Butt
Ali Kashif Bashir
Neural Computing and Applications, 2022, 34 : 18289 - 18324
[9] Bayesian and machine learning-based fault detection and diagnostics for marine applications
Cheliotis, Michail
Lazakis, Iraklis
Cheliotis, Angelos
SHIPS AND OFFSHORE STRUCTURES, 2022, 17 (12) : 2686 - 2698
[10] Efficient Encoding and Decoding of Voxelized Models for Machine Learning-Based Applications
Strnad, Damjan
Kohek, Stefan
Zalik, Borut
Vasa, Libor
Nerat, Andrej
IEEE ACCESS, 2025, 13 : 5551 - 5561

← 1 2 3 4 5 →