Black-Box Testing of Deep Neural Networks through Test Case Diversity

被引：24

作者：

Aghababaeyan, Zohreh ^{[1
]}

Abdellatif, Manel ^{[2
,3
]}

Briand, Lionel ^{[3
,4
]}

Ramesh, S. ^{[5
]}

Bagherzadeh, Mojtaba ^{[3
]}

机构：

[1] Univ Ottawa, Sch EECS, Ottawa, ON K1N 6N5, Canada

[2] Ecole Technol Super, Software & Informat Technol Engn Dept, Montreal, PQ H3C 1K3, Canada

[3] Univ Ottawa, Sch EECS, Ottawa, ON K1N 6N5, Canada

[4] Univ Luxembourg, SnT Ctr Secur Reliabil & Trust, L-4365 Esch Sur Alzette, Luxembourg

[5] Gen Motors, Dept Res & Dev, Warren, MI 48092 USA

来源：

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING | 2023年 / 49卷 / 05期

基金：

加拿大自然科学与工程研究理事会;

关键词：

Measurement; Testing; Feature extraction; Closed box; Fault detection; Neurons; Computational modeling; Coverage; deep neural network; diversity; faults; test; CLASSIFICATION; EFFICIENT; DISTANCE;

D O I：

10.1109/TSE.2023.3243522

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Deep Neural Networks (DNNs) have been extensively used in many areas including image processing, medical diagnostics and autonomous driving. However, DNNs can exhibit erroneous behaviours that may lead to critical errors, especially when used in safety-critical systems. Inspired by testing techniques for traditional software systems, researchers have proposed neuron coverage criteria, as an analogy to source code coverage, to guide the testing of DNNs. Despite very active research on DNN coverage, several recent studies have questioned the usefulness of such criteria in guiding DNN testing. Further, from a practical standpoint, these criteria are white-box as they require access to the internals or training data of DNNs, which is often not feasible or convenient. Measuring such coverage requires executing DNNs with candidate inputs to guide testing, which is not an option in many practical contexts. In this paper, we investigate diversity metrics as an alternative to white-box coverage criteria. For the previously mentioned reasons, we require such metrics to be black-box and not rely on the execution and outputs of DNNs under test. To this end, we first select and adapt three diversity metrics and study, in a controlled manner, their capacity to measure actual diversity in input sets. We then analyze their statistical association with fault detection using four datasets and five DNNs. We further compare diversity with state-of-the-art white-box coverage criteria. As a mechanism to enable such analysis, we also propose a novel way to estimate fault detection in DNNs. Our experiments show that relying on the diversity of image features embedded in test input sets is a more reliable indicator than coverage criteria to effectively guide DNN testing. Indeed, we found that one of our selected black-box diversity metrics far outperforms existing coverage criteria in terms of fault-revealing capability and computational time. Results also confirm the suspicions that state-of-the-art coverage criteria are not adequate to guide the construction of test input sets to detect as many faults as possible using natural inputs.

引用

页码：3182 / 3204

页数：23

共 50 条

[21] Revizor: Testing Black-Box CPUs against Speculation Contracts
Oleksenko, Oleksii
Fetzer, Christof
Kopf, Boris
Silberstein, Mark
ASPLOS '22: PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS, 2022, : 226 - 239
[22] Black-box Attacks Against Neural Binary Function Detection
Bundt, Joshua
Davinroy, Michael
Agadakos, Ioannis
Oprea, Alina
Robertson, William
PROCEEDINGS OF THE 26TH INTERNATIONAL SYMPOSIUM ON RESEARCH IN ATTACKS, INTRUSIONS AND DEFENSES, RAID 2023, 2023, : 1 - 16
[23] Rearranging Pixels is a Powerful Black-Box Attack for RGB and Infrared Deep Learning Models
Pomponi, Jary
Dantoni, Daniele
Alessandro, Nicolosi
Scardapane, Simone
IEEE ACCESS, 2023, 11 : 11298 - 11306
[24] Analysis of Explainers of Black Box Deep Neural Networks for Computer Vision: A Survey
Buhrmester, Vanessa
Muench, David
Arens, Michael
MACHINE LEARNING AND KNOWLEDGE EXTRACTION, 2021, 3 (04): : 966 - 989
[25] Automatic Discovery of Web Services Based on Dynamic Black-Box Testing
Park, Youngki
Jung, Woosung
Lee, Byungjeong
Wu, Chisu
2009 IEEE 33RD INTERNATIONAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE, VOLS 1 AND 2, 2009, : 107 - +
[26] Ensemble adversarial black-box attacks against deep learning systems
Hang, Jie
Han, Keji
Chen, Hui
Li, Yun
PATTERN RECOGNITION, 2020, 101
[27] Black-Box Adversarial Attack for Deep Learning Classifiers in IoT Applications
Singh, Abhijit
Sikdar, Biplab
2022 IEEE 8TH WORLD FORUM ON INTERNET OF THINGS, WF-IOT, 2022,
[28] Testing for Multiple Faults in Deep Neural Networks
Moussa, Dina A.
Hefenbrock, Michael
Tahoori, Mehdi
IEEE DESIGN & TEST, 2024, 41 (03) : 47 - 53
[29] GZOO: Black-Box Node Injection Attack on Graph Neural Networks via Zeroth-Order Optimization
Yu, Hao
Liang, Ke
Hu, Dayu
Tu, Wenxuan
Ma, Chuan
Zhou, Sihang
Liu, Xinwang
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2025, 37 (01) : 319 - 333
[30] Creating a Self-Service DevOps Platform for Black-Box Testing on Kubernetes
Golis, Tomas
Dakic, Pavle
PROCEEDINGS OF NINTH INTERNATIONAL CONGRESS ON INFORMATION AND COMMUNICATION TECHNOLOGY, VOL 8, ICICT 2024, 2024, 1004 : 345 - 355

← 1 2 3 4 5 →