Black-Box Testing of Deep Neural Networks through Test Case Diversity

被引:24
|
作者
Aghababaeyan, Zohreh [1 ]
Abdellatif, Manel [2 ,3 ]
Briand, Lionel [3 ,4 ]
Ramesh, S. [5 ]
Bagherzadeh, Mojtaba [3 ]
机构
[1] Univ Ottawa, Sch EECS, Ottawa, ON K1N 6N5, Canada
[2] Ecole Technol Super, Software & Informat Technol Engn Dept, Montreal, PQ H3C 1K3, Canada
[3] Univ Ottawa, Sch EECS, Ottawa, ON K1N 6N5, Canada
[4] Univ Luxembourg, SnT Ctr Secur Reliabil & Trust, L-4365 Esch Sur Alzette, Luxembourg
[5] Gen Motors, Dept Res & Dev, Warren, MI 48092 USA
基金
加拿大自然科学与工程研究理事会;
关键词
Measurement; Testing; Feature extraction; Closed box; Fault detection; Neurons; Computational modeling; Coverage; deep neural network; diversity; faults; test; CLASSIFICATION; EFFICIENT; DISTANCE;
D O I
10.1109/TSE.2023.3243522
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Deep Neural Networks (DNNs) have been extensively used in many areas including image processing, medical diagnostics and autonomous driving. However, DNNs can exhibit erroneous behaviours that may lead to critical errors, especially when used in safety-critical systems. Inspired by testing techniques for traditional software systems, researchers have proposed neuron coverage criteria, as an analogy to source code coverage, to guide the testing of DNNs. Despite very active research on DNN coverage, several recent studies have questioned the usefulness of such criteria in guiding DNN testing. Further, from a practical standpoint, these criteria are white-box as they require access to the internals or training data of DNNs, which is often not feasible or convenient. Measuring such coverage requires executing DNNs with candidate inputs to guide testing, which is not an option in many practical contexts. In this paper, we investigate diversity metrics as an alternative to white-box coverage criteria. For the previously mentioned reasons, we require such metrics to be black-box and not rely on the execution and outputs of DNNs under test. To this end, we first select and adapt three diversity metrics and study, in a controlled manner, their capacity to measure actual diversity in input sets. We then analyze their statistical association with fault detection using four datasets and five DNNs. We further compare diversity with state-of-the-art white-box coverage criteria. As a mechanism to enable such analysis, we also propose a novel way to estimate fault detection in DNNs. Our experiments show that relying on the diversity of image features embedded in test input sets is a more reliable indicator than coverage criteria to effectively guide DNN testing. Indeed, we found that one of our selected black-box diversity metrics far outperforms existing coverage criteria in terms of fault-revealing capability and computational time. Results also confirm the suspicions that state-of-the-art coverage criteria are not adequate to guide the construction of test input sets to detect as many faults as possible using natural inputs.
引用
收藏
页码:3182 / 3204
页数:23
相关论文
共 50 条
  • [21] Revizor: Testing Black-Box CPUs against Speculation Contracts
    Oleksenko, Oleksii
    Fetzer, Christof
    Kopf, Boris
    Silberstein, Mark
    ASPLOS '22: PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS, 2022, : 226 - 239
  • [22] Black-box Attacks Against Neural Binary Function Detection
    Bundt, Joshua
    Davinroy, Michael
    Agadakos, Ioannis
    Oprea, Alina
    Robertson, William
    PROCEEDINGS OF THE 26TH INTERNATIONAL SYMPOSIUM ON RESEARCH IN ATTACKS, INTRUSIONS AND DEFENSES, RAID 2023, 2023, : 1 - 16
  • [23] Rearranging Pixels is a Powerful Black-Box Attack for RGB and Infrared Deep Learning Models
    Pomponi, Jary
    Dantoni, Daniele
    Alessandro, Nicolosi
    Scardapane, Simone
    IEEE ACCESS, 2023, 11 : 11298 - 11306
  • [24] Analysis of Explainers of Black Box Deep Neural Networks for Computer Vision: A Survey
    Buhrmester, Vanessa
    Muench, David
    Arens, Michael
    MACHINE LEARNING AND KNOWLEDGE EXTRACTION, 2021, 3 (04): : 966 - 989
  • [25] Automatic Discovery of Web Services Based on Dynamic Black-Box Testing
    Park, Youngki
    Jung, Woosung
    Lee, Byungjeong
    Wu, Chisu
    2009 IEEE 33RD INTERNATIONAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE, VOLS 1 AND 2, 2009, : 107 - +
  • [26] Ensemble adversarial black-box attacks against deep learning systems
    Hang, Jie
    Han, Keji
    Chen, Hui
    Li, Yun
    PATTERN RECOGNITION, 2020, 101
  • [27] Black-Box Adversarial Attack for Deep Learning Classifiers in IoT Applications
    Singh, Abhijit
    Sikdar, Biplab
    2022 IEEE 8TH WORLD FORUM ON INTERNET OF THINGS, WF-IOT, 2022,
  • [28] Testing for Multiple Faults in Deep Neural Networks
    Moussa, Dina A.
    Hefenbrock, Michael
    Tahoori, Mehdi
    IEEE DESIGN & TEST, 2024, 41 (03) : 47 - 53
  • [29] GZOO: Black-Box Node Injection Attack on Graph Neural Networks via Zeroth-Order Optimization
    Yu, Hao
    Liang, Ke
    Hu, Dayu
    Tu, Wenxuan
    Ma, Chuan
    Zhou, Sihang
    Liu, Xinwang
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2025, 37 (01) : 319 - 333
  • [30] Creating a Self-Service DevOps Platform for Black-Box Testing on Kubernetes
    Golis, Tomas
    Dakic, Pavle
    PROCEEDINGS OF NINTH INTERNATIONAL CONGRESS ON INFORMATION AND COMMUNICATION TECHNOLOGY, VOL 8, ICICT 2024, 2024, 1004 : 345 - 355