Aries: Efficient Testing of Deep Neural Networks via Labeling-Free Accuracy Estimation

被引:7
|
作者
Hu, Qiang [1 ]
Guo, Yuejun [2 ]
Xie, Xiaofei [3 ]
Cordy, Maxime [1 ]
Papadakis, Mike [1 ]
Ma, Lei [4 ,5 ]
Le Traon, Yves [1 ]
机构
[1] Univ Luxembourg, Luxembourg, Luxembourg
[2] Luxembourg Inst Sci & Technol, Luxembourg, Luxembourg
[3] Singapore Management Univ, Singapore, Singapore
[4] Univ Alberta, Edmonton, AB, Canada
[5] Univ Tokyo, Tokyo, Japan
来源
2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ICSE | 2023年
基金
加拿大自然科学与工程研究理事会;
关键词
deep learning testing; performance estimation; distribution shift;
D O I
10.1109/ICSE48619.2023.00152
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Deep learning (DL) plays a more and more important role in our daily life due to its competitive performance in industrial application domains. As the core of DL-enabled systems, deep neural networks (DNNs) need to be carefully evaluated to ensure the produced models match the expected requirements. In practice, the de facto standard to assess the quality of DNNs in the industry is to check their performance (accuracy) on a collected set of labeled test data. However, preparing such labeled data is often not easy partly because of the huge labeling effort, i.e., data labeling is labor-intensive, especially with the massive new incoming unlabeled data every day. Recent studies show that test selection for DNN is a promising direction that tackles this issue by selecting minimal representative data to label and using these data to assess the model. However, it still requires human effort and cannot be automatic. In this paper, we propose a novel technique, named Aries, that can estimate the performance of DNNs on new unlabeled data using only the information obtained from the original test data. The key insight behind our technique is that the model should have similar prediction accuracy on the data which have similar distances to the decision boundary. We performed a large-scale evaluation of our technique on two famous datasets, CIFAR-10 and Tiny-ImageNet, four widely studied DNN models including ResNet101 and DenseNet-121, and 13 types of data transformation methods. Results show that the estimated accuracy by Aries is only 0.03% - 2.60% off the true accuracy. Besides, Aries also outperforms the state-of-the-art labeling-free methods in 50 out of 52 cases and selection-labeling-based methods in 96 out of 128 cases.
引用
收藏
页码:1776 / 1787
页数:12
相关论文
共 10 条
  • [1] Seed Selection for Testing Deep Neural Networks
    Zhi, Yuhan
    Xie, Xiaofei
    Shen, Chao
    Sun, Jun
    Zhang, Xiaoyu
    Guan, Xiaohong
    ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2024, 33 (01)
  • [2] Distribution Aware Testing Framework for Deep Neural Networks
    Demir, Demet
    Can, Aysu Betin
    Surer, Elif
    IEEE ACCESS, 2023, 11 : 119481 - 119505
  • [3] DeepCNP: An efficient white-box testing of deep neural networks by aligning critical neuron paths
    Liu, Weiguang
    Luo, Senlin
    Pan, Limin
    Zhang, Zhao
    INFORMATION AND SOFTWARE TECHNOLOGY, 2025, 179
  • [4] Muffin: Testing Deep Learning Libraries via Neural Architecture Fuzzing
    Gu, Jiazhen
    Luo, Xuchuan
    Zhou, Yangfan
    Wang, Xin
    2022 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2022), 2022, : 1418 - 1430
  • [5] EATS: Efficient Adaptive Test Case Selection for Deep Neural Networks
    Meng, Huanze
    Zhang, Zhiyi
    Ding, Yuchen
    Chen, Shuxian
    Yao, Yongming
    2024 IEEE 24TH INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY, QRS, 2024, : 36 - 48
  • [6] Prioritizing Test Inputs for Deep Neural Networks via Mutation Analysis
    Wang, Zan
    You, Hanmo
    Chen, Junjie
    Zhang, Yingyi
    Dong, Xuyuan
    Zhang, Wenbin
    2021 IEEE/ACM 43RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2021), 2021, : 397 - 409
  • [7] DeepHunter: A Coverage-Guided Fuzz Testing Framework for Deep Neural Networks
    Xie, Xiaofei
    Ma, Lei
    Juefei-Xu, Felix
    Xue, Minhui
    Chen, Hongxu
    Liu, Yang
    Zhao, Jianjun
    Li, Bo
    Yin, Jianxiong
    See, Simon
    PROCEEDINGS OF THE 28TH ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON SOFTWARE TESTING AND ANALYSIS (ISSTA '19), 2019, : 146 - 157
  • [8] Chinese Character CAPTCHA Recognition and performance estimation via deep neural network
    Lin, Dazhen
    Lin, Fan
    Lv, Yanping
    Cai, Feipeng
    Cao, Donglin
    NEUROCOMPUTING, 2018, 288 : 11 - 19
  • [9] NPC: Neuron Path Coverage via Characterizing Decision Logic of Deep Neural Networks
    Xie, Xiaofei
    Li, Tianlin
    Wang, Jian
    Ma, Lei
    Guo, Qing
    Juefei-Xu, Felix
    Liu, Yang
    ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2022, 31 (03)
  • [10] CertPri: Certifiable Prioritization for Deep Neural Networks via Movement Cost in Feature Space
    Zheng, Haibin
    Chen, Jinyin
    Jin, Haibo
    2023 38TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE, 2023, : 1 - 13