QuoTe: Quality-oriented Testing for Deep Learning Systems

被引:2
|
作者
Chen, Jialuo [1 ]
Wang, Jingyi [1 ]
Ma, Xingjun [2 ]
Sun, Youcheng [3 ]
Sun, Jun [4 ]
Zhang, Peixin [1 ]
Cheng, Peng [1 ]
机构
[1] Zhejiang Univ, Hangzhou 310027, Peoples R China
[2] Fudan Univ, Shanghai 200433, Peoples R China
[3] Univ Manchester, Manchester M13 9PL, Lancs, England
[4] Singapore Management Univ, Singapore 188065, Singapore
基金
国家重点研发计划;
关键词
Deep learning; testing; robustness; fairness; ROBUSTNESS;
D O I
10.1145/3582573
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Recently, there has been significant growth of interest in applying software engineering techniques for the quality assurance of deep learning (DL) systems. One popular direction is DL testing-that is, given a property of test, defects of DL systems are found either by fuzzing or guided search with the help of certain testing metrics. However, recent studies have revealed that the neuron coverage metrics, which are commonly used by most existing DL testing approaches, are not necessarily correlated with model quality (e.g., robustness, the most studied model property), and are also not an effective measurement on the confidence of the model quality after testing. In this work, we address this gap by proposing a novel testing framework calledQuoTe (i.e., Quality-oriented Testing). A key part of QuoTe is a quantitative measurement on (1) the value of each test case in enhancing the model property of interest (often via retraining) and (2) the convergence quality of the model property improvement. QuoTe utilizes the proposed metric to automatically select or generate valuable test cases for improving model quality. The proposedmetric is also a lightweight yet strong indicator of how well the improvement converged. Extensive experiments on both image and tabular datasets with a variety of model architectures confirm the effectiveness and efficiency of QuoTe in improving DL model quality-that is, robustness and fairness. As a generic quality-oriented testing framework, future adaptations can be made to other domains (e.g., text) as well as other model properties.
引用
收藏
页数:33
相关论文
共 50 条
  • [1] Quality-Oriented Federated Learning on the Fly
    Wang, Fei
    Li, Baochun
    Li, Bo
    IEEE NETWORK, 2022, 36 (05): : 152 - 159
  • [2] DeepCT: Tomographic Combinatorial Testing for Deep Learning Systems
    Ma, Lei
    Juefei-Xu, Felix
    Xue, Minhui
    Li, Bo
    Li, Li
    Liu, Yang
    Zhao, Jianjun
    2019 IEEE 26TH INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING (SANER), 2019, : 614 - 618
  • [3] Coverage Guided Differential Adversarial Testing of Deep Learning Systems
    Guo, Jianmin
    Zhao, Yue
    Song, Houbing
    Jiang, Yu
    IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, 2021, 8 (02): : 933 - 942
  • [4] DeepMutation: Mutation Testing of Deep Learning Systems
    Ma, Lei
    Zhang, Fuyuan
    Sun, Jiyuan
    Xue, Minhui
    Li, Bo
    Juefei-Xu, Felix
    Xie, Chao
    Li, Li
    Liu, Yang
    Zhao, Jianjun
    Wang, Yadong
    2018 29TH IEEE INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING (ISSRE), 2018, : 100 - 111
  • [5] DeepWeak: Weak Mutation Testing for Deep Learning Systems
    Xue, Yinjie
    Zhang, Zhiyi
    Liu, Chen
    Chen, Shuxian
    Huang, Zhiqiu
    2024 IEEE 24TH INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY, QRS, 2024, : 49 - 60
  • [6] DLFuzz: Differential Fuzzing Testing of Deep Learning Systems
    Guo, Jianmin
    Jiang, Yu
    Zhao, Yue
    Chen, Quan
    Sun, Jiaguang
    ESEC/FSE'18: PROCEEDINGS OF THE 2018 26TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, 2018, : 739 - 743
  • [7] DCT: Differential Combination Testing of Deep Learning Systems
    Wang, Chunyan
    Ge, Weimin
    Li, Xiaohong
    Feng, Zhiyong
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2019: IMAGE PROCESSING, PT III, 2019, 11729 : 697 - 710
  • [8] DeepCon: Contribution Coverage Testing for Deep Learning Systems
    Zhou, Zhiyang
    Dou, Wensheng
    Liu, Jie
    Zhang, Chenxin
    Wei, Jun
    Ye, Dan
    2021 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING (SANER 2021), 2021, : 189 - 200
  • [9] Deep Learning for Combined Water Quality Testing and Crop Recommendation
    Alkhudaydi, Tahani
    Albalawi, Maram Qasem
    Alanazi, Jamelah Sanad
    Al-Anazi, Wejdan
    Alfarshouti, Rahaf Mansour
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (04) : 447 - 455
  • [10] Safety-Critical Oracles for Metamorphic Testing of Deep Learning LiDAR Point Cloud Object Detectors
    Speth, Simon
    Trien, Maximilian
    Kufer, Dominik
    Pretschner, Alexander
    IEEE OPEN JOURNAL OF INTELLIGENT TRANSPORTATION SYSTEMS, 2025, 6 : 95 - 108