QuoTe: Quality-oriented Testing for Deep Learning Systems

被引：2

作者：

Chen, Jialuo ^{[1
]}

Wang, Jingyi ^{[1
]}

Ma, Xingjun ^{[2
]}

Sun, Youcheng ^{[3
]}

Sun, Jun ^{[4
]}

Zhang, Peixin ^{[1
]}

Cheng, Peng ^{[1
]}

机构：

[1] Zhejiang Univ, Hangzhou 310027, Peoples R China

[2] Fudan Univ, Shanghai 200433, Peoples R China

[3] Univ Manchester, Manchester M13 9PL, Lancs, England

[4] Singapore Management Univ, Singapore 188065, Singapore

来源：

ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY | 2023年 / 32卷 / 05期

基金：

国家重点研发计划;

关键词：

Deep learning; testing; robustness; fairness; ROBUSTNESS;

D O I：

10.1145/3582573

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Recently, there has been significant growth of interest in applying software engineering techniques for the quality assurance of deep learning (DL) systems. One popular direction is DL testing-that is, given a property of test, defects of DL systems are found either by fuzzing or guided search with the help of certain testing metrics. However, recent studies have revealed that the neuron coverage metrics, which are commonly used by most existing DL testing approaches, are not necessarily correlated with model quality (e.g., robustness, the most studied model property), and are also not an effective measurement on the confidence of the model quality after testing. In this work, we address this gap by proposing a novel testing framework calledQuoTe (i.e., Quality-oriented Testing). A key part of QuoTe is a quantitative measurement on (1) the value of each test case in enhancing the model property of interest (often via retraining) and (2) the convergence quality of the model property improvement. QuoTe utilizes the proposed metric to automatically select or generate valuable test cases for improving model quality. The proposedmetric is also a lightweight yet strong indicator of how well the improvement converged. Extensive experiments on both image and tabular datasets with a variety of model architectures confirm the effectiveness and efficiency of QuoTe in improving DL model quality-that is, robustness and fairness. As a generic quality-oriented testing framework, future adaptations can be made to other domains (e.g., text) as well as other model properties.

引用

页数：33

共 50 条

[1] Quality-Oriented Federated Learning on the Fly
Wang, Fei
Li, Baochun
Li, Bo
IEEE NETWORK, 2022, 36 (05): : 152 - 159
[2] DeepCT: Tomographic Combinatorial Testing for Deep Learning Systems
Ma, Lei
Juefei-Xu, Felix
Xue, Minhui
Li, Bo
Li, Li
Liu, Yang
Zhao, Jianjun
2019 IEEE 26TH INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING (SANER), 2019, : 614 - 618
[3] Coverage Guided Differential Adversarial Testing of Deep Learning Systems
Guo, Jianmin
Zhao, Yue
Song, Houbing
Jiang, Yu
IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, 2021, 8 (02): : 933 - 942
[4] DeepMutation: Mutation Testing of Deep Learning Systems
Ma, Lei
Zhang, Fuyuan
Sun, Jiyuan
Xue, Minhui
Li, Bo
Juefei-Xu, Felix
Xie, Chao
Li, Li
Liu, Yang
Zhao, Jianjun
Wang, Yadong
2018 29TH IEEE INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEERING (ISSRE), 2018, : 100 - 111
[5] DeepWeak: Weak Mutation Testing for Deep Learning Systems
Xue, Yinjie
Zhang, Zhiyi
Liu, Chen
Chen, Shuxian
Huang, Zhiqiu
2024 IEEE 24TH INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY, QRS, 2024, : 49 - 60
[6] DLFuzz: Differential Fuzzing Testing of Deep Learning Systems
Guo, Jianmin
Jiang, Yu
Zhao, Yue
Chen, Quan
Sun, Jiaguang
ESEC/FSE'18: PROCEEDINGS OF THE 2018 26TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, 2018, : 739 - 743
[7] DCT: Differential Combination Testing of Deep Learning Systems
Wang, Chunyan
Ge, Weimin
Li, Xiaohong
Feng, Zhiyong
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2019: IMAGE PROCESSING, PT III, 2019, 11729 : 697 - 710
[8] DeepCon: Contribution Coverage Testing for Deep Learning Systems
Zhou, Zhiyang
Dou, Wensheng
Liu, Jie
Zhang, Chenxin
Wei, Jun
Ye, Dan
2021 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ANALYSIS, EVOLUTION AND REENGINEERING (SANER 2021), 2021, : 189 - 200
[9] Deep Learning for Combined Water Quality Testing and Crop Recommendation
Alkhudaydi, Tahani
Albalawi, Maram Qasem
Alanazi, Jamelah Sanad
Al-Anazi, Wejdan
Alfarshouti, Rahaf Mansour
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (04) : 447 - 455
[10] Safety-Critical Oracles for Metamorphic Testing of Deep Learning LiDAR Point Cloud Object Detectors
Speth, Simon
Trien, Maximilian
Kufer, Dominik
Pretschner, Alexander
IEEE OPEN JOURNAL OF INTELLIGENT TRANSPORTATION SYSTEMS, 2025, 6 : 95 - 108

← 1 2 3 4 5 →