RobOT: Robustness-Oriented Testing for Deep Learning Systems

被引:49
作者
Wang, Jingyi [1 ]
Chen, Jialuo [1 ]
Sun, Youcheng [2 ]
Ma, Xingjun [3 ]
Wang, Dongxia [1 ]
Sun, Jun [4 ]
Cheng, Peng [1 ]
机构
[1] Zhejiang Univ, Hangzhou, Peoples R China
[2] Queens Univ Belfast, Belfast, Antrim, North Ireland
[3] Deakin Univ, Geelong, Vic, Australia
[4] Singapore Management Univ, Singapore, Singapore
来源
2021 IEEE/ACM 43RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2021) | 2021年
基金
新加坡国家研究基金会; 国家重点研发计划;
关键词
D O I
10.1109/ICSE43902.2021.00038
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Recently, there has been a significant growth of interest in applying software engineering techniques for the quality assurance of deep learning (DL) systems. One popular direction is deep learning testing, where adversarial examples (a.k.a. hugs) of DL systems are found either by fuzzing or guided search with the help of certain testing metrics. However, recent studies have revealed that the commonly used neuron coverage metrics by existing DL testing approaches are not correlated to model robustness. It Is also not an effective measurement on the confidence of the model robustness after testing. In this work, we address this gap by proposing a novel testing framework called Robustness-Oriented Testing (RobOT). A key part of RobOT is a quantitative measurement on 1) the value of each test came in improving model robustness (often via retraining), and 2) the convergence quality of the model robustness improvement. RobOT utilizes the proposed metric to automatically generate test cases valuable fur improving model robustness. The proposed metric is also a strong indicator on how well robustness improvement has converged through testing. Experiments on multiple benchmark datasets confirm the effectiveness and efficiency of RobOT in improving DL model, robustness, with 67.02% increase on the adversarial robustness that is 50.65% higher than the state-of-the-art work DeepGini.
引用
收藏
页码:300 / 311
页数:12
相关论文
共 53 条
[21]  
Khoury Marc, 2019, ARXIV PREPRINT ARXIV
[22]   Guiding Deep Learning System Testing Using Surprise Adequacy [J].
Kim, Jinhan ;
Feldt, Robert ;
Yoo, Shin .
2019 IEEE/ACM 41ST INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2019), 2019, :1039-1049
[23]  
LeCun Y., 2015, NATURE, V521, P436, DOI DOI 10.1038/NATURE14539
[24]  
Levinson J, 2011, IEEE INT VEH SYM, P163, DOI 10.1109/IVS.2011.5940562
[25]   Structural Coverage Criteria for Neural Networks Could Be Misleading [J].
Li, Zenan ;
Ma, Xiaoxing ;
Xu, Chang ;
Cao, Chun .
2019 IEEE/ACM 41ST INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: NEW IDEAS AND EMERGING RESULTS (ICSE-NIER 2019), 2019, :89-92
[26]   DeepGauge: Multi-Granularity Testing Criteria for Deep Learning Systems [J].
Ma, Lei ;
Juefei-Xu, Felix ;
Zhang, Fuyuan ;
Sun, Jiyuan ;
Xue, Minhui ;
Li, Bo ;
Chen, Chunyang ;
Su, Ting ;
Li, Li ;
Liu, Yang ;
Zhao, Jianjun ;
Wang, Yadong .
PROCEEDINGS OF THE 2018 33RD IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMTED SOFTWARE ENGINEERING (ASE' 18), 2018, :120-131
[27]  
Ma X., 2018, P ICLR
[28]   Understanding adversarial attacks on deep learning based medical image analysis systems [J].
Ma, Xingjun ;
Niu, Yuhao ;
Gu, Lin ;
Yisen, Wang ;
Zhao, Yitian ;
Bailey, James ;
Lu, Feng .
PATTERN RECOGNITION, 2021, 110
[29]  
Pacheco C, 2007, 22 ACM SIGPLAN C OBJ P OOPSLA COMPANION, P815
[30]   The Limitations of Deep Learning in Adversarial Settings [J].
Papernot, Nicolas ;
McDaniel, Patrick ;
Jha, Somesh ;
Fredrikson, Matt ;
Celik, Z. Berkay ;
Swami, Ananthram .
1ST IEEE EUROPEAN SYMPOSIUM ON SECURITY AND PRIVACY, 2016, :372-387