RobOT: Robustness-Oriented Testing for Deep Learning Systems

被引:49
作者
Wang, Jingyi [1 ]
Chen, Jialuo [1 ]
Sun, Youcheng [2 ]
Ma, Xingjun [3 ]
Wang, Dongxia [1 ]
Sun, Jun [4 ]
Cheng, Peng [1 ]
机构
[1] Zhejiang Univ, Hangzhou, Peoples R China
[2] Queens Univ Belfast, Belfast, Antrim, North Ireland
[3] Deakin Univ, Geelong, Vic, Australia
[4] Singapore Management Univ, Singapore, Singapore
来源
2021 IEEE/ACM 43RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2021) | 2021年
基金
新加坡国家研究基金会; 国家重点研发计划;
关键词
D O I
10.1109/ICSE43902.2021.00038
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Recently, there has been a significant growth of interest in applying software engineering techniques for the quality assurance of deep learning (DL) systems. One popular direction is deep learning testing, where adversarial examples (a.k.a. hugs) of DL systems are found either by fuzzing or guided search with the help of certain testing metrics. However, recent studies have revealed that the commonly used neuron coverage metrics by existing DL testing approaches are not correlated to model robustness. It Is also not an effective measurement on the confidence of the model robustness after testing. In this work, we address this gap by proposing a novel testing framework called Robustness-Oriented Testing (RobOT). A key part of RobOT is a quantitative measurement on 1) the value of each test came in improving model robustness (often via retraining), and 2) the convergence quality of the model robustness improvement. RobOT utilizes the proposed metric to automatically generate test cases valuable fur improving model robustness. The proposed metric is also a strong indicator on how well robustness improvement has converged through testing. Experiments on multiple benchmark datasets confirm the effectiveness and efficiency of RobOT in improving DL model, robustness, with 67.02% increase on the adversarial robustness that is 50.65% higher than the state-of-the-art work DeepGini.
引用
收藏
页码:300 / 311
页数:12
相关论文
共 53 条
[11]  
Goodfellow I. J., 2014, INT C LEARNING REPRE
[12]   Adversarial Examples for Malware Detection [J].
Grosse, Kathrin ;
Papernot, Nicolas ;
Manoharan, Praveen ;
Backes, Michael ;
McDaniel, Patrick .
COMPUTER SECURITY - ESORICS 2017, PT II, 2017, 10493 :62-79
[13]   DLFuzz: Differential Fuzzing Testing of Deep Learning Systems [J].
Guo, Jianmin ;
Jiang, Yu ;
Zhao, Yue ;
Chen, Quan ;
Sun, Jiaguang .
ESEC/FSE'18: PROCEEDINGS OF THE 2018 26TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, 2018, :739-743
[14]   Is Neuron Coverage a Meaningful Measure for Testing Deep Neural Networks? [J].
Harel-Canada, Fabrice ;
Wang, Lingxiao ;
Gulzar, Muhammad Ali ;
Gu, Quanquan ;
Kim, Miryung .
PROCEEDINGS OF THE 28TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (ESEC/FSE '20), 2020, :851-862
[15]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[16]  
Tran HD, 2019, LECT NOTES COMPUT SC, V11800, P670, DOI 10.1007/978-3-030-30942-8_39
[17]   Safety Verification of Deep Neural Networks [J].
Huang, Xiaowei ;
Kwiatkowska, Marta ;
Wang, Sen ;
Wu, Min .
COMPUTER AIDED VERIFICATION, CAV 2017, PT I, 2017, 10426 :3-29
[18]  
Huster Todd, 2019, ECML PKDD 2018 Workshops. Nemesis 2018, UrbReas 2018, SoGood 2018 IWAISe 2018, and Green Data Mining 2018. Proceedings: Lecture Notes in Artificial Intelligence (LNAI 11329), P16, DOI 10.1007/978-3-030-13453-2_2
[19]   Black-box Adversarial Attacks on Video Recognition Models [J].
Jiang, Linxi ;
Ma, Xingjun ;
Chen, Shaoxiang ;
Bailey, James ;
Jiang, Yu-Gang .
PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, :864-872
[20]   Towards Proving the Adversarial Robustness of Deep Neural Networks [J].
Katz, Guy ;
Barrett, Clark ;
Dill, David L. ;
Julian, Kyle ;
Kochenderfer, Mykel J. .
ELECTRONIC PROCEEDINGS IN THEORETICAL COMPUTER SCIENCE, 2017, (257) :19-26