Adversarial Sample Detection for Deep Neural Network through Model Mutation Testing

被引:139
作者
Wang, Jingyi [1 ,2 ]
Dong, Guoliang [3 ]
Sun, Jun [2 ]
Wang, Xinyu [3 ]
Zhang, Peixin [3 ]
机构
[1] Shenzhen Univ, Shenzhen, Guangdong, Peoples R China
[2] Singapore Univ Tech & Design, Singapore, Singapore
[3] Zhejiang Univ, Hangzhou, Zhejiang, Peoples R China
来源
2019 IEEE/ACM 41ST INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2019) | 2019年
关键词
adversarial sample; detection; deep neural network; mutation; testing; sensitivity; ROBUSTNESS;
D O I
10.1109/ICSE.2019.00126
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep neural networks (DNN) have been shown to be useful in a wide range of applications. However, they are also known to be vulnerable to adversarial samples. By transforming a normal sample with some carefully crafted human imperceptible perturbations, even highly accurate DNN make wrong decisions. Multiple defense mechanisms have been proposed which aim to hinder the generation of such adversarial samples. However, a recent work show that most of them are ineffective. In this work, we propose an alternative approach to detect adversarial samples at runtime. Our main observation is that adversarial samples are much more sensitive than normal samples if we impose random mutations on the DNN. We thus first propose a measure of 'sensitivity' and show empirically that normal samples and adversarial samples have distinguishable sensitivity. We then integrate statistical hypothesis testing and model mutation testing to check whether an input sample is likely to be normal or adversarial at runtime by measuring its sensitivity. We evaluated our approach on the MNIST and CIFAR10 datasets. The results show that our approach detects adversarial samples generated by state-of-the-art attacking methods efficiently and accurately.
引用
收藏
页码:1245 / 1256
页数:12
相关论文
共 56 条
[1]   A Survey of Statistical Model Checking [J].
Agha, Gul ;
Palmskog, Karl .
ACM TRANSACTIONS ON MODELING AND COMPUTER SIMULATION, 2018, 28 (01)
[2]  
[Anonymous], 2016, ARXIV PREPRINT ARXIV
[3]  
[Anonymous], 2018, ARXIV180505206
[4]  
[Anonymous], 2016, P 2016 IEEE C COMP V
[5]  
[Anonymous], 2018, ARXIV180409699
[6]  
[Anonymous], Enhancing The Reliability of Out-of-distribution Image Detection in Neural Networks
[7]  
[Anonymous], 2017, ARXIV171010766
[8]  
[Anonymous], 2015, ARXIV151105432
[9]  
[Anonymous], 2018, ARXIV180110578
[10]  
[Anonymous], 2006, P 23 INT C MACHINE L, DOI [10.1145/1143844.1143874, DOI 10.1145/1143844.1143874]