Robustness Certification Research on Deep Learning Models: A Survey

被引:0
作者
Ji S.-L. [1 ]
Du T.-Y. [1 ]
Deng S.-G. [1 ]
Cheng P. [2 ]
Shi J. [3 ]
Yang M. [4 ]
Li B. [5 ]
机构
[1] College of Computer Science and Technology, Zhejiang University, Hangzhou
[2] College of Control Science and Engineering, Zhejiang University, Hangzhou
[3] Huawei Singapore Research Center, Singapore
[4] School of Computer Science, Fudan University, Shanghai
[5] Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana-Champaign
来源
Jisuanji Xuebao/Chinese Journal of Computers | 2022年 / 45卷 / 01期
基金
中国国家自然科学基金;
关键词
Adversarial example; Artificial intelligence security; Deep learning; Robustness certification;
D O I
10.11897/SP.J.1016.2022.00190
中图分类号
学科分类号
摘要
In the era of big data, breakthroughs in theories and technologies of deep learning have provided strong support for artificial intelligence at the data and the algorithm level, as well as have promoted the development of scale and industrialization of deep learning in a large number of tasks, such as image classification, object detection, semantic segmentation, natural language processing and speech recognition. However, though deep learning models have excellent performance in many real-world applications, they still suffer many security threats. For instance, it is now known that deep neural networks are fundamentally vulnerable to malicious manipulations, such as adversarial examples that force target deep neural networks to misbehave. In recent years, a plethora of work has focused on constructing adversarial examples in various domains. The phenomenon of adversarial examples demonstrates the inherent lack of robustness of deep neural networks, which limits their use in security-critical applications. In order to build a safe and reliable deep learning system and eliminate the potential security risks of deep learning models in real-world applications, the security issue of deep learning has attracted extensive attention from academia and industry. Thus far, intensive research has been devoted to improving the robustness of DNNs against adversarial attacks. Unfortunately, most defenses are based on heuristics and thus lack any theoretical guarantee, which can often be defeated or circumvented by more powerful attacks. Therefore, defenses only showing empirical success against attacks, are difficult to be concluded robust. Aiming to end the constant arms race between adversarial attacks and defenses, the concept of robustness certification is proposed to provide guaranteed robustness by formally verifying whether a given region surrounding a data point admits any adversarial example. Robustness certification, the functionality of verifying whether the given region surrounding a data point admits any adversarial example, provides guaranteed security for deep neural networks deployed in adversarial environments. Within the certified robustness bound, any possible perturbation would not impact the prediction of a deep neural network. A large number of researchers have conducted in-depth research on the model robustness certification from the perspective of complete and incomplete, and proposed a series of certification methods. These methods can be generally categorized as exact certification methods and relaxed certification methods. Exact certification methods are mostly based on satisfiability modulo theories or mixed-integer linear program solvers. Though these methods are able to certify the exact robustness bound, they are usually computationally expensive. Hence, it is difficult to scale them even to medium size networks. Relaxed certification methods include the convex polytope methods, reachability analysis methods, and abstract interpretation methods, etc. These methods are usually efficient but cannot provide precise robustness bounds as exact certification methods do. Nevertheless, considering the expensive computational cost, relaxed certification methods are shown to be more promising in practical applications, especially for large networks. In this survey, we review the current challenges of model robustness certification problem, systematically and scientifically summarize existing research work, and clarify the advantages and disadvantages of current research. Finally, we explore future research directions of model robustness certification research. © 2022, Science Press. All right reserved.
引用
收藏
页码:190 / 206
页数:16
相关论文
共 118 条
[1]  
Wang T, Gao H, Qiu J., A combined adaptive neural network and nonlinear model predictive control for multirate networked industrial process control, IEEE Transactions on Neural Networks and Learning Systems, 27, 2, pp. 416-425, (2015)
[2]  
Schmidhuber J., Deep learning in neural networks: An overview, Neural Networks, 61, pp. 85-117, (2015)
[3]  
Silver D, Huang A, Maddison C J, Et al., Mastering the game of Go with deep neural networks and tree search, Nature, 529, 7587, (2016)
[4]  
Bojarski M, Del Testa D, Dworakowski D, Et al., End to end learning for self-driving cars, (2016)
[5]  
Goodfellow I J, Shlens J, Szegedy C., Explaining and harnessing adversarial examples, Proceedings of the 3rd International Conference on Learning Representations, pp. 1-11, (2015)
[6]  
Szegedy C, Zaremba W, Sutskever I, Et al., Intriguing properties of neural networks, Proceedings of the International Conference on Learning Representations, pp. 1-10, (2014)
[7]  
Carlini N, Wagner D., Towards evaluating the robustness of neural networks, Proceedings of the 2017 IEEE Symposium on Security and Privacy, pp. 39-57, (2017)
[8]  
Madry A, Makelov A, Schmidt L, Et al., Towards deep learning models resistant to adversarial attacks, (2017)
[9]  
Papernot N, Mcdaniel P, Wu X, Et al., Distillation as a defense to adversarial perturbations against deep neural networks, Proceedings of the IEEE Symposium on Security and Privacy, pp. 582-597, (2016)
[10]  
Meng D, Chen H., Magnet: a two-pronged defense against adversarial examples, Proceedings of the ACM SIGSAC Conference on Computer and Communications Security, pp. 135-147, (2017)