Explaining Adversarial Robustness of Neural Networks from Clustering Effect Perspective

被引:0
作者
Jin, Yulin [1 ]
Zhang, Xiaoyu [1 ]
Lou, Jian [2 ]
Ma, Xu [3 ]
Wang, Zilong [1 ]
Chen, Xiaofeng [1 ]
机构
[1] Xidian Univ, State Key Lab Integrated Serv Networks ISN, Xian, Peoples R China
[2] Zhejiang Univ, ZJU Hangzhou Global Sci & Technol Innovat Ctr, Hangzhou, Peoples R China
[3] Qufu Normal Univ, Sch Cyber Sci & Engn, Jinan, Peoples R China
来源
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV | 2023年
基金
中国国家自然科学基金;
关键词
D O I
10.1109/ICCV51070.2023.00417
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Adversarial training (AT) is the most commonly used mechanism to improve the robustness of deep neural networks. Recently, a novel adversarial attack against intermediate layers exploits the extra fragility of adversarially trained networks to output incorrect predictions. The result implies the insufficiency in the searching space of the adversarial perturbation in adversarial training. To straighten out the reason for the effectiveness of the intermediate-layer attack, we interpret the forward propagation as the Clustering Effect, characterizing that the intermediate-layer representations of neural networks for samples i.i.d. to the training set with the same label are similar, and we theoretically prove the existence of Clustering Effect by corresponding Information Bottleneck Theory. We afterward observe that the intermediate-layer attack disobeys the clustering effect of the AT-trained model. Inspired by these significant observations, we propose a regularization method to extend the perturbation searching space during training, named sufficient adversarial training (SAT). We give a proven robustness bound of neural networks through rigorous mathematical proof. The experimental evaluations manifest the superiority of SAT over other state-of-the-art AT mechanisms in defending against adversarial attacks against both output and intermediate layers. Our code and Appendix can be found at https://github.com/clustering-effect/SAT.
引用
收藏
页码:4499 / 4508
页数:10
相关论文
共 32 条
[1]  
Bai Y., 2021, ARXIV210308307
[2]  
Boopathy Akhilan, 2020, P INT C MACH LEARN, P1014
[3]   Towards Evaluating the Robustness of Neural Networks [J].
Carlini, Nicholas ;
Wagner, David .
2017 IEEE SYMPOSIUM ON SECURITY AND PRIVACY (SP), 2017, :39-57
[4]   DeepDriving: Learning Affordance for Direct Perception in Autonomous Driving [J].
Chen, Chenyi ;
Seff, Ari ;
Kornhauser, Alain ;
Xiao, Jianxiong .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :2722-2730
[5]  
Croce F, 2020, PR MACH LEARN RES, V119
[6]  
Etmann C., 2019, arXiv
[7]  
Feng M., 2023, IEEE T MULTIMEDIA
[8]  
Goodfellow IJ, 2014, ARXIV14126572
[9]  
He K, 2016, Proceedings of the IEEE conference on computer vision and pattern recognition, DOI [DOI 10.1109/CVPR.2016.90, 10.1109/CVPR.2016.90]
[10]  
Hein M, 2017, ADV NEUR IN, V30