ENHANCING ADVERSARIAL ROBUSTNESS FOR IMAGE CLASSIFICATION BY REGULARIZING CLASS LEVEL FEATURE DISTRIBUTION

被引：4

作者：

Yu, Cheng ^{[1
]}

Xue, Youze ^{[1
]}

Chen, Jiansheng ^{[1
,2
,3
]}

Wang, Yu ^{[1
]}

Ma, Huimin ^{[3
]}

机构：

[1] Tsinghua Univ, Dept Elect Engn, Beijing, Peoples R China

[2] Beijing Natl Res Ctr Informat Sci & Technol, Beijing, Peoples R China

[3] Univ Sci & Technol Beijing, Beijing, Peoples R China

来源：

2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP) | 2021年

基金：

中国国家自然科学基金;

关键词：

Adversarial Training; Intra and Inter Class Feature Regularization; Robustness;

D O I：

10.1109/ICIP42928.2021.9506383

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recent researches have shown that deep neural networks (DNNs) are vulnerable to adversarial examples. Adversarial training is practically the most effective approach to improve the robustness of DNNs against adversarial examples. However, conventional adversarial training methods only focus on the classification results or the instance level relationship on feature representations for adversarial examples. Inspired by the fact that adversarial examples break the distinguishability of the feature representations of DNNs for different classes, we propose Intra and Inter Class Feature Regularization ((IFR)-F-2) to make the feature distribution of adversarial examples maintain the same classification property as clean examples. On the one hand, the intra-class regularization restricts the distance of features between adversarial examples and both the corresponding clean data and samples for the same class. On the other hand, the inter-class regularization prevents the feature of adversarial examples from getting close to other classes. By adding (IFR)-F-2 in both adversarial example generation and model training steps in adversarial training, we can get stronger and more diverse adversarial examples, and the neural network learns a more distinguishable and reasonable feature distribution. Experiments on various adversarial training frameworks demonstrate that (IFR)-F-2 is adaptive for multiple training frameworks and outperforms the state-of-the-art methods for classification of both clean data and adversarial examples.

引用

页码：494 / 498

页数：5

共 21 条

[1]

Nguyen A, 2015, PROC CVPR IEEE, P427, DOI 10.1109/CVPR.2015.7298640

[2]

[Anonymous], 2018, Semidefinite relaxations for certifying robustness to adversarial examples.

[3]

[Anonymous], 2014, PROC 2 INT C LEARN R

[4]

Bui A., 2020, P EUR C COMP VIS, P209

[5]

Croce F, 2020, PR MACH LEARN RES, V119

[6] Rich feature hierarchies for accurate object detection and semantic segmentation [J].

Girshick, Ross ;

Donahue, Jeff ;

Darrell, Trevor ;

Malik, Jitendra .

2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :580-587

[7]

Guo CA, 2017, PR MACH LEARN RES, V70

[8] Identity Mappings in Deep Residual Networks [J].

He, Kaiming ;

Zhang, Xiangyu ;

Ren, Shaoqing ;

Sun, Jian .

COMPUTER VISION - ECCV 2016, PT IV, 2016, 9908 :630-645

[9]

Ioffe S, 2015, PR MACH LEARN RES, V37, P448

[10]

Jaderberg M, 2015, ADV NEUR IN, V28

← 1 2 3 →