Regularizing Hard Examples Improves Adversarial Robustness

被引：0

作者：

Lee, Hyungyu ^{[1
]}

Lee, Saehyung ^{[1
]}

Bae, Ho ^{[2
]}

Yoon, Sungroh ^{[3
,4
]}

机构：

[1] Seoul Natl Univ, Elect & Comp Engn, Interdisciplinary Program Artificial Intelligence, Seoul 08826, South Korea

[2] Ewha Womans Univ, Dept Cyber Secur, Seoul 03760, South Korea

[3] Seoul Natl Univ, Interdisciplinary Program Artificial Intelligence, Elect & Comp Engn, AIIS,ASRI,INMC, Seoul 08826, South Korea

[4] Seoul Natl Univ, ISRC, Seoul 08826, South Korea

来源：

JOURNAL OF MACHINE LEARNING RESEARCH | 2025年 / 26卷

关键词：

adversarial robustness; adversarial examples; hard examples; memorization; robust overfitting;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Recent studies have validated that pruning hard-to-learn examples from training improves the generalization performance of neural networks (NNs). In this study, we investigate this intriguing phenomenon-the negative effect of hard examples on generalization-in adversarial training. Particularly, we theoretically demonstrate that the increase in the difficulty of hard examples in adversarial training is significantly greater than the increase in the difficulty of easy examples. Furthermore, we verify that hard examples are only fitted through memorization of the label in adversarial training. We conduct both theoretical and empirical analyses of this memorization phenomenon, showing that pruning hard examples in adversarial training can enhance the model's robustness. However, the challenge remains in finding the optimal threshold for removing hard examples that degrade robustness performance. Based upon these observations, we propose a new approach, difficulty proportional label smoothing (DPLS), to adaptively mitigate the negative effect of hard examples, thereby improving the adversarial robustness of NNs. Notably, our experimental result indicates that our method can successfully leverage hard examples while circumventing the negative effect.

引用

页数：48

共 48 条

[1]

Athalye A, 2018, PR MACH LEARN RES, V80

[2]

Carmon Y, 2019, ADV NEUR IN, V32

[3] RayS: A Ray Searching Method for Hard-label Adversarial Attack [J].

Chen, Jinghui ;

Gu, Quanquan .

KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, :1739-1747

[4]

Chen T, 2020, PR MACH LEARN RES, V119

[5]

Coates Adam, 2011, PROC 14 INT C ARTIF, V15, P215

[6]

Croce F., 2020, arXiv

[7]

Croce F, 2020, Arxiv, DOI [arXiv:2003.01690, 10.48550/arXiv.2003.01690, DOI 10.48550/ARXIV.2003.01690]

[8]

DeVries T, 2017, Arxiv, DOI arXiv:1708.04552

[9]

Dong CY, 2022, ADV NEUR IN

[10]

Dong YP, 2022, Arxiv, DOI arXiv:2106.01606

← 1 2 3 4 5 →