Robustness-Aware Word Embedding Improves Certified Robustness to Adversarial Word Substitutions

被引：0

作者：

Wang, Yibin ^{[1
]}

Yang, Yichen ^{[1
]}

He, Di ^{[2
]}

He, Kun ^{[1
]}

机构：

[1] Huazhong Univ Sci & Technol, Sch Comp Sci & Technol, Wuhan, Peoples R China

[2] Peking Univ, Sch Intelligence Sci & Technol, Beijing, Peoples R China

来源：

FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023 | 2023年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Natural Language Processing (NLP) models have gained great success on clean texts, but they are known to be vulnerable to adversarial examples typically crafted by synonym substitutions. In this paper, we target to solve this problem and find that word embedding is important to the certified robustness of NLP models. Given the findings, we propose the Embedding Interval Bound Constraint (EIBC) triplet loss to train robustness-aware word embeddings for better certified robustness. We optimize the EIBC triplet loss to reduce distances between synonyms in the embedding space, which is theoretically proven to make the verification boundary tighter. Meanwhile, we enlarge distances among non-synonyms, maintaining the semantic representation of word embeddings. Our method is conceptually simple and componentized. It can be easily combined with IBP training and improves the certified robust accuracy from 76.73% to 84.78% on the IMDB dataset. Experiments demonstrate that our method outperforms various state-of-the-art certified defense baselines and generalizes well to unseen substitutions. The code is available at https://github.com/JHL-HUST/EIBC-IBP/.

引用

页码：673 / 687

页数：15

共 50 条

[31] Training on Foveated Images Improves Robustness to Adversarial Attacks
Shah, Muhammad A.
Kashaf, Aqsa
Raj, Bhiksha
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[32] Certified Robustness of Graph Neural Networks against Adversarial Structural Perturbation
Wang, Binghui
Jia, Jinyuan
Cao, Xiaoyu
Gong, Neil Zhenqiang
KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, : 1645 - 1653
[33] Word Level Robustness Enhancement: Fight Perturbation with Perturbation
Huang, Pei
Yang, Yuting
Jia, Fuqi
Liu, Minghao
Ma, Feifei
Zhang, Jian
THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 10785 - 10793
[34] Class-aware domain adaptation for improving adversarial robustness
Hou, Xianxu
Liu, Jingxin
Xu, Bolei
Wang, Xiaolong
Liu, Bozhi
Qiu, Guoping
IMAGE AND VISION COMPUTING, 2020, 99 (99)
[35] Robustness-aware 2-bit quantization with real-time performance for neural network
Li, Xiaobin
Jiang, Hongxu
Zhang, Runhua
Tian, Fangzheng
Huang, Shuangxi
Xu, Donghuan
NEUROCOMPUTING, 2021, 455 : 12 - 22
[36] Robustness-Aware Real-Time SFC Routing Update in Multi-Tenant Clouds
Tu, Huaqing
Zhao, Gongming
Xu, Hongli
Zhao, Yangming
Zhai, Yutong
2021 IEEE/ACM 29TH INTERNATIONAL SYMPOSIUM ON QUALITY OF SERVICE (IWQOS), 2021,
[37] Robustness-aware design space exploration through iterative refinement of D-optimal designs
Tuzov, Ilya
de Andres, David
Ruiz, Juan-Carlos
2019 15TH EUROPEAN DEPENDABLE COMPUTING CONFERENCE (EDCC 2019), 2019, : 23 - 30
[38] Improving the Robustness of Wasserstein Embedding by Adversarial PAC-Bayesian Learning
Ding, Daizong
Zhang, Mi
Pan, Xudong
Yang, Min
He, Xiangnan
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 3791 - 3800
[39] Sentiment-Aware Word Embedding for Emotion Classification
Mao, Xingliang
Chang, Shuai
Shi, Jinjing
Li, Fangfang
Shi, Ronghua
APPLIED SCIENCES-BASEL, 2019, 9 (07):
[40] On accuracy, robustness and security of bag-of-word search systems
Voloshynovskiy, Sviatoslav
Diephuis, Maurits
Kostadinov, Dimche
Farhadzadeh, Farzad
Holotyak, Taras
MEDIA WATERMARKING, SECURITY, AND FORENSICS 2014, 2014, 9028

← 1 2 3 4 5 →