Robustness-Aware Word Embedding Improves Certified Robustness to Adversarial Word Substitutions

被引:0
|
作者
Wang, Yibin [1 ]
Yang, Yichen [1 ]
He, Di [2 ]
He, Kun [1 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Comp Sci & Technol, Wuhan, Peoples R China
[2] Peking Univ, Sch Intelligence Sci & Technol, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Natural Language Processing (NLP) models have gained great success on clean texts, but they are known to be vulnerable to adversarial examples typically crafted by synonym substitutions. In this paper, we target to solve this problem and find that word embedding is important to the certified robustness of NLP models. Given the findings, we propose the Embedding Interval Bound Constraint (EIBC) triplet loss to train robustness-aware word embeddings for better certified robustness. We optimize the EIBC triplet loss to reduce distances between synonyms in the embedding space, which is theoretically proven to make the verification boundary tighter. Meanwhile, we enlarge distances among non-synonyms, maintaining the semantic representation of word embeddings. Our method is conceptually simple and componentized. It can be easily combined with IBP training and improves the certified robust accuracy from 76.73% to 84.78% on the IMDB dataset. Experiments demonstrate that our method outperforms various state-of-the-art certified defense baselines and generalizes well to unseen substitutions. The code is available at https://github.com/JHL-HUST/EIBC-IBP/.
引用
收藏
页码:673 / 687
页数:15
相关论文
共 50 条
  • [1] Certified Robustness to Adversarial Word Substitutions
    Jia, Robin
    Raghunathan, Aditi
    Goksel, Kerem
    Liang, Percy
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 4129 - 4142
  • [2] Quantifying Robustness to Adversarial Word Substitutions
    Yang, Yuting
    Huang, Pei
    Cao, Juao
    Ma, Feifei
    Zhang, Jian
    Li, Jintao
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: RESEARCH TRACK, ECML PKDD 2023, PT I, 2023, 14169 : 95 - 112
  • [3] SAFER: A Structure-free Approach for Certified Robustness to Adversarial Word Substitutions
    Ye, Mao
    Gong, Chengyue
    Liu, Qiang
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 3465 - 3475
  • [4] Efficient Adversarial Contrastive Learning via Robustness-Aware Coreset Selection
    Xu, Xilie
    Zhang, Jingfeng
    Liu, Feng
    Sugiyama, Masashi
    Kankanhalli, Mohan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [5] Certified Robustness to Word Substitution Attack with Differential Privacy
    Wang, Wenjie
    Tang, Pengfei
    Lou, Jian
    Xiong, Li
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 1102 - 1112
  • [6] MACROBERT: Maximizing Certified Region of BERT to Adversarial Word Substitutions
    Wang, Fali
    Lin, Zheng
    Liu, Zhengxiao
    Zheng, Mingyu
    Wang, Lei
    Zha, Daren
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2021), PT II, 2021, 12682 : 253 - 261
  • [7] ROBUSTNESS-AWARE FILTER PRUNING FOR ROBUST NEURAL NETWORKS AGAINST ADVERSARIAL ATTACKS
    Lim, Hyuntak
    Roh, Si-Dong
    Park, Sangki
    Chung, Ki-Seok
    2021 IEEE 31ST INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2021,
  • [8] Certified Adversarial Robustness with Additive Noise
    Li, Bai
    Chen, Changyou
    Wang, Wenlin
    Carin, Lawrence
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [9] Certified Robustness to Adversarial Examples with Differential Privacy
    Lecuyer, Mathias
    Atlidakis, Vaggelis
    Geambasu, Roxana
    Hsu, Daniel
    Jana, Suman
    2019 IEEE SYMPOSIUM ON SECURITY AND PRIVACY (SP 2019), 2019, : 656 - +
  • [10] Certified Adversarial Robustness for Deep Reinforcement Learning
    Lutjen, Bjorn
    Everett, Michael
    How, Jonathan P.
    CONFERENCE ON ROBOT LEARNING, VOL 100, 2019, 100