Interpretable Adversarial Perturbation in Input Embedding Space for Text

被引:0
|
作者
Sato, Motoki [1 ,3 ,5 ]
Suzuki, Jun [2 ,4 ,6 ]
Shindo, Hiroyuki [3 ,4 ]
Matsumoto, Yuji [3 ,4 ]
机构
[1] Preferred Networks Inc, Tokyo, Japan
[2] NTT Commun Sci Labs, Kyoto, Japan
[3] Nara Inst Sci & Technol, Ikoma, Nara, Japan
[4] RIKEN Ctr Adv Intelligence Project, Tokyo, Japan
[5] RIKEN AIP, Tokyo, Japan
[6] Tohoku Univ, Sendai, Miyagi, Japan
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Following great success in the image processing field, the idea of adversarial training has been applied to tasks in the natural language processing (NLP) field. One promising approach directly applies adversarial training developed in the image processing field to the input word embedding space instead of the discrete input space of texts. However, this approach abandons such interpretability as generating adversarial texts to significantly improve the performance of NLP tasks. This paper restores interpretability to such methods by restricting the directions of perturbations toward the existing words in the input embedding space. As a result, we can straightforwardly reconstruct each input with perturbations to an actual text by considering the perturbations to be the replacement of words in the sentence while maintaining or even improving the task performance(1).
引用
收藏
页码:4323 / 4330
页数:8
相关论文
共 50 条
  • [1] Chinese legal adversarial text generation based on interpretable perturbation strategies
    Zhang, Yunting
    Ye, Lin
    Li, Shang
    Wu, Yue
    Li, Baisong
    Zhang, Hongli
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2025, 28 (02):
  • [2] Interpretable adversarial neural pairwise ranking for academic network embedding
    Paul, Agyemang
    Wu, Zhefu
    Chen, Boyu
    Luo, Kai
    Fang, Luping
    KNOWLEDGE AND INFORMATION SYSTEMS, 2025, : 3293 - 3315
  • [3] Transparent Embedding Space for Interpretable Image Recognition
    Wang, Jiaqi
    Liu, Huafeng
    Jing, Liping
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (05) : 3204 - 3219
  • [4] Text Embedding Augmentation Based on Retraining With Pseudo-Labeled Adversarial Embedding
    Kim, Myeongsup
    Kang, Pilsung
    IEEE ACCESS, 2022, 10 : 8363 - 8376
  • [5] Interpretable Image Recognition by Constructing Transparent Embedding Space
    Wang, Jiaqi
    Liu, Huafeng
    Wang, Xinyue
    Jing, Liping
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 875 - 884
  • [6] Jacobian norm with Selective Input Gradient Regularization for interpretable adversarial defense
    Liu, Deyin
    Wu, Lin Yuanbo
    Li, Bo
    Boussaid, Farid
    Bennamoun, Mohammed
    Xie, Xianghua
    Liang, Chengwu
    PATTERN RECOGNITION, 2024, 145
  • [7] AMEN: Adversarial Multi-space Embedding Network for Text-Based Person Re-identification
    Wang, Zijie
    Xue, Jingyi
    Zhu, Aichun
    Li, Yifeng
    Zhang, Mingyi
    Zhong, Chongliang
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2021, PT II, 2021, 13020 : 462 - 473
  • [8] Reinforced Perturbation Generation for Adversarial Text-based CAPTCHA
    Cheng, Zhijun
    Wu, Zhuoting
    Yang, Zhuopan
    Yang, Zhenguo
    Li, Xiaoping
    Liu, Wenyin
    PROCEEDINGS OF THE 2024 27 TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, CSCWD 2024, 2024, : 2746 - 2751
  • [9] FontCode: Embedding Information in Text Documents Using Glyph Perturbation
    Xiao, Chang
    Zhang, Cheng
    Zheng, Changxi
    ACM TRANSACTIONS ON GRAPHICS, 2018, 37 (02):
  • [10] Joint Character-Level Word Embedding and Adversarial Stability Training to Defend Adversarial Text
    Liu, Hui
    Zhang, Yongzheng
    Wang, Yipeng
    Lin, Zheng
    Chen, Yige
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 8384 - 8391