Combating Multi-level Adversarial Text with Pruning based Adversarial Training

被引:3
|
作者
Ke, Jianpeng [1 ]
Wang, Lina [1 ]
Ye, Aoshuang [1 ]
Fu, Jie [1 ]
机构
[1] Wuhan Univ, Key Lab Aerosp Informat Secur & Trusted Comp, Minist Educ, Sch Cyber Sci & Engn, Wuhan, Peoples R China
来源
2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN) | 2022年
基金
中国国家自然科学基金;
关键词
Adversarial training; Adversarial text; Model pruning; Deep neural network;
D O I
10.1109/IJCNN55064.2022.9892314
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Despite significant advancements of deep learning-based models for natural language processing (NLP) tasks, previous efforts have shown that numerous models, including deep neural networks (DNNs), suffer from moderate to significant performance degradation with adversarial examples. Adversary crafts malicious text by adding, deleting, modifying chars, words, and sentences, to fool the DNN models. Therefore, adversarial training and model enhanced methods are proposed to combat the adversarial attack. However, both methods are lack generalization due to the overfitting intrinsic of neural networks. In this paper, we propose a novel framework to combat text adversarial examples, namely DisPAT, which consists an adversarial text discriminator and a robust pruned text classifier. First, we explore the adversarial examples and benign examples distribution in embedding space, indicating the feasibility of a DNN-based discriminator. To get multi-level adversarial texts, we deploy a generator, and a discriminator to identify adversarial perturbations. Notably, in the inference stage, our pipeline places the well-trained discriminator in front of the text classifier to distinguish the char-level adversarial text. Finally, we apply neuron-salience-based pruning to specifically improve the classifier performance of adversarial text. Experimental results show that our approach outperforms state-of-the-art baselines in combating both char-level and word-level adversarial text. Moreover, DisPAT achieves a very close to or even higher accuracy than that of the standard model.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] Combating Word-level Adversarial Text with Robust Adversarial Training
    Du, Xiaohu
    Yu, Jie
    Li, Shasha
    Yi, Zibo
    Liu, Hai
    Ma, Jun
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [2] Generating Long and Coherent Text with Multi-Level Generative Adversarial Networks
    Tang, Tianyi
    Li, Junyi
    Zhao, Wayne Xin
    Wen, Ji-Rong
    WEB AND BIG DATA, APWEB-WAIM 2021, PT II, 2021, 12859 : 49 - 63
  • [3] Joint Character-Level Word Embedding and Adversarial Stability Training to Defend Adversarial Text
    Liu, Hui
    Zhang, Yongzheng
    Wang, Yipeng
    Lin, Zheng
    Chen, Yige
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 8384 - 8391
  • [4] Wavelet-based multi-level generative adversarial networks for face aging
    Shao, Jun
    Bui, Tien D.
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2022, 223
  • [5] Multi-level adversarial attention cross-modal hashing
    Wang, Benhui
    Zhang, Huaxiang
    Zhu, Lei
    Nie, Liqiang
    Liu, Li
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2023, 117
  • [6] Multi-level adversarial network for domain adaptive semantic segmentation
    Huang, Jiaxing
    Guan, Dayan
    Xiao, Aoran
    Lu, Shijian
    PATTERN RECOGNITION, 2022, 123
  • [7] ADVERSARIAL ATTACKS ON MULTI-LEVEL FAULT DETECTION AND DIAGNOSIS SYSTEMS
    Awad, Akram S.
    Alkhouri, Ismail R.
    Atia, George K.
    2021 IEEE 31ST INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2021,
  • [8] Margin Discrepancy-Based Adversarial Training for Multi-Domain Text Classification
    Wu, Yuan
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT III, NLPCC 2024, 2025, 15361 : 170 - 182
  • [9] Improvements to adversarial training for text classification
    He, Jia-Long
    Zhang, Xiao-Lin
    Wang, Yong-Ping
    Gu, Rui-Chun
    Liu, Li-Xin
    Xu, En-Hui
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2024, 46 (02) : 5191 - 5202
  • [10] An adversarial training method for text classification
    Liu, Xiaoyang
    Dai, Shanghong
    Fiumara, Giacomo
    De Meo, Pasquale
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2023, 35 (08)