WordChange: Adversarial Examples Generation Approach for Chinese Text Classification

被引:20
作者
Nuo, Cheng [1 ]
Chang, Guo-Qin [1 ]
Gao, Haichang [2 ]
Pei, Ge [2 ]
Zhang, Yang [2 ]
机构
[1] Xidian Univ, Sch Cyber Engn, Xian 710071, Peoples R China
[2] Xidian Univ, Sch Comp Sci & Technol, Xian 710071, Peoples R China
关键词
Task analysis; Predictive models; Semantics; Perturbation methods; Machine learning; Security; Neural networks; Adversarial examples; deep learning; Chinese character modification strategies; black box; sentence filtering;
D O I
10.1109/ACCESS.2020.2988786
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As an important carrier for disseminating information in the Internet Age, the text contains a large amount of information. In recent years, adversarial example attacks against text discrete domains have been received widespread attention. Deep neural network (DNN) produces opposite predictions by adding small perturbations to the text data. In this paper, we present "WordChange'': an adversarial examples generation approach for Chinese text classification based on multiple modification strategies, and we evaluate the effectiveness of the method in sentiment analysis dataset and spam dataset. This method effectively locates important word positions by designing a keyword contribution algorithm.We first propose a "word-split'' strategy to substitute keywords thatare designed by the structure and semantic property of Chinese texts. We also first apply "swap'' and "insert'' strategies on Chinese texts to generate adversarial examples. We further discuss the infiuence of multiple Chinese Word Segmentation tools and different text lengths on the proposed method, as well as the diversification of Chinese text modification strategies. Finally, the adversarial texts based on the long short-term memory network (LSTM) can be successfully transferred to other text classifiers and real-world applications.
引用
收藏
页码:79561 / 79572
页数:12
相关论文
共 42 条
[1]  
[Anonymous], 2018, P 22 C COMP NAT LANG
[2]  
[Anonymous], 2018, P 27 INT C COMP LING
[3]  
[Anonymous], 2004, P EMNLP
[4]  
[Anonymous], 2017, Math Probl Eng
[5]   Towards Evaluating the Robustness of Neural Networks [J].
Carlini, Nicholas ;
Wagner, David .
2017 IEEE SYMPOSIUM ON SECURITY AND PRIVACY (SP), 2017, :39-57
[6]   A METHOD OF FINE-GRAINED SHORT TEXT SENTIMENT ANALYSIS BASED ON MACHINE LEARNING [J].
Chang, G. ;
Huo, H. .
NEURAL NETWORK WORLD, 2018, 28 (04) :325-344
[7]  
Chen HG, 2018, PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, P2587
[8]  
Davis Matt., 2003, Aoccdrnig to a rscheearch at cmabrigde uinervtisy
[9]  
Ebrahimi J, 2018, PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 2, P31
[10]   Black-box Generation of Adversarial Text Sequences to Evade Deep Learning Classifiers [J].
Gao, Ji ;
Lanchantin, Jack ;
Soffa, Mary Lou ;
Qi, Yanjun .
2018 IEEE SYMPOSIUM ON SECURITY AND PRIVACY WORKSHOPS (SPW 2018), 2018, :50-56