An adversarial-example generation method for Chinese sentiment tendency classification based on audiovisual confusion and contextual association

被引:0
作者
Hongxu Ou
Long Yu
Shengwei Tian
Xin Chen
Chen Shi
Bo Wang
Tiejun Zhou
机构
[1] Xinjiang University,Software College
[2] Xinjiang University,The Key Laboratory of Software Engineering
[3] Xinjiang University,Network Center
[4] Xinjiang University,Mathematics and Systems Science
[5] Internet Information Center,undefined
来源
Knowledge and Information Systems | 2023年 / 65卷
关键词
Adversarial examples; Audio visual deception; Contextualized generation; Fluency;
D O I
暂无
中图分类号
学科分类号
摘要
The generation methods of adversarial examples have been more explored on English data, while the research papers on Chinese adversarial examples are very limited. At the same time, the existing Chinese adversarial attack methods are often characterized by a single form of generation and not rich enough expression. And the attack effect of these methods still has room for improvement. Therefore, this paper proposes SentiAttack, a method to introduce 6 perturbations from two perspectives, according to the characteristics of Chinese. The 6 types of perturbation were obtained from both audiovisual deception (words with similar sound, Chinese characters with similar form, horizontal splitting of Chinese character and reverse order of adjacent Chinese characters within word) and contextualized generation (WoBERT-MLM (Su in Wobert: Word-based chinese bert model - zhuiyiai. Technical report, 2020. https://github.com/ZhuiyiTechnology/WoBERT) word generation and LongLM (Guan et al. in Trans Assoc Comput Linguist 10:434–451, 2022. https://doi.org/10.1162/tacl_a_00469) sentence-piece generation), respectively. In addition, a “fluency” metric is added to further measure the quality of the adversarial examples. We conducted experiments on five datasets (CH-SIMS 3, ChnSentiCorp, online shopping, waimai, and weibo8). With the effective constraints of semantic similarity, expression fluency and perturbation, we obtained 74.40%, 49.10%, 42.90%, 39.90% and 66.20% accuracy decrease, respectively.
引用
收藏
页码:5231 / 5258
页数:27
相关论文
共 30 条
[1]  
Chakraborty A(2021)A survey on adversarial attacks and defences CAAI Trans Intell Technol 6 25-45
[2]  
Alam M(2019)Adversarial examples generation approach for tendency classification on chinese texts Ruan Jian Xue Bao/J Softw 30 2415-2427
[3]  
Dey V(2020)A generation method of word-level adversarial samples for chinese text classification Netinfo Secur 20 12-16
[4]  
Chattopadhyay A(2022)Chinese adversarial examples generation approach with multi-strategy based on semantic Knowl Inf Syst 64 1101-1119
[5]  
Mukhopadhyay D(2019)A survey of user profiling: state-of-the-art, challenges, and solutions IEEE Access 7 144907-144924
[6]  
Wang W(2022)A systematic literature review on spam content detection and classification PeerJ Comput Sci 8 e830-3514
[7]  
Wang R(2021)Pre-training with whole word masking for Chinese BERT IEEE ACM Trans Audio Speech Lang Process 29 3504-undefined
[8]  
Wang L(undefined)undefined undefined undefined undefined-undefined
[9]  
Tang B(undefined)undefined undefined undefined undefined-undefined
[10]  
Tong X(undefined)undefined undefined undefined undefined-undefined