Generating traceable adversarial text examples by watermarking in the semantic space

被引:1
作者
Li, Mingjie [1 ]
Wu, Hanzhou [1 ]
Zhang, Xinpeng [1 ]
机构
[1] Shanghai Univ, Sch Commun & Informat Engn, Shanghai, Peoples R China
基金
中国国家自然科学基金;
关键词
adversarial examples; natural language; watermarking; text quality; deep learning; DIGITAL WATERMARKING;
D O I
10.1117/1.JEI.31.6.063034
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The adversarial examples have been proven to reveal the vulnerability of the deep neural networks (DNNs) model, which can be used to evaluate the performance and further improve the robustness of the model. Because text data is discrete, it is more difficult to generate adversarial examples in the natural language processing (NLP) domain than in the image domain. One of the challenges is that the generated adversarial text examples should maintain the correctness of grammar and the semantic similarity compared with the original texts. In this paper, we propose an adversarial text generation model, which generates high-quality adversarial text examples through an end-to-end model. Moreover, the adversarial text examples generated by our proposed model are embedded with watermarks, which can mark and trace the source of the generated adversarial text examples and prevent the model from being maliciously or illegally used. The experimental results show that the attack success rates of the proposed model can still reach higher than 88% even on the AG's News dataset where generating adversarial text examples is more difficult. And the quality of adversarial text examples generated by the proposed model is higher than that of the baseline models. At the same time, because of the generated adversarial text examples are embedded with strong robust watermarks, the model can be better protected. (c) 2022 SPIE and IS&T
引用
收藏
页数:17
相关论文
共 52 条
[1]   Adversarial Watermarking Transformer: Towards Tracing Text Provenance with Data Hiding [J].
Abdelnabi, Sahar ;
Fritz, Mario .
2021 IEEE SYMPOSIUM ON SECURITY AND PRIVACY, SP, 2021, :121-140
[2]  
Alzantot M, 2018, 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), P2890
[3]  
[Anonymous], 2011, P 49 ANN M ASS COMP
[4]  
Belinkov Yonatan, 2018, Synthetic and natural noise both break neural machine translation
[5]  
Bowman S. R., 2015, C EMP METH NAT LANG, DOI DOI 10.18653/V1/D15-1075
[6]   Copyright protection for the electronic distribution of text documents [J].
Brassil, JT ;
Low, S ;
Maxemchuk, NF .
PROCEEDINGS OF THE IEEE, 1999, 87 (07) :1181-1196
[7]  
Chen JP, 2016, 2016 THIRD INTERNATIONAL CONFERENCE ON DIGITAL INFORMATION PROCESSING, DATA MINING, AND WIRELESS COMMUNICATIONS (DIPDMWC), P117, DOI 10.1109/DIPDMWC.2016.7529374
[8]   A Novel Digital Watermarking Based on General Non-Negative Matrix Factorization [J].
Chen, Zigang ;
Li, Lixiang ;
Peng, Haipeng ;
Liu, Yuhong ;
Yang, Yixian .
IEEE TRANSACTIONS ON MULTIMEDIA, 2018, 20 (08) :1973-1986
[9]  
Choi K, 2020, PR MACH LEARN RES, V119
[10]   A sentence-level text adversarial attack algorithm against IIoT based smart grid [J].
Dong, Jialiang ;
Guan, Zhitao ;
Wu, Longfei ;
Du, Xiaojiang ;
Guizani, Mohsen .
COMPUTER NETWORKS, 2021, 190