HQA-Attack: Toward High Quality Black-Box Hard-Label Adversarial Attack on Text

被引:0
作者
Liu, Han [1 ]
Xu, Zhi [1 ]
Zhang, Xiaotong [1 ]
Zhang, Feng [2 ]
Ma, Fenglong [3 ]
Chen, Hongyang [4 ]
Yu, Hong [1 ]
Zhang, Xianchao [1 ]
机构
[1] Dalian Univ Technol, Dalian, Peoples R China
[2] Peking Univ, Beijing, Peoples R China
[3] Penn State Univ, University Pk, PA 16802 USA
[4] Zhejiang Lab, Hangzhou, Peoples R China
来源
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023) | 2023年
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Black-box hard-label adversarial attack on text is a practical and challenging task, as the text data space is inherently discrete and non-differentiable, and only the predicted label is accessible. Research on this problem is still in the embryonic stage and only a few methods are available. Nevertheless, existing methods rely on the complex heuristic algorithm or unreliable gradient estimation strategy, which probably fall into the local optimum and inevitably consume numerous queries, thus are difficult to craft satisfactory adversarial examples with high semantic similarity and low perturbation rate in a limited query budget. To alleviate above issues, we propose a simple yet effective framework to generate high quality textual adversarial examples under the black-box hard-label attack scenarios, named HQA-Attack. Specifically, after initializing an adversarial example randomly, HQA-attack first constantly substitutes original words back as many as possible, thus shrinking the perturbation rate. Then it leverages the synonym set of the remaining changed words to further optimize the adversarial example with the direction which can improve the semantic similarity and satisfy the adversarial condition simultaneously. In addition, during the optimizing procedure, it searches a transition synonym word for each changed word, thus avoiding traversing the whole synonym set and reducing the query number to some extent. Extensive experimental results on five text classification datasets, three natural language inference datasets and two real-world APIs have shown that the proposed HQA-Attack method outperforms other strong baselines significantly.
引用
收藏
页数:12
相关论文
共 43 条
[1]  
Alzantot M, 2018, 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), P2890
[2]  
[Anonymous], 2019, NDSS, DOI DOI 10.14722/NDSS.2019.23138
[3]  
[Anonymous], 2019, ACL
[4]   HAPTR2: Improved Haptic Transformer for legged robots? terrain classification [J].
Bednarek, Michal ;
Nowicki, Michal R. ;
Walas, Krzysztof .
ROBOTICS AND AUTONOMOUS SYSTEMS, 2022, 158
[5]  
Bowman S.R., 2015, C P EMNLP 2015 C EMP, P632
[6]  
Cer D, 2018, CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018): PROCEEDINGS OF SYSTEM DEMONSTRATIONS, P169
[7]  
Chen Y., 2022, P 2022 C EMPIRICAL M, P11222
[8]  
Cheng MH, 2020, AAAI CONF ARTIF INTE, V34, P3601
[9]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[10]  
Ebrahimi J, 2018, PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 2, P31