Adversarial Attacks on Language Models: WordPiece Filtration and ChatGPT Synonyms

被引:0
|
作者
T. Ter-Hovhannisyan [1 ]
H. Aleksanyan [1 ]
K. Avetisyan [1 ]
机构
[1] Russian-Armenian University,
[2] ISP RAS,undefined
关键词
D O I
10.1007/s10958-024-07427-z
中图分类号
学科分类号
摘要
Adversarial attacks on text have gained significant attention in recent years due to their potential to undermine the reliability of NLP models. We present novel black-box character- and word-level adversarial example generation approaches applicable to BERT-based models. The character-level approach is based on the idea of adding natural typos into a word according to its WordPiece tokenization. As for word-level approaches, we present three techniques that make use of synonymous substitute words created by ChatGPT and post-corrected to be in the appropriate grammatical form for the given context. Additionally, we try to minimize the perturbation rate taking into account the damage that each perturbation does to the model. By combining character-level approaches, word-level approaches, and the perturbation rate minimization technique, we achieve a state of the art attack rate. Our best approach works 30–65% faster than the previously best method, Tampers, and has a comparable perturbation rate. At the same time, proposed perturbations retain the semantic similarity between the original and adversarial examples and achieve a relatively low value of Levenshtein distance.
引用
收藏
页码:210 / 220
页数:10
相关论文
共 50 条
  • [41] Practical Adversarial Attacks on Spatiotemporal Traffic Forecasting Models
    Liu, Fan
    Liu, Hao
    Jiang, Wenzhao
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [42] Toward Federated Learning Models Resistant to Adversarial Attacks
    Hu, Fei
    Zhou, Wuneng
    Liao, Kaili
    Li, Hongliang
    Tong, Dongbing
    IEEE INTERNET OF THINGS JOURNAL, 2023, 10 (19) : 16917 - 16930
  • [43] Semantically Stealthy Adversarial Attacks against Segmentation Models
    Chen, Zhenhua
    Wang, Chuhua
    Crandall, David
    2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 2846 - 2855
  • [44] HEADLESS HORSEMAN: ADVERSARIAL ATTACKS ON TRANSFER LEARNING MODELS
    Abdelkader, Ahmed
    Curry, Michael J.
    Fowl, Liam
    Goldstein, Tom
    Schwarzschild, Avi
    Shu, Manli
    Studer, Christoph
    Zhu, Chen
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 3087 - 3091
  • [45] Blind Adversarial Training: Towards Comprehensively Robust Models Against Blind Adversarial Attacks
    Xie, Haidong
    Xiang, Xueshuang
    Dong, Bin
    Liu, Naijin
    ARTIFICIAL INTELLIGENCE, CICAI 2023, PT II, 2024, 14474 : 15 - 26
  • [46] Adversarial Defense on Harmony: Reverse Attack for Robust AI Models Against Adversarial Attacks
    Kim, Yebon
    Jung, Jinhyo
    Kim, Hyunjun
    So, Hwisoo
    Ko, Yohan
    Shrivastava, Aviral
    Lee, Kyoungwoo
    Hwang, Uiwon
    IEEE ACCESS, 2024, 12 : 176485 - 176497
  • [47] Generate qualified adversarial attacks and foster enhanced models based on generative adversarial networks
    He, Junpeng
    Luo, Lei
    Xiao, Kun
    Fang, Xiyu
    Li, Yun
    INTELLIGENT DATA ANALYSIS, 2022, 26 (05) : 1359 - 1377
  • [48] ChatGPT and large language models in academia: opportunities and challenges
    Jesse G. Meyer
    Ryan J. Urbanowicz
    Patrick C. N. Martin
    Karen O’Connor
    Ruowang Li
    Pei-Chen Peng
    Tiffani J. Bright
    Nicholas Tatonetti
    Kyoung Jae Won
    Graciela Gonzalez-Hernandez
    Jason H. Moore
    BioData Mining, 16
  • [49] Improving Neural Network Models for Natural Language Processing in Russian with Synonyms
    Galinsky, Ruslan
    Alekseev, Anton
    Nikolenko, Sergey I.
    PROCEEDINGS OF THE 2016 IEEE ARTIFICIAL INTELLIGENCE AND NATURAL LANGUAGE CONFERENCE (AINL FRUCT 2016), 2016, : 45 - 51
  • [50] ChatGPT and large language models in academia: opportunities and challenges
    Meyer, Jesse G.
    Urbanowicz, Ryan J.
    Martin, Patrick C. N.
    O'Connor, Karen
    Li, Ruowang
    Peng, Pei-Chen
    Bright, Tiffani J.
    Tatonetti, Nicholas
    Won, Kyoung Jae
    Gonzalez-Hernandez, Graciela
    Moore, Jason H.
    BIODATA MINING, 2023, 16 (01)