Analyzing Robustness of Automatic Scientific Claim Verification Tools against Adversarial Rephrasing Attacks

被引:0
|
作者
Layne, Janet [1 ]
Ratul, Qudrat e. alahy [1 ]
Serra, Edoardo [1 ]
Jajodia, Sushil [2 ]
机构
[1] Boise State Univ, Dept Comp Sci, Boise, ID USA
[2] George Mason Univ, Ctr Secure Informat Syst, Fairfax, VA 22030 USA
基金
美国国家科学基金会;
关键词
Neural networks; adversarial attack; scientific claim verification;
D O I
10.1145/3663481
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The coronavirus pandemic has fostered an explosion of misinformation about the disease, including the risk and effectiveness of vaccination. AI tools for automatic Scientific Claim Verification (SCV) can be crucial to defeat misinformation campaigns spreading through social media channels. However, over the past years, many concerns have been raised about the robustness of AIto adversarial attacks, and the field of automatic SCV is not exempt. The risk is that such SCV tools may reinforce and legitimize the spread of fake scientific claims rather than refute them. This article investigates the problem of generating adversarial attacks for SCV tools and shows that it is far more difficult than the generic NLP adversarial attack problem. The current NLP adversarial attack generators, when applied to SCV, often generate modified claims with entirely different meaning from the original. Even when the meaning is preserved, the modification of the generated claim is too simplistic (only a single word is changed), leaving many weaknesses of the SCV tools undiscovered. We propose T5-ParEvo, an iterative evolutionary attack generator, that is able to generate more complex and creative attacks while better preserving the semantics of the original claim. Using detailed quantitative and qualitative analyses, we demonstrate the efficacy of T5-ParEvo in comparison with existing attack generators.
引用
收藏
页数:32
相关论文
共 26 条
  • [21] Robustness Evaluation of Cloud-Deployed Large Language Models against Chinese Adversarial Text Attacks
    Zhang, Yunting
    Ye, Lin
    Li, Baisong
    Zhang, Hongli
    2023 IEEE 12TH INTERNATIONAL CONFERENCE ON CLOUD NETWORKING, CLOUDNET, 2023, : 438 - 442
  • [22] Investigating Universal Adversarial Attacks Against Transformers-Based Automatic Essay Scoring Systems
    Silveira, Igor Cataneo
    Barbosa, Andre
    da Costa, Daniel Silva Lopes
    Maua, Denis Deratani
    INTELLIGENT SYSTEMS, BRACIS 2024, PT II, 2025, 15413 : 169 - 183
  • [23] A Comprehensive Study of the Robustness for LiDAR-Based 3D Object Detectors Against Adversarial Attacks
    Yifan Zhang
    Junhui Hou
    Yixuan Yuan
    International Journal of Computer Vision, 2024, 132 : 1592 - 1624
  • [24] A Comprehensive Study of the Robustness for LiDAR-Based 3D Object Detectors Against Adversarial Attacks
    Zhang, Yifan
    Hou, Junhui
    Yuan, Yixuan
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (05) : 1592 - 1624
  • [25] A Survey of Robustness and Safety of 2D and 3D Deep Learning Models against Adversarial Attacks
    Li, Yanjie
    Xie, Bin
    Guo, Songtao
    Yang, Yuanyuan
    Xiao, Bin
    ACM COMPUTING SURVEYS, 2024, 56 (06)
  • [26] Improving the accuracy and robustness of RRAM-based in-memory computing against RRAM hardware noise and adversarial attacks
    Cherupally, Sai Kiran
    Meng, Jian
    Rakin, Adnan Siraj
    Yin, Shihui
    Yeo, Injune
    Yu, Shimeng
    Fan, Deliang
    Seo, Jae-Sun
    SEMICONDUCTOR SCIENCE AND TECHNOLOGY, 2022, 37 (03)