Analyzing Robustness of Automatic Scientific Claim Verification Tools against Adversarial Rephrasing Attacks

被引：0

作者：

Layne, Janet ^{[1
]}

Ratul, Qudrat e. alahy ^{[1
]}

Serra, Edoardo ^{[1
]}

Jajodia, Sushil ^{[2
]}

机构：

[1] Boise State Univ, Dept Comp Sci, Boise, ID USA

[2] George Mason Univ, Ctr Secure Informat Syst, Fairfax, VA 22030 USA

来源：

ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY | 2024年 / 15卷 / 05期

基金：

美国国家科学基金会;

关键词：

Neural networks; adversarial attack; scientific claim verification;

D O I：

10.1145/3663481

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The coronavirus pandemic has fostered an explosion of misinformation about the disease, including the risk and effectiveness of vaccination. AI tools for automatic Scientific Claim Verification (SCV) can be crucial to defeat misinformation campaigns spreading through social media channels. However, over the past years, many concerns have been raised about the robustness of AIto adversarial attacks, and the field of automatic SCV is not exempt. The risk is that such SCV tools may reinforce and legitimize the spread of fake scientific claims rather than refute them. This article investigates the problem of generating adversarial attacks for SCV tools and shows that it is far more difficult than the generic NLP adversarial attack problem. The current NLP adversarial attack generators, when applied to SCV, often generate modified claims with entirely different meaning from the original. Even when the meaning is preserved, the modification of the generated claim is too simplistic (only a single word is changed), leaving many weaknesses of the SCV tools undiscovered. We propose T5-ParEvo, an iterative evolutionary attack generator, that is able to generate more complex and creative attacks while better preserving the semantics of the original claim. Using detailed quantitative and qualitative analyses, we demonstrate the efficacy of T5-ParEvo in comparison with existing attack generators.

引用

页数：32

共 26 条

[21] Robustness Evaluation of Cloud-Deployed Large Language Models against Chinese Adversarial Text Attacks
Zhang, Yunting
Ye, Lin
Li, Baisong
Zhang, Hongli
2023 IEEE 12TH INTERNATIONAL CONFERENCE ON CLOUD NETWORKING, CLOUDNET, 2023, : 438 - 442
[22] Investigating Universal Adversarial Attacks Against Transformers-Based Automatic Essay Scoring Systems
Silveira, Igor Cataneo
Barbosa, Andre
da Costa, Daniel Silva Lopes
Maua, Denis Deratani
INTELLIGENT SYSTEMS, BRACIS 2024, PT II, 2025, 15413 : 169 - 183
[23] A Comprehensive Study of the Robustness for LiDAR-Based 3D Object Detectors Against Adversarial Attacks
Yifan Zhang
Junhui Hou
Yixuan Yuan
International Journal of Computer Vision, 2024, 132 : 1592 - 1624
[24] A Comprehensive Study of the Robustness for LiDAR-Based 3D Object Detectors Against Adversarial Attacks
Zhang, Yifan
Hou, Junhui
Yuan, Yixuan
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (05) : 1592 - 1624
[25] A Survey of Robustness and Safety of 2D and 3D Deep Learning Models against Adversarial Attacks
Li, Yanjie
Xie, Bin
Guo, Songtao
Yang, Yuanyuan
Xiao, Bin
ACM COMPUTING SURVEYS, 2024, 56 (06)
[26] Improving the accuracy and robustness of RRAM-based in-memory computing against RRAM hardware noise and adversarial attacks
Cherupally, Sai Kiran
Meng, Jian
Rakin, Adnan Siraj
Yin, Shihui
Yeo, Injune
Yu, Shimeng
Fan, Deliang
Seo, Jae-Sun
SEMICONDUCTOR SCIENCE AND TECHNOLOGY, 2022, 37 (03)

← 1 2 3 →