Fake News in Sheep's Clothing: Robust Fake News Detection Against LLM-Empowered Style Attacks

被引:12
作者
Wu, Jiaying [1 ]
Guo, Jiafeng [2 ]
Hooi, Bryan [1 ]
机构
[1] Natl Univ Singapore, Singapore, Singapore
[2] Univ Chinese Acad Sci, Inst Comp Technol CAS, Beijing, Peoples R China
来源
PROCEEDINGS OF THE 30TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2024 | 2024年
基金
新加坡国家研究基金会;
关键词
Fake News; Large Language Models; Adversarial Robustness;
D O I
10.1145/3637528.3671977
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
It is commonly perceived that fake news and real news exhibit distinct writing styles, such as the use of sensationalist versus objective language. However, we emphasize that style-related features can also be exploited for style-based attacks. Notably, the advent of powerful Large Language Models (LLMs) has empowered malicious actors to mimic the style of trustworthy news sources, doing so swiftly, cost-effectively, and at scale. Our analysis reveals that LLM-camouflaged fake news content significantly undermines the effectiveness of state-of-the-art text-based detectors (up to 38% decrease in F1 Score), implying a severe vulnerability to stylistic variations. To address this, we introduce SheepDog, a style-robust fake news detector that prioritizes content over style in determining news veracity. SheepDog achieves this resilience through (1) LLM-empowered news reframings that inject style diversity into the training process by customizing articles to match different styles; (2) a style-agnostic training scheme that ensures consistent veracity predictions across style-diverse reframings; and (3) content-focused veracity attributions that distill content-centric guidelines from LLMs for debunking fake news, offering supplementary cues and potential intepretability that assist veracity prediction. Extensive experiments on three real-world benchmarks demonstrate SheepDog's style robustness and adaptability to various backbones.(1)
引用
收藏
页码:3367 / 3378
页数:12
相关论文
共 69 条
[11]   DETERRENT: Knowledge Guided Graph Attention Network for Detecting Healthcare Misinformation [J].
Cui, Limeng ;
Seo, Haeseung ;
Tabar, Maryam ;
Ma, Fenglong ;
Wang, Suhang ;
Lee, Dongwon .
KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, :492-502
[12]  
Dettmers T., 2023, NeurIPS
[13]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[14]  
Dun YQ, 2021, AAAI CONF ARTIF INTE, V35, P81
[15]  
Guan J., 2023, ARXIV
[16]   PETGEN: Personalized Text Generation Attack on Deep Sequence Embedding-based Classification Models [J].
He, Bing ;
Ahamad, Mustaque ;
Kumar, Srijan .
KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, :575-584
[17]  
He Pengcheng, 2021, INT C LEARN REPR
[18]  
He QY, 2024, AAAI CONF ARTIF INTE, P18188
[19]  
He Xiaoxin, 2023, ARXIV
[20]  
Higdon Nolan, 2020, DEMOCRATIC COMMUNIQU, V279