Fake News in Sheep's Clothing: Robust Fake News Detection Against LLM-Empowered Style Attacks

被引：12

作者：

Wu, Jiaying ^{[1
]}

Guo, Jiafeng ^{[2
]}

Hooi, Bryan ^{[1
]}

机构：

[1] Natl Univ Singapore, Singapore, Singapore

[2] Univ Chinese Acad Sci, Inst Comp Technol CAS, Beijing, Peoples R China

来源：

PROCEEDINGS OF THE 30TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2024 | 2024年

基金：

新加坡国家研究基金会;

关键词：

Fake News; Large Language Models; Adversarial Robustness;

D O I：

10.1145/3637528.3671977

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

It is commonly perceived that fake news and real news exhibit distinct writing styles, such as the use of sensationalist versus objective language. However, we emphasize that style-related features can also be exploited for style-based attacks. Notably, the advent of powerful Large Language Models (LLMs) has empowered malicious actors to mimic the style of trustworthy news sources, doing so swiftly, cost-effectively, and at scale. Our analysis reveals that LLM-camouflaged fake news content significantly undermines the effectiveness of state-of-the-art text-based detectors (up to 38% decrease in F1 Score), implying a severe vulnerability to stylistic variations. To address this, we introduce SheepDog, a style-robust fake news detector that prioritizes content over style in determining news veracity. SheepDog achieves this resilience through (1) LLM-empowered news reframings that inject style diversity into the training process by customizing articles to match different styles; (2) a style-agnostic training scheme that ensures consistent veracity predictions across style-diverse reframings; and (3) content-focused veracity attributions that distill content-centric guidelines from LLMs for debunking fake news, offering supplementary cues and potential intepretability that assist veracity prediction. Extensive experiments on three real-world benchmarks demonstrate SheepDog's style robustness and adaptability to various backbones.(1)

引用

页码：3367 / 3378

页数：12

共 69 条

[11] DETERRENT: Knowledge Guided Graph Attention Network for Detecting Healthcare Misinformation [J].

Cui, Limeng ;

Seo, Haeseung ;

Tabar, Maryam ;

Ma, Fenglong ;

Wang, Suhang ;

Lee, Dongwon .

KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, :492-502

[12]

Dettmers T., 2023, NeurIPS

[13]

Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171

[14]

Dun YQ, 2021, AAAI CONF ARTIF INTE, V35, P81

[15]

Guan J., 2023, ARXIV

[16] PETGEN: Personalized Text Generation Attack on Deep Sequence Embedding-based Classification Models [J].

He, Bing ;

Ahamad, Mustaque ;

Kumar, Srijan .

KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, :575-584

[17]

He Pengcheng, 2021, INT C LEARN REPR

[18]

He QY, 2024, AAAI CONF ARTIF INTE, P18188

[19]

He Xiaoxin, 2023, ARXIV

[20]

Higdon Nolan, 2020, DEMOCRATIC COMMUNIQU, V279

← 1 2 3 4 5 6 7 →