RumorLLM: A Rumor Large Language Model-Based Fake-News-Detection Data-Augmentation Approach

被引：13

作者：

Lai, Jianqiao ^{[1
]}

Yang, Xinran ^{[1
]}

Luo, Wenyue ^{[1
]}

Zhou, Linjiang ^{[1
]}

Li, Langchen ^{[1
]}

Wang, Yongqi ^{[1
]}

Shi, Xiaochuan ^{[1
]}

机构：

[1] Wuhan Univ, Sch Cyber Sci & Engn, Wuhan 430072, Peoples R China

来源：

APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 08期

关键词：

fake-news detection; large language models; rumor generation; category imbalance; data augmentation;

D O I：

10.3390/app14083532

中图分类号：

O6 [化学];

学科分类号：

0703 ;

摘要：

With the rapid development of the Internet and social media, false information, rumors, and misleading content have become pervasive, posing significant threats to public opinion and social stability, and even causing serious societal harm. This paper introduces a novel solution to address the challenges of fake news detection, presenting the "Rumor Large Language Models" (RumorLLM), a large language model finetuned with rumor writing styles and content. The key contributions include the development of RumorLLM and a data-augmentation method for small categories, effectively mitigating the issue of category imbalance in real-world fake-news datasets. Experimental results on the BuzzFeed and PolitiFact datasets demonstrate the superiority of the proposed model over baseline methods, particularly in F1 score and AUC-ROC. The model's robust performance highlights its effectiveness in handling imbalanced datasets and provides a promising solution to the pressing issue of false-information proliferation.

引用

页数：16

共 45 条

[1] Rumor detection in Arabic tweets using semi-supervised and unsupervised expectation-maximization [J].

Alzanin, Samah M. ;

Azmi, Aqil M. .

KNOWLEDGE-BASED SYSTEMS, 2019, 185

[2]

Amjad M, 2020, PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), P2537

[3]

Begus G, 2025, Arxiv, DOI arXiv:2305.00948

[4]

Bhattacharjee Saranya, 2023, Computational Intelligence in Pattern Recognition: Proceedings of CIPR 2023. Lecture Notes in Networks and Systems (725), P427, DOI 10.1007/978-981-99-3734-9_35

[5]

Cao J., 2020, Exploring the role of visual content in fake news detection, disinformation, misinformation, and fake news in social media, P141

[6] Content-Based Fake News Detection With Machine and Deep Learning: a Systematic Review [J].

Capuano, Nicola ;

Fenza, Giuseppe ;

Loia, Vincenzo ;

Nota, Francesco David .

NEUROCOMPUTING, 2023, 530 :91-103

[7]

Castillo C., 2011, P 20 INT C WORLD WID, P675, DOI 10.1145/1963405.1963500

[8]

Granik M, 2017, 2017 IEEE FIRST UKRAINE CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (UKRCON), P900, DOI 10.1109/UKRCON.2017.8100379

[9]

He P, 2020, arXiv, DOI DOI 10.48550/ARXIV.2006.03654

[10]

Hossain M.M., 2021, P INT C 4 IND REVOLU, P723

← 1 2 3 4 5 →