Limitations of Large Language Models in Propaganda Detection Task

被引：0

作者：

Szwoch, Joanna ^{[1
]}

Staszkow, Mateusz ^{[2
]}

Rzepka, Rafal ^{[3
]}

Araki, Kenji ^{[3
]}

机构：

[1] Hokkaido Univ, Grad Sch Informat Sci & Technol, Sapporo 0600808, Japan

[2] Mateusz Staszkow Software Dev, PL-01234 Warsaw, Poland

[3] Hokkaido Univ, Fac Informat Sci & Technol, Sapporo 0600808, Japan

来源：

APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 10期

关键词：

propaganda detection; media bias; online news analysis; propaganda in online news; propaganda techniques; FAKE NEWS; MEDIA;

D O I：

10.3390/app14104330

中图分类号：

O6 [化学];

学科分类号：

0703 ;

摘要：

Propaganda in the digital era is often associated with online news. In this study, we focused on the use of large language models and their detection of propaganda techniques in the electronic press to investigate whether it is a noteworthy replacement for human annotators. We prepared prompts for generative pre-trained transformer models to find spans in news articles where propaganda techniques appear and name them. Our study was divided into three experiments on different datasets-two based on an annotated SemEval2020 Task 11 corpora and one on an unannotated subset of the Polish Online News Corpus, which we claim to be an even bigger challenge as an example of an under-resourced language. Reproduction of the results of the first experiment resulted in a higher recall of 64.53% than the original run, and the highest precision of 81.82% was achieved for gpt-4-1106-preview CoT. None of our attempts outperformed the baseline F1 score. One of the attempts with gpt-4-0125-preview on original SemEval2020 Task 11 achieved an almost 20% F1 score, but it was below the baseline, which oscillated around 50%. Part of our work that was dedicated to Polish articles showed that gpt-4-0125-preview had a 74% accuracy in the binary detection of propaganda techniques and 69% in propaganda technique classification. The results for SemEval2020 show that the outputs of generative models tend to be unpredictable and are hardly reproducible for propaganda detection. For the time being, these are unreliable methods for this task, but we believe they can help to generate more training data.

引用

页数：22

共 50 条

[1] A Comparative Study in Large Language Models Usage for Fake News Detection
Emil, Repede Stefan
Brad, Remus
ADVANCES IN ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING, 2024, 4 (04): : 2810 - 2823
[2] Large Language Models and Security
Bezzi, Michele
IEEE SECURITY & PRIVACY, 2024, 22 (02) : 60 - 68
[3] IIITT at CASE 2021 Task 1: Leveraging Pretrained Language Models for Multilingual Protest Detection
Jada, Pawan Kalyan
Reddy, Duddukunta Sashidhar
Hande, Adeep
Priyadharshini, Ruba
Sakuntharaj, Ratnasingam
Chakravarthi, Bharathi Raja
CASE 2021: THE 4TH WORKSHOP ON CHALLENGES AND APPLICATIONS OF AUTOMATED EXTRACTION OF SOCIO-POLITICAL EVENTS FROM TEXT (CASE), 2021, : 98 - 104
[4] Large language models and agricultural extension services
Tzachor, A.
Devare, M.
Richards, C.
Pypers, P.
Ghosh, A.
Koo, J.
Johal, S.
King, B.
NATURE FOOD, 2023, 4 (11): : 941 - 948
[5] Fake news detection: comparative evaluation of BERT-like models and large language models with generative AI-annotated data
Raza, Shaina
Paulen-Patterson, Drai
Ding, Chen
KNOWLEDGE AND INFORMATION SYSTEMS, 2025, : 3267 - 3292
[6] The Security of Using Large Language Models: A Survey with Emphasis on ChatGPT
Zhou, Wei
Zhu, Xiaogang
Han, Qing-Long
Li, Lin
Chen, Xiao
Wen, Sheng
Xiang, Yang
IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2025, 12 (01) : 1 - 26
[7] A Survey on the Use of Large Language Models (LLMs) in Fake News
Papageorgiou, Eleftheria
Chronis, Christos
Varlamis, Iraklis
Himeur, Yassine
FUTURE INTERNET, 2024, 16 (08)
[8] Can Large Language Models Transform Computational Social Science?
Ziems, Caleb
Held, William
Shaikh, Omar
Chen, Jiaao
Zhang, Zhehao
Yang, Diyi
COMPUTATIONAL LINGUISTICS, 2023, 50 (01) : 237 - 291
[9] Fake News Detection in Telugu Language using Transformers Models
Hariharan, R. L.
Jinkathoti, Mahendranath
Kumar, P. Sai Prasanna
Kumar, M. Anand
2024 5TH INTERNATIONAL CONFERENCE ON INNOVATIVE TRENDS IN INFORMATION TECHNOLOGY, ICITIIT 2024, 2024,
[10] Propitter: A Twitter Corpus for Computational Propaganda Detection
Casavantes, Marco
Montes-y-Gomez, Manuel
Carlos Gonzalez, Luis
Barron-Cedeno, Alberto
ADVANCES IN SOFT COMPUTING, MICAI 2023, PT II, 2024, 14392 : 16 - 27

← 1 2 3 4 5 →