Limitations of Large Language Models in Propaganda Detection Task

被引:0
|
作者
Szwoch, Joanna [1 ]
Staszkow, Mateusz [2 ]
Rzepka, Rafal [3 ]
Araki, Kenji [3 ]
机构
[1] Hokkaido Univ, Grad Sch Informat Sci & Technol, Sapporo 0600808, Japan
[2] Mateusz Staszkow Software Dev, PL-01234 Warsaw, Poland
[3] Hokkaido Univ, Fac Informat Sci & Technol, Sapporo 0600808, Japan
来源
APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 10期
关键词
propaganda detection; media bias; online news analysis; propaganda in online news; propaganda techniques; FAKE NEWS; MEDIA;
D O I
10.3390/app14104330
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Propaganda in the digital era is often associated with online news. In this study, we focused on the use of large language models and their detection of propaganda techniques in the electronic press to investigate whether it is a noteworthy replacement for human annotators. We prepared prompts for generative pre-trained transformer models to find spans in news articles where propaganda techniques appear and name them. Our study was divided into three experiments on different datasets-two based on an annotated SemEval2020 Task 11 corpora and one on an unannotated subset of the Polish Online News Corpus, which we claim to be an even bigger challenge as an example of an under-resourced language. Reproduction of the results of the first experiment resulted in a higher recall of 64.53% than the original run, and the highest precision of 81.82% was achieved for gpt-4-1106-preview CoT. None of our attempts outperformed the baseline F1 score. One of the attempts with gpt-4-0125-preview on original SemEval2020 Task 11 achieved an almost 20% F1 score, but it was below the baseline, which oscillated around 50%. Part of our work that was dedicated to Polish articles showed that gpt-4-0125-preview had a 74% accuracy in the binary detection of propaganda techniques and 69% in propaganda technique classification. The results for SemEval2020 show that the outputs of generative models tend to be unpredictable and are hardly reproducible for propaganda detection. For the time being, these are unreliable methods for this task, but we believe they can help to generate more training data.
引用
收藏
页数:22
相关论文
共 50 条
  • [1] A Comparative Study in Large Language Models Usage for Fake News Detection
    Emil, Repede Stefan
    Brad, Remus
    ADVANCES IN ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING, 2024, 4 (04): : 2810 - 2823
  • [2] Large Language Models and Security
    Bezzi, Michele
    IEEE SECURITY & PRIVACY, 2024, 22 (02) : 60 - 68
  • [3] IIITT at CASE 2021 Task 1: Leveraging Pretrained Language Models for Multilingual Protest Detection
    Jada, Pawan Kalyan
    Reddy, Duddukunta Sashidhar
    Hande, Adeep
    Priyadharshini, Ruba
    Sakuntharaj, Ratnasingam
    Chakravarthi, Bharathi Raja
    CASE 2021: THE 4TH WORKSHOP ON CHALLENGES AND APPLICATIONS OF AUTOMATED EXTRACTION OF SOCIO-POLITICAL EVENTS FROM TEXT (CASE), 2021, : 98 - 104
  • [4] Large language models and agricultural extension services
    Tzachor, A.
    Devare, M.
    Richards, C.
    Pypers, P.
    Ghosh, A.
    Koo, J.
    Johal, S.
    King, B.
    NATURE FOOD, 2023, 4 (11): : 941 - 948
  • [5] Fake news detection: comparative evaluation of BERT-like models and large language models with generative AI-annotated data
    Raza, Shaina
    Paulen-Patterson, Drai
    Ding, Chen
    KNOWLEDGE AND INFORMATION SYSTEMS, 2025, : 3267 - 3292
  • [6] The Security of Using Large Language Models: A Survey with Emphasis on ChatGPT
    Zhou, Wei
    Zhu, Xiaogang
    Han, Qing-Long
    Li, Lin
    Chen, Xiao
    Wen, Sheng
    Xiang, Yang
    IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2025, 12 (01) : 1 - 26
  • [7] A Survey on the Use of Large Language Models (LLMs) in Fake News
    Papageorgiou, Eleftheria
    Chronis, Christos
    Varlamis, Iraklis
    Himeur, Yassine
    FUTURE INTERNET, 2024, 16 (08)
  • [8] Can Large Language Models Transform Computational Social Science?
    Ziems, Caleb
    Held, William
    Shaikh, Omar
    Chen, Jiaao
    Zhang, Zhehao
    Yang, Diyi
    COMPUTATIONAL LINGUISTICS, 2023, 50 (01) : 237 - 291
  • [9] Fake News Detection in Telugu Language using Transformers Models
    Hariharan, R. L.
    Jinkathoti, Mahendranath
    Kumar, P. Sai Prasanna
    Kumar, M. Anand
    2024 5TH INTERNATIONAL CONFERENCE ON INNOVATIVE TRENDS IN INFORMATION TECHNOLOGY, ICITIIT 2024, 2024,
  • [10] Propitter: A Twitter Corpus for Computational Propaganda Detection
    Casavantes, Marco
    Montes-y-Gomez, Manuel
    Carlos Gonzalez, Luis
    Barron-Cedeno, Alberto
    ADVANCES IN SOFT COMPUTING, MICAI 2023, PT II, 2024, 14392 : 16 - 27