Enhancing Misinformation Detection in Spanish Language with Deep Learning: BERT and RoBERTa Transformer Models

被引:0
|
作者
Blanco-Fernandez, Yolanda [1 ]
Otero-Vizoso, Javier [2 ]
Gil-Solla, Alberto [1 ]
Garcia-Duque, Jorge [2 ]
机构
[1] Univ Vigo, AtlanTTic Res Ctr Telecommun Technol, Vigo 36310, Spain
[2] Univ Vigo, Escuela Ingn Telecomunicac, Vigo, Spain
来源
APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 21期
关键词
fake news; Spanish; curated synthetic dataset; fine-tuning; Transformer-based models; BERT; RoBERTa; FAKE NEWS;
D O I
10.3390/app14219729
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
This paper presents an approach to identifying political fake news in Spanish using Transformer architectures. Current methodologies often overlook political news due to the lack of quality datasets, especially in Spanish. To address this, we created a synthetic dataset of 57,231 Spanish political news articles, gathered via automated web scraping and enhanced with generative large language models. This dataset is used for fine-tuning and benchmarking Transformer models like BERT and RoBERTa for fake news detection. Our fine-tuned models showed outstanding performance on this dataset, with accuracy ranging from 97.4% to 98.6%. However, testing with a smaller, independent hand-curated dataset, including statements from political leaders during Spain's July 2023 electoral debates, revealed a performance drop to 71%. Although this suggests that the model needs additional refinements to handle the complexity and variability of real-world political discourse, achieving over 70% accuracy seems a promising result in the under-explored domain of Spanish political fake news detection.
引用
收藏
页数:27
相关论文
共 50 条
  • [41] Deep learning with language models improves named entity recognition for PharmaCoNER
    Sun, Cong
    Yang, Zhihao
    Wang, Lei
    Zhang, Yin
    Lin, Hongfei
    Wang, Jian
    BMC BIOINFORMATICS, 2021, 22 (SUPPL 1)
  • [42] Fabricated Hadith Detection: A Novel Matn-Based Approach With Transformer Language Models
    Gaanoun, Kamel
    Alsuhaibani, Mohammed
    IEEE ACCESS, 2022, 10 : 113330 - 113342
  • [43] Detection of Suicidal Intent in Spanish Language Social Networks using Machine Learning
    Valeriano, Kid
    Condori-Larico, Alexia
    Sulla-Torres, Jose
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (04) : 688 - 695
  • [44] Context-Based Fake News Detection Model Relying on Deep Learning Models
    Amer, Eslam
    Kwak, Kyung-Sup
    El-Sappagh, Shaker
    ELECTRONICS, 2022, 11 (08)
  • [45] IoV-BERT-IDS: Hybrid Network Intrusion Detection System in IoV Using Large Language Models
    Fu, Mengyi
    Wang, Pan
    Liu, Minyao
    Zhang, Ze
    Zhou, Xiaokang
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2025, 74 (02) : 1909 - 1921
  • [46] Transformer-based deep learning models for the sentiment analysis of social media data
    Kokab, Sayyida Tabinda
    Asghar, Sohail
    Naz, Shehneela
    ARRAY, 2022, 14
  • [47] Topic Detection based on Deep Learning Language Model in Turkish Microblogs
    Sahinuc, Furkan
    Toraman, Cagri
    Koc, Aykut
    29TH IEEE CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS (SIU 2021), 2021,
  • [48] Adapting transformer-based language models for heart disease detection and risk factors extraction
    Houssein, Essam H.
    Mohamed, Rehab E.
    Hu, Gang
    Ali, Abdelmgeid A.
    JOURNAL OF BIG DATA, 2024, 11 (01)
  • [49] Adapting transformer-based language models for heart disease detection and risk factors extraction
    Essam H. Houssein
    Rehab E. Mohamed
    Gang Hu
    Abdelmgeid A. Ali
    Journal of Big Data, 11
  • [50] Context-Sensitive Visualization of Deep Learning Natural Language Processing Models
    Dunn, Andrew
    Inkpen, Diana
    Andonie, Razvan
    2021 25TH INTERNATIONAL CONFERENCE INFORMATION VISUALISATION (IV): AI & VISUAL ANALYTICS & DATA SCIENCE, 2021, : 170 - 175