Enhancing Misinformation Detection in Spanish Language with Deep Learning: BERT and RoBERTa Transformer Models

被引:0
|
作者
Blanco-Fernandez, Yolanda [1 ]
Otero-Vizoso, Javier [2 ]
Gil-Solla, Alberto [1 ]
Garcia-Duque, Jorge [2 ]
机构
[1] Univ Vigo, AtlanTTic Res Ctr Telecommun Technol, Vigo 36310, Spain
[2] Univ Vigo, Escuela Ingn Telecomunicac, Vigo, Spain
来源
APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 21期
关键词
fake news; Spanish; curated synthetic dataset; fine-tuning; Transformer-based models; BERT; RoBERTa; FAKE NEWS;
D O I
10.3390/app14219729
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
This paper presents an approach to identifying political fake news in Spanish using Transformer architectures. Current methodologies often overlook political news due to the lack of quality datasets, especially in Spanish. To address this, we created a synthetic dataset of 57,231 Spanish political news articles, gathered via automated web scraping and enhanced with generative large language models. This dataset is used for fine-tuning and benchmarking Transformer models like BERT and RoBERTa for fake news detection. Our fine-tuned models showed outstanding performance on this dataset, with accuracy ranging from 97.4% to 98.6%. However, testing with a smaller, independent hand-curated dataset, including statements from political leaders during Spain's July 2023 electoral debates, revealed a performance drop to 71%. Although this suggests that the model needs additional refinements to handle the complexity and variability of real-world political discourse, achieving over 70% accuracy seems a promising result in the under-explored domain of Spanish political fake news detection.
引用
收藏
页数:27
相关论文
共 50 条
  • [21] A Deep Learning Model Based on BERT and Sentence Transformer for Semantic Keyphrase Extraction on Big Social Data
    Devika, R.
    Vairavasundaram, Subramaniyaswamy
    Mahenthar, C. Sakthi Jay
    Varadarajan, Vijayakumar
    Kotecha, Ketan
    IEEE ACCESS, 2021, 9 : 165252 - 165261
  • [22] Fake news detection: comparative evaluation of BERT-like models and large language models with generative AI-annotated data
    Raza, Shaina
    Paulen-Patterson, Drai
    Ding, Chen
    KNOWLEDGE AND INFORMATION SYSTEMS, 2025, : 3267 - 3292
  • [23] Fake News Detection Using Deep Learning and Natural Language Processing
    Matheven, Anand
    Venkata, Burra
    Kumar, Durga
    2022 9TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING & MACHINE INTELLIGENCE, ISCMI, 2022, : 11 - 14
  • [24] Fake news detection in Slovak language using deep learning techniques
    Ivancova, Klaudia
    Sarnovsky, Martin
    Maslej-Kresnakova, Viera
    2021 IEEE 19TH WORLD SYMPOSIUM ON APPLIED MACHINE INTELLIGENCE AND INFORMATICS (SAMI 2021), 2021, : 255 - 259
  • [25] Empirical Insights into Deep Learning Models for Misinformation Classification Within Constrained Data Environment
    Devisetti, Jayendra Ganesh
    Sanjana, S.
    Kuranagatti, Shubhankar
    Hiremath, Abhishek
    Arya, Arti
    ENGINEERING APPLICATIONS OF NEURAL NETWORKS, EANN 2024, 2024, 2141 : 120 - 133
  • [26] Enhancing Freezing of Gait Detection in Parkinson's Through Fine-Tuned Deep Learning Models
    Tebaldi, Michele
    Pravadelli, Graziano
    Demrozi, Florenc
    Giugno, Rosalba
    Turetta, Cristian
    2024 IEEE INTERNATIONAL CONFERENCE ON DIGITAL HEALTH, ICDH 2024, 2024, : 87 - 94
  • [27] Bangla Documents Classification using Transformer Based Deep Learning Models
    Rahman, Md Mahbubur
    Pramanik, Md Aktaruzzaman
    Sadik, Rifat
    Roy, Monikrishna
    Chakraborty, Partha
    2020 2ND INTERNATIONAL CONFERENCE ON SUSTAINABLE TECHNOLOGIES FOR INDUSTRY 4.0 (STI), 2020,
  • [28] Deep Learning-Based Short Text Summarization: An Integrated BERT and Transformer Encoder-Decoder Approach
    Ghanem, Fahd A.
    Padma, M. C.
    Abdulwahab, Hudhaifa M.
    Alkhatib, Ramez
    COMPUTATION, 2025, 13 (04)
  • [29] Detecting racism and xenophobia using deep learning models on Twitter data: CNN, LSTM and BERT
    Alberto Benitez-Andrades, Jose
    Gonzalez-Jimenez, Alvaro
    Lopez-Brea, Alvaro
    Aveleira-Mata, Jose
    Alija-Perez, Jose-Manuel
    Teresa Garcia-Ordas, Maria
    PEERJ COMPUTER SCIENCE, 2022, 8
  • [30] Detecting racism and xenophobia using deep learning models on Twitter data: CNN, LSTM and BERT
    Benítez-Andrades J.A.
    González-Jiménez Á.
    López-Brea Á.
    Aveleira-Mata J.
    Alija-Pérez J.-M.
    García-Ordás M.T.
    PeerJ Computer Science, 2022, 8