Improvement of Misleading and Fake News Classification for Flective Languages by Morphological Group Analysis

被引:17
|
作者
Kapusta, Jozef [1 ]
Obonya, Juraj [1 ]
机构
[1] Constantine Philosopher Univ Nitra, Dept Informat, SK-94974 Nitra, Slovakia
来源
INFORMATICS-BASEL | 2020年 / 7卷 / 01期
关键词
fake news identification; text mining; natural language processing; Part of speech tagging; morphological analysis;
D O I
10.3390/informatics7010004
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Due to the constantly evolving social media and different types of sources of information, we are facing different fake news and different types of misinformation. Currently, we are working on a project to identify applicable methods for identifying fake news for floating language types. We explored different approaches to detect fake news in the presented research, which are based on morphological analysis. This is one of the basic components of natural language processing. The aim of the article is to find out whether it is possible to improve the methods of dataset preparation based on morphological analysis. We collected our own and unique dataset, which consisted of articles from verified publishers and articles from news portals that are known as the publishers of fake and misleading news. Articles were in the Slovak language, which belongs to the floating types of languages. We explored different approaches in this article to the dataset preparation based on morphological analysis. The prepared datasets were the input data for creating the classifier of fake and real news. We selected decision trees for classification. The evaluation of the success of two different methods of preparation was carried out because of the success of the created classifier. We found a suitable dataset pre-processing technique by morphological group analysis. This technique could be used for improving fake news classification.
引用
收藏
页数:10
相关论文
共 18 条
  • [1] Not all Fake News is Written: A Dataset and Analysis of Misleading Video Headlines
    Sung, Yoo Yeon
    Boyd-Graber, Jordan
    Hassan, Naeemul
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023,
  • [2] An Arabic Corpus of Fake News: Collection, Analysis and Classification
    Alkhair, Maysoon
    Meftouh, Karima
    Smaili, Kamel
    Othman, Nouha
    ARABIC LANGUAGE PROCESSING: FROM THEORY TO PRACTICE, ICALP 2019, 2019, 1108 : 292 - 302
  • [3] Using of n-grams from morphological tags for fake news classification
    Kapusta, Jozef
    Drlik, Martin
    Munk, Michal
    PEERJ COMPUTER SCIENCE, 2021, 7
  • [4] Using of n-grams from morphological tags for fake news classification
    Kapusta J.
    Drlik M.
    Munk M.
    PeerJ Computer Science, 2021, 7 : 1 - 27
  • [5] Analysis and Classification of Fake News Using Sequential Pattern Mining
    Nawaz, M. Zohaib
    Nawaz, M. Saqib
    Fournier-Viger, Philippe
    He, Yulin
    BIG DATA MINING AND ANALYTICS, 2024, 7 (03): : 942 - 963
  • [6] Fake News Classification Using Vectorized Semantic and Syntactical Analysis
    Kumar, Sanjay
    Dhingra, Payas
    Jaiswal, Pushkar
    Bharti, Rohit
    ADVANCES IN DATA AND INFORMATION SCIENCES, 2022, 318 : 539 - 550
  • [7] Fake News as Discursive Integration: An Analysis of Sites That Publish False, Misleading, Hyperpartisan and Sensational Information
    Mourao, Rachel R.
    Robertson, Craig T.
    JOURNALISM STUDIES, 2019, 20 (14) : 2077 - 2095
  • [8] Feature extraction from unstructured texts as a combination of the morphological and the syntactic analysis and its usage in fake news classification tasks
    Kitti Szabó Nagy
    Jozef Kapusta
    Michal Munk
    Neural Computing and Applications, 2023, 35 : 22055 - 22067
  • [9] Feature extraction from unstructured texts as a combination of the morphological and the syntactic analysis and its usage in fake news classification tasks
    Szabo Nagy, Kitti
    Kapusta, Jozef
    Munk, Michal
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (29): : 22055 - 22067
  • [10] Analysis of Fake News Classification for Insight into the Roles of Different Data Types
    Ferreira, Victor C.
    Kundu, Sandip
    Franca, Felipe M. G.
    16TH IEEE INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC 2022), 2022, : 75 - 82