Event Graph-Based News Clustering: The Role of Named Entity-Centered Subgraphs

被引:0
|
作者
Komecoglu, Basak Buluz [1 ]
Yilmaz, Burcu [1 ]
机构
[1] Gebze Tech Univ, Inst Informat Technol, TR-41400 Gebze, Kocaeli, Turkiye
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Task analysis; Clustering algorithms; Vectors; Context modeling; Computational modeling; Analytical models; Semantics; Natural language processing; Text processing; Frequent subgraph mining; low-resource language; natural language processing; text clustering; TOPIC DETECTION; SIMILARITY;
D O I
10.1109/ACCESS.2024.3435343
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In an era of exponential growth in online news sources, the need for intelligent digital solutions capable of efficiently analyzing and organizing large amounts of news content has become crucial. This paper presents a graph-based methodology designed to enhance Topic Detection and Tracking (TDT) tasks in natural language processing by efficiently clustering news events into coherent stories. The proposed approach leverages a novel event graph model that captures not only the characteristics of individual news events but also their collective narrative context. Using Named Entity Centred Frequent Subgraphs, the model excels in identifying recurring patterns of events and thus provides a framework for learning a robust, language-independent, and structured representation for structuring news stories, which represents a significant advance in the refinement of traditional clustering algorithms. Empirical experiments using a multilingual benchmark dataset, the News Clustering Dataset, highlight the superior clustering performance of our approach compared to state-of-the-art monolingual document clustering techniques, particularly in English and the competitive results in Spanish. To underline the adaptability of the methodology to low-resource languages, the Turkish 'Story-Based News Dataset' developed specifically for this study also promises to serve as an important resource for a wide range of natural language processing tasks.
引用
收藏
页码:105613 / 105632
页数:20
相关论文
共 27 条
  • [21] Parameter-Free Consensus Embedding Learning for Multiview Graph-Based Clustering
    Wu, Danyang
    Nie, Feiping
    Dong, Xia
    Wang, Rong
    Li, Xuelong
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (12) : 7944 - 7950
  • [22] GRACE: Graph-Based Attention for Coherent Explanation in Fake News Detection on Social Media
    Mamyrbayev, Orken
    Turysbek, Zhanibek
    Afzal, Mariam
    Abdurakhimovich, Marassulov Ussen
    Galiya, Ybytayeva
    Abdullah, Muhammad
    Ul Amin, Riaz
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2025, 16 (01) : 1159 - 1171
  • [23] Graph-Based Semi-Supervised Deep Image Clustering With Adaptive Adjacency Matrix
    Ding, Shifei
    Hou, Haiwei
    Xu, Xiao
    Zhang, Jian
    Guo, Lili
    Ding, Ling
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (12) : 18828 - 18837
  • [24] News Keyword Extraction Algorithm Based on Semantic Clustering and Word Graph Model
    Xiong, Ao
    Liu, Derong
    Tian, Hongkang
    Liu, Zhengyuan
    Yu, Peng
    Kadoch, Michel
    TSINGHUA SCIENCE AND TECHNOLOGY, 2021, 26 (06) : 886 - 893
  • [25] Graph-based association rule learning for context-based health monitoring to enable user-centered assistance
    Sahlab, Nada
    Sonji, Iman
    Weyrich, Michael
    ARTIFICIAL INTELLIGENCE IN MEDICINE, 2023, 135
  • [26] Robust Rank-Constrained Sparse Learning: A Graph-Based Framework for Single View and Multiview Clustering
    Wang, Qi
    Liu, Ran
    Chen, Mulin
    Li, Xuelong
    IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (10) : 10228 - 10239
  • [27] GRACE: A Graph-Based Cluster Ensemble Approach for Single-Cell RNA-Seq Data Clustering
    Guan, Jihong
    Li, Rui-Yi
    Wang, Jiasheng
    IEEE ACCESS, 2020, 8 : 166730 - 166741