Formal concept analysis for topic detection: A clustering quality experimental analysis

被引:36
作者
Castellanos, A. [1 ]
Cigarran, J. [1 ]
Garcia-Serrano, A. [1 ]
机构
[1] ETSI Informt UNED, C Juan Rosal 16, Madrid, Spain
关键词
Formal concept analysis; Topic detection; Clustering quality analysis; Hierarchical agglomerative clustering; Latent dirichlet allocation; STABILITY; MODELS;
D O I
10.1016/j.is.2017.01.008
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The Topic Detection task is focused on discovering the main topics addressed by a series of documents (e.g., news reports, e-mails, tweets). Topics, defined in this way, are expected to be thematically similar, cohesive and self-contained. This task has been broadly studied from the point of view of clustering and probabilistic techniques. In this work, we propose for this task the application of Formal Concept Analysis (FCA), an exploratory technique for data analysis and organization. In particular, we propose an extension of FCA-based methods for topic detection applied in the literature by applying the stability concept for the topic selection. The hypothesis is that FCA will enable the better organization of the data and stability the better selection of topics based on this data organization, thus better fulfilling the task requirements by improving the quality and accuracy of the topic detection process. In addition, the proposed FCA-based methodology is able to cope with some well-known drawbacks that clustering and probabilistic methodologies present, such as: the need to set a predefined number of clusters or the difficulty in dealing with topics with complex generalization-specialization relationships. In order to prove this hypothesis, the FCA operation is compared to other established techniques Hierarchical Agglomerative Clustering (HAC) and Latent Dirichlet Allocation (LDA). To allow this comparison, these approaches have been implemented by the authors in a novel experimental framework. The quality of the topics detected by the different approaches in terms of their suitability for the topic detection task is evaluated by means of internal clustering validity metrics. This evaluation demonstrates that FCA generates cohesive clusters, which are less subject to changes in cluster granularity. Driven by the quality of the detected topics, FCA achieves the best general outcome, improving the experimental results for Topic Detection Task at the 2013 Replab Campaign. (C) 2017 Elsevier Ltd. All rights reserved.
引用
收藏
页码:24 / 42
页数:19
相关论文
共 50 条
  • [21] A rough view of concept in formal concept analysis
    Jia Liu
    Ming Li
    2005 International Symposium on Computer Science and Technology, Proceedings, 2005, : 283 - 289
  • [22] Definition of Strategies for Crime Prevention and Combat Using Fuzzy Clustering and Formal Concept Analysis
    Guimaraes de Farias, Adriana M.
    Cintra, Marcos E.
    Felix, Angelica C.
    Cavalcante, Danniel L.
    INTERNATIONAL JOURNAL OF UNCERTAINTY FUZZINESS AND KNOWLEDGE-BASED SYSTEMS, 2018, 26 (03) : 429 - 452
  • [23] Categorization of Multiple Documents Using Fuzzy Overlapping Clustering Based on Formal Concept Analysis
    Chen, Yi-Hui
    Lu, Eric Jui-Lin
    Cheng, Ya-Wen
    INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2020, 30 (05) : 631 - 647
  • [24] A formal concept analysis approach to consensus clustering of multi-experiment expression data
    Anna Hristoskova
    Veselka Boeva
    Elena Tsiporkova
    BMC Bioinformatics, 15
  • [25] A formal concept analysis approach to consensus clustering of multi-experiment expression data
    Hristoskova, Anna
    Boeva, Veselka
    Tsiporkova, Elena
    BMC BIOINFORMATICS, 2014, 15
  • [26] A Synergy Between Machine Learning and Formal Concept Analysis for Crowd Detection
    Al-Oraiqat, Anas M.
    Drieiev, Oleksandr
    Almatarneh, Sattam
    Injadat, Mohammadnoor
    Al-Oraiqat, Karim A.
    Drieieva, Hanna
    Hasan, Yassin M. Y.
    IEEE ACCESS, 2025, 13 : 36804 - 36823
  • [27] Factorization with Hierarchical Classes Analysis and with Formal Concept Analysis
    Glodeanu, Cynthia Vera
    FORMAL CONCEPT ANALYSIS, 2011, 6628 : 107 - 118
  • [28] Formal concept analysis model for static code analysis
    Motogna, Simona
    Cristea, Diana
    Sotropa, Diana
    Molnar, Arthur-Jozsef
    CARPATHIAN JOURNAL OF MATHEMATICS, 2022, 38 (01) : 159 - 168
  • [29] On Pseudointents in Fuzzy Formal Concept Analysis
    Ojeda-Hernandez, Manuel
    Cabrera, Inma P.
    Cordero, Pablo
    Munoz-Velasco, Emilio
    GRAPH-BASED REPRESENTATION AND REASONING, ICCS 2023, 2023, 14133 : 36 - 40
  • [30] On computable automorphisms in formal concept analysis
    A. S. Morozov
    Siberian Mathematical Journal, 2010, 51 : 289 - 295