Text clustering with local semantic kernels

被引:10
|
作者
AlSumait, Loulwah [1 ]
Domeniconi, Carlotta [1 ]
机构
[1] George Mason Univ, Dept Comp Sci, Fairfax, VA 22030 USA
关键词
D O I
10.1007/978-1-84800-046-9_5
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Document clustering is a fundamental task of text mining, by which efficient organization, navigation, summarization, and retrieval of documents can be achieved. The clustering of documents presents difficult challenges due to the sparsity and the high dimensionality of text data, and to the complex semantics of natural language. Subspace clustering is an extension of traditional clustering that is designed to capture local feature relevance, and to group documents with respect to the features (or words) that matter the most. This chapter presents a subspace clustering technique based on a locally adaptive clustering (LAC) algorithm. To improve the subspace clustering of documents and the identification of keywords achieved by LAC, kernel methods and semantic distances are deployed. The basic idea is to define a local kernel for each cluster by which semantic distances between pairs of words are computed to derive the clustering and local term weightings. The proposed approach, called semantic LAC, is evaluated using benchmark datasets. Our experiments show that semantic LAC is capable of improving the clustering quality.
引用
收藏
页码:87 / 105
页数:19
相关论文
共 50 条
  • [1] Text clustering with string kernels in R
    Karatzoglou, Alexandros
    Feinerer, Ingo
    ADVANCES IN DATA ANALYSIS, 2007, : 91 - +
  • [2] Semantic smoothing for text clustering
    Nasir, Jamal A.
    Varlamis, Iraklis
    Karim, Asim
    Tsatsaronis, George
    KNOWLEDGE-BASED SYSTEMS, 2013, 54 : 216 - 229
  • [3] A Framework for Semantic Text Clustering
    Fatimi, Soukaina
    EL Saili, Chama
    Alaoui, Larbi
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (06) : 451 - 459
  • [4] Semantic Evaluation of Text Clustering
    Sinh Hoa Nguyen
    Swieboda, Wojciech
    Hung Son Nguyen
    ADVANCED COMPUTATIONAL METHODS FOR KNOWLEDGE ENGINEERING, 2014, 282 : 269 - 280
  • [5] Combined syntactic and semantic kernels for text classification
    Bloehdorn, Stephan
    Moschitti, Alessandro
    ADVANCES IN INFORMATION RETRIEVAL, 2007, 4425 : 307 - +
  • [6] On Semantic Evaluation of Text Clustering Algorithms
    Nguyen, Sinh Hoa
    Swieboda, Wojciech
    Nguyen, Hung Son
    2014 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING (GRC), 2014, : 224 - 229
  • [7] Semantic Enriched Short Text Clustering
    Kozlowski, Marek
    Rybinski, Henryk
    FOUNDATIONS OF INTELLIGENT SYSTEMS, ISMIS 2017, 2017, 10352 : 435 - 445
  • [8] Scalable text semantic clustering around topics
    Brena, Ramon
    Ramirez, Eduardo
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2019, 36 (05) : 4645 - 4657
  • [9] Semantic correlation network based text clustering
    Song, SX
    Li, CP
    AI 2005: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2005, 3809 : 604 - 613
  • [10] Semantic Clustering for a Functional Text Classification Task
    Lippincott, Thomas
    Passonneau, Rebecca
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, 2009, 5449 : 509 - +