Clustering of Rough Set Related Documents with Use of Knowledge from DBpedia

被引:0
|
作者
Szczuka, Marcin [1 ]
Janusz, Andrzej [1 ]
Herba, Kamil [1 ]
机构
[1] Univ Warsaw, Fac Math Informat & Mech, PL-02097 Warsaw, Poland
来源
ROUGH SETS AND KNOWLEDGE TECHNOLOGY | 2011年 / 6954卷
关键词
Text mining; semantic clustering; DBpedia; document grouping; rough sets;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A case study of semantic clustering of scientific articles related to Rough Sets is presented. The proposed method groups the documents on the basis of their content and with assistance of DBpedia knowledge base. The text corpus is first treated with Natural Language Processing tools in order to produce vector representations of the content and then matched against a collection of concepts retrieved from DBpedia. As a result, a new representation is constructed that better reflects the semantics of the texts. With this new representation, the documents are hierarchically clustered in order to form partition of papers that share semantic relatedness. The steps in textual data preparation, utilization of DBpedia and clustering are explained and illustrated with results of experiments performed on a corpus of scientific documents about rough sets.
引用
收藏
页码:394 / 403
页数:10
相关论文
共 50 条
  • [1] Semantic Clustering of Scientific Articles with Use of DBpedia Knowledge Base
    Szczuka, Marcin
    Janusz, Andrzej
    Herba, Kamil
    INTELLIGENT TOOLS FOR BUILDING A SCIENTIFIC INFORMATION PLATFORM, 2012, 390 : 61 - 76
  • [2] A rough set based clustering method by knowledge combination
    Okuzaki, T
    Hirano, S
    Kobashi, S
    Hata, Y
    Takahashi, Y
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2002, E85D (12) : 1898 - 1908
  • [3] CLUSTERING DOCUMENTS WITH LARGE OVERLAP OF TERMS INTO DIFFERENT CLUSTERS BASED ON SIMILARITY ROUGH SET MODEL
    Nguyen Chi Thanh
    Yamada, Koichi
    Unehara, Muneyuki
    KDIR 2010: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND INFORMATION RETRIEVAL, 2010, : 396 - 399
  • [4] Precision of Rough Set Clustering
    Lingras, Pawan
    Chen, Min
    Miao, Duoqian
    ROUGH SETS AND CURRENT TRENDS IN COMPUTING, PROCEEDINGS, 2008, 5306 : 369 - +
  • [5] A Similarity Rough Set Model for Document Representation and Document Clustering
    Nguyen Chi Thanh
    Yamada, Koichi
    Unehara, Muneyuki
    JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2011, 15 (02) : 125 - 133
  • [6] Integrating rough set principles in the graded possibilistic clustering
    Ferone, Alessio
    Maratea, Antonio
    INFORMATION SCIENCES, 2019, 477 : 148 - 160
  • [7] Rough Set Based Clustering in Dense Web Domain
    Mishra, Rajhans
    Kumar, Pradeep
    Bhasker, Bharat
    2012 12TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS (ISDA), 2012, : 521 - 526
  • [8] The information entropy, rough entropy and knowledge granulation in rough set theory
    Liang, JY
    Shi, ZZ
    INTERNATIONAL JOURNAL OF UNCERTAINTY FUZZINESS AND KNOWLEDGE-BASED SYSTEMS, 2004, 12 (01) : 37 - 46
  • [9] Rough Approximations and the Minimal Set in Knowledge Spaces
    Xu, Feifei
    Miao, Duoqian
    Yao, Yiyu
    Wei, Lai
    JOURNAL OF ADVANCED MATHEMATICS AND APPLICATIONS, 2013, 2 (02) : 162 - 171
  • [10] Extraction of diagnostic knowledge from clinical databases based on rough set theory
    Tsumoto, S
    Tanaka, H
    SOFT COMPUTING IN INTELLIGENT SYSTEMS AND INFORMATION PROCESSING, 1996, : 145 - 151