Clustering of Rough Set Related Documents with Use of Knowledge from DBpedia

被引:0
|
作者
Szczuka, Marcin [1 ]
Janusz, Andrzej [1 ]
Herba, Kamil [1 ]
机构
[1] Univ Warsaw, Fac Math Informat & Mech, PL-02097 Warsaw, Poland
来源
ROUGH SETS AND KNOWLEDGE TECHNOLOGY | 2011年 / 6954卷
关键词
Text mining; semantic clustering; DBpedia; document grouping; rough sets;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A case study of semantic clustering of scientific articles related to Rough Sets is presented. The proposed method groups the documents on the basis of their content and with assistance of DBpedia knowledge base. The text corpus is first treated with Natural Language Processing tools in order to produce vector representations of the content and then matched against a collection of concepts retrieved from DBpedia. As a result, a new representation is constructed that better reflects the semantics of the texts. With this new representation, the documents are hierarchically clustered in order to form partition of papers that share semantic relatedness. The steps in textual data preparation, utilization of DBpedia and clustering are explained and illustrated with results of experiments performed on a corpus of scientific documents about rough sets.
引用
收藏
页码:394 / 403
页数:10
相关论文
共 50 条
  • [31] Automated knowledge discovery in clinical databases based on rough set model
    Tsumoto, S
    INFOR, 2000, 38 (03) : 196 - 207
  • [32] A Rough-Set Feature Selection Model for Classification and Knowledge Discovery
    Qamar, Usman
    2013 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC 2013), 2013, : 788 - 793
  • [33] Algebras from Semiconcepts in Rough Set Theory
    Howlader, Prosenjit
    Banerjee, Mohua
    ROUGH SETS, IJCRS 2018, 2018, 11103 : 440 - 454
  • [34] Distinguishing Vagueness from Ambiguity in Rough Set Approximations
    Greco, Salvatore
    Matarazzo, Benedetto
    Slowinski, Roman
    INTERNATIONAL JOURNAL OF UNCERTAINTY FUZZINESS AND KNOWLEDGE-BASED SYSTEMS, 2018, 26 : 89 - 125
  • [35] Google Knows Who Is Famous Today - Building an Ontology From Search Engine Knowledge and DBpedia
    Ochs, Christopher
    Tian, Tian
    Geller, James
    Chun, Soon Ae
    FIFTH IEEE INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC 2011), 2011, : 320 - 327
  • [36] Shadowed set-based rough-fuzzy Clustering using Random Feature Mapping
    Kong, Lingning
    Chen, Long
    2017 INTERNATIONAL CONFERENCE ON SECURITY, PATTERN ANALYSIS, AND CYBERNETICS (SPAC), 2017, : 400 - 405
  • [37] A clustering method based on rough sets and its application to knowledge discovery in the medical database
    Hirano, S
    Tsumoto, S
    Okuzaki, T
    Hata, Y
    MEDINFO 2001: PROCEEDINGS OF THE 10TH WORLD CONGRESS ON MEDICAL INFORMATICS, PTS 1 AND 2, 2001, 84 : 206 - 210
  • [38] Knowledge acquisition in incomplete information systems based on variable precision rough set model
    Wu, WZ
    PROCEEDINGS OF 2005 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-9, 2005, : 2245 - 2250
  • [39] Dominance-based rough set approach as a paradigm of knowledge discovery and granular computing
    Roman Slowinski
    重庆邮电大学学报(自然科学版), 2010, 22 (06) : 708 - 719
  • [40] From probabilistic computing approach to probabilistic rough set for solving problem related to uncertainty under machine learning
    Paul, Subrata
    Mitra, Anirban
    Rajulu, K. Govinda
    2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMPUTING RESEARCH (ICCIC), 2015, : 358 - 363