A clustering approach for XML linked documents

被引:0
|
作者
Catania, B [1 ]
Maddalena, A [1 ]
机构
[1] Univ Genoa, Genoa, Italy
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Clustering algorithms for hypertext documents consider not only, the document content but also the links existing between them. All the similarity functions proposed in the literature assume that just one type of link exists between documents, with a unique semantic meaning. With the rapid diffusion of XML documents, a specific language, called XLink, has been proposed to specify inside XML documents different types of links. Each type of link forces a different degree of similarity between the documents on which it is defined, thus we claim it must influence in a different way the computation of distance values. In this paper, after presenting a graph-based formalization of the hypertexts we consider we introduce a distance function, based on both the number and the type of the links connecting documents. Sonic preliminary experimental results on clustering algorithms based on the proposed function conclude the paper.
引用
收藏
页码:121 / 125
页数:5
相关论文
共 50 条
  • [1] Clustering of XML documents
    Guillaume, D
    Murtagh, F
    COMPUTER PHYSICS COMMUNICATIONS, 2000, 127 (2-3) : 215 - 227
  • [2] HCX: An Efficient Hybrid Clustering Approach for XML Documents
    Kutty, Sangeetha
    Nayak, Richi
    Li, Yuefeng
    DOCENG'09: PROCEEDINGS OF THE 2009 ACM SYMPOSIUM ON DOCUMENT ENGINEERING, 2009, : 94 - 97
  • [3] Clustering XML documents by structure
    Dalamagas, T
    Cheng, T
    Winkel, KJ
    Sellis, T
    METHODS AND APPLICATIONS OF ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2004, 3025 : 112 - 121
  • [4] Clustering XML Documents by Structure
    Lesniewska, Anna
    ADVANCES IN DATABASES AND INFORMATION SYSTEMS, 2010, 5968 : 238 - 246
  • [5] Clustering schemaless XML documents
    Shen, Y
    Wang, B
    ON THE MOVE TO MEANINGFUL INTERNET SYSTEMS 2003: COOPIS, DOA, AND ODBASE, 2003, 2888 : 767 - 784
  • [6] XML documents clustering by structures
    Nayak, Richi
    Xu, Sumei
    ADVANCES IN XML INFORMATION RETRIEVAL AND EVALUATION, 2006, 3977 : 432 - 442
  • [7] Semantic Clustering of XML Documents
    Tagarelli, Andrea
    Greco, Sergio
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2010, 28 (01)
  • [8] Collaborative clustering of XML documents
    Greco, Sergio
    Gullo, Francesco
    Ponti, Giovanni
    Tagarelli, Andrea
    JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2011, 77 (06) : 988 - 1008
  • [9] Clustering XML documents by patterns
    Piernik, Maciej
    Brzezinski, Dariusz
    Morzy, Tadeusz
    KNOWLEDGE AND INFORMATION SYSTEMS, 2016, 46 (01) : 185 - 212
  • [10] Collaborative Clustering of XML Documents
    Greco, Sergio
    Gullo, Francesco
    Ponti, Giovanni
    Tagarelli, Andrea
    2009 INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING WORKSHOPS (ICPPW 2009), 2009, : 579 - 586