A clustering approach for XML linked documents

被引:0
|
作者
Catania, B [1 ]
Maddalena, A [1 ]
机构
[1] Univ Genoa, Genoa, Italy
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Clustering algorithms for hypertext documents consider not only, the document content but also the links existing between them. All the similarity functions proposed in the literature assume that just one type of link exists between documents, with a unique semantic meaning. With the rapid diffusion of XML documents, a specific language, called XLink, has been proposed to specify inside XML documents different types of links. Each type of link forces a different degree of similarity between the documents on which it is defined, thus we claim it must influence in a different way the computation of distance values. In this paper, after presenting a graph-based formalization of the hypertexts we consider we introduce a distance function, based on both the number and the type of the links connecting documents. Sonic preliminary experimental results on clustering algorithms based on the proposed function conclude the paper.
引用
收藏
页码:121 / 125
页数:5
相关论文
共 50 条
  • [21] Clustering XML Documents Using Closed Frequent Subtrees: A Structural Similarity Approach
    Kutty, Sangeetha
    Tran, Tien
    Nayak, Richi
    Li, Yuefeng
    FOCUSED ACCESS TO XML DOCUMENTS, 2008, 4862 : 183 - 194
  • [22] Overview of the INEX 2008 XML Mining Track Categorization and Clustering of XML Documents in a Graph of Documents
    Denoyer, Ludovic
    Gallinari, Patrick
    ADVANCES IN FOCUSED RETRIEVAL, 2009, 5631 : 401 - 411
  • [23] Using structural similarity for clustering XML documents
    Aitelhadj, Ali
    Boughanem, Mohand
    Mezghiche, Mohamed
    Souam, Fatiha
    KNOWLEDGE AND INFORMATION SYSTEMS, 2012, 32 (01) : 109 - 139
  • [24] Clustering XML Documents Using Frequent Subtrees
    Kutty, Sangeetha
    Tran, Tien
    Nayak, Richi
    Li, Yuefeng
    ADVANCES IN FOCUSED RETRIEVAL, 2009, 5631 : 436 - 445
  • [25] Clustering XML documents using structural summaries
    Dalamagas, T
    Cheng, T
    Winkel, KJ
    Sellis, T
    CURRENT TRENDS IN DATABASE TECHNOLOGY - EDBT 2004 WORKSHOPS, PROCEEDINGS, 2004, 3268 : 547 - 556
  • [26] Novel mixed clustering method for XML documents
    College of Information and Communications Engineering, Harbin Engineering University, Harbin 150001, China
    不详
    Harbin Gongcheng Daxue Xuebao, 2007, 6 (697-701):
  • [27] Structure and Content Similarity for Clustering XML Documents
    Zhang, Lijun
    Li, Zhanhuai
    Chen, Qun
    Li, Ning
    WEB-AGE INFORMATION MANAGEMENT, 2010, 6185 : 116 - 124
  • [28] XML Documents Clustering based on Representative Path
    Kim, Woosaeng
    PROCEEDINGS OF THE 13TH WSEAS INTERNATIONAL CONFERENCE ON COMPUTERS, 2009, : 108 - +
  • [29] Clustering XML Documents by Combining Content and Structure
    Guo Yongming
    Chen Dehua
    Le Jiajin
    ISISE 2008: INTERNATIONAL SYMPOSIUM ON INFORMATION SCIENCE AND ENGINEERING, VOL 1, 2008, : 583 - 587
  • [30] Using structural similarity for clustering XML documents
    Ali Aïtelhadj
    Mohand Boughanem
    Mohamed Mezghiche
    Fatiha Souam
    Knowledge and Information Systems, 2012, 32 : 109 - 139