A matching algorithm for measuring the structural similarity between an XML document and a DTD and its applications

被引:61
作者
Bertino, E
Guerrini, G
Mesiti, M
机构
[1] Univ Milan, Dipartimento Informat & Comunicaz, I-20135 Milan, Italy
[2] Univ Pisa, Dipartimento Informat, I-56127 Pisa, Italy
关键词
structural similarity; document classification; structure evolution; structural queries; selective dissemination of documents; document protection;
D O I
10.1016/S0306-4379(03)00031-0
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper we propose a matching algorithm for measuring the structural similarity between an XML document and a DTD. The matching algorithm, by comparing the document structure against the one the DTD requires, is able to identify commonalities and differences. Differences can be due to the presence of extra elements with respect to those the DTD requires and to the absence of required elements. The evaluation of commonalities and differences gives raise to a numerical rank of the structural similarity. Moreover, in the paper, some applications of the matching algorithm are discussed. Specifically, the matching algorithm is exploited for the classification of XML documents against a set of DTDs, the evolution of the DTD structure, the evaluation of structural queries, the selective dissemination of XML documents, and the protection of XML document contents. (C) 2003 Elsevier Science Ltd. All rights reserved.
引用
收藏
页码:23 / 46
页数:24
相关论文
共 35 条
[1]  
[Anonymous], P 28 INT C VER LARG
[2]  
[Anonymous], 1999, The W3C XML Path Language
[3]  
BAEZAYATES RA, 1999, MODERN INFORMATION R
[4]  
BATINI C, 1986, COMPUT SURV, V18, P323, DOI 10.1145/27633.27634
[5]  
Bertino E, 2002, LECT NOTES COMPUT SC, V2490, P45
[6]   Protection and administration of XML data sources [J].
Bertino, E ;
Castano, S ;
Ferrari, E ;
Mesiti, M .
DATA & KNOWLEDGE ENGINEERING, 2002, 43 (03) :237-260
[7]   Specifying and enforcing access control policies for XML document sources [J].
Bertino E. ;
Castano S. ;
Ferrari E. ;
Mesiti M. .
World Wide Web, 2000, 3 (03) :139-151
[8]  
Bertino E, 1999, LECT NOTES COMPUT SC, V1628, P416
[9]  
BERTINO E, 2002, DISITR0202 U GEN
[10]  
Do HH, 2003, LECT NOTES COMPUT SC, V2593, P221