Compacting XML documents

被引:1
作者
Kálmán, M [1 ]
Havasi, F [1 ]
Gyimóthy, T [1 ]
机构
[1] Dept Software Engn, H-6720 Szeged, Hungary
关键词
XML; SRML; XML compaction; XML semantics;
D O I
10.1016/j.infsof.2005.03.001
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Nowadays, one of the most common formats for storing information is XML. The biggest drawback of XML documents is that their size is rather large compared to the information they store. XML documents may contain redundant attributes, which can be calculated from others. These redundant attributes can be deleted from the original XML document if the calculation rules can be stored somehow. In an Attribute Grammar environment there is an analog description for these rules: semantic rules. In order to use this technique in an XML environment we defined a new metalanguage called SRML. We have developed a method, which enables us to use this SRML metalanguage for compacting XML documents. After compaction it is possible to use XML compressors to make the compacted document much smaller. By using this combined approach we could achieve a significant size reduction compared to the compressed size of the XML specific compressors. This article extends the method published earlier to provide the possibility of automatically generating rules using machine learning techniques, with which it can find relationships between attributes which might not have been noticed by the user beforehand. (c) 2005 Elsevier B.V. All rights reserved.
引用
收藏
页码:90 / 106
页数:17
相关论文
共 50 条
[21]   A Framework of Summarizing XML Documents with Schemas [J].
Lv, Teng ;
Yan, Ping .
INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2013, 10 (01) :18-27
[22]   ExQueX: Exploring and Querying XML Documents [J].
Kimelfeld, Benny ;
Sagiv, Yehoshua ;
Weber, Gidi .
ACM SIGMOD/PODS 2009 CONFERENCE, 2009, :1103-1105
[23]   Accelerating queries by pruning XML documents [J].
Bressan, S ;
Catania, B ;
Lacroix, Z ;
Li, YG ;
Maddalena, A .
DATA & KNOWLEDGE ENGINEERING, 2005, 54 (02) :211-240
[24]   The Implementation of a Normalization Tool for XML Documents [J].
Kao, Kuo-Fong ;
Tsai, Mark ;
Liao, I-En .
JOURNAL OF INTERNET TECHNOLOGY, 2008, 9 (02) :131-137
[25]   Engineering documents into XML file formats [J].
Chiang, Chia-Chu .
INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY, PROCEEDINGS, 2007, :610-615
[26]   Measuring changes in streaming XML documents [J].
Seaward, LM ;
Saxton, LV .
PROCEEDINGS OF THE 6TH JOINT CONFERENCE ON INFORMATION SCIENCES, 2002, :232-234
[27]   Optimal Probabilistic Generation of XML Documents [J].
Abiteboul, Serge ;
Amsterdamer, Yael ;
Deutch, Daniel ;
Milo, Tova ;
Senellart, Pierre .
THEORY OF COMPUTING SYSTEMS, 2015, 57 (04) :806-842
[28]   Efficient extraction of schemas for XML documents [J].
Min, JK ;
Ahn, JY ;
Chung, CW .
INFORMATION PROCESSING LETTERS, 2003, 85 (01) :7-12
[29]   Processing XML documents with overlapping hierarchies [J].
Iacob, IE ;
Dekhtyar, A .
PROCEEDINGS OF THE 5TH ACM/IEEE JOINT CONFERENCE ON DIGITAL LIBRARIES, PROCEEDINGS, 2005, :409-409
[30]   Querying XML documents in logic programming [J].
Almendros-Jimenez, J. M. ;
Becerra-Teron, A. ;
Enciso-Banos, F. J. .
THEORY AND PRACTICE OF LOGIC PROGRAMMING, 2008, 8 :323-361