Compacting XML documents

被引:1
|
作者
Kálmán, M [1 ]
Havasi, F [1 ]
Gyimóthy, T [1 ]
机构
[1] Dept Software Engn, H-6720 Szeged, Hungary
关键词
XML; SRML; XML compaction; XML semantics;
D O I
10.1016/j.infsof.2005.03.001
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Nowadays, one of the most common formats for storing information is XML. The biggest drawback of XML documents is that their size is rather large compared to the information they store. XML documents may contain redundant attributes, which can be calculated from others. These redundant attributes can be deleted from the original XML document if the calculation rules can be stored somehow. In an Attribute Grammar environment there is an analog description for these rules: semantic rules. In order to use this technique in an XML environment we defined a new metalanguage called SRML. We have developed a method, which enables us to use this SRML metalanguage for compacting XML documents. After compaction it is possible to use XML compressors to make the compacted document much smaller. By using this combined approach we could achieve a significant size reduction compared to the compressed size of the XML specific compressors. This article extends the method published earlier to provide the possibility of automatically generating rules using machine learning techniques, with which it can find relationships between attributes which might not have been noticed by the user beforehand. (c) 2005 Elsevier B.V. All rights reserved.
引用
收藏
页码:90 / 106
页数:17
相关论文
共 50 条
  • [1] Clustering of XML documents
    Guillaume, D
    Murtagh, F
    COMPUTER PHYSICS COMMUNICATIONS, 2000, 127 (2-3) : 215 - 227
  • [2] Slicing XML Documents
    Silva, Josep
    ELECTRONIC NOTES IN THEORETICAL COMPUTER SCIENCE, 2006, 157 (02) : 187 - 192
  • [3] Clustering XML documents by structure
    Dalamagas, T
    Cheng, T
    Winkel, KJ
    Sellis, T
    METHODS AND APPLICATIONS OF ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2004, 3025 : 112 - 121
  • [4] Clustering XML Documents by Structure
    Lesniewska, Anna
    ADVANCES IN DATABASES AND INFORMATION SYSTEMS, 2010, 5968 : 238 - 246
  • [5] Warehousing dynamic XML documents
    Rusu, Laura Irina
    Rahayu, Wenny
    Taniar, David
    DATA WAREHOUSING AND KNOWLEDGE DISCOVERY, PROCEEDINGS, 2006, 4081 : 175 - 184
  • [6] Parallel processing XML documents
    Lü, K
    Zhu, YL
    Sun, WJ
    Lin, SX
    Fan, JP
    IDEAS 2002: INTERNATIONAL DATABASE ENGINEERING AND APPLICATIONS SYMPOSIUM, PROCEEDINGS, 2002, : 96 - 105
  • [7] Clustering XML documents by patterns
    Piernik, Maciej
    Brzezinski, Dariusz
    Morzy, Tadeusz
    KNOWLEDGE AND INFORMATION SYSTEMS, 2016, 46 (01) : 185 - 212
  • [8] A view model for XML documents
    Baril, X
    Bellahsène, Z
    OOIS 2000: 6TH INTERNATIONAL CONFERENCE ON OBJECT ORIENTED INFORMATION SYSTEMS, PROCEEDINGS, 2001, : 429 - 441
  • [9] Collaborative Clustering of XML Documents
    Greco, Sergio
    Gullo, Francesco
    Ponti, Giovanni
    Tagarelli, Andrea
    2009 INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING WORKSHOPS (ICPPW 2009), 2009, : 579 - 586
  • [10] Semantic Search for XML Documents
    Song Ling
    Lv Qiangi
    Tang Xiaobing
    MEASURING TECHNOLOGY AND MECHATRONICS AUTOMATION, PTS 1 AND 2, 2011, 48-49 : 1028 - +