Lempel-Ziv compression of structured text

被引:0
|
作者
Adiego, J [1 ]
Navarro, G [1 ]
de la Fuente, P [1 ]
机构
[1] Univ Valladolid, Dept Informat, Valladolid, Spain
关键词
Ziv-Lempel; XML data; text compression;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We describe a novel Lempel-Ziv approach suitable for compressing structured documents, called LZCS, which takes advantage of redundant information that can appear in the structure. The main idea is that frequently repeated subtrees may exist and these can be replaced by a backward reference to their first occurrence. The main advantage is that compressed documents generated by LZCS are easy to display, access at random, and navigate. In a second stage, processed documents can be further compressed using some semiadaptive technique, so that random access and navigability remain possible. LZCS is especially efficient to compress collections of highly structured data, such as XML forms, invoices, e-commerce and web-service exchange documents. The comparison against structure-based and standard compressors shows that LZCS is a competitive choice for this type of documents, while the others axe not well-suited to support navigation or random access.
引用
收藏
页码:112 / 121
页数:10
相关论文
共 50 条
  • [31] On Lempel-Ziv complexity of sequences
    Doganaksoy, Ali
    Gologlu, Faruk
    SEQUENCES AND THEIR APPLICATIONS - SETA 2006, 2006, 4086 : 180 - 189
  • [32] A PostScript printer controller embedded with Lempel-Ziv data compression function
    Satoh, A
    Ueda, M
    Satoh, T
    ELECTRONICS AND COMMUNICATIONS IN JAPAN PART II-ELECTRONICS, 1999, 82 (08): : 17 - 26
  • [33] Application of Lempel-Ziv factorization to the approximation of grammar-based compression
    Rytter, W
    COMBINATORIAL PATTERN MATCHING, 2002, 2373 : 20 - 31
  • [34] Efficient VLSI for Lempel-Ziv compression in wireless data communication networks
    Jung, BJ
    Burleson, WP
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 1998, 6 (03) : 475 - 483
  • [35] Error resilient Lempel-Ziv data compression scheme with perfect hashing
    Chang, Chinchen
    Tseng, Hsienwen
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2008, 4 (03): : 761 - 770
  • [36] Space-efficient construction of Lempel-Ziv compressed text indexes
    Arroyuelo, Diego
    Navarro, Gonzalo
    INFORMATION AND COMPUTATION, 2011, 209 (07) : 1070 - 1102
  • [37] On the Size of Lempel-Ziv and Lyndon Factorizations
    Karkkainen, Juha
    Kempa, Dominik
    Nakashima, Yuto
    Puglisi, Simon J.
    Shur, Arseny M.
    34TH SYMPOSIUM ON THEORETICAL ASPECTS OF COMPUTER SCIENCE (STACS 2017), 2017, 66
  • [38] On the Approximation Ratio of Lempel-Ziv Parsing
    Gagie, Travis
    Navarro, Gonzalo
    Prezza, Nicola
    LATIN 2018: THEORETICAL INFORMATICS, 2018, 10807 : 490 - 503
  • [39] Computing Lempel-Ziv Factorization Online
    Starikovskaya, Tatiana
    MATHEMATICAL FOUNDATIONS OF COMPUTER SCIENCE 2012, 2012, 7464 : 789 - 799
  • [40] A statistical Lempel-Ziv compression algorithm for personal digital assistant (PDA)
    Kwong, S
    Ho, YF
    IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2001, 47 (01) : 154 - 162