Efficient universal lossless data compression algorithms based on a greedy sequential grammar transform - Part two: With context models

被引：13

作者：

Yang, EH ^{[1
]}

He, DK ^{[1
]}

机构：

[1] Univ Waterloo, Dept Elect & Comp Engn, Waterloo, ON N2L 3G1, Canada

来源：

IEEE TRANSACTIONS ON INFORMATION THEORY | 2003年 / 49卷 / 11期

基金：

加拿大创新基金会; 加拿大自然科学与工程研究理事会;

关键词：

arithmetic coding; context models; entropy; grammar-based coding; redundancy; string and pattern matching; universal data compression;

D O I：

10.1109/TIT.2003.818411

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The concept of context-free grammar (CFG)-based coding is extended to the case of countable-context models, yielding context-dependent grammar (CDG)-based coding. Given a countable-context model, a greedy CDG transform is proposed. Based on this greedy CDG transform, two universal lossless data compression algorithms, an improved sequential context-dependent algorithm and a hierarchical context-dependent algorithm, are then developed. It is shown that these algorithms are all universal in the sense that they can achieve asymptotically the entropy rate of any stationary, ergodic source with a finite alphabet. Moreover, it is proved that these algorithms' worst case redundancies among all individual sequences of length n from a finite alphabet are upper-bounded by d log log n/log n, as long as the number of distinct contexts grows with the sequence length n in the order of O(n(a)), where 0 < a < 1 and d are positive constants. It is further shown that for some nonstationary sources, the proposed context-dependent algorithms can achieve better expected redundancies than any existing CFG-based codes, including the Lempel-Ziv algorithm, the multilevel pattern matching algorithm, and the context-free algorithms in Part I of this series of papers.

引用

页码：2874 / 2894

页数：21

共 25 条

[1] Barron A. R., 1985, THESIS STANFORD U ST
[2] DATA-COMPRESSION USING ADAPTIVE CODING AND PARTIAL STRING MATCHING
CLEARY, JG
WITTEN, IH
[J]. IEEE TRANSACTIONS ON COMMUNICATIONS, 1984, 32 (04) : 396 - 402
[3] DATA-COMPRESSION USING DYNAMIC MARKOV MODELING
CORMACK, GV
HORSPOOL, RNS
[J]. COMPUTER JOURNAL, 1987, 30 (06) : 541 - 550
[4] Cover T. M., 2005, ELEM INF THEORY, DOI 10.1002/047174882X
[5] Redundancy rates for renewal and other processes
Csiszar, I
Shields, PC
[J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 1996, 42 (06) : 2065 - 2072
[6] The method of types
Csiszar, I
[J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 1998, 44 (06) : 2505 - 2523
[7] Hopcroft J. E., 2007, Introduction to Automata Theory, Languages and Computation
[8] SAMPLE CONVERSES IN SOURCE-CODING THEORY
KIEFFER, JC
[J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 1991, 37 (02) : 263 - 268
[9] Grammar-based codes: A new class of universal lossless source codes
Kieffer, JC
Yang, EH
[J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 2000, 46 (03) : 737 - 754
[10] Universal lossless compression via multilevel pattern matching
Kieffer, JC
Yang, EH
Nelson, GJ
Cosman, P
[J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 2000, 46 (04) : 1227 - 1245

← 1 2 3 →