Efficient universal lossless data compression algorithms based on a greedy sequential grammar transform - Part two: With context models

被引:13
作者
Yang, EH [1 ]
He, DK [1 ]
机构
[1] Univ Waterloo, Dept Elect & Comp Engn, Waterloo, ON N2L 3G1, Canada
基金
加拿大创新基金会; 加拿大自然科学与工程研究理事会;
关键词
arithmetic coding; context models; entropy; grammar-based coding; redundancy; string and pattern matching; universal data compression;
D O I
10.1109/TIT.2003.818411
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The concept of context-free grammar (CFG)-based coding is extended to the case of countable-context models, yielding context-dependent grammar (CDG)-based coding. Given a countable-context model, a greedy CDG transform is proposed. Based on this greedy CDG transform, two universal lossless data compression algorithms, an improved sequential context-dependent algorithm and a hierarchical context-dependent algorithm, are then developed. It is shown that these algorithms are all universal in the sense that they can achieve asymptotically the entropy rate of any stationary, ergodic source with a finite alphabet. Moreover, it is proved that these algorithms' worst case redundancies among all individual sequences of length n from a finite alphabet are upper-bounded by d log log n/log n, as long as the number of distinct contexts grows with the sequence length n in the order of O(n(a)), where 0 < a < 1 and d are positive constants. It is further shown that for some nonstationary sources, the proposed context-dependent algorithms can achieve better expected redundancies than any existing CFG-based codes, including the Lempel-Ziv algorithm, the multilevel pattern matching algorithm, and the context-free algorithms in Part I of this series of papers.
引用
收藏
页码:2874 / 2894
页数:21
相关论文
共 25 条