Incremental frequency count - a post BWT-stage for the Burrows-Wheeler compression algorithm

被引:13
作者
Abel, Juergen [1 ]
机构
[1] Ingenieurburo Dr Abel GmbH, D-41469 Neuss, Germany
关键词
compression; Burrows-Wheeler transform; BWT;
D O I
10.1002/spe.763
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The silage after the Burrows-Wheeler transform (BWT) has a key function inside the Burrows-Wheeler compression algorithm as it transforms the BWT output from a local context into a global context. This paper presents the Incremental Frequency Count stage, a post-BWT stage. The new stage is paired with a run length encoding stage between the BWT and the entropy coding stage of the algorithm. It offers high throughput similar to a Move To Front stage, and at the same time good compression rates like the strong but slow Weighted Frequency Count stage. The properties of the Incremental Frequency Count stage are compared to the Move To Front and Weighted Frequency Count stages by their compression rates and speeds on the Calgary and large Canterbury corpora. Copyright (C) 2006 John Wiley & Sons, Ltd.
引用
收藏
页码:247 / 265
页数:19
相关论文
共 28 条
[1]  
Abel J, 2005, IEEE DATA COMPR CONF, P449
[2]  
ABEL J, 2003, ADV BLOCKSORTING COM
[3]   Block sorting and compression [J].
Arnavut, Z ;
Magliveras, SS .
DCC '97 : DATA COMPRESSION CONFERENCE, PROCEEDINGS, 1997, :181-190
[4]   Modifications of the Burrows and Wheeler data compression algorithm [J].
Balkenhol, B ;
Kurtz, S ;
Shtarkov, YM .
DCC '99 - DATA COMPRESSION CONFERENCE, PROCEEDINGS, 1999, :188-197
[5]  
BALKENHOL B, 1999, SFB343 DISCRETE STRU
[6]   A LOCALLY ADAPTIVE DATA-COMPRESSION SCHEME [J].
BENTLEY, JL ;
SLEATOR, DD ;
TARJAN, RE ;
WEI, VK .
COMMUNICATIONS OF THE ACM, 1986, 29 (04) :320-330
[7]  
Binder E., 2000, DISTANCE CODER
[8]  
Burrows M., 1994, BLOCK SORTING LOSSLE, DOI 10.1.1.37.6774
[9]  
CHAPIN B, 2001, THESIS U N TEXAS
[10]   DATA-COMPRESSION USING ADAPTIVE CODING AND PARTIAL STRING MATCHING [J].
CLEARY, JG ;
WITTEN, IH .
IEEE TRANSACTIONS ON COMMUNICATIONS, 1984, 32 (04) :396-402