An analysis of the Burrows-Wheeler Transform

被引:212
作者
Manzini, G [1 ]
机构
[1] Univ Piemonte Orientale, Dipartimento Sci & Tecnol Avanzate, I-15100 Alessandria, Italy
关键词
algorithms; performance; block sorting; Burrows-Wheeler Transform; move-to-front encoding; worst-case analysis of compression;
D O I
10.1145/382780.382782
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The Burrows-Wheeler Transform (also known as Block-Sorting) is at the base of compression algorithms that are the state of the art in lossless data compression. In this paper, we analyze two algorithms that use this technique. The first one is the original algorithm described by Burrows and Wheeler, which, despite its simplicity. outperforms the Gzip compressor. The second one uses an additional run-length encoding step to improve compression. We prove that the compression ratio of both algorithms can be bounded in terms of the kth order empirical entropy of the input string for any k greater than or equal to 0. We make no assumptions on the input and we obtain bounds which hold in the worst case, that is, for every possible input string. All previous results for Block-Sorting algorithms were concerned with the average compression ratio and have been established assuming that the input comes from a finite-order Markov source.
引用
收藏
页码:407 / 430
页数:24
相关论文
共 50 条
  • [41] Clustering of Expressed Sequence Tags with Distance Measure Based on Burrows-Wheeler Transform
    Keng-Hoong Ng
    Phon-Amnuaisuk, Somnuk
    Ho, Chin-Kuan
    2010 3RD INTERNATIONAL CONFERENCE ON BIOMEDICAL ENGINEERING AND INFORMATICS (BMEI 2010), VOLS 1-7, 2010, : 2183 - 2187
  • [42] Efficient Maximal Repeat Finding Using the Burrows-Wheeler Transform and Wavelet Tree
    Kulekci, M. Oguzhan
    Vitter, Jeffrey Scott
    Xu, Bojian
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2012, 9 (02) : 421 - 429
  • [44] Algorithms to compute the Burrows-Wheeler Similarity Distribution
    Louza, Felipe A.
    Telles, Guilherme P.
    Gog, Simon
    Zhao, Liang
    THEORETICAL COMPUTER SCIENCE, 2019, 782 : 145 - 156
  • [45] A New String Matching Algorithm Based on Compressed Suffix Array and Burrows-Wheeler Transform
    Zhang, Qiaoxia
    Wu, Zhijie
    Xu, Yongkang
    Mao, Xin
    Yu, Ran
    Lu, Songfeng
    2012 INTERNATIONAL CONFERENCE ON FUTURE INFORMATION TECHNOLOGY AND MANAGEMENT SCIENCE & ENGINEERING (FITMSE 2012), 2012, 14 : 37 - 40
  • [46] BWtrs: A tool for searching for tandem repeats in DNA sequences based on the Burrows-Wheeler transform
    Pokrzywa, Rafal
    Polanski, Andrzej
    GENOMICS, 2010, 96 (05) : 316 - 321
  • [47] A Natural Encoding of Genetic Variation in a Burrows-Wheeler Transform to Enable Mapping and Genome Inference
    Maciuca, Sorina
    del Ojo Elias, Carlos
    McVean, Gil
    Iqbal, Zamin
    ALGORITHMS IN BIOINFORMATICS, 2016, 9838 : 222 - 233
  • [48] An efficient and secure compression technique for data protection using burrows-wheeler transform algorithm
    Begum, M. Baritha
    Deepa, N.
    Uddin, Mueen
    Kaluri, Rajesh
    Abdelhaq, Maha
    Alsaqour, Raed
    HELIYON, 2023, 9 (06)
  • [49] Heuristics for the run-length encoded Burrows-Wheeler transform alphabet ordering problem
    Major, Lily
    Clare, Amanda
    Daykin, Jacqueline W.
    Mora, Benjamin
    Zarges, Christine
    JOURNAL OF HEURISTICS, 2025, 31 (01)
  • [50] BowMapCL: Burrows-Wheeler Mapping on Multiple Heterogeneous Accelerators
    Nogueira, David
    Tomas, Pedro
    Roma, Nuno
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2016, 13 (05) : 926 - 938