An analysis of the Burrows-Wheeler Transform

被引:212
作者
Manzini, G [1 ]
机构
[1] Univ Piemonte Orientale, Dipartimento Sci & Tecnol Avanzate, I-15100 Alessandria, Italy
关键词
algorithms; performance; block sorting; Burrows-Wheeler Transform; move-to-front encoding; worst-case analysis of compression;
D O I
10.1145/382780.382782
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The Burrows-Wheeler Transform (also known as Block-Sorting) is at the base of compression algorithms that are the state of the art in lossless data compression. In this paper, we analyze two algorithms that use this technique. The first one is the original algorithm described by Burrows and Wheeler, which, despite its simplicity. outperforms the Gzip compressor. The second one uses an additional run-length encoding step to improve compression. We prove that the compression ratio of both algorithms can be bounded in terms of the kth order empirical entropy of the input string for any k greater than or equal to 0. We make no assumptions on the input and we obtain bounds which hold in the worst case, that is, for every possible input string. All previous results for Block-Sorting algorithms were concerned with the average compression ratio and have been established assuming that the input comes from a finite-order Markov source.
引用
收藏
页码:407 / 430
页数:24
相关论文
共 50 条
  • [31] A four-stage algorithm for updating a Burrows-Wheeler transform
    Salson, M.
    Lecroq, T.
    Leonard, M.
    Mouchard, L.
    THEORETICAL COMPUTER SCIENCE, 2009, 410 (43) : 4350 - 4359
  • [32] Genomic Phylogeny Using the MaxwellTM Classifier Based on Burrows-Wheeler Transform
    Demongeot, Jacques
    Gardes, Joel
    Maldivi, Christophe
    Boisset, Denis
    Boufama, Kenza
    Touzouti, Imene
    COMPUTATION, 2023, 11 (08)
  • [33] Analysis of genomic rearrangements by using the Burrows-Wheeler transform of short-read data
    Kouichi Kimura
    Asako Koike
    BMC Bioinformatics, 16
  • [34] Parallel architecture for DNA sequence inexact matching with Burrows-Wheeler Transform
    Xin, Yao
    Liu, Benben
    Min, Biao
    Li, Will X. Y.
    Cheung, Ray C. C.
    Fong, Anthony S.
    Chan, Ting Fung
    MICROELECTRONICS JOURNAL, 2013, 44 (08) : 670 - 682
  • [35] Multithread Multistring Burrows-Wheeler Transform and Longest Common Prefix Array
    Bonizzoni, Paola
    Della Vedova, Gianluca
    Pirola, Yuri
    Previtali, Marco
    Rizzi, Raffaella
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2019, 26 (09) : 948 - 961
  • [36] A bijective variant of the Burrows-Wheeler Transform using V-order
    Daykin, Jacqueline W.
    Smyth, W. F.
    THEORETICAL COMPUTER SCIENCE, 2014, 531 : 77 - 89
  • [37] Computing the longest common prefix array based on the Burrows-Wheeler transform
    Beller, Timo
    Gog, Simon
    Ohlebusch, Enno
    Schnattinger, Thomas
    JOURNAL OF DISCRETE ALGORITHMS, 2013, 18 : 22 - 31
  • [38] Burrows-Wheeler compression: Principles and reflections
    Fenwick, Peter
    THEORETICAL COMPUTER SCIENCE, 2007, 387 (03) : 200 - 219
  • [39] O(N) Memory-Free Hardware Architecture for Burrows-Wheeler Transform
    Ghosh, Surajeet
    Ray, Sanchita Saha
    IEEE TRANSACTIONS ON COMPUTERS, 2023, 72 (07) : 2080 - 2093
  • [40] Optimizing Burrows-Wheeler Transform-Based Sequence Alignment on Multicore Architectures
    Zhang, Jing
    Lin, Heshan
    Balaji, Pavan
    Feng, Wu-chun
    PROCEEDINGS OF THE 2013 13TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID 2013), 2013, : 377 - 384