Computing the Burrows-Wheeler transform in place and in small space

被引:9
作者
Crochemore, Maxime [1 ]
Grossi, Roberto [2 ]
Karkkainen, Juha [3 ]
Landau, Gad M. [4 ,5 ]
机构
[1] Kings Coll London, London WC2R 2LS, England
[2] Univ Pisa, Dipartimento Informat, I-56100 Pisa, Italy
[3] Univ Helsinki, Dept Comp Sci, FIN-00014 Helsinki, Finland
[4] Univ Haifa, Dept Comp Sci, IL-31999 Haifa, Israel
[5] NYU Poly, Dept Comp Sci & Engn, Brooklyn, NY USA
基金
美国国家科学基金会; 芬兰科学院; 以色列科学基金会;
关键词
Burrows-Wheeler transform; In-place algorithms; String algorithms; Suffix sorting;
D O I
10.1016/j.jda.2015.01.004
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
We introduce the problem of computing the Burrows-Wheeler Transform (BWT) using small additional space. Our in-place algorithm does not need the explicit storage for the suffix sort array and the output array, as typically required in previous work. It relies on the combinatorial properties of the BWT, and runs in O(n(2)) time in the comparison model using O(1) extra memory cells, apart from the array of n cells storing the n characters of the input text. We then discuss the time-space trade-off when O(k.sigma(k)) extra memory cells are allowed with sigma(k) distinct characters, providing an O((n(2)/k + n) log k)-time algorithm to obtain (and invert) the BWT. For example in real systems where the alphabet size is a constant, for any arbitrarily small c > 0, the BWT of a text of n bytes can be computed in O(n epsilon(-1) log n) time using just epsilon n extra bytes. (C) 2015 Elsevier B.V. All rights reserved.
引用
收藏
页码:44 / 52
页数:9
相关论文
共 50 条
  • [41] Optimizing Burrows-Wheeler Transform-Based Sequence Alignment on Multicore Architectures
    Zhang, Jing
    Lin, Heshan
    Balaji, Pavan
    Feng, Wu-chun
    PROCEEDINGS OF THE 2013 13TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID 2013), 2013, : 377 - 384
  • [42] Computation of the suffix array, Burrows-Wheeler transform and FM-index in V-order
    Daykin, Jacqueline W.
    Mhaskar, Neerja
    Smyth, W. F.
    THEORETICAL COMPUTER SCIENCE, 2021, 880 : 82 - 96
  • [43] Time- and Space-efficient Maximal Repeat Finding Using the Burrows-Wheeler Transform and Wavelet Trees
    Kulekci, M. Oguzhan
    Vitter, Jeffrey Scott
    Xu, Bojian
    2010 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2010, : 622 - 625
  • [44] Clustering of Expressed Sequence Tags with Distance Measure Based on Burrows-Wheeler Transform
    Keng-Hoong Ng
    Phon-Amnuaisuk, Somnuk
    Ho, Chin-Kuan
    2010 3RD INTERNATIONAL CONFERENCE ON BIOMEDICAL ENGINEERING AND INFORMATICS (BMEI 2010), VOLS 1-7, 2010, : 2183 - 2187
  • [45] Efficient Maximal Repeat Finding Using the Burrows-Wheeler Transform and Wavelet Tree
    Kulekci, M. Oguzhan
    Vitter, Jeffrey Scott
    Xu, Bojian
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2012, 9 (02) : 421 - 429
  • [46] Algorithms to compute the Burrows-Wheeler Similarity Distribution
    Louza, Felipe A.
    Telles, Guilherme P.
    Gog, Simon
    Zhao, Liang
    THEORETICAL COMPUTER SCIENCE, 2019, 782 : 145 - 156
  • [47] A New String Matching Algorithm Based on Compressed Suffix Array and Burrows-Wheeler Transform
    Zhang, Qiaoxia
    Wu, Zhijie
    Xu, Yongkang
    Mao, Xin
    Yu, Ran
    Lu, Songfeng
    2012 INTERNATIONAL CONFERENCE ON FUTURE INFORMATION TECHNOLOGY AND MANAGEMENT SCIENCE & ENGINEERING (FITMSE 2012), 2012, 14 : 37 - 40
  • [48] Parallel computation of the Burrows Wheeler Transform in compact space
    Fuentes-Sepulveda, Jose
    Navarro, Gonzalo
    Nekrich, Yakov
    THEORETICAL COMPUTER SCIENCE, 2020, 812 : 123 - 136
  • [49] A Natural Encoding of Genetic Variation in a Burrows-Wheeler Transform to Enable Mapping and Genome Inference
    Maciuca, Sorina
    del Ojo Elias, Carlos
    McVean, Gil
    Iqbal, Zamin
    ALGORITHMS IN BIOINFORMATICS, 2016, 9838 : 222 - 233
  • [50] An efficient and secure compression technique for data protection using burrows-wheeler transform algorithm
    Begum, M. Baritha
    Deepa, N.
    Uddin, Mueen
    Kaluri, Rajesh
    Abdelhaq, Maha
    Alsaqour, Raed
    HELIYON, 2023, 9 (06)