Compressed Text Indexing with Wildcards

被引:0
|
作者
Hon, Wing-Kai [1 ]
Ku, Tsung-Han [1 ]
Shah, Rahul [2 ]
Thankachan, Sharma V. [2 ]
Vitter, Jeffrey Scott [3 ]
机构
[1] Natl Tsing Hua Univ, Hsinchu, Taiwan
[2] Louisiana State Univ, Baton Rouge, LA 70803 USA
[3] Univ Kansas, Lawrence, KS 66045 USA
关键词
SUFFIX ARRAYS;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Let T = T-1 phi T-k1(2)phi(k2) ... phi T-kd(d+1) be a text of total length n, where characters of each T-i are chosen from an alphabet Sigma of size sigma, and phi denotes a wildcard symbol. The text indexing with wildcards problem is to index. T such that when we are given a query pattern P we can locate the occurrences of P in T efficiently. This problem has been applied in indexing genomic sequences that contain single-nucleotide polymorphisms (SNP) because SNP can be modeled as wildcards. Recently Tam et al. (2009) and Thachuk (2011) have proposed succinct indexes for this problem. In this paper, we present the first compressed index for this problem, which takes only nH(h) + o(n log sigma) + O(d log n) bits space, where H-h is the hth-order empirical entropy (h = o(log(sigma) n)) of T.
引用
收藏
页码:267 / +
页数:3
相关论文
共 50 条
  • [21] Geometric BWT: Compressed Text Indexing via Sparse Suffixes and Range Searching
    Chien, Yu-Feng
    Hon, Wing-Kai
    Shah, Rahul
    Thankachan, Sharma V.
    Vitter, Jeffrey Scott
    ALGORITHMICA, 2015, 71 (02) : 258 - 278
  • [22] Practical High-Order Entropy-Compressed Text Self-Indexing
    Huo, Hongwei
    Long, Peng
    Vitter, Jeffrey Scott
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (03) : 2943 - 2960
  • [23] Compressed domain indexing of losslessly compressed images
    Schaefer, G
    STORAGE AND RETRIEVAL FOR MEDIA DATABASES 2002, 2002, 4676 : 79 - 85
  • [24] String Indexing with Compressed Patterns
    Bille, Philip
    Gortz, Inge Li
    Steiner, Teresa Anna
    ACM TRANSACTIONS ON ALGORITHMS, 2023, 19 (04)
  • [25] Compressed Indexing with Signature Grammars
    Christiansen, Anders Roy
    Ettienne, Mikko Berggren
    LATIN 2018: THEORETICAL INFORMATICS, 2018, 10807 : 331 - 345
  • [26] String Indexing with Compressed Patterns
    Bille, Philip
    Gortz, Inge Li
    Steiner, Teresa Anna
    37TH INTERNATIONAL SYMPOSIUM ON THEORETICAL ASPECTS OF COMPUTER SCIENCE (STACS 2020), 2020, 154
  • [27] Indexing of compressed video sequences
    Idris, F
    Panchanathan, S
    STORAGE AND RETRIEVAL FOR STILL IMAGE AND VIDEO DATABASES IV, 1996, 2670 : 247 - 253
  • [28] Text indexing with errors
    Maass, MG
    Nowak, J
    COMBINATORIAL PATTERN MATCHING, PROCEEDINGS, 2005, 3537 : 21 - 32
  • [29] Text indexing with errors
    Maass, Moritz G.
    Nowak, Johannes
    JOURNAL OF DISCRETE ALGORITHMS, 2007, 5 (04) : 662 - 681
  • [30] SEMANTIC TEXT INDEXING
    Kaleta, Zbigniew
    COMPUTER SCIENCE-AGH, 2014, 15 (01): : 19 - 34