Periodicity in Data Streams with Wildcards

被引:1
|
作者
Ergun, Funda [1 ]
Grigorescu, Elena [2 ]
Azer, Erfan Sadeqi [1 ]
Zhou, Samson [1 ,2 ]
机构
[1] Indiana Univ, Sch Informat & Comp, Bloomington, IN 47405 USA
[2] Purdue Univ, Dept Comp Sci, W Lafayette, IN 47907 USA
基金
美国国家科学基金会;
关键词
Streaming algorithms; Periodicity; Wildcards; Randomized algorithms; String algorithms;
D O I
10.1007/s00224-019-09950-y
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We investigate the problem of detecting periodic trends within a string S of length n, arriving in the streaming model, containing at most k wildcard characters, where k = o(n). A wildcard character is a special character that can be assigned any other character. We say that S has wildcard-period p if there exists an assignment to each of the wildcard characters so that in the resulting stream the prefix of length n - p equals the suffix of length n - p. We present a two-pass streaming algorithm that computes wildcard-periods of S using O(k3 polylog n) bits of space, while we also show that this problem cannot be solved in sublinear space in one pass. We also give a one-pass randomized streaming algorithm that computes all wildcard-periods p of S withp < n2 and no wildcard characters appearing in the last p symbols of S, using O(k3 log9 n) space.
引用
收藏
页码:177 / 197
页数:21
相关论文
共 50 条
  • [41] Document-Specific Keyphrase Extraction Using Sequential Patterns with Wildcards
    Xie, Fei
    Wu, Xindong
    Zhu, Xingquan
    2014 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2014, : 1055 - 1060
  • [42] Multiple pattern matching with wildcards and one-off conditon
    Guo, D. (guodan@hfut.edu.cn), 1600, Binary Information Press, P.O. Box 162, Bethel, CT 06801-0162, United States (09): : 5543 - 5552
  • [43] Fast Rotation Kernel Density Estimation over Data Streams
    Lei, Runze
    Wang, Pinghui
    Li, Rundong
    Jia, Peng
    Zhao, Junzhou
    Guan, Xiaohong
    Deng, Chao
    KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, : 892 - 902
  • [44] Taming the Wildcards: Combining Definition- and Use-Site Variance
    Altidor, John
    Huang, Shan Shan
    Smaragdakis, Yannis
    PLDI 11: PROCEEDINGS OF THE 2011 ACM CONFERENCE ON PROGRAMMING LANGUAGE DESIGN AND IMPLEMENTATION, 2011, : 602 - 613
  • [45] Taming the Wildcards: Combining Definition- and Use-Site Variance
    Altidor, John
    Huang, Shan Shan
    Smaragdakis, Yannis
    ACM SIGPLAN NOTICES, 2011, 46 (06) : 602 - 613
  • [46] Finding Favourite Tuples on Data Streams with Provably Few Comparisons
    Zhang, Guangyi
    Tatti, Nikolaj
    Gionis, Aristides
    PROCEEDINGS OF THE 29TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2023, 2023, : 3229 - 3238
  • [47] Suffix array for multi-pattern matching with variable length wildcards
    Liu, Na
    Xie, Fei
    Wu, Xindong
    INTELLIGENT DATA ANALYSIS, 2021, 25 (02) : 283 - 303
  • [48] Testing Periodicity
    Lachish, Oded
    Newman, Ilan
    ALGORITHMICA, 2011, 60 (02) : 401 - 420
  • [49] Surface periodicity index (SPI): A measure of periodicity of surface topography
    Gupta, Rohit
    Vadali, Madhu
    PRECISION ENGINEERING-JOURNAL OF THE INTERNATIONAL SOCIETIES FOR PRECISION ENGINEERING AND NANOTECHNOLOGY, 2023, 79 : 200 - 209
  • [50] A novel optimal multi-pattern matching method with wildcards for DNA sequence
    Wang, Xinlu
    Saif, Ahmed A. F.
    Liu, Dayou
    Zhu, Yungang
    Benediktsson, Jon Atli
    TECHNOLOGY AND HEALTH CARE, 2021, 29 : S115 - S124