Text indexing with errors

被引:0
|
作者
Maass, MG [1 ]
Nowak, J [1 ]
机构
[1] Tech Univ Munich, Fak Informat, D-85748 Garching, Germany
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we address the problem of constructing an index for a text document or a collection of documents to answer various questions about the occurrences of a pattern when allowing a constant number of errors. In particular, our index can be built to report all occurrences, all positions, or all documents where a pattern occurs in time linear in the size of the query string and the number of results. This improves over previous work where the lookup time is not linear or depends upon the size of the document corpus. Our data structure has size O(n log(k) n) on average and with high probability for input size n and queries with up to k errors. Additionally, we present a trade-off between query time and index complexity that achieves worst-case bounded index size and preprocessing time with linear lookup time on average.
引用
收藏
页码:21 / 32
页数:12
相关论文
共 50 条
  • [41] Retrieval experiments: Full text versus human indexing versus automatic indexing
    Lancaster, FW
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE, 1998, 49 (05): : 484 - 484
  • [42] Impact of indexing errors on spur gear dynamics
    Inalpolat, M.
    Handschuh, M.
    Kahraman, A.
    INTERNATIONAL GEAR CONFERENCE 2014, 2014, : 751 - +
  • [43] Random Indexing and Modified Random Indexing based approach for extractive text summarization
    Chatterjee, Niladri
    Sahoo, Pramod Kumar
    COMPUTER SPEECH AND LANGUAGE, 2015, 29 (01): : 32 - 44
  • [44] A novel full-text indexing model for Chinese text retrieval
    Zhou, SG
    Hu, YF
    Hu, JT
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, 2001, 2113 : 370 - 379
  • [45] DyST: Dynamic and scalable temporal text indexing
    Norvag, Kjetil
    Nybo, Albert Overskeid
    TIME 2006: THIRTEENTH INTERNATIONAL SYMPOSIUM ON TEMPORAL REPRESENTATION AND REASONING, PROCEEDINGS, 2006, : 204 - +
  • [46] Indexing text events in digital video databases
    Gargi, U
    Antani, S
    Kasturi, R
    FOURTEENTH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1 AND 2, 1998, : 916 - 918
  • [47] Indexing text and visual features for WWW images
    Shen, HT
    Zhou, XF
    Cui, B
    WEB TECHNOLOGIES RESEARCH AND DEVELOPMENT - APWEB 2005, 2005, 3399 : 885 - 899
  • [48] Dynamic text indexing under string updates
    Ferragina, P
    JOURNAL OF ALGORITHMS-COGNITION INFORMATICS AND LOGIC, 1997, 22 (02): : 296 - 328
  • [49] Metric indexing for the vector model in Text Retrieval
    Skopal, T
    Moravec, P
    Pokorny, J
    Snásel, V
    STRING PROCESSING AND INFORMATION RETRIEVAL, PROCEEDINGS, 2004, 3246 : 183 - 195
  • [50] The impact of indexing approaches on Arabic text classification
    Al-Badarneh, Amer
    Al-Shawakfa, Emad
    Bani-Ismail, Basel
    Al-Rababah, Khaleel
    Shatnawi, Safwan
    JOURNAL OF INFORMATION SCIENCE, 2017, 43 (02) : 159 - 173