RE-STORE: A system for compressing, browsing, and searching large documents

被引:9
作者
Moffat, A [1 ]
Wan, R [1 ]
机构
[1] Univ Melbourne, Dept Comp Sci & Software Engn, Parkville, Vic 3010, Australia
来源
EIGHTH SYMPOSIUM ON STRING PROCESSING AND INFORMATION RETRIEVAL, PROCEEDINGS | 2001年
关键词
D O I
10.1109/SPIRE.2001.989752
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We describe a software system for managing text files of up to several hundred megabytes that combines a number of useful facilities. First, the text is stored compressed using a variant of the RE-PAIR mechanism described by Larsson and Moffat, with space savings comparable to those obtained by other widely used general-purpose compression systems. Second, we provide, as a byproduct of the compression process, a phrase-based browsing tool that allows users to explore the contents of the source text in a natural and useful manner. And third, once a set of desired phrases has been determined through the use of the browsing tool, the compressed text can be searched to determine locations at which those phrases appear, without decompressing the whole of the stored text, and without use of an additional index. That is, we show how the RE-PAIR compression regime can be extended to allow phrase-based browsing and fast interactive searching.
引用
收藏
页码:162 / 174
页数:13
相关论文
共 50 条
[21]   An automatic fire searching and suppression system for large spaces [J].
Chen, T ;
Yuan, HY ;
Su, GF ;
Fan, WC .
FIRE SAFETY JOURNAL, 2004, 39 (04) :297-307
[22]   INTELLIGENT CAR-SEARCHING SYSTEM FOR LARGE PARK [J].
Tan, Hua-Chun ;
Zhang, Jie ;
Ye, Xin-Chen ;
Li, Hui-Ze ;
Zhu, Pei ;
Zhao, Qing-Hua .
PROCEEDINGS OF 2009 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-6, 2009, :3134-3138
[23]   Interactive browsing of large images on multi-projector display wall system [J].
Jiang, Zhongding ;
Luo, Xuan ;
Mao, Yandong ;
Zang, Binyu ;
Lin, Hai ;
Bao, Hujun .
HUMAN-COMPUTER INTERACTION, PT 2, PROCEEDINGS, 2007, 4551 :827-+
[24]   Supporting System for Quiz in Large Class - Automatic Keyword Extraction and Browsing Interface [J].
Takase, Haruhiko ;
Kawanaka, Hiroharu ;
Tsuruoka, Shinji .
JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2015, 19 (01) :150-155
[25]   FDA 510(k) Clearance of New Fraxel re:store Dual Laser System [J].
不详 .
JOURNAL OF DRUGS IN DERMATOLOGY, 2009, 8 (12) :1151-1151
[26]   LinkNet: A new approach for searching in a large peer-to-peer system [J].
Zhang, KL ;
Wang, S .
WEB TECHNOLOGIES RESEARCH AND DEVELOPMENT - APWEB 2005, 2005, 3399 :241-246
[27]   Smartphone Based Car-Searching System for Large Parking Lot [J].
Li, Junhuai ;
An, Yang ;
Fei, Rong ;
Wang, Huaijun .
PROCEEDINGS OF THE 2016 IEEE 11TH CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA), 2016, :1994-1998
[28]   LevelStore: A large scale key-value store for deduplication storage system [J].
Lu, Y., 1600, Asian Network for Scientific Information (12) :2101-2110
[29]   The System for Efficient Indexing and Search in the Large Archives of Scanned Historical Documents [J].
Bulin, Martin ;
Svec, Jan ;
Ircing, Pavel .
ADVANCES IN INFORMATION RETRIEVAL, ECIR 2023, PT III, 2023, 13982 :206-210
[30]   GOOGLE IMAGE SWIRL, A LARGE-SCALE CONTENT-BASED IMAGE BROWSING SYSTEM [J].
Jing, Yushi ;
Rowley, Henry A. ;
Rosenberg, Charles ;
Wang, Jingbin ;
Zhao, Ming ;
Covell, Michele .
2010 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME 2010), 2010, :267-267