Index compression using 64-bit words

被引:64
|
作者
Anh, Vo Ngoc [1 ]
Moffat, Alistair [1 ]
机构
[1] Univ Melbourne, Dept Comp Sci & Software Engn, Melbourne, Vic 3010, Australia
来源
SOFTWARE-PRACTICE & EXPERIENCE | 2010年 / 40卷 / 02期
基金
澳大利亚研究理事会;
关键词
performance; measurement; index compression; information retrieval; TEXT RETRIEVAL; INFORMATION-RETRIEVAL; INVERTED FILES; SYSTEMS;
D O I
10.1002/spe.948
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Modern computers typically make use of 64-bit words as the fundamental unit of data access. However the decade-long migration from 32-bit architectures has not been reflected in compression technology, because of a widespread assumption that effective compression techniques operate in terms of bits or bytes, rather than words. Here we demonstrate that the use of 64-bit access units, especially in connection with word-bounded codes, does indeed provide the opportunity for improving the compression performance. In particular, we extend several 32-bit word-bounded coding schemes to 64-bit operation and explore their uses in information retrieval applications. Our results show that the Simple-8b approach, a 64-bit word-bounded code, is an excellent self-skipping code, and has a clear advantage over its competitors in supporting fast query evaluation when the data being compressed represents the inverted index for a large text collection. The advantages of the new code also accrue on 32-bit architectures, and for all of Boolean. ranked, and phrase queries; which means that it can be used in any situation. Copyright (C) 2010 John Wiley & Sons, Ltd.
引用
收藏
页码:131 / 147
页数:17
相关论文
共 50 条
  • [21] 64-bit servers: bits & pieces?
    Data Commun, 10 (85-88, 90):
  • [22] 64-bit server cooling requirements
    Copeland, D
    TWENTY-FIRST ANNUAL IEEE SEMICONDUCTOR THERMAL MEASUREMENT AND MANAGEMENT SYMPOSIUM, PROCEEDINGS 2005, 2005, : 94 - 98
  • [23] 64-bit computing & JVM performance
    Kyrylkov, S
    DR DOBBS JOURNAL, 2005, 30 (03): : 24 - 27
  • [24] 64-BIT PROGRAMMING IN A 32-BIT WORLD
    NICHOLSON, A
    DR DOBBS JOURNAL, 1993, 18 (01): : 34 - &
  • [25] Exploiting 64-bit parallelism - Responds
    Gutman, R
    DR DOBBS JOURNAL, 2000, 25 (12): : 10 - 10
  • [26] Implementation of a 64-bit Jackson Adder
    McAuley, Tynan
    Koven, William
    Carter, Andrew
    Ning, Paula
    Harris, David Money
    2013 ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS, 2013, : 1149 - 1154
  • [27] An Efficient Representation for Lazy Constructors using 64-bit Pointers
    Fourtounis, Georgios
    Papaspyrou, Nikolaos
    FHPC'14: PROCEEDINGS OF THE 2014 ACM SIGPLAN WORKSHOP ON FUNCTIONAL HIGH-PERFORMANCE COMPUTING, 2014, : 23 - 30
  • [28] Design and Implementation of a 64-bit RISC Processor using VHDL
    Sharma, Rohit
    Sehgal, Vivek Kumar
    Nitin, Nitin
    Bhasker, Pranav
    Verma, Ishita
    UKSIM 2009: ELEVENTH INTERNATIONAL CONFERENCE ON COMPUTER MODELLING AND SIMULATION, 2009, : 568 - +
  • [29] Making sense of 64-bit processors
    Brownstein, Mark
    Network Magazine, 2004, 19 (10): : 53 - 57
  • [30] Unix leads the 64-bit charge
    Lachal, L
    BYTE, 1996, 21 (11): : 139 - &