Better bitmap performance with Roaring bitmaps

被引:84
作者
Chambi, Samy [1 ]
Lemire, Daniel [2 ]
Kaser, Owen [3 ]
Godin, Robert [2 ]
机构
[1] Univ Quebec Montreal, Dept Informat, Montreal, PQ, Canada
[2] TELUQ, LICEF Res Ctr, Montreal, PQ, Canada
[3] UNB St John, Comp Sci & Appl Stat, St John, NB, Canada
关键词
performance; measurement; index compression; bitmap index;
D O I
10.1002/spe.2325
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Bitmap indexes are commonly used in databases and search engines. By exploiting bit-level parallelism, they can significantly accelerate queries. However, they can use much memory, and thus, we might prefer compressed bitmap indexes. Following Oracle's lead, bitmaps are often compressed using run-length encoding (RLE). Building on prior work, we introduce the Roaring compressed bitmap format: it uses packed arrays for compression instead of RLE. We compare it to two high-performance RLE-based bitmap encoding techniques: Word Aligned Hybrid compression scheme and Compressed n' Composable Integer Set. On synthetic and real data, we find that Roaring bitmaps (1) often compress significantly better (e.g., 2x) and (2) are faster than the compressed alternatives (up to 900x faster for intersections). Our results challenge the view that RLE-based bitmap compression is best. Copyright (c) 2015 John Wiley & Sons, Ltd.
引用
收藏
页码:709 / 719
页数:11
相关论文
共 22 条
[1]  
[Anonymous], 2013, HACKERS DELIGHT
[2]  
Antoshenkov G., 1995, BYT ALIGNED BITMAP C, P476
[3]  
Beyer K, 1999, SIGMOD RECORD, VOL 28, NO 2 - JUNE 1999, P359, DOI 10.1145/304181.304214
[4]   CONCISE: Compressed 'n' Composable Integer Set [J].
Colantonio, Alessandro ;
Di Pietro, Roberto .
INFORMATION PROCESSING LETTERS, 2010, 110 (16) :644-650
[5]  
Corrales Fabian, 2011, Database and Expert Systems Applications. Proceedings 22nd International Conference, DEXA 2011, P381, DOI 10.1007/978-3-642-23091-2_32
[6]   Efficient Set Intersection for Inverted Indexing [J].
Culpepper, J. Shane ;
Moffat, Alistair .
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2010, 29 (01)
[7]   NET-FLi: On-the-fly Compression, Archiving and Indexing of Streaming Network Traffic [J].
Fusco, Francesco ;
Stoecklin, Marc Ph. ;
Vlachos, Michail .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2010, 3 (02) :1382-1393
[8]  
Grand A, 2014, LUCENE5983
[9]  
Guzun G, 2014, PROC INT CONF DATA, P484, DOI 10.1109/ICDE.2014.6816675
[10]  
Inoue H, 2014, P VLDB ENDOWMENT, V8