Scalable and queryable compressed storage structure for raster data

被引:27
作者
Ladra, Susana [1 ]
Parama, Jose R. [1 ]
Silva-Coira, Fernando [1 ]
机构
[1] Univ A Coruna, Fac Informat, CITIC, Campus Elvina S-N, La Coruna 15071, Spain
关键词
Geographic information systems; Raster datasets; Data compression; Indexing; Query processing; REPRESENTATIONS; ACCESS;
D O I
10.1016/j.is.2017.10.007
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Compact data structures are storage structures that combine a compressed representation of the data and the access mechanisms for retrieving individual data without the need of decompressing from the beginning. The target is to be able to keep the data always compressed, even in main memory, given that the data can be processed directly in that form. With this approach, we obtain several benefits: we can load larger datasets in main memory, we can make a better usage of the memory hierarchy, and we can obtain bandwidth savings in a distributed computational scenario, without wasting time in compressing and decompressing data during data exchanges. In this work, we follow a compact data structure approach to design a storage structure for raster data, which is commonly used to represent attributes of the space (temperatures, pressure, elevation measures, etc.) in geographical information systems. As it is common in compact data structures, our new technique is not only able to store and directly access compressed data, but also indexes its content, thereby accelerating the execution of queries. Previous compact data structures designed to store raster data work well when the raster dataset has few different values. Nevertheless, when the number of different values in the raster increases, their space consumption and search performance degrade. Our experiments show that our storage structure improves previous approaches in all aspects, especially when the number of different values is large, which is critical when applying over real datasets. Compared with classical methods for storing rasters, namely netCDF, our method competes in space and excels in access and query times. (C) 2017 Elsevier Ltd. All rights reserved.
引用
收藏
页码:179 / 204
页数:26
相关论文
共 69 条
[1]   A DATA STRUCTURE AND ALGORITHM BASED ON A LINEAR KEY FOR A RECTANGLE RETRIEVAL PROBLEM [J].
ABEL, DJ ;
SMITH, JL .
COMPUTER VISION GRAPHICS AND IMAGE PROCESSING, 1983, 24 (01) :1-13
[2]  
Alvarez S., 2010, Proceedings of the eighth workshop on mining and learning with graphs, P18
[3]   Compressed vertical partitioning for efficient RDF management [J].
Alvarez-Garcia, Sandra ;
Brisaboa, Nieves ;
Fernandez, Javier D. ;
Martinez-Prieto, Miguel A. ;
Navarro, Gonzalo .
KNOWLEDGE AND INFORMATION SYSTEMS, 2015, 44 (02) :439-474
[4]   Inverted index compression using word-aligned binary codes [J].
Anh, VN ;
Moffat, A .
INFORMATION RETRIEVAL, 2005, 8 (01) :151-166
[5]  
[Anonymous], 2009, Database Systems: The Complete Book
[6]  
[Anonymous], 2006, ICDE 59, DOI DOI 10.1109/ICDE.2006.150
[7]   BF-Tree: Approximate Tree Indexing [J].
Athanassoulis, Manos ;
Ailamaki, Anastasia .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2014, 7 (14) :1881-1892
[8]   Representing trees of higher degree [J].
Benoit, D ;
Demaine, ED ;
Munro, JI ;
Raman, R ;
Raman, V ;
Rao, SS .
ALGORITHMICA, 2005, 43 (04) :275-292
[9]   GraCT: A Grammar Based Compressed Representation of Trajectories [J].
Brisaboa, Nieves R. ;
Gomez-Brandon, Adrian ;
Navarro, Gonzalo ;
Parama, Jose R. .
STRING PROCESSING AND INFORMATION RETRIEVAL, SPIRE 2016, 2016, 9954 :218-230
[10]   Compact representation of Web graphs with extended functionality [J].
Brisaboa, Nieves R. ;
Ladra, Susana ;
Navarro, Gonzalo .
INFORMATION SYSTEMS, 2014, 39 :152-174