Improving access to multi-dimensional self-describing scientific datasets

被引:4
作者
Nam, B [1 ]
Sussman, A [1 ]
机构
[1] Univ Maryland, UMIACS, College Pk, MD 20742 USA
来源
CCGRID 2003: 3RD IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER COMPUTING AND THE GRID, PROCEEDINGS | 2003年
关键词
D O I
10.1109/CCGRID.2003.1199366
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Applications that query into very large multidimensional datasets are becoming more common. Many self-describing scientific data file formats have also emerged, which have structural metadata to help navigate the multi-dimensional arrays that are stored in the files. The files may also contain application-specific semantic metadata. In this paper we discuss efficient methods for performing searches for subsets of multi-dimensional data objects, using semantic information to build multidimensional indexes, and group data items into properly sized chunks to maximize disk I/O bandwidth. This work is the first step in the design and implementation of a generic indexing library that will work with various high-dimension scientific data file formats containing semantic information about the stored data. To validate the approach, we have implemented indexing structures for NASA remote sensing data stored in the HDF format with a specific schema (HDF-EOS), and show the performance improvements that are gained from indexing the datasets, compared to using the existing HDF library for accessing the data.
引用
收藏
页码:172 / 179
页数:8
相关论文
共 15 条
[1]  
[Anonymous], WHITE PAPER HDF ARCH
[2]  
[Anonymous], NETCDF USERS GUIDE C
[3]  
BECKMANN N, 1990, SIGMOD REC, V19, P322, DOI 10.1145/93605.98741
[4]   Distributed processing of very large datasets with DataCutter [J].
Beynon, MD ;
Kurc, T ;
Catalyurek, U ;
Chang, CL ;
Sussman, A ;
Saltz, J .
PARALLEL COMPUTING, 2001, 27 (11) :1457-1478
[5]   Searching in high-dimensional spaces -: Index structures for improving the performance of multimedia Databases [J].
Böhm, C ;
Berchtold, S ;
Keim, D .
ACM COMPUTING SURVEYS, 2001, 33 (03) :322-373
[6]   Searching in metric spaces [J].
Chávez, E ;
Navarro, G ;
BaezaYates, R ;
Marroquín, JL .
ACM COMPUTING SURVEYS, 2001, 33 (03) :273-321
[7]  
Guttman A., 1984, P ACM SIGMOD INT C M, P47, DOI DOI 10.1145/602259.602266
[8]   AN IMPLEMENTATION OF INTERPROCEDURAL BOUNDED REGULAR SECTION ANALYSIS [J].
HAVLAK, P ;
KENNEDY, K .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 1991, 2 (03) :350-360
[9]  
*JET PROP LAB CAL, 1995, D7669 JPL CAL I TECH
[10]  
KURC T, 1999, P 1999 ACM IEEE SC99