Efficient retrieval of multidimensional datasets through parallel I/O

被引:4
作者
Prabhakar, S [1 ]
Abdel-Ghaffar, K [1 ]
Agrawal, D [1 ]
El Abbadi, A [1 ]
机构
[1] Purdue Univ, W Lafayette, IN 47907 USA
来源
FIFTH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING, PROCEEDINGS | 1998年
关键词
D O I
10.1109/HIPC.1998.738011
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Many scientific and engineering applications process large multidimensional datasets. An important access pattern for these applications is the retrieval of data corresponding to ranges of values in multiple dimensions. Performance is limited by disks largely due to high disk latencies. Tiling and distributing the data across multiple disks is an effective technique for improving performance through parallel I/O. The distribution of tiles across the disks is an important factor in achieving gains. Several schemes for declustering multidimensional data to improve the performance of range queries have been proposed in the literature. We extend the class of Cyclic schemes which have been developed earlier for two-dimensional data to multi pie dimensions. We establish important properties of Cyclic schemes, based upon which we reduce the search space for determining good declustering schemes within the class of Cyclic schemes. Through experimental evaluation, we establish that the Cyclic schemes are superior to other declustering schemes, including the state-of-the-art, both in terms of the degree of parallelism and robustness.
引用
收藏
页码:375 / 382
页数:8
相关论文
共 50 条
[41]   Efficient HTTP Based I/O on Very Large Datasets for High Performance Computing with the Libdavix Library [J].
Devresse, Adrien ;
Furano, Fabrizio .
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2014, 8807 :194-205
[42]   TinyProf : Towards Continuous Performance Introspection through Scalable Parallel I/O [J].
Fan, Ke ;
Kesavan, Suraj ;
Petruzza, Steve ;
Kumar, Sidharth .
ISC HIGH PERFORMANCE 2024 RESEARCH PAPER PROCEEDINGS, 39TH INTERNATIONAL CONFERENCE, 2024,
[43]   An efficient parallel computing strategy for the processing of large GNSS network datasets [J].
Yang Cui ;
Zhengsheng Chen ;
Linyang Li ;
Qinghua Zhang ;
Sheng Luo ;
Zhiping Lu .
GPS Solutions, 2021, 25
[44]   An Efficient Architecture for Parallel Skyline Computation over Large Distributed Datasets [J].
Li, He ;
Jang, Sumin ;
Yoo, Jaesoo .
JOURNAL OF INTERNET TECHNOLOGY, 2014, 15 (04) :577-588
[45]   An efficient parallel computing strategy for the processing of large GNSS network datasets [J].
Cui, Yang ;
Chen, Zhengsheng ;
Li, Linyang ;
Zhang, Qinghua ;
Luo, Sheng ;
Lu, Zhiping .
GPS SOLUTIONS, 2021, 25 (02)
[46]   ExaHDF5: Delivering Efficient Parallel I/O on Exascale Computing Systems [J].
Suren Byna ;
M. Scot Breitenfeld ;
Bin Dong ;
Quincey Koziol ;
Elena Pourmal ;
Dana Robinson ;
Jerome Soumagne ;
Houjun Tang ;
Venkatram Vishwanath ;
Richard Warren .
Journal of Computer Science and Technology, 2020, 35 :145-160
[47]   Parallel and I/O-Efficient Algorithms for Non-Linear Preferential Attachment [J].
Allendorf, Daniel ;
Meyer, Ulrich ;
Penschuck, Manuel ;
Tran, Hung .
2023 PROCEEDINGS OF THE SYMPOSIUM ON ALGORITHM ENGINEERING AND EXPERIMENTS, ALENEX, 2023, :65-76
[48]   Opportunistic Data-driven Execution of Parallel Programs for Efficient I/O Services [J].
Zhang, Xuechen ;
Davis, Kei ;
Jiang, Song .
2012 IEEE 26TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2012, :330-341
[49]   ExaHDF5: Delivering Efficient Parallel I/O on Exascale Computing Systems [J].
Byna, Suren ;
Breitenfeld, M. Scot ;
Dong, Bin ;
Koziol, Quincey ;
Pourmal, Elena ;
Robinson, Dana ;
Soumagne, Jerome ;
Tang, Houjun ;
Vishwanath, Venkatram ;
Warren, Richard .
JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2020, 35 (01) :145-160
[50]   HDF5 Cache VOL: Efficient and Scalable Parallel I/O through Caching Data on Node-local Storage [J].
Zheng, Huihuo ;
Vishwanath, Venkatram ;
Koziol, Quincey ;
Tang, Houjun ;
Ravi, John ;
Mainzer, John ;
Byna, Suren .
2022 22ND IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING (CCGRID 2022), 2022, :61-70