Efficient retrieval of multidimensional datasets through parallel I/O

被引:4
作者
Prabhakar, S [1 ]
Abdel-Ghaffar, K [1 ]
Agrawal, D [1 ]
El Abbadi, A [1 ]
机构
[1] Purdue Univ, W Lafayette, IN 47907 USA
来源
FIFTH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING, PROCEEDINGS | 1998年
关键词
D O I
10.1109/HIPC.1998.738011
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Many scientific and engineering applications process large multidimensional datasets. An important access pattern for these applications is the retrieval of data corresponding to ranges of values in multiple dimensions. Performance is limited by disks largely due to high disk latencies. Tiling and distributing the data across multiple disks is an effective technique for improving performance through parallel I/O. The distribution of tiles across the disks is an important factor in achieving gains. Several schemes for declustering multidimensional data to improve the performance of range queries have been proposed in the literature. We extend the class of Cyclic schemes which have been developed earlier for two-dimensional data to multi pie dimensions. We establish important properties of Cyclic schemes, based upon which we reduce the search space for determining good declustering schemes within the class of Cyclic schemes. Through experimental evaluation, we establish that the Cyclic schemes are superior to other declustering schemes, including the state-of-the-art, both in terms of the degree of parallelism and robustness.
引用
收藏
页码:375 / 382
页数:8
相关论文
共 50 条
[21]   Using semantic information to guide efficient parallel I/O on clusters [J].
Schulz, M .
11TH IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE DISTRIBUTED COMPUTING, PROCEEDINGS, 2002, :135-142
[22]   Towards Efficient Support for Parallel I/O in Java']Java HPC [J].
Awan, Ammar Ahmad ;
Ayub, Muhammad Sohaib ;
Shafi, Aamir ;
Lee, Sungyoung .
2012 13TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING, APPLICATIONS, AND TECHNOLOGIES (PDCAT 2012), 2012, :137-143
[23]   Efficient parallel data mining for massive datasets: Parallel random forests classifier [J].
Dai, JY ;
Lee, J ;
Wang, MC .
PDPTA '05: PROCEEDINGS OF THE 2005 INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED PROCESSING TECHNIQUES AND APPLICATIONS, VOLS 1-3, 2005, :1142-1148
[24]   An efficient parallel retrieval for complex object index [J].
Horie, T. (horie@pear.fuis.fukui-u.ac.jp), The IEEE Computer Society; The Database Society of Japan, DBSJ; Information Processing Society of Japan, IPSJ; The Inst. of Elec., Info. and Com. Engineers, IEICE (Inst. of Elec. and Elec. Eng. Computer Society, 445 Hoes Lane - P.O.Box 1331, Piscataway, NJ 08855-1331, United States)
[25]   An efficient parallel texture classification for image retrieval [J].
You, J ;
Shen, H ;
Cohen, HA .
JOURNAL OF VISUAL LANGUAGES AND COMPUTING, 1997, 8 (03) :359-372
[26]   An efficient parallel texture classification for image retrieval [J].
You, J ;
Shen, H ;
Cohen, HA .
ADVANCES IN PARALLEL AND DISTRIBUTED COMPUTING - PROCEEDINGS, 1997, :18-25
[27]   EFFICIENT ORGANIZATION AND ACCESS OF MULTIDIMENSIONAL DATASETS ON TERTIARY STORAGE-SYSTEMS [J].
CHEN, LT ;
DRACH, R ;
KEATING, M ;
LOUIS, S ;
ROTEM, D ;
SHOSHANI, A .
INFORMATION SYSTEMS, 1995, 20 (02) :155-183
[28]   Parallel I/O [J].
Schikuta, E ;
Wanek, H .
INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2001, 15 (02) :162-168
[29]   PARALLEL I/O AS A PARALLEL APPLICATION [J].
MOYER, SA ;
SUNDERAM, VS .
INTERNATIONAL JOURNAL OF SUPERCOMPUTER APPLICATIONS AND HIGH PERFORMANCE COMPUTING, 1995, 9 (02) :95-107
[30]   Efficient multidimensional data redistribution for resizable parallel computations [J].
Sudarsan, Rajesh ;
Ribbens, Calvin J. .
PARALLEL AND DISTRIBUTED PROCESSING AND APPLICATIONS, PROCEEDINGS, 2007, 4742 :182-194