PIDX: Efficient Parallel I/O for Multi-resolution Multi-dimensional Scientific Datasets

被引:11
作者
Kumar, Sidharth [1 ]
Vishwanath, Venkatram [2 ]
Carns, Philip [2 ]
Summa, Brian [1 ]
Scorzelli, Giorgio [1 ]
Pascucci, Valerio [1 ]
Ross, Robert [2 ]
Chen, Jacqueline [3 ]
Kolla, Hemanth [3 ]
Grout, Ray
机构
[1] Univ Utah, SCI Inst, Salt Lake City, UT 84112 USA
[2] Argonne Natl Lab, Argonne, IL USA
[3] Sandia Natl Labs, Livermore, CA USA
来源
2011 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER) | 2011年
关键词
D O I
10.1109/CLUSTER.2011.19
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The IDX data format provides efficient, cache oblivious, and progressive access to large-scale scientific datasets by storing the data in a hierarchical Z (HZ) order. Data stored in IDX format can be visualized in an interactive environment allowing for meaningful explorations with minimal resources. This technology enables real-time, interactive visualization and analysis of large datasets on a variety of systems ranging from desktops and laptop computers to portable devices such as iPhones/iPads and over the web. While the existing ViSUS API for writing IDX data is serial, there are obvious advantages of applying the IDX format to the output of large scale scientific simulations. We have therefore developed PIDX - a parallel API for writing data in an IDX format. With PIDX it is now possible to generate IDX datasets directly from large scale scientific simulations with the added advantage of real-time monitoring and visualization of the generated data. In this paper, we provide an overview of the IDX file format and how it is generated using PIDX. We then present a data model description and a novel aggregation strategy to enhance the scalability of the PIDX library. The S3D combustion application is used as an example to demonstrate the efficacy of PIDX for a real-world scientific simulation. S3D is used for fundamental studies of turbulent combustion requiring exceptionally high fidelity simulations. PIDX achieves up to 18 GiB/s I/O throughput at 8,192 processes for S3D to write data out in the IDX format. This allows for interactive analysis and visualization of S3D data, thus, enabling in situ analysis of S3D simulation.
引用
收藏
页码:103 / 111
页数:9
相关论文
共 18 条
  • [1] [Anonymous], P SC2003 HIGH PERF N
  • [2] [Anonymous], 2009, COMPUTATIONAL SCI DI
  • [3] [Anonymous], 2001, P 2001 ACM IEEE C SU
  • [4] [Anonymous], UCRLJC140581 LAWR LI
  • [5] Carns P., 2011, P 27 IEEE C MASS STO
  • [6] Ching A., 2004, International Journal of High Performance Computing and Networking, V2, P133, DOI 10.1504/IJHPCN.2004.008898
  • [7] Chiueh T., 1993, Proceedings ACM Multimedia 93, P401, DOI 10.1145/166266.168438
  • [8] Guthe S, 2002, VIS 2002: IEEE VISUALIZATION 2002, PROCEEDINGS, P53, DOI 10.1109/VISUAL.2002.1183757
  • [9] Kui Gao, 2009, Proceedings of the 2009 International Conference on Parallel Processing (ICPP 2009), P470, DOI 10.1109/ICPP.2009.68
  • [10] Kumar S., 2010, P 2010 PET DAT STOR