Input/Output APIs and Data Organization for High Performance Scientific Computing

被引:0
|
作者
Lofstead, Jay [1 ]
Zheng, Fang [1 ]
Klasky, Scott [2 ]
Schwan, Karsten [1 ]
机构
[1] Georgia Inst Technol, Coll Comp, Atlanta, GA 30332 USA
[2] Oak Ridge Natl Lab, Oak Ridge, TN 37831 USA
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Scientific Data Management has become essential to the productivity of scientists using ever larger machines and running applications that produce ever more data. There are several specific issues when running on petascale (and beyond) machines. One is the need for massively parallel data output, which in part, depends on the data formats and semantics being used. Here, the inhibition of parallelism by file system notions of strict and immediate consistency can be addressed with 'delayed data consistency' methods. Such methods can also be used to remove the runtime coordination steps required for immediate consistency from machine resources like Bluegene's separate networks for barrier calls and its dedicated IO nodes, thereby freeing them to instead, perform alternate tasks that enhance data output performance and/or richness. Second, once data is generated, it is important to be able to efficiently access it, which implies the need for rapid data characterization and indexing. This can be achieved by adding small amounts of metadata to the output process, thereby permitting scientists to quickly make informed decisions about which files to process from large-scale science runs. Third, failure probabilities increase with an increasing number of nodes, which suggests the need for organizing output data to be resilient to failures in which the output from a single or from a small number of nodes is lost or corrupted. This paper demonstrates the utility of using delayed consistency methods for the process of data output from the compute nodes of petascale machines. It also demonstrates the advantages derived from resilient data organization coupled with lightweight methods for data indexing. An implementation of these techniques is realized in ADIOS, the Adaptable IO System, and its BP intermediate file format. The implementation is designed to be compatible with existing, well-known file formats like HDF-5 and NetCDF, thereby permitting end users to exploit the rich tool chains for these formats. Initial performance evaluations of the approach exhibit substantial performance advantages over using native parallel HDF-5 in the Chimera supernova code.
引用
收藏
页码:1 / +
页数:3
相关论文
共 50 条
  • [31] INPUT AND OUTPUT ORGANIZATION OF THE SUPPLEMENTARY MOTOR AREA
    WIESENDANGER, M
    HUMMELSHEIM, H
    BIANCHETTI, M
    CHEN, DF
    HYLAND, B
    MAIER, V
    WIESENDANGER, R
    CIBA FOUNDATION SYMPOSIA, 1987, 132 : 40 - 62
  • [32] The subjective organization of input and output events in memory
    Koriat, A
    Pearlman-Avnion, S
    Ben-Zur, H
    PSYCHOLOGICAL RESEARCH-PSYCHOLOGISCHE FORSCHUNG, 1998, 61 (04): : 295 - 307
  • [33] Input-output organization of the mouse claustrum
    Zingg, Brian
    Dong, Hong-Wei
    Tao, Huizhong Whit
    Zhang, Li I.
    JOURNAL OF COMPARATIVE NEUROLOGY, 2018, 526 (15) : 2428 - 2443
  • [34] The subjective organization of input and output events in memory
    Asher Koriat
    Shiri Pearlman-Avnion
    Hasida Ben-Zur
    Psychological Research, 1998, 61 : 295 - 307
  • [35] Grid computing: The future of distributed computing for high performance scientific and business applications
    Mukherjee, S
    Mustafi, J
    Chaudhuri, A
    DISTRIBUTED COMPUTING, PROCEEDINGS: MOBILE AND WIRELESS COMPUTING, 2002, 2571 : 339 - 342
  • [36] ADIOS 2: The Adaptable Input Output System. A framework for high-performance data management
    Godoy, William F.
    Podhorszki, Norbert
    Wang, Ruonan
    Atkins, Chuck
    Eisenhauer, Greg
    Gu, Junmin
    Davis, Philip
    Choi, Jong
    Germaschewski, Kai
    Huck, Kevin
    Huebl, Axel
    Kim, Mark
    Kress, James
    Kurc, Tahsin
    Liu, Qing
    Logan, Jeremy
    Mehta, Kshitij
    Ostrouchov, George
    Parashar, Manish
    Poeschel, Franz
    Pugmire, David
    Suchyta, Eric
    Takahashi, Keichi
    Thompson, Nick
    Tsutsumi, Seiji
    Wan, Lipeng
    Wolf, Matthew
    Wu, Kesheng
    Klasky, Scott
    SOFTWAREX, 2020, 12
  • [37] ViennalPD - An Input Control Language for Scientific Computing
    Weinbub, Josef
    Rupp, Karl
    Selberherr, Siegfried
    8TH INTERNATIONAL INDUSTRIAL SIMULATION CONFERENCE 2010, ISC 2010, 2010, : 34 - 38
  • [38] Griffon - GPU Programming APIs for Scientific and General Purpose Computing
    Makpaisit, Pisit
    Marurngsith, Worawan
    INTERNATIONAL SYMPOSIUM ON DISTRIBUTED COMPUTING AND ARTIFICIAL INTELLIGENCE, 2011, 91 : 175 - 182
  • [39] INPUT SPACES AND OUTPUT PERFORMANCE
    ZAKIAN, V
    INTERNATIONAL JOURNAL OF CONTROL, 1987, 46 (01) : 185 - 191
  • [40] Achievable performance of sampled-data controllers with input and output delays
    Osburn, SL
    Bernstein, DS
    PROCEEDINGS OF THE 1996 IEEE INTERNATIONAL CONFERENCE ON CONTROL APPLICATIONS, 1996, : 904 - 909