A Fault Tolerant Implementation for a Massively Parallel Seismic Framework

被引:2
|
作者
Kayum, Suha N. [1 ]
Alsalim, Hussain [1 ]
Tonellot, Thierry-Laurent [1 ]
Momin, Ali [1 ]
机构
[1] Saudi Aramco, Dhahran, Saudi Arabia
关键词
parallel seismic applications; fault tolerance; High Performance Computing;
D O I
10.1109/hpec43674.2020.9286143
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
An increase in the acquisition of seismic data volumes has resulted in applications processing seismic data running for weeks or months on large supercomputers. A fault occurring during processing would jeopardize the fidelity and quality of the results, hence necessitating a resilient application. GeoDRIVE is a High-Performance Computing (HPC) software framework tailored to massive seismic applications and supercomputers. A fault tolerance mechanism that capitalizes on Boost.asio for network communication is presented and tested quantitatively and qualitatively by simulating faults using fault injection. Resource provisioning is also illustrated by adding more resources to a job during simulation. Finally, a large-scale job of 2,500 seismic experiments and 358 billion grid elements is executed on 32,000 cores. Subsets of nodes are killed at different times, validating the resilience of the mechanism in large scale. While the implementation is demonstrated in a seismic application context, it can be tailored to any HPC application with embarrassingly parallel properties.
引用
收藏
页数:8
相关论文
共 50 条
  • [41] Efficient massively parallel implementation of some combinatorial algorithms
    Hsu, TS
    Ramachandran, V
    THEORETICAL COMPUTER SCIENCE, 1996, 162 (02) : 297 - 322
  • [42] Massively Parallel Implementation of Explicitly Correlated Coupled-Cluster Singles and Doubles Using TiledArray Framework
    Peng, Chong
    Calvin, Justus A.
    Pavosevic, Fabijan
    Zhang, Jinmei
    Valeev, Edward F.
    JOURNAL OF PHYSICAL CHEMISTRY A, 2016, 120 (51): : 10231 - 10244
  • [43] A MASSIVELY-PARALLEL NAVIER-STOKES IMPLEMENTATION
    WESLEY, R
    WU, ES
    CALAHAN, DA
    AIAA 9TH COMPUTATIONAL FLUID DYNAMICS CONFERENCE: A COLLECTION OF TECHNICAL PAPERS, 1989, : 125 - 131
  • [44] Massively parallel processing implementation of the toroidal neural networks
    Palazzari, P
    Coli, M
    Rughi, R
    PROCEEDINGS OF THE 2000 6TH IEEE INTERNATIONAL WORKSHOP ON CELLULAR NEURAL NETWORKS AND THEIR APPLICATIONS (CNNA 2000), 2000, : 295 - 300
  • [45] Fault injection campaign for a fault tolerant duplex framework
    Sacco, Gian Franco
    Ferraro, Robert D.
    von Allmen, Paul
    Rennels, Dave A.
    2007 IEEE AEROSPACE CONFERENCE, VOLS 1-9, 2007, : 2582 - +
  • [46] Framework for Massively Parallel Testing at Wafer and Package Test
    Baba, A. Hakan
    Kim, Kee Sup
    2009 IEEE INTERNATIONAL CONFERENCE ON COMPUTER DESIGN, 2009, : 328 - +
  • [48] GePaRDT, a framework for massively parallel processing of dataflow graphs
    Schoech, Alexander
    Bach, Carlo
    Ettemeyer, Andreas
    Linz-Dittrich, Sabine
    REAL-TIME IMAGE AND VIDEO PROCESSING 2012, 2012, 8437
  • [49] MPRAnalyze: statistical framework for massively parallel reporter assays
    Ashuach, Tal
    Fischer, David S.
    Kreimer, Anat
    Ahituv, Nadav
    Theis, Fabian J.
    Yosef, Nir
    GENOME BIOLOGY, 2019, 20 (01)
  • [50] Graph Multiset Transformation as a Framework for Massively Parallel Computation
    Kreowski, Hans-Joerg
    Kuske, Sabine
    GRAPH TRANSFORMATIONS, ICGT 2008, 2008, 5214 : 351 - 365