Task-based programming for Seismic Imaging: Preliminary Results

被引:4
作者
Boillot, Lionel [1 ]
Bosilca, George [2 ]
Agullo, Emmanuel [3 ]
Calandra, Henri [4 ]
机构
[1] INRIA Bordeaux Sud Ouest, Mag Project Team 3D, Ave Univ,BP 1155, F-64013 Pau, France
[2] Univ Tennessee, Dept EECS, ICL, Knoxville, TN 37996 USA
[3] INRIA Bordeaux Sud Ouest, HiePACS Project Team, F-33405 Talence, France
[4] TOTAL EP, Depth Imaging & High Performance Comp, Houston, TX 77057 USA
来源
2014 IEEE INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS, 2014 IEEE 6TH INTL SYMP ON CYBERSPACE SAFETY AND SECURITY, 2014 IEEE 11TH INTL CONF ON EMBEDDED SOFTWARE AND SYST (HPCC,CSS,ICESS) | 2014年
关键词
WAVE-PROPAGATION;
D O I
10.1109/HPCC.2014.205
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The level of hardware complexity of current super-computers is forcing the High Performance Computing (HPC) community to reconsider parallel programming paradigms and standards. The high-level of hardware abstraction provided by task-based paradigms make them excellent candidates for writing portable codes that can consistently deliver high performance across a wide range of platforms. While this paradigm has proved efficient for achieving such goals for dense and sparse linear solvers, it is yet to be demonstrated that industrial parallel codes-relying on the classical Message Passing Interface (MPI) standard and that accumulate dozens of years of expertise (and countless lines of code)-may be revisited to turn them into efficient task-based programs. In this paper, we study the applicability of task-based programming in the case of a Reverse Time Migration (RTM) application for Seismic Imaging. The initial MPI-based application is turned into a task-based code executed on top of the PaRSEC runtime system. Preliminary results show that the approach is competitive with (and even potentially superior to) the original MPI code on a homogeneous multicore node, and can more efficiently exploit complex hardware such as a cache coherent Non Uniform Memory Access (ccNUMA) node or an Intel Xeon Phi accelerator.
引用
收藏
页码:1259 / 1266
页数:8
相关论文
共 31 条
[1]  
Agullo E., 2011, 2011 9th IEEE/ACS International Conference on Computer Systems and Applications (AICCSA), P217, DOI 10.1109/AICCSA.2011.6126599
[2]  
Agullo E., 2011, Proceedings of the 25th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2011), P932, DOI 10.1109/IPDPS.2011.90
[3]  
Agullo E, 2012, RR8192 INRIA
[4]   TASK-BASED FMM FOR MULTICORE ARCHITECTURES [J].
Agullo, Emmanuel ;
Bramas, Berenger ;
Coulaud, Olivier ;
Darve, Eric ;
Messner, Matthias ;
Takahashi, Toru .
SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2014, 36 (01) :C66-C93
[5]  
Agullo E, 2013, LECT NOTES COMPUT SC, V8097, P521, DOI 10.1007/978-3-642-40047-6_53
[6]  
Agullo Emmanuel, 2010, GPU COMPUTING GEMS, V2
[7]  
[Anonymous], 2011, Openmp application program interface
[8]  
[Anonymous], 2001, OPTIMIZING COMPILERS
[9]  
[Anonymous], 2009, J PHYS C SERIES, V180
[10]   StarPU: a unified platform for task scheduling on heterogeneous multicore architectures [J].
Augonnet, Cedric ;
Thibault, Samuel ;
Namyst, Raymond ;
Wacrenier, Pierre-Andre .
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2011, 23 (02) :187-198