A PGAS Execution Model for Efficient Stencil Computation on Many-Core Processors

被引:0
|
作者
Ikei, Mitsuru [1 ]
Sato, Mitsuhisa [2 ]
机构
[1] Univ Tsukuba, Dept Comp Sci, Intel KK, Intel Architecture Technol Grp, Tokyo, Japan
[2] Univ Tsukuba, Dept Comp Sci, Tsukuba, Ibaraki, Japan
来源
2014 14TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID) | 2014年
关键词
many integrated core; parallel; PGAS; stencil code;
D O I
10.1109/CCGrid.2014.20
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
A efficient PGAS execution model on many-core processor for stencil computation is proposed and implemented. We use XcalableMP as a base language and we modify its runtime well fit in many-core processors. The runtime uses processes for parallel execution and global arrays of the stencil codes are broken into blocked sub-arrays placed on shared memory. Using two stencil codes, Laplace and Himeno, we evaluated its performance. In the evaluation, we show (1) Blocking improves locality of memory access during computation therefore improves total CPU execution time. (2) Direct data access using shared memory can relieve communication burden of sub-array halo exchanges.
引用
收藏
页码:305 / 314
页数:10
相关论文
共 50 条
  • [1] WorkQ: A Many-Core Producer/Consumer Execution Model Applied to PGAS Computations
    Ozog, David
    Malony, Allen
    Hammond, Jeff R.
    Balaji, Pavan
    2014 20TH IEEE INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS), 2014, : 632 - 639
  • [2] Low-level PGAS computing on many-core processors with TSHMEM
    Lam, Bryant C.
    George, Alan D.
    Lam, Herman
    Aggarwal, Vikas
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2015, 27 (17): : 5288 - 5310
  • [3] Automatic Code Generation and Optimization of Large-scale Stencil Computation on Many-core Processors
    Li, Mingzhen
    Liu, Yi
    Yang, Hailong
    Hu, Yongmin
    Sun, Qingxiao
    Chen, Bangduo
    You, Xin
    Liu, Xiaoyan
    Luan, Zhongzhi
    Qian, Depei
    50TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, 2021,
  • [4] Multi-level spatial and temporal tiling for efficient HPC stencil computation on many-core processors with large shared caches
    Yount, Charles
    Duran, Alejandro
    Tobin, Josh
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2019, 92 : 903 - 919
  • [5] Efficient Fault Simulation on Many-Core Processors
    Kochte, Michael A.
    Schaal, Marcel
    Wunderlich, Hans-Joachim
    Zoellin, Christian G.
    PROCEEDINGS OF THE 47TH DESIGN AUTOMATION CONFERENCE, 2010, : 380 - 385
  • [6] Efficient backprojection-based synthetic aperture radar computation with many-core processors
    Park, Jongsoo
    Tang, Ping Tak Peter
    Smelyanskiy, Mikhail
    Kim, Daehyun
    Benson, Thomas
    SCIENTIFIC PROGRAMMING, 2013, 21 (3-4) : 165 - 179
  • [7] Efficient Backprojection-based Synthetic Aperture Radar Computation with Many-core Processors
    Park, Jongsoo
    Tang, Ping Tak Peter
    Smelyanskiy, Mikhail
    Kim, Daehyun
    Benson, Thomas
    2012 INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SC), 2012,
  • [8] Towards efficient tile low-rank GEMM computation on sunway many-core processors
    Han, Qingchang
    Yang, Hailong
    Dun, Ming
    Luan, Zhongzhi
    Gan, Lin
    Yang, Guangwen
    Qian, Depei
    JOURNAL OF SUPERCOMPUTING, 2021, 77 (05): : 4533 - 4564
  • [9] Towards efficient tile low-rank GEMM computation on sunway many-core processors
    Qingchang Han
    Hailong Yang
    Ming Dun
    Zhongzhi Luan
    Lin Gan
    Guangwen Yang
    Depei Qian
    The Journal of Supercomputing, 2021, 77 : 4533 - 4564
  • [10] Multi-tasking Execution in PGAS Language XcalableMP and Communication Optimization on Many-core Clusters
    Tsugane, Keisuke
    Lee, Jinpil
    Murai, Hitoshi
    Sato, Mitsuhisa
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING IN ASIA-PACIFIC REGION (HPC ASIA 2018), 2018, : 75 - 85