Utilizing Multiple Xeon Phi Coprocessors on One Compute Node

被引:0
|
作者
Dong, Xinnan [1 ]
Chai, Jun [1 ]
Yang, Jing [1 ]
Wen, Mei [1 ]
Wu, Nan [1 ]
Cai, Xing [2 ,3 ]
Zhang, Chunyuan [1 ]
Chen, Zhaoyun [1 ]
机构
[1] Natl Univ Def Technol, Sch Comp Sci, Changsha 410073, Hunan, Peoples R China
[2] Simula Res Lab, NO-1325 Lyakser, Norway
[3] Univ Oslo, Dept Informat, NO-03166 Oslo, Norway
关键词
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Future exascale systems are expected to adopt compute nodes that incorporate many accelerators. This paper thus investigates the topic of programming multiple Xeon Phi coprocessors that lie inside one compute node. Besides a standard MPI-OpenMP programming approach, which belongs to the symmetric usage mode, two offload-mode programming approaches are considered. The first offload approach is conventional and uses compiler pragmas, whereas the second one is new and combines Intel's APIs of coprocessor offload infrastructure (COI) and symmetric communication interface (SCIF) for low-latency communication. While the pragma-based approach allows simpler programming, the COI-SCIF approach has three advantages in (1) lower overhead associated with launching offloaded code, (2) higher data transfer bandwidths, and (3) more advanced asynchrony between computation and data movement. The low-level COI-SCIF approach is also shown to have benefits over the MPI-OpenMP counterpart. All the programming approaches are tested by a real-world 3D application, for which the COI-SCIF approach shows a performance upper hand on a Tianhe-2 compute node with three Xeon Phi coprocessors.
引用
收藏
页码:68 / 81
页数:14
相关论文
共 50 条
  • [41] A Large Closed Queueing Network Containing Two Types of Node and Multiple Customer Classes: One Bottleneck Station
    Vyacheslav M. Abramov
    Queueing Systems, 2004, 48 : 45 - 73
  • [42] A large closed queueing network containing two types of node and multiple customer classes: One bottleneck station
    Abramov, VM
    QUEUEING SYSTEMS, 2004, 48 (1-2) : 45 - 73
  • [43] LC fluorescence method for multiple synthetic compounds to rapidly create in vivo pharmacokinetic database utilizing 'N-in-One' dosing
    Rajanikanth, M
    Gupta, RC
    JOURNAL OF PHARMACEUTICAL AND BIOMEDICAL ANALYSIS, 2001, 26 (04) : 519 - 530
  • [44] Sharp error bounds of a quadrature rule with one multiple node for the finite Hilbert transform in some classes of continuous differentiable functions
    Dragomir, SS
    TAIWANESE JOURNAL OF MATHEMATICS, 2005, 9 (01): : 95 - 109
  • [45] Design Space Construction of Multiple Dose-Strength Tablets Utilizing Bayesian Estimation Based on One Set of Design-of-Experiments
    Maeda, Jin
    Suzuki, Tatsuya
    Takayama, Kozo
    CHEMICAL & PHARMACEUTICAL BULLETIN, 2012, 60 (11) : 1399 - 1408
  • [46] Suppression of Resonant Frequency Drifting of Transmitter for One-to-Multiple Wireless Power Transfer System by Utilizing Hybrid Compensation Topologies in Receivers
    Huang, Yongcan
    Liu, Chunhua
    Xiao, Yang
    Liu, Senyi
    Song, Zaixin
    2019 22ND INTERNATIONAL CONFERENCE ON ELECTRICAL MACHINES AND SYSTEMS (ICEMS 2019), 2019, : 1084 - 1088
  • [47] Generating Single and Multiple Cooperative Heuristics for the One Dimensional Bin Packing Problem Using a Single Node Genetic Programming Island Model
    Sim, Kevin
    Hart, Emma
    GECCO'13: PROCEEDINGS OF THE 2013 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, 2013, : 1549 - 1556
  • [49] Location Aware and Node Ranking Value-Assisted Embedding Algorithm for One-Stage Embedding in Multiple Distributed Virtual Network Embedding
    Cao, Haotong
    Guo, Yongan
    Hu, Yue
    Wu, Shengchen
    Zhu, Hongbo
    Yang, Longxiang
    IEEE ACCESS, 2018, 6 : 78425 - 78436
  • [50] Only one pRNA hexamer but multiple copies of the DNA-packaging protein gp16 are needed for the motor to package bacterial virus phi29 genomic DNA
    Shu, D
    Guo, PX
    VIROLOGY, 2003, 309 (01) : 108 - 113