Utilizing Multiple Xeon Phi Coprocessors on One Compute Node

被引：0

作者：

Dong, Xinnan ^{[1
]}

Chai, Jun ^{[1
]}

Yang, Jing ^{[1
]}

Wen, Mei ^{[1
]}

Wu, Nan ^{[1
]}

Cai, Xing ^{[2
,3
]}

Zhang, Chunyuan ^{[1
]}

Chen, Zhaoyun ^{[1
]}

机构：

[1] Natl Univ Def Technol, Sch Comp Sci, Changsha 410073, Hunan, Peoples R China

[2] Simula Res Lab, NO-1325 Lyakser, Norway

[3] Univ Oslo, Dept Informat, NO-03166 Oslo, Norway

来源：

ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2014, PT II | 2014年 / 8631卷

关键词：

D O I：

暂无

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Future exascale systems are expected to adopt compute nodes that incorporate many accelerators. This paper thus investigates the topic of programming multiple Xeon Phi coprocessors that lie inside one compute node. Besides a standard MPI-OpenMP programming approach, which belongs to the symmetric usage mode, two offload-mode programming approaches are considered. The first offload approach is conventional and uses compiler pragmas, whereas the second one is new and combines Intel's APIs of coprocessor offload infrastructure (COI) and symmetric communication interface (SCIF) for low-latency communication. While the pragma-based approach allows simpler programming, the COI-SCIF approach has three advantages in (1) lower overhead associated with launching offloaded code, (2) higher data transfer bandwidths, and (3) more advanced asynchrony between computation and data movement. The low-level COI-SCIF approach is also shown to have benefits over the MPI-OpenMP counterpart. All the programming approaches are tested by a real-world 3D application, for which the COI-SCIF approach shows a performance upper hand on a Tianhe-2 compute node with three Xeon Phi coprocessors.

引用

页码：68 / 81

页数：14

共 50 条

[21] A Coprocessor Sharing-Aware Scheduler for Xeon Phi-based Compute Clusters
Coviello, Giuseppe
Cadambi, Srihari
Chakradhar, Srimat
2014 IEEE 28TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM, 2014,
[22] mD3DOCKxb: a deep parallel optimized software for molecular docking with Intel Xeon Phi Coprocessors
Cheng, Qian
Peng, Shaoliang
Lu, Yutong
Wu, Chengkun
Wang, Haiqiang
Liu, Xin
Zhu, Weiliang
Xu, Zhijian
Zhang, Xinben
2015 15TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING, 2015, : 725 - 728
[23] mAMBER:Accelerating explicit solvent molecular dynamic with Intel Xeon Phi Many-Integrated Core Coprocessors
Liu, Xin
Peng, Shaoliang
Yang, Canqun
Wu, Chengkun
Wang, Haiqiang
Cheng, Qian
Zhu, Weiliang
Wang, Jinan
2015 15TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING, 2015, : 729 - 732
[24] Tera-Scale 1D FFT with Low-Communication Algorithm and Intel® Xeon Phi™ Coprocessors
Park, Jongsoo
Bikshandi, Ganesh
Vaidyanathan, Karthikeyan
Tang, Ping Tak Peter
Dubey, Pradeep
Kim, Daehyun
2013 INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SC), 2013,
[25] High-level support for hybrid parallel execution of C plus plus applications targeting Intel® Xeon Phi™ coprocessors
Dokulil, Jiri
Bajrovic, Enes
Benkner, Siegfried
Pllana, Sabri
Sandrieser, Martin
Bachmayer, Beverly
2013 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, 2013, 18 : 2508 - 2511
[26] Load Balancing and Patch-Based Parallel Adaptive Mesh Refinement for Tsunami Simulation on Heterogeneous Platforms Using Xeon Phi Coprocessors
Ferreira, Chaulio R.
Bader, Michael
PROCEEDINGS OF THE PLATFORM FOR ADVANCED SCIENTIFIC COMPUTING CONFERENCE (PASC17), 2017,
[27] A Unified Programming Model for Intra- and Inter-Node Offloading on Xeon Phi Clusters
Noack, Matthias
Wende, Florian
Steinke, Thomas
Cordes, Frank
SC14: INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2014, : 203 - 214
[28] Accelerating multiple replica molecular dynamics simulations using the Intel® Xeon Phi coprocessor
Parks, Conor
Huang, Lei
Wang, Yang
Ramkrishna, Doraiswami
MOLECULAR SIMULATION, 2017, 43 (09) : 714 - 723
[29] Design and Implementation of the Linpack Benchmark for Single and Multi-Node Systems Based on Intel® Xeon Phi™ Coprocessor
Heinecke, Alexander
Vaidyanathan, Karthikeyan
Smelyanskiy, Mikhail
Kobotov, Alexander
Dubtsov, Roman
Henry, Greg
Shet, Aniruddha G.
Chrysos, George
Dubey, Pradeep
IEEE 27TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS 2013), 2013, : 126 - 137
[30] Performance Study of Monte Carlo Codes on Xeon Phi Coprocessors - Testing MCNP 6.1 and Profiling ARCHER Geometry Module on the FS7ONNi Problem
Liu, Tianyu
Wolfe, Noah
Lin, Hui
Zieb, Kris
Ji, Wei
Caracappa, Peter
Carother, Christopher
Xu, X. George
ICRS-13 & RPSD-2016, 13TH INTERNATIONAL CONFERENCE ON RADIATION SHIELDING & 19TH TOPICAL MEETING OF THE RADIATION PROTECTION AND SHIELDING DIVISION OF THE AMERICAN NUCLEAR SOCIETY - 2016, 2017, 153

← 1 2 3 4 5 →