Can MIC find its place in the field of PDES? An Early Performance Evaluation of PDES Simulator on Intel Many Integrated Cores Coprocessor

被引:2
作者
Chen, Huilong [1 ]
Yao, Yiping [1 ,2 ]
Tang, Wenjie [1 ]
Meng, Dong [1 ]
Zhu, Feng [1 ]
Fu, Yuewen [1 ]
机构
[1] Natl Univ Def Technol, Coll Informat Syst & Management, Changsha, Hunan, Peoples R China
[2] Natl Univ Def Technol, State Key Lab High Performance Comp, Changsha, Hunan, Peoples R China
来源
2015 IEEE/ACM 19TH INTERNATIONAL SYMPOSIUM ON DISTRIBUTED SIMULATION AND REAL TIME APPLICATIONS (DS-RT) | 2015年
关键词
performance evaluation; PDES simulator; MIC; Many Integrated Core; many-core coprocessor;
D O I
10.1109/DS-RT.2015.23
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The widespread utilization of many-core processors offers a good opportunity for Parallel Discrete Events Simulation (PDES) to obtain a better execution performance. As one of the newly introduced many-core processors, the Intel Xeon Phi coprocessor based on Many Integrated Core (MIC) architecture integrates about 60 optimized x86 cores within a PCB board, reaching a peak performance of 1.0 TFLOPS. Furthermore, benefiting from using x86 architecture cores, the MIC coprocessor is fully compatible with almost all programs designed for general purpose CPUs, which makes it easy to run simulation progress on MIC. There have been many works on performance evaluation and optimization of PDES simulator using Graphic Processing Unit (GPU) or Tilera or other many-core processors, yet almost no related works on Phi are published until now. In this article, an early performance evaluation of the well-known PDES simulator ROSS and its POSIX thread version ROSS-MT was conducted based on a computing node composed of two Intel Xeon multi-core CPUs and one Phi coprocessor, using the classical PDES benchmark PHOLD and its extended version by adding different event granularities. Experiment results show that the pure MPI based ROSS performs poorly on MIC coprocessor, indicating that it would not be feasible for common PDES applications. Though ROSS-MT has a much better performance on MIC, the computation potential of MIC is still hardly fully explored. Furthermore, with the event granularity becomes larger, performance of this benchmark exhibits a "fall of cliff", which turns it into a computation dominant application. However, the entire performance on MIC coprocessor is still worse than that on host. After reasoning the problems, we vectorized the code of event handler to better use the Vector Processing Unit (VPU) of MIC coprocessor, which brings us a peak speedup of 9.7X, showing that MIC coprocessor is able to find its place in the PDES field. At last, according to our evaluation work, we provide some advices to further exploit the power of MIC coprocessor for PDES applications.
引用
收藏
页码:41 / 49
页数:9
相关论文
共 24 条
[1]  
[Anonymous], 1999, Parallel and Distribution Simulation Systems
[2]  
[Anonymous], INTEL XEON PHI COPRO
[3]  
Barnes PeterD., 2013, P ACM SIGSIM C PRINC, P327
[4]   ROSS: A high-performance, low-memory, modular Time Warp system [J].
Carothers, CD ;
Bauer, D ;
Pearce, S .
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2002, 62 (11) :1648-1669
[5]  
Carothers CD, 1999, THIRTEENTH WORKSHOP ON PARALLEL AND DISTRIBUTED SIMULATION - PROCEEDINGS, P126, DOI [10.1109/PADS.1999.766169, 10.1145/347823.347828]
[6]  
Chen L., 2011, 2011 S PHOTONICS OPT, P1, DOI DOI 10.1109/PADS.2011.5936752
[7]   Dynamically Adjusting Core Frequencies to Accelerate Time Warp Simulations in Many-Core Processors [J].
Kunz, Georg ;
Schemmel, Daniel ;
Gross, James ;
Wehrle, Klaus .
2012 ACM/IEEE/SCS 26TH WORKSHOP ON PRINCIPLES OF ADVANCED AND DISTRIBUTED SIMULATION (PADS), 2012, :23-32
[8]  
Chrysos G., INTEL XEON PHI COPRO
[9]  
DAS S, 1994, 1994 WINTER SIMULATION CONFERENCE PROCEEDINGS, P1332
[10]   Characterizing and Understanding PDES Behavior on Tilera Architecture [J].
Jagtap, Deepak ;
Bahulkar, Ketan ;
Ponomarev, Dmitry ;
Abu-Ghazaleh, Nael .
2012 ACM/IEEE/SCS 26TH WORKSHOP ON PRINCIPLES OF ADVANCED AND DISTRIBUTED SIMULATION (PADS), 2012, :53-62