Characterizing and optimizing Java-based HPC applications on Intel many-core architecture

被引:0
作者
Yang YU [1 ,2 ]
Tianyang LEI [2 ]
Haibo CHEN [2 ]
Binyu ZANG [2 ]
机构
[1] School of Computer Science, Fudan University
[2] Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University
关键词
many-core; Java; Xeon Phi; HPC; prefetching;
D O I
暂无
中图分类号
TP38 [其他计算机];
学科分类号
081201 ;
摘要
The increasing demand for performance has stimulated the wide adoption of many-core accelerators like IntelR Xeon PhiTMCoprocessor, which is based on Intel’s Many Integrated Core architecture. While many HPC applications running in native mode have been tuned to run efficiently on Xeon Phi, it is still unclear how a managed runtime like JVM performs on such an architecture. In this paper, we present the first measurement study of a set of Java HPC applications on Xeon Phi under JVM. One key obstacle to the study is that there is currently little support of Java for Xeon Phi. This paper presents the result based on the first porting of Open JDK platform to Xeon Phi, in which the Hot Spot virtual machine acts as the kernel execution engine. The main difficulty includes the incompatibility between Xeon Phi ISA and the assembly library of Hotspot VM.By evaluating the multithreaded Java Grande benchmark suite and our ported Java Phoenix benchmarks, we quantitatively study the performance and scalability issues of JVM on Xeon Phi and draw several conclusions from the study. To fully utilize the vector computing capability and hide the significant memory access latency on the coprocessor, we present a semi-automatic vectorization scheme and software prefetching model in Hot Spot.Together with 60 physical cores and tuning, our optimized JVM achieves averagely 2.7 x and 3.5 x speedup compared to Xeon CPU processor by using vectorization and prefetching accordingly. Our study also indicates that it is viable and potentially performance-beneficial to run applications written for such a managed runtime like JVM on Xeon Phi.
引用
收藏
页码:207 / 223
页数:17
相关论文
共 4 条
[1]  
The benefit of SMT in the multi-core era[J] . Stijn Eyerman,Lieven Eeckhout.ACM SIGARCH Computer Architecture News . 2014 (1)
[2]  
Nested parallelism for multi-core HPC systems using Java[J] . Aamir Shafi,Bryan Carpenter,Mark Baker.Journal of Parallel and Distributed Computing . 2009 (6)
[3]  
NINJA: Java for high performance numerical computing[J] . Jos&eacute,E.,Moreira,Samuel P. Midkiff,Manish Gupta,Peng Wu,George Almasi,Pedro Artigas.Scientific Programming . 2002 (1)
[4]  
Performance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi .2 Ramachandran A,Vienne J,Koesterke L et al. the 42nd International Conference on Parallel Processing (ICPP) . 2013