Preliminary performance evaluations of the determinant quantum Monte Carlo simulations for multi-core CPU and many-core GPU

被引:0
作者
Kao, Quey-Liang [1 ]
Lee, Che-Rung [1 ]
机构
[1] Natl Tsing Hua Univ, Dept Comp Sci, 101,Kuang Fu Rd,Sec 2, Hsinchu 30013, Taiwan
关键词
preliminary performance evaluation; PPE; multi-core CPU; graphics processing unit; GPU; quantum Monte Carlo simulation; numerical simulation;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The diversity of architectural designs and the programming styles of emerging computational hardware have created a wide search spectrum for the performance optimisation in the development of next generation high-performance software. Preliminary performance evaluations (PPE) on various computational platforms are essential to provide useful guidelines for proper software design choices. In this paper, we study the performance of the numerical kernels of the determinant quantum Monte Carlo (DQMC) simulations for two popular computing processors: multi-core CPU and GPU. Two algorithms, the Loh's method and the SOF algorithm, with different implementations and problem configurations, are tested to explore the hardware characteristics, such as scalability and processor utilisation. The results of this PPE that show the favoured algorithms and applicable parameter ranges on those two platforms can provide useful technical information not only for this particular computation, but also for all applications that use similar computation kernels.
引用
收藏
页码:34 / 43
页数:10
相关论文
共 13 条
[1]  
Bai Z, 2009, MULTISCALE PHENOMENA, P1
[2]  
Bai Z., 2010, LAA
[3]  
Blackford S., 1999, 41 LAPACK
[4]   MONTE-CARLO CALCULATIONS OF COUPLED BOSON-FERMION SYSTEMS .1. [J].
BLANKENBECLER, R ;
SCALAPINO, DJ ;
SUGAR, RL .
PHYSICAL REVIEW D, 1981, 24 (08) :2278-2286
[5]   Cluster solver for dynamical mean-field theory with linear scaling in inverse temperature [J].
Khatami, E. ;
Lee, C. R. ;
Bai, Z. J. ;
Scalettar, R. T. ;
Jarrell, M. .
PHYSICAL REVIEW E, 2010, 81 (05)
[6]  
Lee C-R., 2010, IPDPS
[7]  
Lee C-R., 2011, IPDPS
[8]  
LOH EY, 1992, ELECT PHASE TRANSITI, P177
[9]   Programming Matrix Algorithms-by-Blocks for Thread-Level Parallelism [J].
Quintana-Orti, Gregorio ;
Quintana-Orti, Enrique S. ;
Van de Geijn, Robert A. ;
Van Zee, Field G. ;
Chan, Ernie .
ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 2009, 36 (03)
[10]  
Stewart G.W., 1998, SIAM, V1