Scalable Fast Multipole Method for Large-Scale Electromagnetic Scattering Problems on Heterogeneous CPU-GPU Clusters
被引:4
|
作者:
Vinh Dang
论文数: 0引用数: 0
h-index: 0
机构:
Catholic Univ Amer, Dept Elect Engn & Comp Sci, Washington, DC 20064 USACatholic Univ Amer, Dept Elect Engn & Comp Sci, Washington, DC 20064 USA
Vinh Dang
[1
]
Tran, Nghia
论文数: 0引用数: 0
h-index: 0
机构:
Catholic Univ Amer, Dept Elect Engn & Comp Sci, Washington, DC 20064 USACatholic Univ Amer, Dept Elect Engn & Comp Sci, Washington, DC 20064 USA
Tran, Nghia
[1
]
Kilic, Ozlem
论文数: 0引用数: 0
h-index: 0
机构:
Catholic Univ Amer, Dept Elect Engn & Comp Sci, Washington, DC 20064 USACatholic Univ Amer, Dept Elect Engn & Comp Sci, Washington, DC 20064 USA
Kilic, Ozlem
[1
]
机构:
[1] Catholic Univ Amer, Dept Elect Engn & Comp Sci, Washington, DC 20064 USA
来源:
IEEE ANTENNAS AND WIRELESS PROPAGATION LETTERS
|
2016年
/
15卷
关键词:
Fast multipole method;
graphics processing unit;
heterogeneous clusters;
message passing interface (MPI);
portable operating system interface (POSIX);
ALGORITHM;
D O I:
10.1109/LAWP.2016.2537779
中图分类号:
TM [电工技术];
TN [电子技术、通信技术];
学科分类号:
0808 ;
0809 ;
摘要:
This letter investigates a hybrid framework for the solution of large-scale electromagnetic scattering problems by using the fast multipole method (FMM) on a heterogeneous CPU-GPU system. Enabling the use of both CPU and GPU resources available in the cluster allows for solving significantly larger problem sizes than using only CPU or GPU resources. The performance is evaluated on a 13-node cluster equipped with NVIDIA Tesla M2090 GPUs. The experimental results demonstrate that our FMM implementation on CPU-GPU is up to 72.3x faster than that of the 12-core eight-node CPU implementation. The scalability of the CPU-GPU implementation is very close to the theoretical expectations.