A High Performance Parallel and Heterogeneous Approach to Narrowband Beamforming

被引:1
作者
Sarofeen, Christian [1 ]
Gillett, Philip [2 ]
机构
[1] Naval Surface Warfare Ctr, Computat Anal & Design, Carderock Div, West Bethesda, MD 20817 USA
[2] Naval Surface Warfare Ctr, Hydroacoust & Propulsor Dev, Carderock Div, West Bethesda, MD 20817 USA
关键词
Beamforming; delay-sum beamforming; distributed computing; heterogeneous computing; hybrid parallel programming; ALGORITHM; FPGA; CUDA;
D O I
10.1109/TPDS.2015.2494038
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
This paper describes a high performing, hybrid parallel, and heterogeneous algorithmic approach to narrowband Delay-Sum Beamforming (DSB) in the frequency domain using a Just-In-Time Asynchronous Data Method (JIT-ADM) parallel pattern. JIT-ADM is a novel asynchronous parallel programming pattern that unifies various levels of asynchronous concurrency available with distributed heterogeneous computing. The computational performance of this DSB algorithm was analyzed on a 50 node Cray XC30 with a single 10-core Intel Xeon E5-2670 v2 and NVIDIA Tesla K20X general purpose Graphics Processing Unit (GPU) on each node. The algorithm exhibits well behaved weak scalability with 92.7 percent parallel efficiency at 50 nodes compared to maximum performance observed. It is also shown that the algorithm efficiently utilizes a large portion of the available hardware. During beamforming the GPU is utilized at 51.8 percent of its maximum double precision floating point throughput whereas a comparable Central Processing Unit (CPU) version utilizes 60.0 percent of its maximum expected floating point throughput. Across the weak scalability study, utilizing GPUs for processing, a 2-5x performance gain is achieved compared to using CPUs. A brief derivation and validation of the implemented DSB is also presented.
引用
收藏
页码:2196 / 2207
页数:12
相关论文
共 44 条
  • [1] Ajmera J., 2004, AC SPEECH SIGN PROC, Vl, P1
  • [2] [Anonymous], INTEL LABS
  • [3] [Anonymous], 2006, Tech. rep.
  • [4] Bitzer J, 2001, DIGITAL SIGNAL PROC, P19
  • [5] HIGH-RESOLUTION FREQUENCY-WAVENUMBER SPECTRUM ANALYSIS
    CAPON, J
    [J]. PROCEEDINGS OF THE IEEE, 1969, 57 (08) : 1408 - &
  • [6] Cappello F., 2000, P IEEE ACM SC2000 C, P12
  • [7] Accelerating compute-intensive applications with GPUs and FPGAs
    Che, Shuai
    Li, Jie
    Sheaffer, Jeremy W.
    Skadron, Kevin
    Lach, John
    [J]. 2008 SYMPOSIUM ON APPLICATION SPECIFIC PROCESSORS, 2008, : 101 - +
  • [8] GPU-Based Real-Time Volumetric Ultrasound Image Reconstruction for a Ring Array
    Choe, Jung Woo
    Nikoozadeh, Amin
    Oralkan, Oemer
    Khuri-Yakub, Butrus T.
    [J]. IEEE TRANSACTIONS ON MEDICAL IMAGING, 2013, 32 (07) : 1258 - 1264
  • [9] A novel adaptive beamforming algorithm for a smart antenna system in a CDMA mobile communication environment
    Choi, S
    Shim, D
    [J]. IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2000, 49 (05) : 1793 - 1806
  • [10] ROBUST ADAPTIVE BEAMFORMING
    COX, H
    ZESKIND, RM
    OWEN, MM
    [J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1987, 35 (10): : 1365 - 1376