Efficient GPU-accelerated parallel cross-correlation

被引:0
作者
Madera, Karel [1 ]
Smelko, Adam [1 ]
Krulis, Martin [1 ]
机构
[1] Charles Univ Prague, Dept Distributed & Dependable Syst, Prague, Czech Republic
关键词
Cross-correlation; GPU; CUDA; Parallel; Algorithm; Caching; Optimizations;
D O I
10.1016/j.jpdc.2025.105054
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Cross-correlation is a data analysis method widely employed in various signal processing and similarity-search applications. Our objective is to design a highly optimized GPU-accelerated implementation that will speed up the applications and also improve energy efficiency since GPUs are more efficient than CPUs in data-parallel tasks. There are two rudimentary ways to compute cross-correlation - a definition-based algorithm that tries all possible overlaps and an algorithm based on the Fourier transform, which is much more complex but has better asymptotical time complexity. We have focused mainly on the definition-based approach which is better suited for smaller input data and we have implemented multiple CUDA-enabled algorithms with multiple optimization options. The algorithms were evaluated on various scenarios, including the most typical types of multi-signal correlations, and we provide empirically verified optimal solutions for each of the studied scenarios.
引用
收藏
页数:16
相关论文
共 25 条
  • [1] Clark MA, 2011, Arxiv, DOI arXiv:1107.4264
  • [2] Fast Matched Filter (FMF): An Efficient Seismic Matched-Filter Search for Both CPU and GPU Architectures
    Beauce, Eric
    Frank, William B.
    Romanenko, Alexey
    [J]. SEISMOLOGICAL RESEARCH LETTERS, 2018, 89 (01) : 165 - 172
  • [3] Improving matrix-based dynamic programming on massively parallel accelerators
    Bednarek, David
    Brabec, Michal
    Krulis, Martin
    [J]. INFORMATION SYSTEMS, 2017, 64 : 175 - 193
  • [4] On the performance of multi-GPU-based expert systems for acoustic localization involving massive microphone arrays
    Belloch, Jose A.
    Gonzalez, Alberto
    Vidal, Antonio M.
    Cobos, Maximo
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (13) : 5607 - 5620
  • [5] Bracewell R., 1966, Am. J. Phys., V34, P712, DOI [10.1049/ep.1965.0268., DOI 10.1049/EP.1965.0268]
  • [6] Efficient stereo matching on embedded GPUs with zero-means cross correlation
    Chang, Qiong
    Zha, Aolong
    Wang, Weimin
    Liu, Xin
    Onishi, Masaki
    Lei, Lei
    Er, Meng Joo
    Maruyama, Tsutomu
    [J]. JOURNAL OF SYSTEMS ARCHITECTURE, 2022, 123
  • [7] Cui H, 2019, MEDD C EMBED COMPUT, P712
  • [8] Real-time stereo vision-based lane detection system
    Fan, Rui
    Dahnoun, Naim
    [J]. MEASUREMENT SCIENCE AND TECHNOLOGY, 2018, 29 (07)
  • [9] Fan R, 2017, IEEE CONF IMAGING SY, P241
  • [10] Accelerating block-matching and 3D filtering method for image denoising on GPUs
    Honzatko, David
    Krulis, Martin
    [J]. JOURNAL OF REAL-TIME IMAGE PROCESSING, 2019, 16 (06) : 2273 - 2287