Multithreaded implicitly dealiased convolutions

被引:5
作者
Roberts, Malcolm [1 ]
Bowman, John C. [2 ]
机构
[1] Comp Modelling Grp Ltd, 3710 33 St NW, Calgary, AB T2L 2M1, Canada
[2] Univ Alberta, Dept Math & Stat Sci, Edmonton, AB T6G 2G1, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Convolution; Implicit dealiasing; Fast Fourier transform; Multithreading; Parallelization; Pseudospectral method;
D O I
10.1016/j.jcp.2017.11.026
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Implicit dealiasing is a method for computing in-place linear convolutions via fast Fourier transforms that decouples work memory from input data. It offers easier memory management and, for long one-dimensional input sequences, greater efficiency than conventional zero-padding. Furthermore, for convolutions of multidimensional data, the segregation of data and work buffers can be exploited to reduce memory usage and execution time significantly. This is accomplished by processing and discarding data as it is generated, allowing work memory to be reused, for greater data locality and performance. A multithreaded implementation of implicit dealiasing that accepts an arbitrary number of input and output vectors and a general multiplication operator is presented, along with an improved one-dimensional Hermitian convolution that avoids the loop dependency inherent in previous work. An alternate data format that can accommodate a Nyquist mode and enhance cache efficiency is also proposed. (c) 2017 Elsevier Inc. All rights reserved.
引用
收藏
页码:98 / 114
页数:17
相关论文
共 50 条
  • [31] Active replication of multithreaded applications
    Basile, C
    Kalbarczyk, Z
    Iyer, RK
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2006, 17 (05) : 448 - 465
  • [32] WEIGHTED INEQUALITIES FOR CONVOLUTIONS
    ANDERSEN, KF
    PROCEEDINGS OF THE AMERICAN MATHEMATICAL SOCIETY, 1995, 123 (04) : 1129 - 1136
  • [33] Performance evaluation of a multithreaded fast Fourier transform algorithm for derivative pricing
    Thulasiram, RK
    Thulasiraman, P
    JOURNAL OF SUPERCOMPUTING, 2003, 26 (01) : 43 - 58
  • [34] Performance Evaluation of a Multithreaded Fast Fourier Transform Algorithm for Derivative Pricing
    Ruppa K. Thulasiram
    Parimala Thulasiraman
    The Journal of Supercomputing, 2003, 26 : 43 - 58
  • [35] Efficient Partitioning of Algorithms for Long Convolutions and their Mapping onto Architectures
    Laurens Bierens
    Ed Deprettere
    Journal of VLSI signal processing systems for signal, image and video technology, 1998, 18 : 51 - 64
  • [36] EXPONENTIAL DEFICIENCY OF CONVOLUTIONS OF DENSITIES
    Pinelis, Iosif
    ESAIM-PROBABILITY AND STATISTICS, 2012, 16 : 86 - 96
  • [37] Seeing Multithreaded Behavior Using TSGL
    Adams, Joel C.
    Crain, Patrick A.
    Dilley, Christopher P.
    2016 IEEE 30TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2016, : 972 - +
  • [38] Analyzing Lock Contention in Multithreaded Applications
    Tallent, Nathan R.
    Mellor-Crummey, John M.
    Porterfield, Allan
    ACM SIGPLAN NOTICES, 2010, 45 (05) : 269 - 279
  • [39] Parallel Multithreaded Medical Images Filtering
    Gancheva, Veska
    2021 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE (CSCI 2021), 2021, : 1788 - 1793
  • [40] The Coming Wave of Multithreaded Chip Multiprocessors
    James Laudon
    Lawrence Spracklen
    International Journal of Parallel Programming, 2007, 35 : 299 - 330