The 2D wavelet transform on emerging architectures: GPUs and multicores

被引:21
作者
Franco, Joaquin [1 ,3 ]
Bernabe, Gregorio [1 ]
Fernandez, Juan [1 ]
Ujaldon, Manuel [2 ,4 ]
机构
[1] Univ Murcia, Dept Comp Engn, Murcia, Spain
[2] Univ Malaga, Comp Architecture Dept, E-29071 Malaga, Spain
[3] Univ Murcia, Dept Ingn & Tecnol Comp DITEC, Murcia, Spain
[4] Univ Malaga, Sch Comp Engn, E-29071 Malaga, Spain
关键词
2D fast wavelet transform; Parallel programming; CUDA; Graphics processors; Multicore CPU; High-performance computing;
D O I
10.1007/s11554-011-0224-7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Because of the computational power of today's GPUs, they are starting to be harnessed more and more to help out CPUs on high-performance computing. In addition, an increasing number of today's state-of-the-art supercomputers include commodity GPUs to bring us unprecedented levels of performance in terms of raw GFLOPS and GFLOPS/cost. In this work, we present a GPU implementation of an image processing application of growing popularity: The 2D fast wavelet transform (2D-FWT). Based on a pair of Quadrature Mirror Filters, a complete set of application-specific optimizations are developed from a CUDA perspective to achieve outstanding factor gains over a highly optimized version of 2D-FWT run in the CPU. An alternative approach based on the Lifting Scheme is also described in Franco et al. (Acceleration of the 2D wavelet transform for CUDA-enabled Devices, 2010). Then, we investigate hardware improvements like multicores on the CPU side, and exploit them at thread-level parallelism using the OpenMP API and pthreads . Overall, the GPU exhibits better scalability and parallel performance on large-scale images to become a solid alternative for computing the 2D-FWT versus those thread-level methods run on emerging multicore architectures.
引用
收藏
页码:145 / 152
页数:8
相关论文
共 18 条
[1]  
[Anonymous], 2007, 307776002US INT
[2]  
[Anonymous], 2003, P ACM SIGGRAPHEUROGR, DOI DOI 10.2312/EGGH.EGGH03.112-119
[3]  
[Anonymous], 1992, CBMSNSF REGIONAL C S
[4]  
[Anonymous], 2016, Programming massively parallel processors: a hands-on approach
[5]  
Bernabe G., 2000, IEEE EMBS INT C INF
[6]  
Franco J., 2010, 10 PARA 2010 STAT AR
[7]  
Franco J., 2010, 10 INT C COMP SCI 2
[8]  
Govindaraju N., 2008, P SUP 2008 AUST TX U
[9]   A THEORY FOR MULTIRESOLUTION SIGNAL DECOMPOSITION - THE WAVELET REPRESENTATION [J].
MALLAT, SG .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1989, 11 (07) :674-693
[10]   Cache issues with JPEG-2000 wavelet lifting [J].
Meerwald, P ;
Norcen, R ;
Uhl, A .
VISUAL COMMUNICATIONS AND IMAGE PROCESSING 2002, PTS 1 AND 2, 2002, 4671 :626-634