Efficient GPU-Implementation of H-P Sort Based on Improved Histogram Computation

被引:0
作者
Takase, Kaito [1 ]
Hagihara, Takumi [2 ]
Fujimoto, Noriyuki [2 ]
Wada, Koichi [1 ]
机构
[1] Hosei Univ, Chiyoda City, Japan
[2] Osaka Metropolitan Univ, Osaka, Japan
来源
THE PROCEEDINGS OF INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING IN ASIA-PACIFIC REGION, HPC ASIA 2024 | 2024年
关键词
H-P sort; Integer sorting; parallel sorting; Histogram; Prefix sum; GPU;
D O I
10.1145/3635035.3635051
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present an enhanced GPU implementation of the H-P sort algorithm, which is a widely used method for integer sorting based on histogram computation and prefix sum calculation. This work extends a previous high-performance GPU version of the algorithm, presented at the 2021 ICPP international conference. Through algorithmic refinements and optimizations, we demonstrate significant performance improvements compared to the conventional H-P sort. It achieves a maximum speedup of 3.45x over conventional H-P Sort. Furthermore, in scenarios where the H-P sort had previously lagged behind the fastest available library, our improved implementation now surpasses it in terms of execution speed. This advancement in GPU-based sorting techniques contributes to more efficient data processing in various computational applications, emphasizing the practical significance of this work.
引用
收藏
页码:134 / 144
页数:11
相关论文
共 16 条
[1]  
[Anonymous], 2010, Introduction to Algorithms
[2]  
[Anonymous], 1998, Sorting and Searching
[3]  
Blelloch Guy E, 1990, Prefix sums and their applications
[4]   O(log*n) algorithms on a Sum-CRCW PRAM [J].
Eisenstat, S. C. .
COMPUTING, 2007, 79 (01) :93-97
[5]  
Faujdar N., 2016, INDIAN J SCI TECHNOL, V9, P1, DOI DOI 10.17485/ijst/2016/v9i15/80080
[6]   An optimized approach to histogram computation on GPU [J].
Gomez-Luna, Juan ;
Maria Gonzalez-Linares, Jose ;
Ignacio Benavides, Jose ;
Guil, Nicolas .
MACHINE VISION AND APPLICATIONS, 2013, 24 (05) :899-908
[7]   Compiling Generalized Histograms for GPU [J].
Henriksen, Troels ;
Hellfritzsch, Sune ;
Sadayappan, Ponnuswamy ;
Oancea, Cosmin .
PROCEEDINGS OF SC20: THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SC20), 2020,
[8]   Design and implementation of an efficient integer count sort in CUDA GPUs [J].
Kolonias, Vasileios ;
Voyiatzis, Artemios G. ;
Goulas, George ;
Housos, Efthymios .
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2011, 23 (18) :2365-2381
[9]  
Koppaka S, 2010, Arxiv, DOI arXiv:1011.0235
[10]   Efficient GPU-Implementation for Integer Sorting Based on Histogram and Prefix-Sums [J].
Kozakai, Seiya ;
Fujimoto, Noriyuki ;
Wada, Koichi .
50TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, 2021,