Optimizing image processing on multi-core CPUs with Intel parallel programming technologies

被引:21
|
作者
Kim, Cheong Ghil [1 ]
Kim, Jeom Goo [1 ]
Lee, Do Hyeon [2 ]
机构
[1] Namseoul Univ, Dept Comp Sci, Cheonan 331707, Choongnam, South Korea
[2] Namseoul Univ, IT Convergence Technol Res & Educ Ctr, Cheonan 331707, Choongnam, South Korea
关键词
Multi-core; Streaming SIMD extension; Threading building block; Sobel operator; Sub-word parallelism; Task-level parallelism; Multimedia; EXTENSION;
D O I
10.1007/s11042-011-0906-y
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The rapid advance of computer hardware and popularity of multimedia applications enable multi-core processors with sub-word parallelism instructions to become a dominant market trend in desk-top PCs as well as high end mobile devices. This paper presents an efficient parallel implementation of 2D convolution algorithm demanding high performance computing power in multi-core desktop PCs. It is a representative computation intensive algorithm, in image and signal processing applications, accompanied by heavy memory access; on the other hand, their computational complexities are relatively low. The purpose of this study is to explore the effectiveness of exploiting the streaming SIMD (Single Instruction Multiple Data) extension (SSE) technology and TBB (Threading Building Block) run-time library in Intel multi-core processors. By doing so, we can take advantage of all the hardware features of multi-core processor concurrently for data-and task-level parallelism. For the performance evaluation, we implemented a 3x3 kernel based convolution algorithm using SSE2 and TBB with different combinations and compared their processing speeds. The experimental results show that both technologies have a significant effect on the performance and the processing speed can be greatly improved when using two technologies at the same time; for example, 6.2, 6.1, and 1.4 times speedup compared with the implementation of either of them are suggested for 256x256, 512x512, and 1024x1024 data sets, respectively.
引用
收藏
页码:237 / 251
页数:15
相关论文
empty
未找到相关数据