Optimizing image processing on multi-core CPUs with Intel parallel programming technologies

被引:0
|
作者
Cheong Ghil Kim
Jeom Goo Kim
Do Hyeon Lee
机构
[1] Namseoul University,Department of Computer Science
[2] Namseoul University,IT Convergence Technology Research & Education Center
来源
Multimedia Tools and Applications | 2014年 / 68卷
关键词
Multi-core; Streaming SIMD extension; Threading building block; Sobel operator; Sub-word parallelism; Task-level parallelism; Multimedia;
D O I
暂无
中图分类号
学科分类号
摘要
The rapid advance of computer hardware and popularity of multimedia applications enable multi-core processors with sub-word parallelism instructions to become a dominant market trend in desk-top PCs as well as high end mobile devices. This paper presents an efficient parallel implementation of 2D convolution algorithm demanding high performance computing power in multi-core desktop PCs. It is a representative computation intensive algorithm, in image and signal processing applications, accompanied by heavy memory access; on the other hand, their computational complexities are relatively low. The purpose of this study is to explore the effectiveness of exploiting the streaming SIMD (Single Instruction Multiple Data) extension (SSE) technology and TBB (Threading Building Block) run-time library in Intel multi-core processors. By doing so, we can take advantage of all the hardware features of multi-core processor concurrently for data- and task-level parallelism. For the performance evaluation, we implemented a 3 × 3 kernel based convolution algorithm using SSE2 and TBB with different combinations and compared their processing speeds. The experimental results show that both technologies have a significant effect on the performance and the processing speed can be greatly improved when using two technologies at the same time; for example, 6.2, 6.1, and 1.4 times speedup compared with the implementation of either of them are suggested for 256 × 256, 512 × 512, and 1024 × 1024 data sets, respectively.
引用
收藏
页码:237 / 251
页数:14
相关论文
共 50 条
  • [41] A portable C plus plus library for memory and compute abstraction on multi-core CPUs and GPUs
    Incardona, Pietro
    Gupta, Aryaman
    Yaskovets, Serhii
    Sbalzarini, Ivo F.
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2023, 35 (25)
  • [42] Designing and Manufacturing of Real Embedded Multi-Core CPUs: A Holistic Teaching Approach in Computer Architecture
    Reichenbach, Marc
    Pfundt, Benjamin
    Fey, Dietmar
    10TH EUROPEAN WORKSHOP ON MICROELECTRONICS EDUCATION (EWME), 2014, : 213 - 218
  • [43] Parallel power system simulation on a multi-core PC cluster
    Taoka, Hisao
    Fujita, Yuichi
    IEEJ Transactions on Power and Energy, 2009, 129 (09) : 1152 - 1153
  • [44] Parallel Computation of Adaptive Filtering Algorithms on Multi-Core Systems
    Dong-hwan Lee
    Jaewoo Ahn
    Wonyong Sung
    Journal of Signal Processing Systems, 2012, 69 : 253 - 265
  • [45] Exploring and Enhancing the Performance of Parallel IDS on Multi-Core Processors
    Jiang, Haiyang
    Yang, Jianhua
    Xie, Gaogang
    TRUSTCOM 2011: 2011 INTERNATIONAL JOINT CONFERENCE OF IEEE TRUSTCOM-11/IEEE ICESS-11/FCST-11, 2011, : 673 - 680
  • [46] HiBRID-SoC:: A multi-core SoC architecture for multimedia signal processing
    Stolberg, HJ
    Berekovic, M
    Moch, S
    Friebe, L
    Kulaczewski, MB
    Flügel, S
    Klussmann, H
    Dehnhardt, A
    Pirsch, P
    JOURNAL OF VLSI SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2005, 41 (01): : 9 - 20
  • [47] Algorithm for Object Detection using Multi-Core Parallel Computation
    Ma, Yongjun
    Wu, Wenxu
    He, Qiangqiang
    2012 INTERNATIONAL CONFERENCE ON MEDICAL PHYSICS AND BIOMEDICAL ENGINEERING (ICMPBE2012), 2012, 33 : 455 - 461
  • [48] Parallel Computation of Adaptive Filtering Algorithms on Multi-Core Systems
    Lee, Dong-hwan
    Ahn, Jaewoo
    Sung, Wonyong
    JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2012, 69 (03): : 253 - 265
  • [49] HiBRID-SoC: A Multi-Core SoC Architecture for Multimedia Signal Processing
    Hans-Joachim Stolberg
    Mladen Bereković
    Sören Moch
    Lars Friebe
    Mark B. Kulaczewski
    Sebastian Flügel
    Heiko Klußmann
    Andreas Dehnhardt
    Peter Pirsch
    Journal of VLSI signal processing systems for signal, image and video technology, 2005, 41 : 9 - 20
  • [50] Algorithm for Object Detection using Multi-Core Parallel Computation
    Ma, Yongjun
    Wu, Wenxu
    He, Qiangqiang
    2011 AASRI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND INDUSTRY APPLICATION (AASRI-AIIA 2011), VOL 1, 2011, : 291 - 294