Optimizing image processing on multi-core CPUs with Intel parallel programming technologies

被引:0
|
作者
Cheong Ghil Kim
Jeom Goo Kim
Do Hyeon Lee
机构
[1] Namseoul University,Department of Computer Science
[2] Namseoul University,IT Convergence Technology Research & Education Center
来源
Multimedia Tools and Applications | 2014年 / 68卷
关键词
Multi-core; Streaming SIMD extension; Threading building block; Sobel operator; Sub-word parallelism; Task-level parallelism; Multimedia;
D O I
暂无
中图分类号
学科分类号
摘要
The rapid advance of computer hardware and popularity of multimedia applications enable multi-core processors with sub-word parallelism instructions to become a dominant market trend in desk-top PCs as well as high end mobile devices. This paper presents an efficient parallel implementation of 2D convolution algorithm demanding high performance computing power in multi-core desktop PCs. It is a representative computation intensive algorithm, in image and signal processing applications, accompanied by heavy memory access; on the other hand, their computational complexities are relatively low. The purpose of this study is to explore the effectiveness of exploiting the streaming SIMD (Single Instruction Multiple Data) extension (SSE) technology and TBB (Threading Building Block) run-time library in Intel multi-core processors. By doing so, we can take advantage of all the hardware features of multi-core processor concurrently for data- and task-level parallelism. For the performance evaluation, we implemented a 3 × 3 kernel based convolution algorithm using SSE2 and TBB with different combinations and compared their processing speeds. The experimental results show that both technologies have a significant effect on the performance and the processing speed can be greatly improved when using two technologies at the same time; for example, 6.2, 6.1, and 1.4 times speedup compared with the implementation of either of them are suggested for 256 × 256, 512 × 512, and 1024 × 1024 data sets, respectively.
引用
收藏
页码:237 / 251
页数:14
相关论文
共 50 条
  • [31] Parallel Processing Performance on Multi-Core PC Cluster Distributing Communication Load to Multiple Paths
    Fukunaga, Takafumi
    ADVANCED COMPUTER SCIENCE AND INFORMATION TECHNOLOGY, 2010, 74 : 1 - 11
  • [32] Task Parallel Scheduling over Multi-core System
    Wang, Bo
    CLOUD COMPUTING, PROCEEDINGS, 2009, 5931 : 423 - 434
  • [33] LTE Physical Layer Implementation Based on GPP Multi-core Parallel Processing and USRP Platform
    Chen, Zhiping
    Wu, Jun
    2014 9TH INTERNATIONAL CONFERENCE ON COMMUNICATIONS AND NETWORKING IN CHINA (CHINACOM), 2014, : 197 - 201
  • [34] Application of Multi-core Parallel Computing in FPGA Placement
    Huang, Bohu
    Zhang, Haibin
    2013 2ND INTERNATIONAL SYMPOSIUM ON INSTRUMENTATION AND MEASUREMENT, SENSOR NETWORK AND AUTOMATION (IMSNA), 2013, : 884 - 889
  • [35] A framework for parallel computational physics algorithms on multi-core: SPH in parallel
    Holmes, David W.
    Williams, John R.
    Tilke, Peter
    ADVANCES IN ENGINEERING SOFTWARE, 2011, 42 (11) : 999 - 1008
  • [36] Multi-core CPU Based Parallel Cube Algorithms
    Zhou, Guoliang
    Zhang, Han
    ADVANCED RESEARCH ON COMPUTER SCIENCE AND INFORMATION ENGINEERING, 2011, 153 : 48 - 53
  • [37] An Undergraduate Parallel and Distributed Computing Course in Multi-Core
    Li, Jianhua
    Guo, Weibin
    Zheng, Hong
    PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE FOR YOUNG COMPUTER SCIENTISTS, VOLS 1-5, 2008, : 2412 - 2416
  • [38] Parallel and Distributed Simulation of networked Multi-Core Systems
    Wehner, Philipp
    Goehringer, Diana
    2014 INTERNATIONAL SYMPOSIUM ON SYSTEM-ON-CHIP (SOC), 2014,
  • [39] A HIGH PERFORMANCE MULTI-CORE NETWORK PROCESSING SYSTEM
    Zha, Qiwen
    Zhang, Wu
    Zeng, Xuewen
    Song, Yi
    2011 3RD INTERNATIONAL CONFERENCE ON COMPUTER TECHNOLOGY AND DEVELOPMENT (ICCTD 2011), VOL 1, 2012, : 439 - 443
  • [40] Believe It or Not! Multi-core CPUs can Match GPU Performance for a FLOP-Intensive Application!
    Bordawekar, Rajesh
    Bondhugula, Uday
    Rao, Ravi
    PACT 2010: PROCEEDINGS OF THE NINETEENTH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES, 2010, : 537 - 538