Optimizing image processing on multi-core CPUs with Intel parallel programming technologies

被引:21
|
作者
Kim, Cheong Ghil [1 ]
Kim, Jeom Goo [1 ]
Lee, Do Hyeon [2 ]
机构
[1] Namseoul Univ, Dept Comp Sci, Cheonan 331707, Choongnam, South Korea
[2] Namseoul Univ, IT Convergence Technol Res & Educ Ctr, Cheonan 331707, Choongnam, South Korea
关键词
Multi-core; Streaming SIMD extension; Threading building block; Sobel operator; Sub-word parallelism; Task-level parallelism; Multimedia; EXTENSION;
D O I
10.1007/s11042-011-0906-y
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The rapid advance of computer hardware and popularity of multimedia applications enable multi-core processors with sub-word parallelism instructions to become a dominant market trend in desk-top PCs as well as high end mobile devices. This paper presents an efficient parallel implementation of 2D convolution algorithm demanding high performance computing power in multi-core desktop PCs. It is a representative computation intensive algorithm, in image and signal processing applications, accompanied by heavy memory access; on the other hand, their computational complexities are relatively low. The purpose of this study is to explore the effectiveness of exploiting the streaming SIMD (Single Instruction Multiple Data) extension (SSE) technology and TBB (Threading Building Block) run-time library in Intel multi-core processors. By doing so, we can take advantage of all the hardware features of multi-core processor concurrently for data-and task-level parallelism. For the performance evaluation, we implemented a 3x3 kernel based convolution algorithm using SSE2 and TBB with different combinations and compared their processing speeds. The experimental results show that both technologies have a significant effect on the performance and the processing speed can be greatly improved when using two technologies at the same time; for example, 6.2, 6.1, and 1.4 times speedup compared with the implementation of either of them are suggested for 256x256, 512x512, and 1024x1024 data sets, respectively.
引用
收藏
页码:237 / 251
页数:15
相关论文
共 50 条
  • [1] Optimizing image processing on multi-core CPUs with Intel parallel programming technologies
    Cheong Ghil Kim
    Jeom Goo Kim
    Do Hyeon Lee
    Multimedia Tools and Applications, 2014, 68 : 237 - 251
  • [2] A Parallel Dynamic Programming Algorithm on a Multi-core Architecture
    Tan, Guangming
    Sun, Ninghui
    Gao, Guang R.
    SPAA'07: PROCEEDINGS OF THE NINETEENTH ANNUAL SYMPOSIUM ON PARALLELISM IN ALGORITHMS AND ARCHITECTURES, 2007, : 135 - +
  • [3] A Parallel Packet Processing Method On Multi-Core Systems
    Li, Yunchun
    Qiao, Xinxin
    2011 TENTH INTERNATIONAL SYMPOSIUM ON DISTRIBUTED COMPUTING AND APPLICATIONS TO BUSINESS, ENGINEERING AND SCIENCE (DCABES), 2011, : 78 - 81
  • [4] Design of Parallel Algorithms for Super Long Integer Operation Based on Multi-core CPUs
    Zhang, Shifeng
    Su, Shenghui
    2015 11TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY (CIS), 2015, : 335 - 339
  • [5] A Markup Language for Parallel Programming Model on Multi-Core System
    Zhang Yingqian
    Sun Bin
    Liu Jia
    2009 INTERNATIONAL CONFERENCE ON SCALABLE COMPUTING AND COMMUNICATIONS & EIGHTH INTERNATIONAL CONFERENCE ON EMBEDDED COMPUTING, 2009, : 640 - +
  • [6] Multi-core CPUs, Clusters, and Grid Computing: A Tutorial
    Creel, Michael
    Goffe, William L.
    COMPUTATIONAL ECONOMICS, 2008, 32 (04) : 353 - 382
  • [7] Leveraging Multi-Core CPUs in the Context of Demand Planning
    Tinnefeld, Christian
    Mueller, Stephan H.
    Krueger, Jens
    Grund, Martin
    Zeier, Alexander
    2009 IEEE 16TH INTERNATIONAL CONFERENCE ON INDUSTRIAL ENGINEERING AND ENGINEERING MANAGEMENT, VOLS 1 AND 2, PROCEEDINGS, 2009, : 2007 - 2011
  • [8] Multi-core CPUs, Clusters, and Grid Computing: A Tutorial
    Michael Creel
    William L. Goffe
    Computational Economics, 2008, 32
  • [9] Efficient Implementation of XPath Processoron Multi-Core CPUs
    Krulis, Martin
    Yaghob, Jakub
    PROCEEDINGS OF THE DATESO 2010 WORKSHOP - DATESO DATABASES, TEXTS, SPECIFICATIONS, AND OBJECTS, 2010, 567 : 60 - 71
  • [10] Comparative analysis of debugging tools in parallel programming for multi-core processors
    Shipunov, Valeriy
    Gavryushenko, Andrey
    Kuznetsov, Eugene
    2007 PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON THE EXPERIENCE OF DESIGNING AND APPLICATION OF CAD SYSTEMS IN MICROELECTRONICS, 2007, : 426 - 428