Multimedia processor-based implementation of an error-diffusion halftoning algorithm exploiting subword parallelism

被引:13
作者
Ahn, JW [1 ]
Sung, W [1 ]
机构
[1] Seoul Natl Univ, Sch Elect Engn, Kwanak Gu, Seoul 151742, South Korea
关键词
error-diffusion halftoning algorithm; multimedia processor; Pentium MMX; subword parallelism;
D O I
10.1109/76.905980
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Multimedia processor-based implementation of digital image processing algorithms has become important since several multimedia processors, such as the Intel Pentium MMX, are now available and can replace special-purpose hardware-based systems because of their flexibility. Multimedia processors increase throughput by processing multiple pixels simultaneously using a subword-parallel arithmetic and logic unit architecture. The error-diffusion halftoning algorithm employs feedback of quantized output signals to faithfully convert a multi-level image to a binary image or to one with fewer levels of quantization. This makes it difficult to achieve speedup by utilizing the multimedia extension. In this study, the error-diffusion halftoning algorithm is implemented for a multimedia processor using three methods: single-pixel, single-line, and multiple-line processing. The single-pixel approach is the closest to conventional implementations, but the multimedia extension is used only in the filter kernel. The single-line approach computes multiple pixels in one scan-line simultaneously, but requires a complex algorithm transformation to remove dependencies between pixels. The multiple-line method exploits parallelism by employing a skewed data structure and processing multiple pixels in different scan-lines. The Pentium MMX instruction set is used for quantitative performance evaluation including run-time overheads and misaligned memory accesses. A speedup of more than ten times is achieved compared to the software (integer C) implementation on a conventional processor for the structurally sequential error-diffusion halftoning algorithm.
引用
收藏
页码:129 / 138
页数:10
相关论文
共 19 条
[1]   Pentium-MMX based implementation of a digital copier [J].
Ahn, JW ;
Sung, W .
1998 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS-SIPS 98: DESIGN AND IMPLEMENTATION, 1998, :142-151
[2]  
[Anonymous], 1981, RZ1060 IBM RES LAB
[3]   Evaluating MMX technology using DSP and multimedia applications [J].
Bhargava, R ;
John, LK ;
Evans, BL ;
Radhakrishnan, R .
31ST ANNUAL ACM/IEEE INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, PROCEEDINGS, 1998, :37-46
[4]   A media processor for multimedia signal processing applications [J].
Holmann, E ;
Yoshida, T ;
Yamada, A ;
Mohri, A .
SIPS 97 - 1997 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS: DESIGN AND IMPLEMENTATION, 1997, :86-96
[5]  
*INT CORP, 1998, VTUN CD
[6]  
Jarvis JF., 1976, COMPUTER GRAPHICS IM, V5, P13, DOI [10.1016/S0146-664X(76)80003-2, DOI 10.1016/S0146-664X(76)80003-2]
[7]   PARALLEL ALGORITHM FOR EFFICIENT SOLUTION OF A GENERAL CLASS OF RECURRENCE EQUATIONS [J].
KOGGE, PM ;
STONE, HS .
IEEE TRANSACTIONS ON COMPUTERS, 1973, C-22 (08) :786-793
[8]  
KUNG SY, 1988, VLSI ARRAY PROCESSOR
[9]   Subword parallelism with MAX-2 [J].
Lee, RB .
IEEE MICRO, 1996, 16 (04) :51-59
[10]  
*MIPS TECHN, 1997, MIPS EXT DIG MED 3D