Processing using one-dimensional processes arrays

被引:23
作者
Hammerstrom, DW
Lulich, DP
机构
[1] Adaptive Solutions, Inc., Beaverton
关键词
D O I
10.1109/5.503300
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Image processing (IF) is an ideal candidate for specialized architectures because of the sheer volume of data, the natural parallelism of IP algorithms, and the high demand for IP solutions. The continually increasing circuit density of very large scale integration (VLSI) now makes it possible to integrate a complete parallel computer onto a single piece of silicon, significantly improving cost-performance for IP applications. The widespread use of IP applications such as desktop publishing and multimedia-and the limited performance of even the highest speed microprocessors oil these tasks-reveal financial incentives for creating a specialized IP architecture. However, choosing the correct architecture requires the resolution of several complex design trade-offs. The first half of this paper presents the design rationale for CNAPS(TM), a specialized one-dimensional (1-D) processor array developed by Adaptive Solutions Inc. In this context, we discuss the problem of Amdahl's law, which severely constrains special-purpose architectures. We also discuss specific architectural decisions such as the kind of parallelism, die computational precision of the processors, on-chip versus off-chip processor memory, and-most importantly-the interprocessor communication architecture. We argue that, for our particular set of applications, a 1-D architecture gives the best ''bang for the buck'' even when compared to the more traditional two-dimensional (2-D) architecture. The rectangular structure of an image intuitively suggests that IP algorithms map efficiently to a 2-D processor array. Traditional IP architectures hence have consisted of parallel arrays of processors organized hi 2-D grids. The configuration is often assumed to be ''optimal'' when the number of processors is equal to the number of image pixels and when each processor is interconnected with its eight nearest neighbors in a rectangular array. Such one-to-one configurations are almost always too expensive to deploy. The number of processors number of pixels. Under these conditions, intuitions about optimal mappings and topologies begin to break down. In our application domains, where the number of pixels greatly exceeds the number of processors, and for our target applications a 1-D array offers the same performance as a 2-D array, usually at a lower cost. The second half of this paper describes how several simple algorithms map to the CNAPS array. Our results show that the CNAPS 1-D army offers excellent performance over a range of IP algorithms. We also briefly look at the performance of CNAPS as a pattern recognition engine because many image processing and pattern recognition problems are intimately related.
引用
收藏
页码:1005 / 1018
页数:14
相关论文
共 21 条
[1]  
Baxes G.A., 1994, DIGITAL IMAGE PROCES
[2]  
Cantoni V., 1983, Computing Structures for Image Processing. Workshop on Multicomputers and Image Processing, P43
[3]  
GRIFFIN M, 1991, ISSCC DIG TECH PAPER, P180
[4]  
Hammerstrom D, 1993, PARALLEL DIGITAL IMP, P107
[5]  
HAMMERSTROM D, 1991, VLSI ARTIFICIAL INTE
[6]   EFFICIENT IMAGE-PROCESSING ALGORITHMS ON THE SCAN LINE ARRAY PROCESSOR [J].
HELMAN, D ;
JAJA, J .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1995, 17 (01) :47-56
[7]  
HENNESSY J, 1991, COMPUTER ARCHITECTUR
[8]  
HUTCHESON GD, 1996, SCI AM JAN, P54
[9]  
JONKER PP, 1994, P 12 IAPR INT C PATT, V3, P334
[10]  
KUMA P, 1991, PARALLEL ARCHITECTUR