Compact associative-memory architecture with fully parallel search capability for the minimum Hamming distance

被引:39
作者
Mattausch, HJ [1 ]
Gyohten, T
Soda, Y
Koide, T
机构
[1] Hiroshima Univ, Res Ctr Nanodevices & Syst, Higashihiroshima 7398527, Japan
[2] Mitsubishi Electr Corp, ULSI Dev Ctr, Itami, Hyogo 6648641, Japan
[3] Sony LSI Design Ltd, Syst Design Dept, Hodogaya Ku, Yokohama, Kanagawa 2400005, Japan
关键词
associative memory; CAM; CMOS; fully parallel search; Hamming distance; mixed digital/analog circuit; WTA; winner-take-all circuit;
D O I
10.1109/4.982428
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
A associative-memory architecture for a fully parallel minimum Hamming distance search is proposed, which uses digital circuitry for bit comparison and fast analog circuitry for word comparison as well as winner-take-all (WTA) functionality. Following this original approach allows compact and high-performance integration in conventional CMOS technology. First, static encoding of word-comparison results as a current-sink capability reduces word-comparison circuitry to the theoretical minimum, namely, one transistor per bit and one signal line per word. Second, a new WTA principle, which we call self-adapting winner line-up amplification (WLA), regulates the winner row output automatically into the narrow maximum-gain region of a distance amplifier. Third, winner search circuit complexity scales linear with reference-word number and not quadratic as inevitable for digital approaches. Due to static distance encoding and WLA regulation, transient noise and fabrication process variations are largely tolerated. Only relative chip-internal transistor-parameter variations, creating effective mismatch of matched transistors, limit winner search result correctness. Practical feasibility is verified by a 0.6-mum 2-poly 3-metal CMOS design with 32 rows and 128 columns, achieving <100 ns search times and 1.57-mm(2) integration area. The only previous work with linear-scaling search circuitry reports just static measurements (no dynamic search times) for a design with 16 rows and 12 columns, i.e., 20 times smaller complexity.
引用
收藏
页码:218 / 227
页数:10
相关论文
共 14 条
[11]   A fully parallel vector-quantization processor for real-time motion-picture compression [J].
Nakada, A ;
Shibata, T ;
Konda, M ;
Morimoto, T ;
Ohmi, T .
IEEE JOURNAL OF SOLID-STATE CIRCUITS, 1999, 34 (06) :822-830
[12]  
NIKAIDO T, 1982, 14 INT C SOL STAT DE, P13
[13]   A parallel vector-quantization processor eliminating redundant calculations for real-time motion picture compression [J].
Nozawa, T ;
Konda, M ;
Fujibayashi, M ;
Imai, M ;
Kotani, K ;
Sugawa, S ;
Ohmi, T .
IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2000, 35 (11) :1744-1751
[14]  
Tveter D., 1998, PATTERN RECOGNITION