DURE: An Energy- and Resource-Efficient TCAM Architecture for FPGAs With Dynamic Updates

被引:20
作者
Ullah, Inayat [1 ]
Ullah, Zahid [2 ]
Afzaal, Umar [1 ]
Lee, Jeong-A [1 ]
机构
[1] Chosun Univ, Dept Comp Engn, Gwangju 61452, South Korea
[2] CECOS Univ IT & Emerging Sci, Dept Elect Engn, Peshawar 25100, Pakistan
基金
新加坡国家研究基金会;
关键词
Dynamic update; field-programmable gate array (FPGA); memory architecture; static random-access memory (SRAM)-based ternary content-addressable memory (TCAM); SRAM; ALGORITHM; DESIGN;
D O I
10.1109/TVLSI.2019.2904105
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Ternary content-addressable memory (TCAM) designed using static random-access memory (SRAM)-based field-programmable gate arrays (FPGAs) offers a promising lookup performance. However, the update process in a TCAM table poses significant challenges for efficiently employing SRAM-based TCAM. SRAM-based TCAM for FPGAs is designed using block RAM or distributed RAM resources in FPGAs. Such designs suspend search operations during an already high-latency update operation, rendering them infeasible in applications that require high-frequency updates. This paper presents a dynamically updateable energy-and resource-efficient TCAM design (DURE) based on FPGAs. DURE exploits the distributed RAM resources in FPGAs. More specifically, the lookup table RAMs (LUTRAMs) available in SLICEM resources are configured as quad-port RAM, which constitutes the basic memory (BM) block in the implementation of DURE. The contents of the TCAM table are divided into chunks of equal size and mapped onto the LUTRAMs of the proposed BM blocks. DURE implements dynamic updates by reconfiguring the LUTRAMs of only those BM blocks that are associated with the word being updated, thereby allowing search and update operations to be performed simultaneously. This achieves a lookup rate of 335 million lookups per second, with an update rate of 5.15 million updates per second on a 512 x 36 size TCAM on a Virtex-6 FPGA. Compared with the existing SRAM-based TCAMs, DURE has a smaller single-cycle search latency and achieves at least 2.5 times more energy efficiency and a 67% higher performance per area.
引用
收藏
页码:1298 / 1307
页数:10
相关论文
共 40 条
[1]  
Aasaraai K., 2012, INT J RECONFIGURABLE, V2012, P6
[2]   Resource-Efficient SRAM-Based Ternary Content Addressable Memory [J].
Ahmed, Ali ;
Park, Kyungbae ;
Baeg, Sanghyeon .
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2017, 25 (04) :1583-1587
[3]  
[Anonymous], P 2011 INT C DEV COM
[4]  
[Anonymous], VIRT 6 FPGA CONF LOG
[5]  
[Anonymous], 2017, TERN CONT ADDR MEM T
[6]   1.4Gsearch/s 2-Mb/mm2 TCAM Using Two-Phase-Pre-Charge ML Sensing and Power-Grid Pre-Conditioning to Reduce Ldi/dt Power-Supply Noise by 50% [J].
Arsovski, Igor ;
Patil, Akhilesh ;
Houle, Robert M. ;
Fragano, Michael T. ;
Rodriguez, Ramon ;
Kim, Raymond ;
Butler, Van .
IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2018, 53 (01) :155-163
[7]   High-Endurance Bipolar ReRAM-Based Non-Volatile Flip-Flops with Run-Time Tunable Resistive States [J].
Biglari, Mehrdad ;
Lieske, Tobias ;
Fey, Dietmar .
NANOARCH'18: PROCEEDINGS OF THE 14TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON NANOSCALE ARCHITECTURES, 2018, :19-24
[8]   HIGH-SPEED PACKET PROCESSING USING RECONFIGURABLE COMPUTING [J].
Brebner, Gordon ;
Jiang, Weirong .
IEEE MICRO, 2014, 34 (01) :8-18
[9]   A power-efficient wide-range phase-locked loop [J].
Chen, OTC ;
Sheen, RRB .
IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2002, 37 (01) :51-62
[10]   Robinia-BLAST: An Extensible Parallel BLAST based on Data-intensive Distributed Computing [J].
Gu, Yang ;
Huang, Zhenchun .
2014 IEEE 12TH INTERNATIONAL CONFERENCE ON DEPENDABLE, AUTONOMIC AND SECURE COMPUTING (DASC)/2014 IEEE 12TH INTERNATIONAL CONFERENCE ON EMBEDDED COMPUTING (EMBEDDEDCOM)/2014 IEEE 12TH INTERNATIONAL CONF ON PERVASIVE INTELLIGENCE AND COMPUTING (PICOM), 2014, :1-6