A Fast Scalable Automaton-Matching Accelerator for Embedded Content Processors

被引:2
|
作者
Tseng, Kuo-Kun [1 ]
Lai, Yuan-Cheng [2 ]
Lin, Ying-Dar [3 ]
Lee, Tsern-Huei [4 ]
机构
[1] Hungkuang Univ, Dept Comp & Informat Engn, Taichung 433, Taiwan
[2] Natl Taiwan Univ Sci & Technol, Dept Informat Management, Taipei 106, Taiwan
[3] Natl Chiao Tung Univ, Dept Comp & Informat Sci, Hsinchu 300, Taiwan
[4] Natl Chiao Tung Univ, Dept Commun Engn, Hsinchu 300, Taiwan
关键词
Algorithms; Performance; Design; String matching; content filtering; automaton; Aho-Corasick; Bloom filter;
D O I
10.1145/1509288.1509291
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Home and office network gateways often employ a cost-effective embedded network processor to handle their network services. Such network gateways have received strong demand for applications dealing with intrusion detection, keyword blocking, antivirus and antispam. Accordingly, we were motivated to propose an appropriate fast scalable automaton-matching (FSAM) hardware to accelerate the embedded network processors. Although automaton matching algorithms are robust with deterministic matching time, there is still plenty of room for improving their average-case performance. FSAM employs novel prehash and root-index techniques to accelerate the matching for the nonroot states and the root state, respectively, in automation based hardware. The prehash approach uses some hashing functions to pretest the input sub-string for the nonroot states while the root-index approach handles multiple bytes in one single matching for the root state. Also, FSAM is applied in a prevalent automaton algorithm, Aho-Corasick (AC), which is often used in many content-filtering applications. When implemented in FPGA, FSAM can perform at the rate of 11.1Gbps with the pattern set of 32,634 bytes, demonstrating that our proposed approach can use a small logic circuit to achieve a competitive performance, although a larger memory is used. Furthermore, the amount of patterns in FSAM is not limited by the amount of internal circuits and memories. If the high-speed external memories are employed, FSAM can support up to 21,302 patterns while maintaining similar high performance.
引用
收藏
页数:30
相关论文
共 50 条
  • [31] ResiRCA: A resilient energy harvesting ReRAM crossbar-based accelerator for intelligent embedded processors
    Qiu, Keni
    Jao, Nicholas
    Zhao, Mengying
    Mishra, Cyan Subhra
    Gudukbay, Gulsum
    Jose, Sethu
    Sampson, Jack
    Kandemir, Mahmut Taylan
    Narayanan, Vijaykrishnan
    2020 IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA 2020), 2020, : 315 - 327
  • [32] Nebula: A Scalable and Flexible Accelerator for DNN Multi-Branch Blocks on Embedded Systems
    Yang, Dawei
    Li, Xinlei
    Qi, Lizhe
    Zhang, Wenqiang
    Jiang, Zhe
    ELECTRONICS, 2022, 11 (04)
  • [33] Modified hotspot cache architecture: A low energy fast cache for embedded processors
    Ali, Kashif
    Aboelaze, Mokhtar
    Datta, Suprakash
    2006 INTERNATIONAL CONFERENCE ON EMBEDDED COMPUTER SYSTEMS: ARCHITECTURES, MODELING AND SIMULATION, PROCEEDINGS, 2006, : 35 - +
  • [34] Scalable row-based parallel H.264 decoder on embedded multicore processors
    Elias Baaklini
    Santhosh Rethinagiri
    Hassan Sbeity
    Smail Niar
    Signal, Image and Video Processing, 2015, 9 : 57 - 71
  • [35] Scalable row-based parallel H.264 decoder on embedded multicore processors
    Baaklini, Elias
    Rethinagiri, Santhosh
    Sbeity, Hassan
    Niar, Smail
    SIGNAL IMAGE AND VIDEO PROCESSING, 2015, 9 : 57 - 71
  • [36] Input-independent, Scalable and Fast String Matching on the Cray XMT
    Villa, Oreste
    Chavarria-Miranda, Daniel
    Maschhoff, Kristyn
    2009 IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL & DISTRIBUTED PROCESSING, VOLS 1-5, 2009, : 679 - +
  • [37] Fast and Scalable Design Space Exploration for Deep Learning on Embedded Systems
    Kutukcu, Basar
    Baidya, Sabur
    Dey, Sujit
    IEEE ACCESS, 2024, 12 : 148254 - 148266
  • [38] CONTENT-ADDRESSABLE MEMORY DOES FAST MATCHING
    BURSKY, D
    ELECTRONIC DESIGN, 1988, 36 (27) : 119 - 121
  • [39] SCALENet: A SCalable Low power AccELerator for Real-time Embedded Deep Neural Networks
    Shea, Colin
    Page, Adam
    Mohsenin, Tinoosh
    PROCEEDINGS OF THE 2018 GREAT LAKES SYMPOSIUM ON VLSI (GLSVLSI'18), 2018, : 129 - 134
  • [40] On-line error detection and fast recover techniques for dependable embedded processors - Introduction
    Pflanz, M
    ON-LINE ERROR DETECTION AND FAST RECOVER TECHNIQUES FOR DEPENDABLE EMBEDDED PROCESSORS, 2002, 2270 : 1 - +