A Fast and Power-Efficient Hardware Architecture for Non-Maximum Suppression

被引:34
作者
Shi, Man [1 ]
Ouyang, Peng [2 ]
Yin, Shouyi [1 ]
Liu, Leibo [1 ]
Wei, Shaojun [1 ]
机构
[1] Tsinghua Univ, Inst Microelect, Beijing 100084, Peoples R China
[2] Beihang Univ, Sch Elect & Informat Engn, Beijing 100083, Peoples R China
关键词
Face detection; NMS algorithm; hardware architecture;
D O I
10.1109/TCSII.2019.2893527
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Non-maximum suppression (NMS) is an indispensable post-processing step in face detection. The vast majority of face detection methods need NMS to merge the candidate detected face boxes that belong to the same face. However, the standard NMS is a greedy and local optimization technique which suffers from several shortcomings, such as high complexity ), high latency, and large power consumption. This brief alleviates these problems and presents an efficient hardware architecture for NMS, meanwhile, carries out the optimization for the calculation unit to achieve the reduction of area accordingly. Based on the multi-thread computing, this brief utilizes sliding window to obtain parallelism and uses position-based bit table technique for the enhancement of data accessing and data reusing, which greatly decreases the cost of memory access and power consumption. The proposed hardware architecture is implemented in TSMC 28-nm technology. Experiments show that the power consumption is 6.142 mW and the latency is 12.79 to cluster 1000 candidate boxes, whose energy efficiency is higher than those state-of-the-art methods by 3798 and 358, respectively.
引用
收藏
页码:1870 / 1874
页数:5
相关论文
共 12 条
[1]  
Alankar B., 2016, INDIAN J SCI TECHNOL, V9, P1, DOI DOI 10.17485/ijst/2016/v9i34/95610
[2]   Soft-NMS - Improving Object Detection With One Line of Code [J].
Bodla, Navaneeth ;
Singh, Bharat ;
Chellappa, Rama ;
Davis, Larry S. .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :5562-5570
[3]   Segmentation- and Annotation-Free License Plate Recognition With Deep Localization and Failure Identification [J].
Bulan, Orhan ;
Kozitsky, Vladimir ;
Ramesh, Palghat ;
Shreve, Matthew .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2017, 18 (09) :2351-2363
[4]   Deep Texture Features for Robust Face Spoofing Detection [J].
de Souza, Gustavo Botelho ;
da Silva Santos, Daniel Felipe ;
Pires, Rafael Goncalves ;
Marana, Aparecido Nilceu ;
Papa, Joao Paulo .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2017, 64 (12) :1397-1401
[5]  
Ganeswara Rao M. V., 2018, Microelectronics, Electromagnetics and Telecommunications. Proceedings of ICMEET 2017. LNEE 471, P847, DOI 10.1007/978-981-10-7329-8_87
[6]  
He Y. Y., SOFTER NMS RETHINKIN
[7]   A 23-mW Face Recognition Processor with Mostly-Read 5T Memory in 40-nm CMOS [J].
Jeon, Dongsuk ;
Dong, Qing ;
Kim, Yejoong ;
Wang, Xiaolong ;
Chen, Shuai ;
Yu, Hao ;
Blaauw, David ;
Sylvester, Dennis .
IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2017, 52 (06) :1628-1642
[8]   Efficient non-maximum suppression [J].
Neubeck, Alexander ;
Van Gool, Luc .
18TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 3, PROCEEDINGS, 2006, :850-+
[9]  
Oro D, 2016, INT CONF ACOUST SPEE, P1026, DOI 10.1109/ICASSP.2016.7471831
[10]   Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks [J].
Ren, Shaoqing ;
He, Kaiming ;
Girshick, Ross ;
Sun, Jian .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (06) :1137-1149