High-accuracy low-latency non-maximum suppression processor for traffic object detection

被引:0
作者
Yuan, Chenbo [1 ,2 ,3 ]
Xu, Peng [1 ]
Chen, Gang [1 ,2 ,3 ]
机构
[1] Chinese Acad Sci, Inst Semicond, Beijing 10083, Peoples R China
[2] Univ Chinese Acad Sci, Beijing 10089, Peoples R China
[3] Semicond Neural Network Intelligent & Comp Techno, Beijing 100083, Peoples R China
基金
中国国家自然科学基金;
关键词
traffic object detection; NMS algorithm; hardware accelerator; NMS;
D O I
10.1587/elex.20.20230445
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
As autonomous driving technology advances, the requirements for object detection are becoming increasingly high. Non-maximum suppression (NMS) algorithm, as a key component in traffic object detection algorithms, is an independent post-processing process in the object detection framework. Due to the complexity of real-world road scenarios and high density of detected entities in urban traffic, the number of candidate bounding boxes generated by the neural network is large. Hence, low-precision processors may generate a significant number of redundant target bounding boxes. The excessive output of redundant target bounding boxes not only imposes a workload on subsequent processing but also has the potential to result in non-optimal decision-making. We propose a high-performance NMS processor that can quickly process a large number of candidate boxes without performing sorting of their scores. Also, it has low precision loss computing units and high parallel computing arrays. Combined with algorithm design, it effectively reduces the computational complexity and reduces the inference time of the end-to-end task of the NMS algorithm. Thus, our NMS processor's speed is comparable to SOTA architecture, and the average accuracy loss is only 0.4%.
引用
收藏
页数:5
相关论文
empty
未找到相关数据