An Energy-Efficient, Unified CNN Accelerator for Real-Time Multi-Object Semantic Segmentation for Autonomous Vehicle

被引:10
作者
Jung, Jueun [1 ]
Kim, Seungbin [1 ]
Jang, Wuyoung [2 ]
Seo, Bokyoung [2 ]
Lee, Kyuho Jason [1 ,2 ]
机构
[1] Ulsan Natl Inst Sci & Technol, Dept Elect Engn, Ulsan 44919, South Korea
[2] Ulsan Natl Inst Sci & Technol, Grad Sch Artificial Intelligence, Ulsan 44919, South Korea
关键词
Convolutional neural networks; multi-object semantic segmentation; dilated convolution; transposed convolution; depth-wise separable convolution; trilateral network; autonomous electric vehicle system; NEURAL-NETWORK;
D O I
10.1109/TCSI.2024.3349588
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
An energy-efficient, unified convolutional neural network (CNN) accelerator is proposed with a lightweight RGB-D network to achieve real-time, multi-object semantic segmentation in autonomous electric vehicle system. First, a lightweight Depth-fused Trilateral Network (DTN) is proposed to achieve high accuracy and real-time operation for road and multi-object segmentation at the same time. Optimized with various types of convolution layers and limited hardware resources, the DTN achieves 94.73% accuracy on KITTI Road dataset. Second, the unified CNN processor is designed with dual-mode shift-register-based input reconfiguration units and layer fusion architecture with 2-types of processing elements for depth-wise separable convolution (DSC) to support 5 different types of convolution layers including standard convolution, dilated convolution, transposed convolution, point-wise convolution, and DSC. With flexible architecture, it achieves 17.97 $\times$ higher throughput with DTN and DSC layer fusion architecture reduces 34.7% of overall external memory access. Implemented with 28nm CMOS technology, the unified CNN processor shows 43.6 mW power consumption and 4.94 TOPS/W energy efficiency. As a result, the proposed system with DTN realizes 40.07 frames-per-second (fps) throughputs in multi-object semantic segmentation application with high resolution driving scenes dataset.
引用
收藏
页码:2093 / 2104
页数:12
相关论文
共 47 条
[1]   SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].
Badrinarayanan, Vijay ;
Kendall, Alex ;
Cipolla, Roberto .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495
[2]   RoadNet-RT: High Throughput CNN Architecture and SoC Design for Real-Time Road Segmentation [J].
Bai, Lin ;
Lyu, Yecheng ;
Huang, Xinming .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2021, 68 (02) :704-714
[3]   Fully-Convolutional Siamese Networks for Object Tracking [J].
Bertinetto, Luca ;
Valmadre, Jack ;
Henriques, Joao F. ;
Vedaldi, Andrea ;
Torr, Philip H. S. .
COMPUTER VISION - ECCV 2016 WORKSHOPS, PT II, 2016, 9914 :850-865
[4]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[5]   Bi-directional Cross-Modality Feature Propagation with Separation-and-Aggregation Gate for RGB-D Semantic Segmentation [J].
Chen, Xiaokang ;
Lin, Kwan-Yee ;
Wang, Jingbo ;
Wu, Wayne ;
Qian, Chen ;
Li, Hongsheng ;
Zeng, Gang .
COMPUTER VISION - ECCV 2020, PT XI, 2020, 12356 :561-577
[6]   RBNet: A Deep Neural Network for Unified Road and Road Boundary Detection [J].
Chen, Zhe ;
Chen, Zijing .
NEURAL INFORMATION PROCESSING, ICONIP 2017, PT I, 2017, 10634 :677-687
[7]  
CHOLLET F, 2017, PROC CVPR IEEE, P1800, DOI DOI 10.1109/CVPR.2017.195
[8]  
CHONG WH, 2021, INFECTION, P1, DOI DOI 10.1007/S15010-021-01701-X
[9]   The Cityscapes Dataset for Semantic Urban Scene Understanding [J].
Cordts, Marius ;
Omran, Mohamed ;
Ramos, Sebastian ;
Rehfeld, Timo ;
Enzweiler, Markus ;
Benenson, Rodrigo ;
Franke, Uwe ;
Roth, Stefan ;
Schiele, Bernt .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3213-3223
[10]  
Culurciello, 2016, CoRR