An Energy-Efficient, Unified CNN Accelerator for Real-Time Multi-Object Semantic Segmentation for Autonomous Vehicle

被引：10

作者：

Jung, Jueun ^{[1
]}

Kim, Seungbin ^{[1
]}

Jang, Wuyoung ^{[2
]}

Seo, Bokyoung ^{[2
]}

Lee, Kyuho Jason ^{[1
,2
]}

机构：

[1] Ulsan Natl Inst Sci & Technol, Dept Elect Engn, Ulsan 44919, South Korea

[2] Ulsan Natl Inst Sci & Technol, Grad Sch Artificial Intelligence, Ulsan 44919, South Korea

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS | 2024年 / 71卷 / 05期

关键词：

Convolutional neural networks; multi-object semantic segmentation; dilated convolution; transposed convolution; depth-wise separable convolution; trilateral network; autonomous electric vehicle system; NEURAL-NETWORK;

D O I：

10.1109/TCSI.2024.3349588

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

An energy-efficient, unified convolutional neural network (CNN) accelerator is proposed with a lightweight RGB-D network to achieve real-time, multi-object semantic segmentation in autonomous electric vehicle system. First, a lightweight Depth-fused Trilateral Network (DTN) is proposed to achieve high accuracy and real-time operation for road and multi-object segmentation at the same time. Optimized with various types of convolution layers and limited hardware resources, the DTN achieves 94.73% accuracy on KITTI Road dataset. Second, the unified CNN processor is designed with dual-mode shift-register-based input reconfiguration units and layer fusion architecture with 2-types of processing elements for depth-wise separable convolution (DSC) to support 5 different types of convolution layers including standard convolution, dilated convolution, transposed convolution, point-wise convolution, and DSC. With flexible architecture, it achieves 17.97 $\times$ higher throughput with DTN and DSC layer fusion architecture reduces 34.7% of overall external memory access. Implemented with 28nm CMOS technology, the unified CNN processor shows 43.6 mW power consumption and 4.94 TOPS/W energy efficiency. As a result, the proposed system with DTN realizes 40.07 frames-per-second (fps) throughputs in multi-object semantic segmentation application with high resolution driving scenes dataset.

引用

页码：2093 / 2104

页数：12

共 47 条

[1] SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].

Badrinarayanan, Vijay ;

Kendall, Alex ;

Cipolla, Roberto .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495

[2] RoadNet-RT: High Throughput CNN Architecture and SoC Design for Real-Time Road Segmentation [J].

Bai, Lin ;

Lyu, Yecheng ;

Huang, Xinming .

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2021, 68 (02) :704-714

[3] Fully-Convolutional Siamese Networks for Object Tracking [J].

Bertinetto, Luca ;

Valmadre, Jack ;

Henriques, Joao F. ;

Vedaldi, Andrea ;

Torr, Philip H. S. .

COMPUTER VISION - ECCV 2016 WORKSHOPS, PT II, 2016, 9914 :850-865

[4] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].

Chen, Liang-Chieh ;

Papandreou, George ;

Kokkinos, Iasonas ;

Murphy, Kevin ;

Yuille, Alan L. .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848

[5] Bi-directional Cross-Modality Feature Propagation with Separation-and-Aggregation Gate for RGB-D Semantic Segmentation [J].

Chen, Xiaokang ;

Lin, Kwan-Yee ;

Wang, Jingbo ;

Wu, Wayne ;

Qian, Chen ;

Li, Hongsheng ;

Zeng, Gang .

COMPUTER VISION - ECCV 2020, PT XI, 2020, 12356 :561-577

[6] RBNet: A Deep Neural Network for Unified Road and Road Boundary Detection [J].

Chen, Zhe ;

Chen, Zijing .

NEURAL INFORMATION PROCESSING, ICONIP 2017, PT I, 2017, 10634 :677-687

[7]

CHOLLET F, 2017, PROC CVPR IEEE, P1800, DOI DOI 10.1109/CVPR.2017.195

[8]

CHONG WH, 2021, INFECTION, P1, DOI DOI 10.1007/S15010-021-01701-X

[9] The Cityscapes Dataset for Semantic Urban Scene Understanding [J].

Cordts, Marius ;

Omran, Mohamed ;

Ramos, Sebastian ;

Rehfeld, Timo ;

Enzweiler, Markus ;

Benenson, Rodrigo ;

Franke, Uwe ;

Roth, Stefan ;

Schiele, Bernt .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3213-3223

[10]

Culurciello, 2016, CoRR

← 1 2 3 4 5 →