Real-Time Memory Efficient Multitask Learning Model for Autonomous Driving

被引：21

作者：

Miraliev, Shokhrukh ^{[1
]}

Abdigapporov, Shakhboz ^{[1
]}

Kakani, Vijay ^{[2
]}

Kim, Hakil ^{[1
]}

机构：

[1] Inha Univ, Dept Elect & Comp Engn, Incheon 22212, South Korea

[2] Inha Univ, Dept Integrated Syst Engn, Incheon 22212, South Korea

来源：

IEEE TRANSACTIONS ON INTELLIGENT VEHICLES | 2024年 / 9卷 / 01期

基金：

新加坡国家研究基金会;

关键词：

Task analysis; Feature extraction; Object detection; Performance evaluation; Lane detection; Roads; Decoding; Multitask learning; edge device; autonomous driving; object detection; drivable area segmentation; lane detection; convolutional neural networks;

D O I：

10.1109/TIV.2023.3270878

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Developing a self-driving system is a challenging task that requires a high level of scene comprehension with real-time inference, and it is safety-critical. This study proposes a real-time memory efficient multitask learning-based model for joint object detection, drivable area segmentation, and lane detection tasks. To accomplish this research objective, the encoder-decoder architecture efficiently utilized to handle input frames through shared representation. Comprehensive experiments conducted on a challenging public Berkeley Deep Drive (BDD100 K) dataset. For further performance comparisons, a private dataset consisting of 30 K frames was collected and annotated for the three aforementioned tasks. Experimental results demonstrated the superiority of the proposed method's over existing baseline approaches in terms of computational efficiency, model power consumption and accuracy performance. The performance results for object detection, drivable area segmentation and lane detection tasks showed the highest 77.5 mAP50, 91.9 mIoU and 33.8 mIoU results on BDD100K dataset respectively. In addition, the model achieved 112.29 fps processing speed improving both performance and inference speed results of existing multi-tasking models.

引用

页码：247 / 258

页数：12

共 44 条

[1]

Abdigapporov S, 2022, INT C CONTR AUTOMAT, P819, DOI 10.23919/ICCAS55662.2022.10003816

[2] SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].

Badrinarayanan, Vijay ;

Kendall, Alex ;

Cipolla, Roberto .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495

[3]

Bochkovskiy A, 2020, Arxiv, DOI arXiv:2004.10934

[4] Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation [J].

Chen, Liang-Chieh ;

Zhu, Yukun ;

Papandreou, George ;

Schroff, Florian ;

Adam, Hartwig .

COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 :833-851

[5] The Cityscapes Dataset for Semantic Urban Scene Understanding [J].

Cordts, Marius ;

Omran, Mohamed ;

Ramos, Sebastian ;

Rehfeld, Timo ;

Enzweiler, Markus ;

Benenson, Rodrigo ;

Franke, Uwe ;

Roth, Stefan ;

Schiele, Bernt .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3213-3223

[6]

Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848

[7] Restricted Deformable Convolution-Based Road Scene Semantic Segmentation Using Surround View Cameras [J].

Deng, Liuyuan ;

Yang, Ming ;

Li, Hao ;

Li, Tianyi ;

Hu, Bing ;

Wang, Chunxiang .

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2020, 21 (10) :4350-4362

[8]

Han C., 2022, arXiv

[9] Deep Residual Learning for Image Recognition [J].

He, Kaiming ;

Zhang, Xiangyu ;

Ren, Shaoqing ;

Sun, Jian .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778

[10] Learning Lightweight Lane Detection CNNs by Self Attention Distillation [J].

Hou, Yuenan ;

Ma, Zheng ;

Liu, Chunxiao ;

Loy, Chen Change .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :1013-1021

← 1 2 3 4 5 →