Real-Time High-Performance Semantic Image Segmentation of Urban Street Scenes

被引：84

作者：

Dong, Genshun ^{[1
]}

Yan, Yan ^{[1
]}

Shen, Chunhua ^{[2
]}

Wang, Hanzi ^{[1
]}

机构：

[1] Xiamen Univ, Sch Informat, Fujian Key Lab Sensing & Comp Smart City, Xiamen 361005, Peoples R China

[2] Univ Adelaide, Sch Comp Sci, Adelaide, SA 5005, Australia

来源：

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS | 2021年 / 22卷 / 06期

基金：

中国国家自然科学基金; 国家重点研发计划;

关键词：

Semantics; Real-time systems; Image segmentation; Convolution; Intelligent transportation systems; Task analysis; Computational modeling; Intelligent vehicles; street scene understanding; deep learning; real-time semantic image segmentation; light-weight convolutional neural networks; OBJECT RECOGNITION;

D O I：

10.1109/TITS.2020.2980426

中图分类号：

TU [建筑科学];

学科分类号：

0813 ;

摘要：

Deep Convolutional Neural Networks (DCNNs) have recently shown outstanding performance in semantic image segmentation. However, state-of-the-art DCNN-based semantic segmentation methods usually suffer from high computational complexity due to the use of complex network architectures. This greatly limits their applications in the real-world scenarios that require real-time processing. In this paper, we propose a real-time high-performance DCNN-based method for robust semantic segmentation of urban street scenes, which achieves a good trade-off between accuracy and speed. Specifically, a Lightweight Baseline Network with Atrous convolution and Attention (LBN-AA) is firstly used as our baseline network to efficiently obtain dense feature maps. Then, the Distinctive Atrous Spatial Pyramid Pooling (DASPP), which exploits the different sizes of pooling operations to encode the rich and distinctive semantic information, is developed to detect objects at multiple scales. Meanwhile, a Spatial detail-Preserving Network (SPN) with shallow convolutional layers is designed to generate high-resolution feature maps preserving the detailed spatial information. Finally, a simple but practical Feature Fusion Network (FFN) is used to effectively combine both deep and shallow features from the semantic branch (DASPP) and the spatial branch (SPN), respectively. Extensive experimental results show that the proposed method respectively achieves the accuracy of 73.6% and 68.0% mean Intersection over Union (mIoU) at the inference speeds of 51.0 fps and 39.3 fps on the challenging Cityscapes and CamVid test datasets (by only using a single NVIDIA TITAN X card). This demonstrates that the proposed method offers excellent performance at the real-time speed for semantic segmentation of urban street scenes.

引用

页码：3258 / 3274

页数：17

共 50 条

[1] DSANet: Dilated spatial attention for real-time semantic segmentation in urban street scenes
Elhassan, Mohammed A. M.
Huang, Chenxi
Yang, Chenhui
Munea, Tewodros Legesse
EXPERT SYSTEMS WITH APPLICATIONS, 2021, 183
[2] MDRNet: a lightweight network for real-time semantic segmentation in street scenes
Dai, Yingpeng
Wang, Junzheng
Li, Jiehao
Li, Jing
ASSEMBLY AUTOMATION, 2021, 41 (06) : 725 - 733
[3] Multi-directional feature refinement network for real-time semantic segmentation in urban street scenes
Zhou, Yan
Zheng, Xihong
Yang, Yin
Li, Jianxun
Mu, Jinzhen
Irampaye, Richard
IET COMPUTER VISION, 2023, 17 (04) : 431 - 444
[4] Small Object Augmentation of Urban Scenes for Real-Time Semantic Segmentation
Yang, Zhengeng
Yu, Hongshan
Feng, Mingtao
Sun, Wei
Lin, Xuefei
Sun, Mingui
Mao, Zhi-Hong
Mian, Ajmal
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 5175 - 5190
[5] Deep Multi-Resolution Network for Real-Time Semantic Segmentation in Street Scenes
Wang, Yalun
Chen, Shidong
Bian, Huicong
Li, Weixiao
Lu, Qin
2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
[6] Exploring Scale-Aware Features for Real-Time Semantic Segmentation of Street Scenes
Li, Kaige
Geng, Qichuan
Zhou, Zhong
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, 25 (05) : 3575 - 3587
[7] Gated feature aggregate and alignment network for real-time semantic segmentation of street scenes
Liu, Qian
Li, Zhensheng
Qi, Youwei
Wang, Cunbao
MULTIMEDIA SYSTEMS, 2024, 30 (04)
[8] Joint pyramid attention network for real-time semantic segmentation of urban scenes
Xuegang Hu
Liyuan Jing
Uroosa Sehar
Applied Intelligence, 2022, 52 : 580 - 594
[9] Joint pyramid attention network for real-time semantic segmentation of urban scenes
Hu, Xuegang
Jing, Liyuan
Sehar, Uroosa
APPLIED INTELLIGENCE, 2022, 52 (01) : 580 - 594
[10] Real-Time Semantic Segmentation Algorithm for Street Scenes Based on Attention Mechanism and Feature Fusion
Wu, Bao
Xiong, Xingzhong
Wang, Yong
ELECTRONICS, 2024, 13 (18)

← 1 2 3 4 5 →