S-Net: A Lightweight Real-Time Semantic Segmentation Network for Autonomous Driving

被引：0

作者：

Mazhar, Saquib ^{[1
]}

Atif, Nadeem ^{[1
]}

Bhuyan, M. K. ^{[1
]}

Ahamed, Shaik Rafi ^{[1
]}

机构：

[1] Indian Inst Technol, Gauhati, Assam, India

来源：

COMPUTER VISION AND IMAGE PROCESSING, CVIP 2023, PT II | 2024年 / 2010卷

关键词：

Computer vision; Autonomous driving; Semantic segmentation; Deep learning;

D O I：

10.1007/978-3-031-58174-8_14

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Semantic segmentation of road-scene images for autonomous driving is a dense pixel-level prediction task performed in real-time. Deep learning models make extensive efforts to improve segmentation accuracy, among which network architecture design is essential. In edge devices, this becomes more challenging due to limited computing power. While very deep encoder-decoder-based networks perform fairly accurately, their slow inference speed and many parameters make them unsuitable for small devices. Decoder-less models are fast but suffer from accuracy loss. To this end, we propose a novel architecture with a shallow decoder. We propose a building block for our network, which leverages a multi-scale feature pyramid model. The block efficiently learns semantic and contextual features based on which we design our network. It benefits from uniquely placed encoder skip connections, which are responsible for retaining low-level features to preserve boundary information, often lost in deep networks. Experiments on highly competitive Cityscapes and CamVid datasets show the efficiency of our proposed architecture. Our model gets a mean intersection-over-union score of 72.5% and 67.5% on the Cityscapes and CamVid test set, with only 0.6 Million parameters running in real-time.

引用

页码：147 / 159

页数：13

共 37 条

[1] Semantic object classes in video: A high-definition ground truth database [J].