ULAF-Net: Ultra lightweight attention fusion network for real-time semantic segmentation

被引：5

作者：

Hu, Kaidi ^{[1
]}

Xie, Zongxia ^{[1
]}

Hu, Qinghua ^{[1
]}

机构：

[1] Tianjin Univ, Coll Intelligence & Comp, Tianjin 300350, Peoples R China

来源：

INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS | 2024年 / 15卷 / 07期

基金：

中国国家自然科学基金;

关键词：

Lightweight model; Real-time semantic segmentation; Multi-scale fusion; Attention mechanism; Street scene understanding;

D O I：

10.1007/s13042-023-02077-0

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Real-time semantic segmentation, laying the foundation of mobile robots and autonomous driving, has attracted much attention in recent years. Currently, most deep models suffer high computational costs due to their complex architectures, making them impractical on resource-limited devices. Some lightweight models have been designed by reducing model complexity at the expense of segmentation accuracy. We propose an ultra-lightweight network, called ULAF-Net, to achieve a balance between segmentation accuracy, model complexity, and inference speed. This network abandons the straightforward concatenation of simple multi-scale fusion methods. First, we employ a parameter-free attention mechanism to process two large-scale feature maps, followed by the initial fusion. Furthermore, we consider the characteristics of varying scales and utilize lightweight spatial and channel attention modules to perform secondary processing on the fused large-scale feature map and the small-scale feature map, respectively, further highlighting important features. Finally, we combine both of them. In addition, we integrate multiple specialized convolutional methods and attention mechanisms to design a new residual module, which can make full use of the contextual features. The parameter quantity of ULAF-Net is merely 0.60M. It possesses the capability of real-time segmentation and achieves competitive segmentation outcomes on public datasets.

引用

页码：2987 / 3003

页数：17

共 75 条

[1]

Ba J, 2014, ACS SYM SER

[2] SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].

Badrinarayanan, Vijay ;

Kendall, Alex ;

Cipolla, Roberto .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495

[3] Dense Decoder Shortcut Connections for Single-Pass Semantic Segmentation [J].

Bilinski, Piotr ;

Prisacariu, Victor .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :6596-6605

[4] Stereo obstacle detection for unmanned surface vehicles by IMU-assisted semantic segmentation [J].

Bovcon, Borja ;

Mandeljc, Rok ;

Pers, Janez ;

Kristan, Matej .

ROBOTICS AND AUTONOMOUS SYSTEMS, 2018, 104 :1-13

[5] Semantic object classes in video: A high-definition ground truth database [J].

Brostow, Gabriel J. ;

Fauqueur, Julien ;

Cipolla, Roberto .

PATTERN RECOGNITION LETTERS, 2009, 30 (02) :88-97

[6] Deep Spatio-Temporal Random Fields for Efficient Video Segmentation [J].

Chandra, Siddhartha ;

Couprie, Camille ;

Kokkinos, Iasonas .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :8915-8924

[7]

Chaurasia A, 2017, 2017 IEEE VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP)

[8] Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation [J].

Chen, Liang-Chieh ;

Zhu, Yukun ;

Papandreou, George ;

Schroff, Florian ;

Adam, Hartwig .

COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 :833-851

[9] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].

Chen, Liang-Chieh ;

Papandreou, George ;

Kokkinos, Iasonas ;

Murphy, Kevin ;

Yuille, Alan L. .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848

[10] The Cityscapes Dataset for Semantic Urban Scene Understanding [J].

Cordts, Marius ;

Omran, Mohamed ;

Ramos, Sebastian ;

Rehfeld, Timo ;

Enzweiler, Markus ;

Benenson, Rodrigo ;

Franke, Uwe ;

Roth, Stefan ;

Schiele, Bernt .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3213-3223

← 1 2 3 4 5 6 7 8 →