Based on cross-scale fusion attention mechanism network for semantic segmentation for street scenes

被引：2

作者：

Ye, Xin ^{[1
]}

Gao, Lang ^{[1
]}

Chen, Jichen ^{[2
]}

Lei, Mingyue ^{[1
]}

机构：

[1] Xian Technol Univ, Inst Artificial Intelligence & Data Sci, Xian, Peoples R China

[2] Xian Microelect Technol Inst, Comp Part 3, Xian, Peoples R China

来源：

FRONTIERS IN NEUROROBOTICS | 2023年 / 17卷

关键词：

computer vision; semantic segmentation; channel attention mechanism; residual block; dilation convolution; factorized convolution;

D O I：

10.3389/fnbot.2023.1204418

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Semantic segmentation, which is a fundamental task in computer vision. Every pixel will have a specific semantic class assigned to it through semantic segmentation methods. Embedded systems and mobile devices are difficult to deploy high-accuracy segmentation algorithms. Despite the rapid development of semantic segmentation, the balance between speed and accuracy must be improved. As a solution to the above problems, we created a cross-scale fusion attention mechanism network called CFANet, which fuses feature maps from different scales. We first design a novel efficient residual module (ERM), which applies both dilation convolution and factorized convolution. Our CFANet is mainly constructed from ERM. Subsequently, we designed a new multi-branch channel attention mechanism (MCAM) to refine the feature maps at different levels. Experiment results show that CFANet achieved 70.6% mean intersection over union (mIoU) and 67.7% mIoU on Cityscapes and CamVid datasets, respectively, with inference speeds of 118 FPS and 105 FPS on NVIDIA RTX2080Ti GPU cards with 0.84M parameters.

引用

页数：12

共 34 条

[1] SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].

Badrinarayanan, Vijay ;

Kendall, Alex ;

Cipolla, Roberto .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495

[2] Large-Scale Machine Learning with Stochastic Gradient Descent [J].

Bottou, Leon .

COMPSTAT'2010: 19TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL STATISTICS, 2010, :177-186

[3] Semantic object classes in video: A high-definition ground truth database [J].

Brostow, Gabriel J. ;

Fauqueur, Julien ;

Cipolla, Roberto .

PATTERN RECOGNITION LETTERS, 2009, 30 (02) :88-97

[4] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].

Chen, Liang-Chieh ;

Papandreou, George ;

Kokkinos, Iasonas ;

Murphy, Kevin ;

Yuille, Alan L. .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848

[5] The Cityscapes Dataset for Semantic Urban Scene Understanding [J].

Cordts, Marius ;

Omran, Mohamed ;

Ramos, Sebastian ;

Rehfeld, Timo ;

Enzweiler, Markus ;

Benenson, Rodrigo ;

Franke, Uwe ;

Roth, Stefan ;

Schiele, Bernt .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3213-3223

[6] MDRNet: a lightweight network for real-time semantic segmentation in street scenes [J].

Dai, Yingpeng ;

Wang, Junzheng ;

Li, Jiehao ;

Li, Jing .

ASSEMBLY AUTOMATION, 2021, 41 (06) :725-733

[7] EdgeNet: Semantic Scene Completion from a Single RGB-D Image [J].

Dourado, Aloisio ;

De Campos, Teofilo E. ;

Kim, Hansung ;

Hilton, Adrian .

2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, :503-510

[8] SANet: Structure-Aware Network for Visual Tracking [J].

Fan, Heng ;

Ling, Haibin .

2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2017, :2217-2224

[9] MSCFNet: A Lightweight Network With Multi-Scale Context Fusion for Real-Time Semantic Segmentation [J].

Gao, Guangwei ;

Xu, Guoan ;

Yu, Yi ;

Xie, Jin ;

Yang, Jian ;

Yue, Dong .

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (12) :25489-25499

[10]

Han W., 2020, arXiv, DOI [10.21437/Interspeech.2020-2059, DOI 10.21437/INTERSPEECH.2020-2059]

← 1 2 3 4 →