Adaptive Attention Mechanism Fusion for Real-Time Semantic Segmentation in Complex Scenes

被引：0

作者：

Chen, Dan ^{[1
]}

Liu, Le ^{[1
]}

Wang, Chenhao ^{[2
]}

Bai, Xiru ^{[1
]}

Wang, Zichen ^{[1
]}

机构：

[1] School of Automation and Information Engineering, Xi’an University of Technology, Xi’an,710048, China

[2] School of Electronic and Information Engineering, Shaanxi Vocational and Technical College, Xi’an,710038, China

来源：

Dianzi Yu Xinxi Xuebao/Journal of Electronics and Information Technology | 2024年 / 46卷 / 08期

关键词：

Convolutional neural networks - Image fusion;

D O I：

10.11999/JEIT231338

中图分类号：

学科分类号：

摘要：

Realizing high accuracy and low computational burden is a serious challenge faced by Convolutional Neural Network (CNN) for real-time semantic segmentation. In this paper, an efficient real-time semantic segmentation Adaptive Attention mechanism Fusion Network(AAFNet) is designed for complex urban street scenes with numerous types of targets and large changes in lighting. Image spatial details and semantic information are respectively extracted by the network, and then, through Feature Fusion Network(FFN), accurate semantic images are obtained. Dilated Deep-Wise separable convolution (DDW) is adopted by AAFNet to increase the receptive field of semantic feature extraction, an Adaptive Attention mechanism Fusion Module (AAFM) is proposed, which combines Adaptive average pooling(Avp) and Adaptive max pooling(Amp) to refine the edge segmentation effect of the target and reduce the leakage rate of small targets. Finally, semantic segmentation experiments are performed on the Cityscapes and CamVid datasets for complex urban street scenes. The designed AAFNet achieves 73.0% and 69.8% mean Intersection over Union (mIoU) at inference speeds of 32 fps (Cityscapes) and 52 fps (CamVid). Compared with Dilated Spatial Attention Network (DSANet), Multi-Scale Context Fusion Network (MSCFNet), and Lightweight Bilateral Asymmetric Residual Network (LBARNet), AAFNet has the highest segmentation accuracy. © 2024 Science Press. All rights reserved.

引用

页码：3334 / 3342

共 50 条

[41] Efficient Adaptive Upsampling Module for Real-Time Semantic Segmentation
Yang, Xinneng
Wu, Yan
Zhao, Junqiao
Liu, Feilin
Liao, Yujun
Mo, Yujian
INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2022, 36 (14)
[42] Real-time Progressive 3D Semantic Segmentation for Indoor Scenes
Quang-Hieu Pham
Binh-Son Hua
Duc Thanh Nguyen
Yeung, Sai-Kit
2019 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2019, : 1089 - 1098
[43] ESNET: EDGE-BASED SEGMENTATION NETWORK FOR REAL-TIME SEMANTIC SEGMENTATION IN TRAFFIC SCENES
Lyu, Haoran
Fu, Huiyuan
Hu, Xiaojun
Liu, Liang
2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 1855 - 1859
[44] Semantic segmentation of autonomous driving scenes based on multi-scale adaptive attention mechanism
Liu, Danping
Zhang, Dong
Wang, Lei
Wang, Jun
FRONTIERS IN NEUROSCIENCE, 2023, 17
[45] Bilateral attention decoder: A lightweight decoder for real-time semantic segmentation
Peng, Chengli
Tian, Tian
Chen, Chen
Guo, Xiaojie
Ma, Jiayi
NEURAL NETWORKS, 2021, 137 : 188 - 199
[46] Dual-inferences mechanism for real-time semantic segmentation
Toan, Quyen Van
Kim, Min Young
2022 THIRTEENTH INTERNATIONAL CONFERENCE ON UBIQUITOUS AND FUTURE NETWORKS (ICUFN), 2022, : 12 - 17
[47] Attention based lightweight asymmetric network for real-time semantic segmentation
Liu, Qian
Wang, Cunbao
Li, Zhensheng
Qi, Youwei
Fang, Jiongtao
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 130
[48] LSNet: Real-time attention semantic segmentation network with linear complexity
Sheng, Pengpeng
Shi, Yanli
Liu, Xin
Jin, Huan
NEUROCOMPUTING, 2022, 509 : 94 - 101
[49] Real-time human-centric segmentation for complex video scenes
Yu, Ran
Tian, Chenyu
Xia, Weihao
Zhao, Xinyuan
Wang, Liejun
Yang, Yujiu
IMAGE AND VISION COMPUTING, 2022, 126
[50] Lightweight and efficient feature fusion real-time semantic segmentation network
Zhong, Jie
Chen, Aiguo
Jiang, Yizhang
Sun, Chengcheng
Peng, Yuheng
IMAGE AND VISION COMPUTING, 2025, 154

← 1 2 3 4 5 →