Semantic segmentation network based on attention mechanism and feature fusion

被引:0
作者
Cai, Hua [1 ]
Wang, Yu-Yao [1 ]
Fu, Qiang [2 ]
Ma, Zhi-Yong [3 ]
Wang, Wei-Gang [3 ]
Zhang, Chen-Jie [1 ]
机构
[1] School of Electronic Information Engineer, Changchun University of Science and Technology, Changchun
[2] School of Opto-Electronic Engineer, Changchun University of Science and Technology, Changchun
[3] No.2 Department of Urology, The First Hospital of Jilin University, Changchun
来源
Jilin Daxue Xuebao (Gongxueban)/Journal of Jilin University (Engineering and Technology Edition) | 2025年 / 55卷 / 04期
关键词
attention mechanism; contextual information; feature fusion; multi-scale features; semantic segmentation;
D O I
10.13229/j.cnki.jdxbgxb.20230740
中图分类号
学科分类号
摘要
To address the issues of multi-scale object segmentation errors,poor correlation between multi-scale feature maps and feature maps at different stages in the DeepLabv3+ network,the following modules are proposed to incorporate,including a global context attention module,a cascade adaptive Scale awareness module,and an attention optimized fusion module. The global context attention module is embedded in the initial stage of the backbone network for feature extraction,allowing it to capture rich contextual information. The cascade adaptive scale awareness module models the dependencies between multi-scale features,enabling a stronger focus on the features relevant to the target. The attention optimized fusion module merges multiple layers of features through multiple pathways to enhance pixel continuity during decoding. The improved network is validated on the CityScapes dataset and PASCAL VOC2012 augmented dataset,and the experimental results demonstrate its ability to overcome the limitations of DeepLabv3+ . Furthermore,the mean intersection over union reaches 76.2% and 78.7% respectively. © 2025 Editorial Board of Jilin University. All rights reserved.
引用
收藏
页码:1384 / 1395
页数:11
相关论文
共 28 条
[1]  
Ronneberger O, Fischer P, Brox T., U-net: convolutional networks for biomedical image segmentation [C], Medical Image Computing and Computer-Assisted Intervention-MICCAI: The 18th International Conference, pp. 234-241, (2015)
[2]  
Chen J, Lu Y, Yu Q, Et al., Transunet: transformers make strong encoders for medical image segmentation [J/OL]
[3]  
Zhao T Y, Xu J D, Chen R, Et al., Remote sensing image segmentation based on the fuzzy deep convolutional neural network, International Journal of Remote Sensing, 42, 16, pp. 6264-6283, (2021)
[4]  
Yuan X H, Shi J F, Gu L C., A review of deep learning methods for semantic segmentation of remote sensing imagery, Expert Systems with Applications, 169, (2021)
[5]  
Xu Z Y, Zhang W, Zhang T X, Et al., Efficient transformer for remote sensing image segmentation, Remote Sensing, 13, 18, (2021)
[6]  
Badrinarayanan V, Kendall A, Cipolla R., Segnet: a deep convolutional encoder-decoder architecture for image segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 12, pp. 2481-2495, (2017)
[7]  
Yu C, Gao C, Wang J, Et al., Bisenet v2: bilateral network with guided aggregation for real-time semantic segmentation, International Journal of Computer Vision, 129, pp. 3051-3068, (2021)
[8]  
Long J, Shelhamer E, Darrell T., Fully convolutional networks for semantic segmentation, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431-3440, (2015)
[9]  
Chen L C, Papandreou G, Kokkinos I, Et al., Deep-lab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs, IEEE Transactions on Pattern Analysis and Machine Intelligence, 40, 4, pp. 834-848, (2017)
[10]  
Chen L C, Papandreou G, Schroff F, Et al., Rethinking atrous convolution for semantic image segmentation [J/OL]