Enhancing semantic scene segmentation for indoor autonomous systems using advanced attention-supported improved UNet

被引:1
作者
Tran, Hoang N. [1 ]
Nguyen, Nghi V. [1 ]
Le, Nhi Q. P. [1 ]
Nguyen, Nam N. N. [1 ]
Le, Thu A. N. [1 ]
Nguyen, Vinh D. [1 ]
机构
[1] FPT Univ, Can Tho 94000, Vietnam
关键词
Segmentation; UNet; Semantic segmentation; Attention mechanisms; Convolutional neural networks (CNN); EfficientNet; VISION;
D O I
10.1007/s11760-024-03779-w
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper introduces EFFB7-UNet, an advanced semantic segmentation framework tailored for Indoor Autonomous Vision Systems (IAVSs) utilizing the U-Net architecture. The framework employs EfficientNetB4 as its encoder, significantly enhancing feature extraction. It integrates a spatial and channel Squeeze-and-Excitation (scSE) attention block, emphasizing critical areas and features to refine segmentation outcomes. Comprehensive evaluations using the NYUv2 Dataset and various augmented datasets were conducted. This study systematically compares EFFB7-UNet's performance with multiple U-Net encoders, including ResNet50, ResNet101, MobileNet V2, VGG16, VGG19, and EfficientNets B0-B6. The findings reveal that EFFB7-UNet not only surpasses these configurations in terms of accuracy but also highlights the effectiveness of the scSE attention block in achieving superior segmentation results. Without the utilization of depth information, EFFB7-UNet achieves a 12% improvement in mean Intersection over Union (mIOU). This significant enhancement demonstrates EFFB7-UNet's adaptability across various domains, implying substantial progress in enhancing the effectiveness and reliability of Intelligent Autonomous Vision Systems (IAVS) technologies.
引用
收藏
页数:11
相关论文
共 35 条
[1]   Computer Vision to Improve Security Surveillance through the Identification of Digital Patterns [J].
Abdulhussein, Ansam A. ;
Kuba, Hasanien Kariem ;
Alanssari, Alaa Neamah Azeez .
2020 INTERNATIONAL CONFERENCE ON INDUSTRIAL ENGINEERING, APPLICATIONS AND MANUFACTURING (ICIEAM), 2020,
[2]  
Badrinarayanan V., 2015, ARXIV
[3]   A visual measurement algorithm for vibration displacement of rotating body using semantic segmentation network [J].
Chai, Shanglei ;
Wang, Sen ;
Liu, Chang ;
Liu, Xiaoqin ;
Liu, Tao ;
Yang, Rongliang .
EXPERT SYSTEMS WITH APPLICATIONS, 2024, 237
[4]   A Review of Vision-Based Traffic Semantic Understanding in ITSs [J].
Chen, Jing ;
Wang, Qichao ;
Cheng, Harry H. ;
Peng, Weiming ;
Xu, Wenqiang .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (11) :19954-19979
[5]   Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation [J].
Chen, Liang-Chieh ;
Zhu, Yukun ;
Papandreou, George ;
Schroff, Florian ;
Adam, Hartwig .
COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 :833-851
[6]   Bi-directional Cross-Modality Feature Propagation with Separation-and-Aggregation Gate for RGB-D Semantic Segmentation [J].
Chen, Xiaokang ;
Lin, Kwan-Yee ;
Wang, Jingbo ;
Wu, Wayne ;
Qian, Chen ;
Li, Hongsheng ;
Zeng, Gang .
COMPUTER VISION - ECCV 2020, PT XI, 2020, 12356 :561-577
[7]   Indoor Place Category Recognition for a Cleaning Robot by Fusing a Probabilistic Approach and Deep Learning [J].
Choe, Soowook ;
Seong, Hongje ;
Kim, Euntai .
IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (08) :7265-7276
[8]   Deep-Learning-Based Approaches for Semantic Segmentation of Natural Scene Images: A Review [J].
Emek Soylu, Busra ;
Guzel, Mehmet Serdar ;
Bostanci, Gazi Erkan ;
Ekinci, Fatih ;
Asuroglu, Tunc ;
Acici, Koray .
ELECTRONICS, 2023, 12 (12)
[9]   Deep learning-enabled medical computer vision [J].
Esteva, Andre ;
Chou, Katherine ;
Yeung, Serena ;
Naik, Nikhil ;
Madani, Ali ;
Mottaghi, Ali ;
Liu, Yun ;
Topol, Eric ;
Dean, Jeff ;
Socher, Richard .
NPJ DIGITAL MEDICINE, 2021, 4 (01)
[10]   Progressive Adjacent-Layer coordination symmetric cascade network for semantic segmentation of Multimodal remote sensing images [J].
Fan, Xiaomin ;
Zhou, Wujie ;
Qian, Xiaohong ;
Yan, Weiqing .
EXPERT SYSTEMS WITH APPLICATIONS, 2024, 238