RAFNet: RGB-D attention feature fusion network for indoor semantic segmentation

被引:17
|
作者
Yan, Xingchao [1 ]
Hou, Sujuan [1 ]
Karim, Awudu [2 ]
Jia, Weikuan [1 ]
机构
[1] Shandong Normal Univ, Sch Informat Sci & Engn, Jinan 250358, Peoples R China
[2] Beijing Univ Technol, Sch Engn, Beijing 101303, Peoples R China
关键词
RGB-D semantic segmentation; Three parallel branches; Attention modules; MOVEMENT; HEAD;
D O I
10.1016/j.displa.2021.102082
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Semantic segmentation based on the complementary information from RGB and depth images has recently gained great popularity, but due to the difference between RGB and depth maps, how to effectively use RGB-D information is still a problem. In this paper, we propose a novel RGB-D semantic segmentation network named RAFNet, which can selectively gather features from the RGB and depth information. Specifically, we construct an architecture with three parallel branches and propose several complementary attention modules. This structure enables a fusion branch and we add the Bi-directional Multi-step Propagation (BMP) strategy to it, which can not only retain the feature streams of the original RGB and depth branches but also fully utilize the feature flow of the fusion branch. There are three kinds of complementary attention modules that we have constructed. The RGB-D fusion module can effectively extract important features from the RGB and depth branch streams. The refinement module can reduce the loss of semantic information and the context aggregation module can help propagate and integrate information better. We train and evaluate our model on NYUDv2 and SUN-RGBD datasets, and prove that our model achieves state-of-the-art performances.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Feature fusion and context interaction for RGB-D indoor semantic segmentation
    Liu, Heng
    Xie, Wen
    Wang, Shaoxun
    APPLIED SOFT COMPUTING, 2024, 167
  • [2] The Network of Attention-Aware Multimodal fusion for RGB-D Indoor Semantic Segmentation Method
    Zhao, Qiankun
    Wan, Yingcai
    Fang, Lijin
    Wang, Huaizhen
    2022 34TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2022, : 5093 - 5098
  • [3] Attention-based fusion network for RGB-D semantic segmentation
    Zhong, Li
    Guo, Chi
    Zhan, Jiao
    Deng, JingYi
    NEUROCOMPUTING, 2024, 608
  • [4] Transformer fusion for indoor RGB-D semantic segmentation
    Wu, Zongwei
    Zhou, Zhuyun
    Allibert, Guillaume
    Stolz, Christophe
    Demonceaux, Cedric
    Ma, Chao
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 249
  • [5] Cross-modal attention fusion network for RGB-D semantic segmentation
    Zhao, Qiankun
    Wan, Yingcai
    Xu, Jiqian
    Fang, Lijin
    NEUROCOMPUTING, 2023, 548
  • [6] Attention-Aware and Semantic-Aware Network for RGB-D Indoor Semantic Segmentation
    Duan L.-J.
    Sun Q.-C.
    Qiao Y.-H.
    Chen J.-C.
    Cui G.-Q.
    Jisuanji Xuebao/Chinese Journal of Computers, 2021, 44 (02): : 275 - 291
  • [7] CMPFFNet: Cross-Modal and Progressive Feature Fusion Network for RGB-D Indoor Scene Semantic Segmentation
    Zhou, Wujie
    Xiao, Yuxiang
    Yan, Weiqing
    Yu, Lu
    IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2024, 21 (04) : 5523 - 5533
  • [8] CMPFFNet: Cross-Modal and Progressive Feature Fusion Network for RGB-D Indoor Scene Semantic Segmentation
    Zhou, Wujie
    Xiao, Yuxiang
    Yan, Weiqing
    Yu, Lu
    IEEE Transactions on Automation Science and Engineering, 2023, : 1 - 11
  • [9] FGMNet: Feature grouping mechanism network for RGB-D indoor scene semantic segmentation
    Zhang, Yuming
    Zhou, Wujie
    Ye, Lv
    Yu, Lu
    Luo, Ting
    DIGITAL SIGNAL PROCESSING, 2024, 149
  • [10] EFDCNet: Encoding fusion and decoding correction network for RGB-D indoor semantic segmentation
    Chen, Jianlin
    Li, Gongyang
    Zhang, Zhijiang
    Zeng, Dan
    IMAGE AND VISION COMPUTING, 2024, 142