RTLNet: Recursive Triple-Path Learning Network for Scene Parsing of RGB-D Images

被引:4
|
作者
Yue, Yuchun [1 ]
Zhou, Wujie [1 ]
Lei, Jingsheng [1 ]
Yu, Lu [2 ]
机构
[1] Zhejiang Univ Sci & Technol, Sch Informat & Elect Engn, Hangzhou 310023, Peoples R China
[2] Zhejiang Univ, Coll Informat & Elect Engn, Hangzhou 310027, Peoples R China
基金
中国国家自然科学基金;
关键词
Image segmentation; Semantics; Decoding; Training; Streaming media; Sensors; Feature extraction; Scene parsing; cross-modality fusion; multiscale feature fusion; recursive learning; deep learning;
D O I
10.1109/LSP.2021.3139567
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Scene parsing approaches have attracted extensive attention in recent years; although several methods have been developed for scene parsing, most include complex modules for both cross-modality fusion between RGB and depth images in the encoder and image scale level recovery in the decoder under label supervision for high inference accuracy. Cross-modality information in the encoder may be diluted when processed through the decoder, and the supervision results may not be reused effectively, which adversely affects scene parsing. To address these problems, we propose a recursive triple-path learning network (RTLNet) for cross-modality interactions in the decoder using global context and cross-modality fusion modules. The proposed modules fully use cross-modality information to reduce information loss. To enhance the robustness of RTLNet, we add a path to reuse the initial predictions from the decoder and introduce a ladder-shaped feature consistency module to further leverage multiscale features. Experiments are conducted with the proposed RTLNet and nine recent RGB-D indoor scene parsing methods on the NYUv2 and SUN-RGBD indoor scene datasets; the results show that the RTLNet outperforms the other methods.
引用
收藏
页码:429 / 433
页数:5
相关论文
共 37 条
  • [1] FRNet: Feature Reconstruction Network for RGB-D Indoor Scene Parsing
    Zhou, Wujie
    Yang, Enquan
    Lei, Jingsheng
    Yu, Lu
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2022, 16 (04) : 677 - 687
  • [2] ResFusion: deeply fused scene parsing network for RGB-D images
    Dai, Juting
    Tang, Xinyi
    IET COMPUTER VISION, 2018, 12 (08) : 1171 - 1178
  • [3] Joint Task-Recursive Learning for RGB-D Scene Understanding
    Zhang, Zhenyu
    Cui, Zhen
    Xu, Chunyan
    Jie, Zequn
    Li, Xiang
    Yang, Jian
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (10) : 2608 - 2623
  • [4] CCFNet: Cross-Complementary fusion network for RGB-D scene parsing of clothing images
    Xu, Gao
    Zhou, Wujie
    Qian, Xiaohong
    Ye, Lv
    Lei, Jingsheng
    Yu, Lu
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2023, 90
  • [5] Zig-Zag Network for Semantic Segmentation of RGB-D Images
    Lin, Di
    Huang, Hui
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (10) : 2642 - 2655
  • [6] Learning Effective RGB-D Representations for Scene Recognition
    Song, Xinhang
    Jiang, Shuqiang
    Herranz, Luis
    Chen, Chengpeng
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (02) : 980 - 993
  • [7] RFNet: Reverse Fusion Network With Attention Mechanism for RGB-D Indoor Scene Understanding
    Zhou, Wujie
    Lv, Sijia
    Lei, Jingsheng
    Luo, Ting
    Yu, Lu
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2023, 7 (02): : 598 - 603
  • [8] Survey on Semantic Scene Completion Based on RGB-D Images
    Zhang K.
    An B.-Z.
    Li J.
    Yuan X.
    Zhao C.-X.
    Ruan Jian Xue Bao/Journal of Software, 2023, 34 (01): : 444 - 462
  • [9] RGB-D Scene Parsing Based on Spatial Structured Inference Deep Fusion Networks
    Wang Z.-Y.
    Wu Y.-X.
    Zhang G.-Y.
    Bu S.-H.
    Wu, Yan-Xia (wuyanxia@hrbeu.edu.cn), 2018, Chinese Institute of Electronics (46): : 1253 - 1258
  • [10] DMFNet: Deep Multi-Modal Fusion Network for RGB-D Indoor Scene Segmentation
    Yuan, Jianzhong
    Zhou, Wujie
    Luo, Ting
    IEEE ACCESS, 2019, 7 : 169350 - 169358