Multi-modal deep network for RGB-D segmentation of clothes

被引:7
|
作者
Joukovsky, B. [1 ]
Hu, P. [1 ]
Munteanu, A. [1 ]
机构
[1] Vrije Univ Brussel, Dept Elect & Informat, Brussels, Belgium
关键词
image fusion; learning (artificial intelligence); image segmentation; image colour analysis; synthetic data; real-world data; multimodal deep network; RGB-D segmentation; clothes; deep learning; semantic segmentation; synthetic dataset; different clothing styles; semantic classes; data generation pipeline; depth images; ground-truth label maps; novel multimodal encoder-ecoder convolutional network; depth modalities; multimodal features; trained fusion modules; multiscale atrous convolutions;
D O I
10.1049/el.2019.4150
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this Letter, the authors propose a deep learning based method to perform semantic segmentation of clothes from RGB-D images of people. First, they present a synthetic dataset containing more than 50,000 RGB-D samples of characters in different clothing styles, featuring various poses and environments for a total of nine semantic classes. The proposed data generation pipeline allows for fast production of RGB, depth images and ground-truth label maps. Secondly, a novel multi-modal encoder-ecoder convolutional network is proposed which operates on RGB and depth modalities. Multi-modal features are merged using trained fusion modules which use multi-scale atrous convolutions in the fusion process. The method is numerically evaluated on synthetic data and visually assessed on real-world data. The experiments demonstrate the efficiency of the proposed model over existing methods.
引用
收藏
页码:432 / 434
页数:3
相关论文
共 50 条
  • [1] computer catwalk: A multi-modal deep network for the segmentation of RGB-D images of clothes
    Joukovsky, B.
    Hu, P.
    Munteanu, A.
    Electronics Letters, 2020, 56 (09):
  • [2] DMFNet: Deep Multi-Modal Fusion Network for RGB-D Indoor Scene Segmentation
    Yuan, Jianzhong
    Zhou, Wujie
    Luo, Ting
    IEEE ACCESS, 2019, 7 : 169350 - 169358
  • [3] A Multi-Modal RGB-D Object Recognizer
    Faeulhammer, Thomas
    Zillich, Michael
    Prankl, Johann
    Vincze, Markus
    2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2016, : 733 - 738
  • [4] Multi-modal deep feature learning for RGB-D object detection
    Xu, Xiangyang
    Li, Yuncheng
    Wu, Gangshan
    Luo, Jiebo
    PATTERN RECOGNITION, 2017, 72 : 300 - 313
  • [5] RGB-D BASED MULTI-MODAL DEEP LEARNING FOR FACE IDENTIFICATION
    Lin, Tzu-Ying
    Chiu, Ching-Te
    Tang, Ching-Tung
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 1668 - 1672
  • [6] RGB-D based multi-modal deep learning for spacecraft and debris recognition
    AlDahoul, Nouar
    Karim, Hezerul Abdul
    Momo, Mhd Adel
    SCIENTIFIC REPORTS, 2022, 12 (01)
  • [7] RGB-D based multi-modal deep learning for spacecraft and debris recognition
    Nouar AlDahoul
    Hezerul Abdul Karim
    Mhd Adel Momo
    Scientific Reports, 12
  • [8] RGB-D Object Recognition Using Multi-Modal Deep Neural Network and DS Evidence Theory
    Zeng, Hui
    Yang, Bin
    Wang, Xiuqing
    Liu, Jiwei
    Fu, Dongmei
    SENSORS, 2019, 19 (03)
  • [9] LinkNet: 2D-3D linked multi-modal network for online semantic segmentation of RGB-D videos
    Cai, Jun-Xiong
    Mu, Tai-Jiang
    Lai, Yu-Kun
    Hu, Shi-Min
    COMPUTERS & GRAPHICS-UK, 2021, 98 : 37 - 47
  • [10] Multi-modal uniform deep learning for RGB-D person re-identification
    Ren, Liangliang
    Lu, Jiwen
    Feng, Jianjiang
    Zhou, Jie
    PATTERN RECOGNITION, 2017, 72 : 446 - 457