Multi-modal deep network for RGB-D segmentation of clothes

被引：7

作者：

Joukovsky, B. ^{[1
]}

Hu, P. ^{[1
]}

Munteanu, A. ^{[1
]}

机构：

[1] Vrije Univ Brussel, Dept Elect & Informat, Brussels, Belgium

来源：

ELECTRONICS LETTERS | 2020年 / 56卷 / 09期

关键词：

image fusion; learning (artificial intelligence); image segmentation; image colour analysis; synthetic data; real-world data; multimodal deep network; RGB-D segmentation; clothes; deep learning; semantic segmentation; synthetic dataset; different clothing styles; semantic classes; data generation pipeline; depth images; ground-truth label maps; novel multimodal encoder-ecoder convolutional network; depth modalities; multimodal features; trained fusion modules; multiscale atrous convolutions;

D O I：

10.1049/el.2019.4150

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In this Letter, the authors propose a deep learning based method to perform semantic segmentation of clothes from RGB-D images of people. First, they present a synthetic dataset containing more than 50,000 RGB-D samples of characters in different clothing styles, featuring various poses and environments for a total of nine semantic classes. The proposed data generation pipeline allows for fast production of RGB, depth images and ground-truth label maps. Secondly, a novel multi-modal encoder-ecoder convolutional network is proposed which operates on RGB and depth modalities. Multi-modal features are merged using trained fusion modules which use multi-scale atrous convolutions in the fusion process. The method is numerically evaluated on synthetic data and visually assessed on real-world data. The experiments demonstrate the efficiency of the proposed model over existing methods.

引用

页码：432 / 434

页数：3

共 50 条

[1] computer catwalk: A multi-modal deep network for the segmentation of RGB-D images of clothes
Joukovsky, B.
Hu, P.
Munteanu, A.
Electronics Letters, 2020, 56 (09):
[2] DMFNet: Deep Multi-Modal Fusion Network for RGB-D Indoor Scene Segmentation
Yuan, Jianzhong
Zhou, Wujie
Luo, Ting
IEEE ACCESS, 2019, 7 : 169350 - 169358
[3] A Multi-Modal RGB-D Object Recognizer
Faeulhammer, Thomas
Zillich, Michael
Prankl, Johann
Vincze, Markus
2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2016, : 733 - 738
[4] Multi-modal deep feature learning for RGB-D object detection
Xu, Xiangyang
Li, Yuncheng
Wu, Gangshan
Luo, Jiebo
PATTERN RECOGNITION, 2017, 72 : 300 - 313
[5] RGB-D BASED MULTI-MODAL DEEP LEARNING FOR FACE IDENTIFICATION
Lin, Tzu-Ying
Chiu, Ching-Te
Tang, Ching-Tung
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 1668 - 1672
[6] RGB-D based multi-modal deep learning for spacecraft and debris recognition
AlDahoul, Nouar
Karim, Hezerul Abdul
Momo, Mhd Adel
SCIENTIFIC REPORTS, 2022, 12 (01)
[7] RGB-D based multi-modal deep learning for spacecraft and debris recognition
Nouar AlDahoul
Hezerul Abdul Karim
Mhd Adel Momo
Scientific Reports, 12
[8] RGB-D Object Recognition Using Multi-Modal Deep Neural Network and DS Evidence Theory
Zeng, Hui
Yang, Bin
Wang, Xiuqing
Liu, Jiwei
Fu, Dongmei
SENSORS, 2019, 19 (03)
[9] LinkNet: 2D-3D linked multi-modal network for online semantic segmentation of RGB-D videos
Cai, Jun-Xiong
Mu, Tai-Jiang
Lai, Yu-Kun
Hu, Shi-Min
COMPUTERS & GRAPHICS-UK, 2021, 98 : 37 - 47
[10] Multi-modal uniform deep learning for RGB-D person re-identification
Ren, Liangliang
Lu, Jiwen
Feng, Jianjiang
Zhou, Jie
PATTERN RECOGNITION, 2017, 72 : 446 - 457

← 1 2 3 4 5 →