RGB-D Human Matting: A Real-World Benchmark Dataset and a Baseline Method

被引:9
作者
Peng, Bo [1 ]
Zhang, Mingliang [1 ]
Lei, Jianjun [1 ]
Fu, Huazhu [2 ]
Shen, Haifeng [3 ]
Huang, Qingming [4 ]
机构
[1] Tianjin Univ, Sch Elect & Informat Engn, Tianjin 300072, Peoples R China
[2] ASTAR, Inst High Performance Comp IHPC, Singapore 138632, Singapore
[3] Didi Chuxing, AIoT Platform, Beijing 100193, Peoples R China
[4] Univ Chinese Acad Sci, Sch Comp Sci & Technol, 100190, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Human matting; RGB-D; dataset; baseline; IMAGE; SEGMENTATION;
D O I
10.1109/TCSVT.2023.3238580
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The last decade has witnessed an increasing exploration and development of human matting. However, existing matting works primarily focus on predicting better alpha mattes from RGB images. So far few efforts have been devoted to tackling human matting in real-world activity scenarios with RGB-D information. To this end, this paper concentrates on the RGB-D human matting task, and provides the first public RGB-D human matting benchmark dataset as well as a baseline method for deep learning-based RGB-D human matting. To support the research on RGB-D human matting, a new RGB-D human-matting dataset (HDM-2K) is collected and released, which contains 2,270 high-resolution human images in various real-world scenarios and the corresponding depth maps. Additionally, a baseline method for RGB-D human matting is further proposed, which automatically generates the alpha matte by jointly exploiting the spatial structure information in the depth map and detailed texture information in the RGB image. Finally, extensive experiments conducted on the HDM-2K dataset demonstrate that the depth maps are effective for the matting task and the proposed baseline method achieves promising performance on human matting.
引用
收藏
页码:4041 / 4053
页数:13
相关论文
共 55 条
  • [1] Entity Slot Filling for Visual Captioning
    Bin, Yi
    Ding, Yujuan
    Peng, Bo
    Peng, Liang
    Yang, Yang
    Chua, Tat-Seng
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (01) : 52 - 62
  • [2] Cai S., 2019, P IEEECVF INT C COMP, P8819, DOI 10.1109/ICCV.2019.00891
  • [3] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
    Chen, Liang-Chieh
    Papandreou, George
    Kokkinos, Iasonas
    Murphy, Kevin
    Yuille, Alan L.
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) : 834 - 848
  • [4] KNN Matting
    Chen, Qifeng
    Li, Dingzeyu
    Tang, Chi-Keung
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (09) : 2175 - 2188
  • [5] Semantic Human Matting
    Chen, Quan
    Ge, Tiezheng
    Xu, Yanyu
    Zhang, Zhiqiang
    Yang, Xinxin
    Gai, Kun
    [J]. PROCEEDINGS OF THE 2018 ACM MULTIMEDIA CONFERENCE (MM'18), 2018, : 618 - 626
  • [6] Chongyi Li, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12353), P225, DOI 10.1007/978-3-030-58598-3_14
  • [7] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
  • [8] Deora R, 2021, Arxiv, DOI arXiv:2103.12337
  • [9] The Pascal Visual Object Classes (VOC) Challenge
    Everingham, Mark
    Van Gool, Luc
    Williams, Christopher K. I.
    Winn, John
    Zisserman, Andrew
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2010, 88 (02) : 303 - 338
  • [10] Rethinking RGB-D Salient Object Detection: Models, Data Sets, and Large-Scale Benchmarks
    Fan, Deng-Ping
    Lin, Zheng
    Zhang, Zhao
    Zhu, Menglong
    Cheng, Ming-Ming
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (05) : 2075 - 2089