DenseAttentionSeg: Segment hands from interacted objects using depth input

被引：7

作者：

Bo, Zi-Hao ^{[1
,2
]}

Zhang, Hao ^{[1
,2
]}

Yong, Jun-Hai ^{[1
,2
]}

Gao, Hao ^{[3
]}

Xu, Feng ^{[1
,2
]}

机构：

[1] Tsinghua Univ, BNRist, Beijing, Peoples R China

[2] Tsinghua Univ, Sch Software, Beijing, Peoples R China

[3] Nanjing Univ Posts & Telecommun, Inst Adv Technol, Nanjing, Peoples R China

来源：

APPLIED SOFT COMPUTING | 2020年 / 92卷

基金：

北京市自然科学基金; 国家重点研发计划;

关键词：

Hand-object interaction; Semantic segmentation; Artificial neural networks; Human-computer interaction; POSE ESTIMATION; PREDICTION;

D O I：

10.1016/j.asoc.2020.106297

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Hand segmentation is an important task in computer vision, which is usually the foundation of hand pose recognition, hand tracking, and reconstruction. For hand segmentation, it is more challenging when the hand is interacting with objects, but handling interacting motions is more important for applications like HCI and VR. In this paper, we propose a real-time DNN-based technique to segment hand and object in interacting motions from a single depth input. Our model is called DenseAttentionSeg, which contains a dense attention mechanism which effectively fuses information in different scales and improves the quality of result with skip-connections. Besides, we introduce a contour loss in model training, which helps to generate accurate hand and object boundaries. Finally, we propose our InterSegHands dataset, a fine-scale hand segmentation dataset containing about 52k depth maps of hand-object interactions, with the ground truth segmentation masks. Our experiments evaluate the effectiveness of our techniques and datasets, and indicate that our method outperforms the current state-of-the-art deep segmentation methods in handling hand-object interactions. (C) 2020 Elsevier B.V. All rights reserved.

引用

页数：9

共 63 条

[1] Abadi M, 2016, ACM SIGPLAN NOTICES, V51, P1, DOI [10.1145/2951913.2976746, 10.1145/3022670.2976746]
[2] [Anonymous], IEEE C COMP VIS PATT
[3] [Anonymous], 2004, COMP VIS PATT REC WO
[4] [Anonymous], ARXIV171105944
[5] [Anonymous], ARXIV171105944
[6] [Anonymous], 2014, ARXIV14127062
[7] [Anonymous], 2019, IEEE T CYBERN
[8] [Anonymous], COMPUTER ROBOT VISIO
[9] SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation
Badrinarayanan, Vijay
Kendall, Alex
Cipolla, Roberto
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) : 2481 - 2495
[10] Bahdanau D, 2016, Arxiv, DOI [arXiv:1409.0473, DOI 10.48550/ARXIV.1409.0473]

← 1 2 3 4 5 6 7 →