CrossFuNet: RGB and Depth Cross-Fusion Network for Hand Pose Estimation

被引:5
作者
Sun, Xiaojing [1 ]
Wang, Bin [1 ]
Huang, Longxiang [2 ]
Zhang, Qian [1 ]
Zhu, Sulei [1 ]
Ma, Yan [1 ]
机构
[1] Shanghai Normal Univ, Coll Informat Mech & Elect Engn, Shanghai 200234, Peoples R China
[2] Shenzhen Guangjian Technol Co Ltd, Shanghai 200135, Peoples R China
关键词
hand pose estimation; convolutional neural network; RGBD fusion; 3D HAND;
D O I
10.3390/s21186095
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Despite recent successes in hand pose estimation from RGB images or depth maps, inherent challenges remain. RGB-based methods suffer from heavy self-occlusions and depth ambiguity. Depth sensors rely heavily on distance and can only be used indoors, thus there are many limitations to the practical application of depth-based methods. The aforementioned challenges have inspired us to combine the two modalities to offset the shortcomings of the other. In this paper, we propose a novel RGB and depth information fusion network to improve the accuracy of 3D hand pose estimation, which is called CrossFuNet. Specifically, the RGB image and the paired depth map are input into two different subnetworks, respectively. The feature maps are fused in the fusion module in which we propose a completely new approach to combine the information from the two modalities. Then, the common method is used to regress the 3D key-points by heatmaps. We validate our model on two public datasets and the results reveal that our model outperforms the state-of-the-art methods.
引用
收藏
页数:17
相关论文
共 50 条
[21]   A multi-branch hand pose estimation network with joint-wise feature extraction and fusion [J].
Li, Xuefeng ;
Zhou, Yidan ;
Sun, Yi ;
Lin, Xiangbo ;
Ma, Xiaohong .
SIGNAL PROCESSING-IMAGE COMMUNICATION, 2020, 81
[22]   LGCANet: lightweight hand pose estimation network based on HRNet [J].
Pan, Xiaoying ;
Li, Shoukun ;
Wang, Hao ;
Wang, Beibei ;
Wang, Haoyi .
JOURNAL OF SUPERCOMPUTING, 2024, 80 (13) :19351-19373
[23]   Hierarchical topology based hand pose estimation from a single depth image [J].
Yanli Ji ;
Haoxin Li ;
Yang Yang ;
Shuying Li .
Multimedia Tools and Applications, 2018, 77 :10553-10568
[24]   3D Hand Pose Estimation From Monocular RGB With Feature Interaction Module [J].
Guo, Shaoxiang ;
Rigall, Eric ;
Ju, Yakun ;
Dong, Junyu .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (08) :5293-5306
[25]   Hierarchical topology based hand pose estimation from a single depth image [J].
Ji, Yanli ;
Li, Haoxin ;
Yang, Yang ;
Li, Shuying .
MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (09) :10553-10568
[26]   Hand pose estimation based on regression method from monocular RGB cameras for handling occlusion [J].
Roumaissa, Bekiri ;
Chaouki, Babahenini Mohamed .
MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (07) :21497-21523
[27]   Multi-Level Fusion Net for hand pose estimation in hand-object interaction [J].
Lin, Xiang-Bo ;
Zhou, Yi-Dan ;
Du, Kuo ;
Sun, Yi ;
Ma, Xiao-Hong ;
Lu, Jian .
SIGNAL PROCESSING-IMAGE COMMUNICATION, 2021, 94 (94)
[28]   Hand pose estimation based on regression method from monocular RGB cameras for handling occlusion [J].
Bekiri Roumaissa ;
Babahenini Mohamed Chaouki .
Multimedia Tools and Applications, 2024, 83 :21497-21523
[29]   Spectral Super-Resolution via Model-Guided Cross-Fusion Network [J].
Dian, Renwei ;
Shan, Tianci ;
He, Wei ;
Liu, Haibo .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (07) :10059-10070
[30]   Pyramid Deep Fusion Network for Two-Hand Reconstruction From RGB-D Images [J].
Ren, Jinwei ;
Zhu, Jianke .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (07) :5843-5855