AdaFuse: Adaptive Multiview Fusion for Accurate Human Pose Estimation in the Wild

被引:71
作者
Zhang, Zhe [1 ]
Wang, Chunyu [2 ]
Qiu, Weichao [3 ]
Qin, Wenhu [1 ]
Zeng, Wenjun [2 ]
机构
[1] Southeast Univ, Nanjing, Peoples R China
[2] Microsoft Res Asia, Beijing, Peoples R China
[3] Johns Hopkins Univ, Baltimore, MD USA
关键词
Human pose estimation; Multiple camera fusion; Epipolar geometry; MOTION CAPTURE; TRACKING;
D O I
10.1007/s11263-020-01398-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Occlusion is probably the biggest challenge for human pose estimation in the wild. Typical solutions often rely on intrusive sensors such as IMUs to detect occluded joints. To make the task truly unconstrained, we present AdaFuse, an adaptive multiview fusion method, which can enhance the features in occluded views by leveraging those in visible views. The core of AdaFuse is to determine the point-point correspondence between two views which we solve effectively by exploring the sparsity of the heatmap representation. We also learn an adaptive fusion weight for each camera view to reflect its feature quality in order to reduce the chance that good features are undesirably corrupted by "bad" views. The fusion model is trained end-to-end with the pose estimation network, and can be directly applied to new camera configurations without additional adaptation. We extensively evaluate the approach on three public datasets including Human3.6M, Total Capture and CMU Panoptic. It outperforms the state-of-the-arts on all of them. We also create a large scale synthetic dataset Occlusion-Person, which allows us to perform numerical evaluation on the occluded joints, as it provides occlusion labels for every joint in the images. The dataset and code are released at .
引用
收藏
页码:703 / 718
页数:16
相关论文
共 68 条
[1]  
Martinez AA, 2017, INT SYMP COMPUT EDUC
[2]   Multi-view Pictorial Structures for 3D Human Pose Estimation [J].
Amin, Sikandar ;
Andriluka, Mykhaylo ;
Rohrbach, Marcus ;
Schiele, Bernt .
PROCEEDINGS OF THE BRITISH MACHINE VISION CONFERENCE 2013, 2013,
[3]   2D Human Pose Estimation: New Benchmark and State of the Art Analysis [J].
Andriluka, Mykhaylo ;
Pishchulin, Leonid ;
Gehler, Peter ;
Schiele, Bernt .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :3686-3693
[4]  
[Anonymous], 2009, Tech. Rep.
[5]   3D Pictorial Structures for Multiple Human Pose Estimation [J].
Belagiannis, Vasileios ;
Amin, Sikandar ;
Andriluka, Mykhaylo ;
Schiele, Bernt ;
Navab, Nassir ;
Ilic, Slobodan .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :1669-1676
[6]   Twin Gaussian Processes for Structured Prediction [J].
Bo, Liefeng ;
Sminchisescu, Cristian .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2010, 87 (1-2) :28-52
[7]  
Bridgeman Lewis, 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Proceedings, P2487, DOI 10.1109/CVPRW.2019.00304
[8]   3D Pictorial Structures for Multiple View Articulated Pose Estimation [J].
Burenius, Magnus ;
Sullivan, Josephine ;
Carlsson, Stefan .
2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, :3618-3625
[9]   Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields [J].
Cao, Zhe ;
Simon, Tomas ;
Wei, Shih-En ;
Sheikh, Yaser .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1302-1310
[10]   Synthesizing Training Images for Boosting Human 3D Pose Estimation [J].
Chen, Wenzheng ;
Wang, Huan ;
Li, Yangyan ;
Su, Hao ;
Wang, Zhenhua ;
Tu, Changhe ;
Lischinski, Dani ;
Cohen-Or, Daniel ;
Chen, Baoquan .
PROCEEDINGS OF 2016 FOURTH INTERNATIONAL CONFERENCE ON 3D VISION (3DV), 2016, :479-488