Single-Stage is Enough: Multi-Person Absolute 3D Pose Estimation

被引:20
作者
Jin, Lei [1 ]
Xu, Chenyang [1 ]
Wang, Xiaojuan [1 ]
Xiao, Yabo [1 ]
Guo, Yandong [2 ]
Nie, Xuecheng [3 ]
Zhao, Jian [4 ]
机构
[1] Beijing Univ Posts & Telecommun, Beijing, Peoples R China
[2] OPPO Res Inst, Hyderabad, Telangana, India
[3] Natl Univ Singapore, Singapore, Singapore
[4] Inst North Elect Equipment, Bengaluru, Karnataka, India
来源
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2022年
关键词
D O I
10.1109/CVPR52688.2022.01274
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The existing multi-person absolute 3D pose estimation methods are mainly based on two-stage paradigm, i.e., top-down or bottom-up, leading to redundant pipelines with high computation cost. We argue that it is more desirable to simplify such two-stage paradigm to a single-stage one to promote both efficiency and performance. To this end, we present an efficient single-stage solution, Decoupled Regression Model (DRM), with three distinct novelties. First, DRM introduces a new decoupled representation for 3D pose, which expresses the 2D pose in image plane and depth information of each 3D human instance via 2D center point (center of visible keypoints) and root point (denoted as pelvis), respectively. Second, to learn better feature representation for the human depth regression, DRM introduces a 2D Pose-guided Depth Query Module (PDQM) to extract the features in 2D pose regression branch, enabling the depth regression branch to perceive the scale information of instances. Third, DRM leverages a Decoupled Absolute Pose Loss (DAPL) to facilitate the absolute root depth and root-relative depth estimation, thus improving the accuracy of absolute 3D pose. Comprehensive experiments on challenging benchmarks including MuPoTS-3D and Panoptic clearly verify the superiority of our framework, which outperforms the state-of-the-art bottom-up absolute 3D pose estimation methods.
引用
收藏
页码:13076 / 13085
页数:10
相关论文
共 40 条
  • [1] [Anonymous], 2017, CVPR, DOI DOI 10.1109/CVPR.2017.138
  • [2] PandaNet: Anchor-Based Single-Shot Multi-Person 3D Pose Estimation
    Benzine, Abdallah
    Chabot, Florian
    Luvison, Bertrand
    Quoc Cuong Pham
    Achard, Catherine
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 6855 - 6864
  • [3] Can Wang, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12348), P242, DOI 10.1007/978-3-030-58580-8_15
  • [4] Monocular 3D Multi-Person Pose Estimation by Integrating Top-Down and Bottom-Up Networks
    Cheng, Yu
    Wang, Bo
    Yang, Bo
    Tan, Robby T.
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 7645 - 7655
  • [5] Cheng Y, 2020, AAAI CONF ARTIF INTE, V34, P10631
  • [6] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
  • [7] Fang H.-S., 2018, ECCV
  • [8] Exploiting Temporal Information for 3D Human Pose Estimation
    Hossain, Mir Rayat Imtiaz
    Little, James J.
    [J]. COMPUTER VISION - ECCV 2018, PT X, 2018, 11214 : 69 - 86
  • [9] Joo H., 2016, TPAMI, P1
  • [10] End-to-end Recovery of Human Shape and Pose
    Kanazawa, Angjoo
    Black, Michael J.
    Jacobs, David W.
    Malik, Jitendra
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 7122 - 7131