Deep Learning Methods for 3D Human Pose Estimation under Different Supervision Paradigms: A Survey

被引:14
作者
Zhang, Dejun [1 ]
Wu, Yiqi [2 ,3 ]
Guo, Mingyue [4 ]
Chen, Yilin [5 ]
机构
[1] China Univ Geosci, Sch Geog & Informat Engn, Wuhan 430078, Peoples R China
[2] China Univ Geosci, Coll Comp Sci, Wuhan 430078, Peoples R China
[3] China Univ Geosci, Hubei Key Lab Intelligent Geoinformat Proc, Wuhan 430078, Peoples R China
[4] Sichuan Agr Univ, Coll Informat & Engn, Yaan 625014, Peoples R China
[5] Wuhan Inst Technol, Sch Comp Sci & Engn, Wuhan 430205, Peoples R China
基金
美国国家科学基金会;
关键词
3D human pose estimation; deep learning; unsupervised; semi-supervised; fully-supervised; weakly-supervised;
D O I
10.3390/electronics10182267
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The rise of deep learning technology has broadly promoted the practical application of artificial intelligence in production and daily life. In computer vision, many human-centered applications, such as video surveillance, human-computer interaction, digital entertainment, etc., rely heavily on accurate and efficient human pose estimation techniques. Inspired by the remarkable achievements in learning-based 2D human pose estimation, numerous research studies are devoted to the topic of 3D human pose estimation via deep learning methods. Against this backdrop, this paper provides an extensive literature survey of recent literature about deep learning methods for 3D human pose estimation to display the development process of these research studies, track the latest research trends, and analyze the characteristics of devised types of methods. The literature is reviewed, along with the general pipeline of 3D human pose estimation, which consists of human body modeling, learning-based pose estimation, and regularization for refinement. Different from existing reviews of the same topic, this paper focus on deep learning-based methods. The learning-based pose estimation is discussed from two categories: single-person and multi-person. Each one is further categorized by data type to the image-based methods and the video-based methods. Moreover, due to the significance of data for learning-based methods, this paper surveys the 3D human pose estimation methods according to the taxonomy of supervision form. At last, this paper also enlists the current and widely used datasets and compares performances of reviewed methods. Based on this literature survey, it can be concluded that each branch of 3D human pose estimation starts with fully-supervised methods, and there is still much room for multi-person pose estimation based on other supervision methods from both image and video. Besides the significant development of 3D human pose estimation via deep learning, the inherent ambiguity and occlusion problems remain challenging issues that need to be better addressed.
引用
收藏
页数:25
相关论文
共 94 条
  • [1] Automating Surveillance
    Andrejevic, Mark
    [J]. SURVEILLANCE & SOCIETY, 2019, 17 (1-2) : 7 - 13
  • [2] 2D Human Pose Estimation: New Benchmark and State of the Art Analysis
    Andriluka, Mykhaylo
    Pishchulin, Leonid
    Gehler, Peter
    Schiele, Bernt
    [J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 3686 - 3693
  • [3] SCAPE: Shape Completion and Animation of People
    Anguelov, D
    Srinivasan, P
    Koller, D
    Thrun, S
    Rodgers, J
    Davis, J
    [J]. ACM TRANSACTIONS ON GRAPHICS, 2005, 24 (03): : 408 - 416
  • [4] Arbués-Sangüesa A, 2020, IEEE IMAGE PROC, P1506, DOI 10.1109/ICIP40778.2020.9190639
  • [5] Belagiannis V., 2014, European Conference on Computer Vision, P742
  • [6] 3D Pictorial Structures Revisited: Multiple Human Pose Estimation
    Belagiannis, Vasileios
    Amin, Sikandar
    Andriluka, Mykhaylo
    Schiele, Bernt
    Navab, Nassir
    Ilic, Slobodan
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (10) : 1929 - 1942
  • [7] 3D Pictorial Structures for Multiple Human Pose Estimation
    Belagiannis, Vasileios
    Amin, Sikandar
    Andriluka, Mykhaylo
    Schiele, Bernt
    Navab, Nassir
    Ilic, Slobodan
    [J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 1669 - 1676
  • [8] Keep It SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image
    Bogo, Federica
    Kanazawa, Angjoo
    Lassner, Christoph
    Gehler, Peter
    Romero, Javier
    Black, Michael J.
    [J]. COMPUTER VISION - ECCV 2016, PT V, 2016, 9909 : 561 - 578
  • [9] Bridgeman L., P IEEE CVF C COMP VI, P2487
  • [10] Bugra T., P BRIT MACH VIS C BM