Bidirectional Optimization Coupled Lightweight Networks for Efficient and Robust Multi-Person 2D Pose Estimation

被引:0
作者
Shuai Li
Zheng Fang
Wen-Feng Song
Ai-Min Hao
Hong Qin
机构
[1] Beihang University,State Key Laboratory of Virtual Reality Technology and Systems
[2] Beihang University Qingdao Research Institute,Department of Computer Science
[3] Stony Brook University,undefined
来源
Journal of Computer Science and Technology | 2019年 / 34卷
关键词
bidirectional optimization; computer vision; deep learning; probability limb heat map; 2D multi-person pose; estimation;
D O I
暂无
中图分类号
学科分类号
摘要
For multi-person 2D pose estimation, current deep learning based methods have exhibited impressive performance, but the trade-offs among efficiency, robustness, and accuracy in the existing approaches remain unavoidable. In principle, bottom-up methods are superior to top-down methods in efficiency, but they perform worse in accuracy. To make full use of their respective advantages, in this paper we design a novel bidirectional optimization coupled lightweight network (BOCLN) architecture for efficient, robust, and general-purpose multi-person 2D (2-dimensional) pose estimation from natural images. With the BOCLN framework, the bottom-up network focuses on global features, while the top-down network places emphasis on detailed features. The entire framework shares global features along the bottom-up data stream, while the top-down data stream aims to accelerate the accurate pose estimation. In particular, to exploit the priors of human joints’ relationship, we propose a probability limb heat map to represent the spatial context of the joints and guide the overall pose skeleton prediction, so that each person’s pose estimation in cluttered scenes (involving crowd) could be as accurate and robust as possible. Therefore, benefiting from the novel BOCLN architecture, the time-consuming refinement procedure could be much simplified to an efficient lightweight network. Extensive experiments and evaluations on public benchmarks have confirmed that our new method is more efficient and robust, yet still attain competitive accuracy performance compared with the state-of-the-art methods. Our BOCLN shows even greater promise in online applications.
引用
收藏
页码:522 / 536
页数:14
相关论文
共 9 条
  • [1] Kikuchi T(2018)Transferring pose and augmenting background for deep human-image parsing and its applications Computational Visual Media 4 43-54
  • [2] Endo Y(2017)Joint head pose and facial landmark regression from depth images Computational Visual Media 3 229-241
  • [3] Kanamori Y(undefined)undefined undefined undefined undefined-undefined
  • [4] Hashimoto T(undefined)undefined undefined undefined undefined-undefined
  • [5] Mitani J(undefined)undefined undefined undefined undefined-undefined
  • [6] Wang J(undefined)undefined undefined undefined undefined-undefined
  • [7] Zhang J(undefined)undefined undefined undefined undefined-undefined
  • [8] Luo C(undefined)undefined undefined undefined undefined-undefined
  • [9] Chen F(undefined)undefined undefined undefined undefined-undefined