FSA-Net: Learning Fine-Grained Structure Aggregation for Head Pose Estimation from a Single Image

被引：202

作者：

Yang, Tsun-Yi ^{[1
,2
]}

Chen, Yi-Ting ^{[1
]}

Lin, Yen-Yu ^{[1
]}

Chuang, Yung-Yu ^{[1
,2
]}

机构：

[1] Acad Sinica, Taipei, Taiwan

[2] Natl Taiwan Univ, Taipei, Taiwan

来源：

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019) | 2019年

关键词：

FACE ALIGNMENT; MODELS;

D O I：

10.1109/CVPR.2019.00118

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper proposes a method for head pose estimation from a single image. Previous methods often predict head poses through landmark or depth estimation and would require more computation than necessary. Our method is based on regression and feature aggregation. For having a compact model, we employ the soft stagewise regression scheme. Existing feature aggregation methods treat inputs as a bag of features and thus ignore their spatial relationship in a feature map. We propose to learn a fine-grained structure mapping for spatially grouping features before aggregation. The fine-grained structure provides part-based information and pooled values. By utilizing learnable and non-learnable importance over the spatial location, different model variants can be generated and form a complementary ensemble. Experiments show that our method outperforms the state-of-the-art methods including both the landmark-free ones and the ones based on landmark or depth estimation. With only a single RGB frame as input, our method even outperforms methods utilizing multimodality information (RGB-D, RGB-Time) on estimating the yaw angle. Furthermore, the memory overhead of our model is 100 x smaller than those of previous methods.

引用

页码：1087 / 1096

页数：10

共 49 条

[41] CBAM: Convolutional Block Attention Module [J].

Woo, Sanghyun ;

Park, Jongchan ;

Lee, Joon-Young ;

Kweon, In So .

COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 :3-19

[42]

Xiong XH, 2015, PROC CVPR IEEE, P2664, DOI 10.1109/CVPR.2015.7298882

[43] Supervised Descent Method and its Applications to Face Alignment [J].

Xiong, Xuehan ;

De la Torre, Fernando .

2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, :532-539

[44]

Yang Tsun-Yi, 2017, P INT C COMP VIS ICC

[45]

Yang Tsun-Yi, 2018, P INT JOINT C ARTIFI

[46]

Zhang Feifei, 2018, P C COMP VIS PATT RE

[47] Compact Representation of High-Dimensional Feature Vectors for Large-Scale Image Recognition and Retrieval [J].

Zhang, Yu ;

Wu, Jianxin ;

Cai, Jianfei .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (05) :2407-2419

[48]

Zhu XX, 2012, PROC CVPR IEEE, P2879, DOI 10.1109/CVPR.2012.6248014

[49] Face Alignment Across Large Poses: A 3D Solution [J].

Zhu, Xiangyu ;

Lei, Zhen ;

Liu, Xiaoming ;

Shi, Hailin ;

Li, Stan Z. .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :146-155

← 1 2 3 4 5 →