Head and Body Orientation Estimation Using Convolutional Random Projection Forests

被引：12

作者：

Lee, Donghoon ^{[1
,2
]}

Yang, Ming-Hsuan ^{[3
]}

Oh, Songhwai ^{[1
,2
]}

机构：

[1] Seoul Natl Univ, Dept Elect & Comp Engn, 1 Gwanak Ro, Seoul 08826, South Korea

[2] Seoul Natl Univ, ASRI, 1 Gwanak Ro, Seoul 08826, South Korea

[3] Univ Calif Merced, Sch Engn, Merced, CA 95344 USA

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2019年 / 41卷 / 01期

基金：

新加坡国家研究基金会;

关键词：

Head pose estimation; body orientation estimation; random forests; convolutional neural network; compressive sensing; POSE ESTIMATION;

D O I：

10.1109/TPAMI.2017.2784424

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we consider the problem of estimating the head pose and body orientation of a person from a low-resolution image. Under this setting, it is difficult to reliably extract facial features or detect body parts. We propose a convolutional random projection forest (CRPforest) algorithm for these tasks. A convolutional random projection network (CRPnet) is used at each node of the forest. It maps an input image to a high-dimensional feature space using a rich filter bank. The filter bank is designed to generate sparse responses so that they can be efficiently computed by compressive sensing. A sparse random projection matrix can capture most essential information contained in the filter bank without using all the filters in it. Therefore, the CRPnet is fast, e.g., it requires 0.04 ms to process an image of 50 x 50 pixels, due to the small number of convolutions (e.g., 0.01 percent of a layer of a neural network) at the expense of less than 2 percent accuracy. The overall forest estimates head and body pose well on benchmark datasets, e.g., over 98 percent on the HIIT dataset, while requiring 3.8 ms without using a GPU. Extensive experiments on challenging datasets show that the proposed algorithm performs favorably against the state-of-the-art methods in low-resolution images with noise, occlusion, and motion blur.

引用

页码：107 / 120

页数：14

共 45 条

[1]

Al Haj M, 2012, PROC CVPR IEEE, P2602, DOI 10.1109/CVPR.2012.6247979

[2]

[Anonymous], 2006, P ACM SIGKDD INT C K

[3]

[Anonymous], 2009, P BMVC, DOI DOI 10.5244/C.23.120

[4]

Asthana A, 2011, IEEE I CONF COMP VIS, P937, DOI 10.1109/ICCV.2011.6126336

[5]

Bansal A., 2015, CORR

[6] Random forests [J].

Breiman, L .

MACHINE LEARNING, 2001, 45 (01) :5-32

[7]

Candès EJ, 2008, IEEE SIGNAL PROC MAG, V25, P21, DOI 10.1109/MSP.2007.914731

[8] On the algorithmic implementation of multiclass kernel-based vector machines [J].

Crammer, K ;

Singer, Y .

JOURNAL OF MACHINE LEARNING RESEARCH, 2002, 2 (02) :265-292

[9] Head pose estimation based on face symmetry analysis [J].

Dahmane, Afifa ;

Larabi, Slimane ;

Bilasco, Ioan Marius ;

Djeraba, Chabane .

SIGNAL IMAGE AND VIDEO PROCESSING, 2015, 9 (08) :1871-1880

[10]

Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848

← 1 2 3 4 5 →