Extensive Facial Landmark Localization with Coarse-to-fine Convolutional Network Cascade

被引:249
作者
Zhou, Erjin [1 ]
Fan, Haoqiang [1 ]
Cao, Zhimin [1 ]
Jiang, Yuning [1 ]
Yin, Qi [1 ]
机构
[1] Megvii Inc, Beijing, Peoples R China
来源
2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW) | 2013年
关键词
D O I
10.1109/ICCVW.2013.58
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a new approach to localize extensive facial landmarks with a coarse-to-fine convolutional network cascade. Deep convolutional neural networks (DCNN) have been successfully utilized in facial landmark localization for two-fold advantages: 1) geometric constraints among facial points are implicitly utilized; 2) huge amount of training data can be leveraged. However, in the task of extensive facial landmark localization, a large number of facial landmarks (more than 50 points) are required to be located in a unified system, which poses great difficulty in the structure design and training process of traditional convolutional networks. In this paper, we design a four-level convolutional network cascade, which tackles the problem in a coarse-to-fine manner. In our system, each network level is trained to locally refine a subset of facial landmarks generated by previous network levels. In addition, each level predicts explicit geometric constraints (the position and rotation angles of a specific facial component) to rectify the inputs of the current network level. The combination of coarse-to-fine cascade and geometric refinement enables our system to locate extensive facial landmarks (68 points) accurately in the 300-W facial landmark localization challenge.
引用
收藏
页码:386 / 391
页数:6
相关论文
共 12 条
[1]  
[Anonymous], 2000, Opencv. Dr. Dobb's journal of software tools
[2]  
[Anonymous], 1999, 2 INT C AUD VID BAS
[3]  
[Anonymous], COMPUTER VISION PATT
[4]  
Belhumeur P., 2011, COMPUTER VISION PATT
[5]  
Blake A., 1998, Active Shape Models
[6]  
Cootes G. J. E., 2001, PATTERN ANAL MACHINE
[7]   Multi-PIE [J].
Gross, Ralph ;
Matthews, Iain ;
Cohn, Jeffrey ;
Kanade, Takeo ;
Baker, Simon .
IMAGE AND VISION COMPUTING, 2010, 28 (05) :807-813
[8]  
Le V, 2012, LECT NOTES COMPUT SC, V7574, P679, DOI 10.1007/978-3-642-33712-3_49
[9]  
Milborrow S, 2008, LECT NOTES COMPUT SC, V5305, P504, DOI 10.1007/978-3-540-88693-8_37
[10]  
Xudong C., 2012, COMPUTER VISION PATT