Multi-modal Gesture Recognition Challenge 2013: Dataset and Results

被引:95
作者
Escalera, Sergio [1 ,2 ]
Gonzalez, Jordi [2 ,3 ]
Baro, Xavier [2 ,4 ]
Reyes, Miguel [1 ,2 ]
Lopes, Oscar [5 ]
Guyon, Isabelle [6 ]
Athitsos, Vassilis [7 ]
Escalante, Hugo J. [8 ]
机构
[1] Univ Barcelona, Dept Appl Math, E-08007 Barcelona, Spain
[2] UAB, Comp Vis Ctr, Barcelona, Spain
[3] UAB, Dept Comp Sci, Barcelona, Spain
[4] Open Univ Catalonia, EIMT, Barcelona, Spain
[5] Comp Vis Ctr, Barcelona, Spain
[6] ChaLearn, Berkeley, CA USA
[7] Univ Texas, Austin, TX USA
[8] INAOE, Puebla, Mexico
来源
ICMI'13: PROCEEDINGS OF THE 2013 ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION | 2013年
关键词
D O I
10.1145/2522848.2532595
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The recognition of continuous natural gestures is a complex and challenging problem due to the multi-modal nature of involved visual cues (e.g. fingers and lips movements, subtle facial expressions, body pose, etc.), as well as technical limitations such as spatial and temporal resolution and unreliable depth cues. In order to promote the research advance on this field, we organized a challenge on multi-modal gesture recognition. We made available a large video database of 13, 858 gestures from a lexicon of 20 Italian gesture categories recorded with a KinectTm camera, providing the audio, skeletal model, user mask, RGB and depth images. The focus of the challenge was on user independent multiple gesture learning. There are no resting positions and the gestures are performed in continuous sequences lasting 1-2 minutes, containing between 8 and 20 gesture instances in each sequence. As a result, the dataset contains around 1.720.800 frames. In addition to the 20 main gesture categories, 'distracter' gestures are included, meaning that additional audio and gestures out of the vocabulary are included. The final evaluation of the challenge was defined in terms of the Levenshtein edit distance, where the goal was to indicate the real order of gestures within the sequence. 54 international teams participated in the challenge, and outstanding results were obtained by the first ranked participants.
引用
收藏
页码:445 / 452
页数:8
相关论文
共 5 条
  • [1] The Pascal Visual Object Classes (VOC) Challenge
    Everingham, Mark
    Van Gool, Luc
    Williams, Christopher K. I.
    Winn, John
    Zisserman, Andrew
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2010, 88 (02) : 303 - 338
  • [2] Guyon I., 2012, ICPR
  • [3] Human limb segmentation in depth maps based on spatio-temporal Graph-cuts optimization
    Hernandez-Vela, Antonio
    Zlateva, Nadezhda
    Marinov, Alexander
    Reyes, Miguel
    Radeva, Petia
    Dimov, Dimo
    Escalera, Sergio
    [J]. JOURNAL OF AMBIENT INTELLIGENCE AND SMART ENVIRONMENTS, 2012, 4 (06) : 535 - 546
  • [4] Pedregosa F, 2011, J MACH LEARN RES, V12, P2825
  • [5] Shotton J, 2011, PROC CVPR IEEE, P1297, DOI 10.1109/CVPR.2011.5995316