Zero-Shot Learning for Gesture Recognition

被引:4
作者
Madapana, Naveen [1 ]
机构
[1] Purdue Univ, W Lafayette, IN 47906 USA
来源
PROCEEDINGS OF THE 2020 INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, ICMI 2020 | 2020年
基金
美国医疗保健研究与质量局;
关键词
Gesture recognition; zero-shot learning; transfer learning; few-shot learning;
D O I
10.1145/3382507.3421161
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Zero-Shot Learning (ZSL) is a new paradigm in machine learning that aims to recognize the classes that are not present in the training data. Hence, this paradigm is capable of comprehending the categories that were never seen before. While deep learning has pushed the limits of unseen object recognition, ZSL for temporal problems such as unfamiliar gesture recognition (referred to as ZSGL) remain unexplored. ZSGL has the potential to result in efficient human-machine interfaces that can recognize and understand the spontaneous and conversational gestures of humans. In this regard, the objective of this work is to conceptualize, model and develop a framework to tackle ZSGL problems. The first step in the pipeline is to develop a database of gesture attributes that are representative of a range of categories. Next, a deep architecture consisting of convolutional and recurrent layers is proposed to jointly optimize the semantic and classification losses. Lastly, rigorous experiments are performed to compare the proposed model with respect to existing ZSL models on CGD 2013 and MSRC-12 datasets. In our preliminary work, we identified a list of 64 discriminative attributes related to gestures' morphological characteristics. Our approach yields an unseen class accuracy of (41%) which outperforms the state-of-the-art approaches by a considerable margin. Future work involves the following: 1. Modifying the existing architecture in order to improve the ZSL accuracy, 2. Augmenting the database of attributes to incorporate semantic properties, 3. Addressing the issue of data imbalance which is inherent to ZSL problems, and 4. Expanding this research to other domains such as surgeme and action recognition.
引用
收藏
页码:754 / 757
页数:4
相关论文
共 20 条
[1]  
[Anonymous], 2009, P INT C NEUR INF PRO
[2]   Multi-modal Gesture Recognition Challenge 2013: Dataset and Results [J].
Escalera, Sergio ;
Gonzalez, Jordi ;
Baro, Xavier ;
Reyes, Miguel ;
Lopes, Oscar ;
Guyon, Isabelle ;
Athitsos, Vassilis ;
Escalante, Hugo J. .
ICMI'13: PROCEEDINGS OF THE 2013 ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2013, :445-452
[3]  
Farhadi A, 2009, PROC CVPR IEEE, P1778, DOI 10.1109/CVPRW.2009.5206772
[4]  
Fothergill Simon, 2012, P SIGCHI C HUM FACT, P1737, DOI [10.1145/2207676.2208303, DOI 10.1145/2207676.2208303]
[5]   Transductive Multi-View Zero-Shot Learning [J].
Fu, Yanwei ;
Hospedales, Timothy M. ;
Xiang, Tao ;
Gong, Shaogang .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2015, 37 (11) :2332-2345
[6]  
Kodirov E, 2017, Arxiv, DOI arXiv:1704.08345
[7]   Attribute-Based Classification for Zero-Shot Visual Object Categorization [J].
Lampert, Christoph H. ;
Nickisch, Hannes ;
Harmeling, Stefan .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2014, 36 (03) :453-465
[8]  
Lampert CH, 2009, PROC CVPR IEEE, P951, DOI 10.1109/CVPRW.2009.5206594
[9]  
Madapana N, 2019, 2019 14 IEEE INT C A, P1, DOI 10.1109/FG.2019.8756548
[10]  
Madapana N, 2017, PROCEEDINGS OF THE 19TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, ICMI 2017, P331, DOI 10.1145/3136755.3136774