Distilling interaction knowledge for semi-supervised egocentric action recognition

被引:0
作者
Wang, Haoran [1 ]
Yang, Jiahao [1 ]
Yu, Baosheng [2 ]
Zhan, Yibing [3 ]
Tao, Dapeng [4 ]
Ling, Haibin [5 ]
机构
[1] Northeastern Univ, Coll Informat Sci & Engn, Shenyang 110819, Peoples R China
[2] Univ Sydney, Sch Comp Sci, Darlington, NSW 2008, Australia
[3] JD Explore Acad, Beijing 100176, Peoples R China
[4] Yunnan Univ, Sch Informat Sci & Engn, Kunming 650091, Yunnan, Peoples R China
[5] SUNY Stony Brook, Dept Comp Sci, Stony Brook, NY USA
关键词
Knowledge distillation; Semi-supervised learning; Egocentric action recognition;
D O I
10.1016/j.patcog.2024.110927
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Egocentric action recognition, the identification of actions within video content obtained from a first-person perspective, is receiving increasing attention due to the widespread adoption of wearable camera technology. Nonetheless, the task of annotating actions within a video characterized by a cluttered background and the presence of various objects is labor-intensive. In this paper, we consider learning for egocentric action recognition in a semi-supervised manner. Inspired by the fact that videos captured from first-person viewpoint usually contain rich contents about how human hands interact with objects, we thus propose to employ a popular teacher-student framework and distill the interaction knowledge between hand and objects for semi-supervised egocentric action recognition. We refer to the proposed method as Interaction Knowledge Distillation or IKD. Specifically, the teacher network takes hands and action-related objects in the labeled videos as input, and uses graph neural networks to capture their spatial-temporal relations as graph edge features. The student network then takes the detected hands/objects from both labeled and unlabeled videos as input and mimics the teacher network to learn from the interactions to improve model performance. Experiments are performed on two popular egocentric action recognition datasets, Something-Something-V2 and EPIC- KITCHENS-100, which show that our proposed approach consistently outperforms recent state-of-the-art methods in typical semi-supervised settings.
引用
收藏
页数:10
相关论文
共 50 条
  • [41] Semi-supervised Ladder Networks for Speech Emotion Recognition
    Tao, Jian-Hua
    Huang, Jian
    Li, Ya
    Lian, Zheng
    Niu, Ming-Yue
    INTERNATIONAL JOURNAL OF AUTOMATION AND COMPUTING, 2019, 16 (04) : 437 - 448
  • [42] Named entity recognition: a semi-supervised learning approach
    Sintayehu H.
    Lehal G.S.
    International Journal of Information Technology, 2021, 13 (4) : 1659 - 1665
  • [43] Semi-supervised Ladder Networks for Speech Emotion Recognition
    Jian-Hua Tao
    Jian Huang
    Ya Li
    Zheng Lian
    Ming-Yue Niu
    International Journal of Automation and Computing, 2019, 16 (04) : 437 - 448
  • [44] Semi-supervised discriminant analysis method for face recognition
    Chen, Wen-Sheng
    Dai, Xiuli
    Pan, Binbin
    Tang, Yuan Yan
    INTERNATIONAL JOURNAL OF WAVELETS MULTIRESOLUTION AND INFORMATION PROCESSING, 2015, 13 (06)
  • [45] Semi-supervised Growing Neural Gas for Face Recognition
    Zaki, Shireen Mohd
    Yin, Hujun
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2008, 2008, 5326 : 525 - 532
  • [46] Integrated GANs: Semi-Supervised SAR Target Recognition
    Gao, Fei
    Liu, Qiuyang
    Sun, Jinping
    Hussain, Amir
    Zhou, Huiyu
    IEEE ACCESS, 2019, 7 : 113999 - 114013
  • [47] Semi-Supervised End-to-End Speech Recognition
    Karita, Shigeki
    Watanabe, Shinji
    Iwata, Tomoharu
    Ogawa, Atsunori
    Delcroix, Marc
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2 - 6
  • [48] Analysis of Semi-Supervised Methods for Facial Expression Recognition
    Roy, Shuvendu
    Etemad, Ali
    2022 10TH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), 2022,
  • [49] SSRCNN: A Semi-Supervised Learning Framework for Signal Recognition
    Dong, Yihong
    Jiang, Xiaohan
    Cheng, Lei
    Shi, Qingjiang
    IEEE TRANSACTIONS ON COGNITIVE COMMUNICATIONS AND NETWORKING, 2021, 7 (03) : 780 - 789
  • [50] CONTRASTIVE SIAMESE NETWORK FOR SEMI-SUPERVISED SPEECH RECOGNITION
    Khorram, Soheil
    Kim, Jaeyoung
    Tripathi, Anshuman
    Lu, Han
    Zhang, Qian
    Sak, Hasim
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7207 - 7211