Distilling interaction knowledge for semi-supervised egocentric action recognition

被引:0
作者
Wang, Haoran [1 ]
Yang, Jiahao [1 ]
Yu, Baosheng [2 ]
Zhan, Yibing [3 ]
Tao, Dapeng [4 ]
Ling, Haibin [5 ]
机构
[1] Northeastern Univ, Coll Informat Sci & Engn, Shenyang 110819, Peoples R China
[2] Univ Sydney, Sch Comp Sci, Darlington, NSW 2008, Australia
[3] JD Explore Acad, Beijing 100176, Peoples R China
[4] Yunnan Univ, Sch Informat Sci & Engn, Kunming 650091, Yunnan, Peoples R China
[5] SUNY Stony Brook, Dept Comp Sci, Stony Brook, NY USA
关键词
Knowledge distillation; Semi-supervised learning; Egocentric action recognition;
D O I
10.1016/j.patcog.2024.110927
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Egocentric action recognition, the identification of actions within video content obtained from a first-person perspective, is receiving increasing attention due to the widespread adoption of wearable camera technology. Nonetheless, the task of annotating actions within a video characterized by a cluttered background and the presence of various objects is labor-intensive. In this paper, we consider learning for egocentric action recognition in a semi-supervised manner. Inspired by the fact that videos captured from first-person viewpoint usually contain rich contents about how human hands interact with objects, we thus propose to employ a popular teacher-student framework and distill the interaction knowledge between hand and objects for semi-supervised egocentric action recognition. We refer to the proposed method as Interaction Knowledge Distillation or IKD. Specifically, the teacher network takes hands and action-related objects in the labeled videos as input, and uses graph neural networks to capture their spatial-temporal relations as graph edge features. The student network then takes the detected hands/objects from both labeled and unlabeled videos as input and mimics the teacher network to learn from the interactions to improve model performance. Experiments are performed on two popular egocentric action recognition datasets, Something-Something-V2 and EPIC- KITCHENS-100, which show that our proposed approach consistently outperforms recent state-of-the-art methods in typical semi-supervised settings.
引用
收藏
页数:10
相关论文
共 50 条
  • [31] Regularized extreme learning machine for multi-view semi-supervised action recognition
    Iosifidis, Alexandros
    Tefas, Anastasios
    Pitas, Ioannis
    NEUROCOMPUTING, 2014, 145 : 250 - 262
  • [32] RL-SSI Model: Adapting a Supervised Learning Approach to a Semi-Supervised Approach for Human Action Recognition
    dos Santos, Lucas Lisboa
    Winkler, Ingrid
    Sperandio Nascimento, Erick Giovani
    ELECTRONICS, 2022, 11 (09)
  • [33] SMIN: Semi-Supervised Multi-Modal Interaction Network for Conversational Emotion Recognition
    Lian, Zheng
    Liu, Bin
    Tao, Jianhua
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2023, 14 (03) : 2415 - 2429
  • [34] Expert-Guided Knowledge Distillation for Semi-Supervised Vessel Segmentation
    Shen, Ning
    Xu, Tingfa
    Huang, Shiqi
    Mu, Feng
    Li, Jianan
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2023, 27 (11) : 5542 - 5553
  • [35] Semi-supervised Campus Network Intrusion Detection Based on Knowledge Distillation
    Chen, Junjun
    Guo, Qiang
    Fu, Zhongnan
    Shang, Qun
    Ma, Hao
    Wang, Nai
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [36] Face and Gait Recognition Based on Semi-supervised Learning
    Yu, Qiuhong
    Yin, Yilong
    Yang, Gongping
    Ning, Yanbing
    Li, Yanan
    PATTERN RECOGNITION, 2012, 321 : 284 - 291
  • [37] Boosting semi-supervised face recognition with raw faces
    Chen, Yunze
    Huang, Junjie
    Zhu, Zheng
    Long, Xianlei
    Gu, Qingyi
    IMAGE AND VISION COMPUTING, 2022, 125
  • [38] Semi-supervised Ladder Networks for Speech Emotion Recognition
    Jian-Hua Tao
    Jian Huang
    Ya Li
    Zheng Lian
    Ming-Yue Niu
    International Journal of Automation and Computing, 2019, 16 : 437 - 448
  • [39] Semi-supervised Phoneme Recognition with Recurrent Ladder Networks
    Tietz, Marian
    Alpay, Tayfun
    Twiefel, Johannes
    Wermter, Stefan
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2017, PT I, 2017, 10613 : 3 - 10
  • [40] Boosting Semi-Supervised Face Recognition With Noise Robustness
    Liu, Yuchi
    Shi, Hailin
    Du, Hang
    Zhu, Rui
    Wang, Jun
    Zheng, Liang
    Mei, Tao
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (02) : 778 - 787