Distilling interaction knowledge for semi-supervised egocentric action recognition

被引:0
作者
Wang, Haoran [1 ]
Yang, Jiahao [1 ]
Yu, Baosheng [2 ]
Zhan, Yibing [3 ]
Tao, Dapeng [4 ]
Ling, Haibin [5 ]
机构
[1] Northeastern Univ, Coll Informat Sci & Engn, Shenyang 110819, Peoples R China
[2] Univ Sydney, Sch Comp Sci, Darlington, NSW 2008, Australia
[3] JD Explore Acad, Beijing 100176, Peoples R China
[4] Yunnan Univ, Sch Informat Sci & Engn, Kunming 650091, Yunnan, Peoples R China
[5] SUNY Stony Brook, Dept Comp Sci, Stony Brook, NY USA
关键词
Knowledge distillation; Semi-supervised learning; Egocentric action recognition;
D O I
10.1016/j.patcog.2024.110927
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Egocentric action recognition, the identification of actions within video content obtained from a first-person perspective, is receiving increasing attention due to the widespread adoption of wearable camera technology. Nonetheless, the task of annotating actions within a video characterized by a cluttered background and the presence of various objects is labor-intensive. In this paper, we consider learning for egocentric action recognition in a semi-supervised manner. Inspired by the fact that videos captured from first-person viewpoint usually contain rich contents about how human hands interact with objects, we thus propose to employ a popular teacher-student framework and distill the interaction knowledge between hand and objects for semi-supervised egocentric action recognition. We refer to the proposed method as Interaction Knowledge Distillation or IKD. Specifically, the teacher network takes hands and action-related objects in the labeled videos as input, and uses graph neural networks to capture their spatial-temporal relations as graph edge features. The student network then takes the detected hands/objects from both labeled and unlabeled videos as input and mimics the teacher network to learn from the interactions to improve model performance. Experiments are performed on two popular egocentric action recognition datasets, Something-Something-V2 and EPIC- KITCHENS-100, which show that our proposed approach consistently outperforms recent state-of-the-art methods in typical semi-supervised settings.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Semi-Supervised Multiple Feature Analysis for Action Recognition
    Wang, Sen
    Ma, Zhigang
    Yang, Yi
    Li, Xue
    Pang, Chaoyi
    Hauptmann, Alexander G.
    IEEE TRANSACTIONS ON MULTIMEDIA, 2014, 16 (02) : 289 - 298
  • [2] Evaluation of semi-supervised learning method on action recognition
    Shen, Haoquan
    Yan, Yan
    Xu, Shicheng
    Ballas, Nicolas
    Chen, Wenzhi
    MULTIMEDIA TOOLS AND APPLICATIONS, 2015, 74 (02) : 523 - 542
  • [3] Evaluation of semi-supervised learning method on action recognition
    Haoquan Shen
    Yan Yan
    Shicheng Xu
    Nicolas Ballas
    Wenzhi Chen
    Multimedia Tools and Applications, 2015, 74 : 523 - 542
  • [4] GRA: Graph Representation Alignment for Semi-Supervised Action Recognition
    Huang, Kuan-Hung
    Huang, Yao-Bang
    Lin, Yong-Xiang
    Hua, Kai-Lung
    Tanveer, M.
    Lu, Xuequan
    Razzak, Imran
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (09) : 11896 - 11905
  • [5] Semi-supervised Entity Recognition for Power Dispatching Knowledge Modeling
    Wang K.
    Zhao G.
    Gong X.
    Liu J.
    Wang M.
    Yu D.
    Li S.
    Dianwang Jishu/Power System Technology, 2023, 47 (09): : 3855 - 3863
  • [6] Semi-supervised action recognition with dynamic temporal information fusion
    Qian, Huifang
    Zhang, Jialun
    Shi, Zhenyu
    Zhang, Yimin
    NEUROCOMPUTING, 2025, 611
  • [7] Momentum Contrastive Teacher for Semi-Supervised Skeleton Action Recognition
    Lu, Mingqi
    Lu, Xiaobo
    Liu, Jun
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2025, 34 : 295 - 305
  • [8] Knowledge Distillation for Semi-supervised Domain Adaptation
    Orbes-Arteainst, Mauricio
    Cardoso, Jorge
    Sorensen, Lauge
    Igel, Christian
    Ourselin, Sebastien
    Modat, Marc
    Nielsen, Mads
    Pai, Akshay
    OR 2.0 CONTEXT-AWARE OPERATING THEATERS AND MACHINE LEARNING IN CLINICAL NEUROIMAGING, 2019, 11796 : 68 - 76
  • [9] MF-Match: A Semi-Supervised Model for Human Action Recognition
    Yun, Tianhe
    Wang, Zhangang
    SENSORS, 2024, 24 (15)
  • [10] Actor-Aware Contrastive Learning for Semi-Supervised Action Recognition
    Assefa, Maregu
    Jiang, Wei
    Gedamu, Kumie
    Yilma, Getinet
    Ayalew, Melese
    Seid, Mohammed
    2022 IEEE 34TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, ICTAI, 2022, : 660 - 665