Distillation of human-object interaction contexts for action recognition

被引:1
|
作者
Almushyti, Muna [1 ,2 ]
Li, Frederick W. B. [1 ]
机构
[1] Univ Durham, Dept Comp Sci, South Rd, Durham DH1 3LE, England
[2] Qassim Univ, Deanship Educ Serv, Buraydah, Saudi Arabia
基金
英国工程与自然科学研究理事会;
关键词
global context; graph attention network local context; human-object interaction;
D O I
10.1002/cav.2107
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Modeling spatial-temporal relations is imperative for recognizing human actions, especially when a human is interacting with objects, while multiple objects appear around the human differently over time. Most existing action recognition models focus on learning overall visual cues of a scene but disregard a holistic view of human-object relationships and interactions, that is, how a human interacts with respect to short-term task for completion and long-term goal. We therefore argue to improve human action recognition by exploiting both the local and global contexts of human-object interactions (HOIs). In this paper, we propose the Global-Local Interaction Distillation Network (GLIDN), learning human and object interactions through space and time via knowledge distillation for holistic HOI understanding. GLIDN encodes humans and objects into graph nodes and learns local and global relations via graph attention network. The local context graphs learn the relation between humans and objects at a frame level by capturing their co-occurrence at a specific time step. The global relation graph is constructed based on the video-level of human and object interactions, identifying their long-term relations throughout a video sequence. We also investigate how knowledge from these graphs can be distilled to their counterparts for improving HOI recognition. Finally, we evaluate our model by conducting comprehensive experiments on two datasets including Charades and CAD-120. Our method outperforms the baselines and counterpart approaches.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Human-Object Interaction Recognition Based on Modeling Context
    Shuyang Li
    Wei Liang
    Qun Zhang
    Journal of Beijing Institute of Technology, 2017, 26 (02) : 215 - 222
  • [2] A methodology for semantic action recognition based on pose and human-object interaction in avocado harvesting processes
    Vasconez, J. P.
    Admoni, H.
    Auat Cheein, F.
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2021, 184
  • [3] Human-object Interaction Recognition Using Multitask Neural Network
    Yan, Weihao
    Gao, Yue
    Liu, Qiming
    2019 3RD INTERNATIONAL SYMPOSIUM ON AUTONOMOUS SYSTEMS (ISAS 2019), 2019, : 323 - 328
  • [4] Deep learning and RGB-D based human action, human-human and human-object interaction recognition: A survey?
    Khaire, Pushpajit
    Kumar, Praveen
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2022, 86
  • [5] Human-object interaction recognition for automatic construction site safety inspection
    Tang, Shuai
    Roberts, Dominic
    Golparvar-Fard, Mani
    AUTOMATION IN CONSTRUCTION, 2020, 120
  • [6] Effective human-object interaction recognition for edge devices in intelligent space
    Ozaki, Haruhiro
    Tran, Dinh Tuan
    Lee, Joo-Ho
    SICE JOURNAL OF CONTROL MEASUREMENT AND SYSTEM INTEGRATION, 2024, 17 (01) : 1 - 9
  • [7] An Optimization Model for Human Activity Recognition Inspired by Information on Human-object Interaction
    Liu, Xinhua
    You, Tianyu
    Ma, Xiaolin
    Kuang, Hailan
    2018 10TH INTERNATIONAL CONFERENCE ON MEASURING TECHNOLOGY AND MECHATRONICS AUTOMATION (ICMTMA), 2018, : 519 - 523
  • [8] Zero-Shot Learning on Human-Object Interaction Recognition in video
    Maraghi, Vali Ollah
    Faez, Karim
    2019 5TH IRANIAN CONFERENCE ON SIGNAL PROCESSING AND INTELLIGENT SYSTEMS (ICSPIS 2019), 2019,
  • [9] Spatio-Temporal Interaction Graph Parsing Networks for Human-Object Interaction Recognition
    Wang, Ning
    Zhu, Guangming
    Zhang, Liang
    Shen, Peiyi
    Li, Hongsheng
    Hua, Cong
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 4985 - 4993
  • [10] Human-object interaction detection with missing objects
    Kogashi, Kaen
    Wu, Yang
    Nobuhara, Shohei
    Nishino, Ko
    IMAGE AND VISION COMPUTING, 2021, 113