Focal Channel Knowledge Distillation for Multi-Modality Action Recognition

被引:1
作者
Gan, Lipeng [1 ]
Cao, Runze [1 ]
Li, Ning [1 ]
Yang, Man [1 ]
Li, Xiaochao [1 ,2 ,3 ]
机构
[1] Xiamen Univ, Dept Microelect & lntegrated Circuit, Xiamen 361005, Peoples R China
[2] Xiamen Univ Malaysia, Dept Elect & Elect Engn, Sepang 43900, Selangor, Malaysia
[3] Univ Sydney, Sch Elect & Informat Engn, Sydney, NSW 2006, Australia
来源
IEEE ACCESS | 2023年 / 11卷
关键词
Action recognition; knowledge distillation; multi-modality;
D O I
10.1109/ACCESS.2023.3298647
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The multi-modality action recognition aims to learn the complementary information from multiple modalities to improve the action recognition performance. However, there exists a significant modality channel difference, the equal transferring channel semantic features from multi-modalities to RGB will result in competition and redundancy during knowledge distillation. To address this issue, we propose a focal channel knowledge distillation strategy to transfer the key semantic correlations and distributions of multi-modality teachers into the RGB student network. The focal channel correlations provide intrinsic relationships and diversity properties of key semantics, and focal channel distributions provide salient channel activation of features. By ignoring the less-discriminative and irrelevant channels, the student can more efficiently utilize the channel capability to learn the complementary semantic features from the other modalities. Our focal channel knowledge distillation achieves 91.2%, 95.6%, 98.3% and 81.0% accuracy with 4.5%, 4.2%, 3.7% and 7.1% improvement on NTU 60 (CS), UTD-MHAD, N-UCLA and HMDB51 datasets comparing to unimodal RGB models. This focal channel knowledge distillation framework can also be integrated with the unimodal models to achieve the state-of-the-art performance. The extensive experiments show that the proposed method achieves 92.5%, 96.0%, 98.9%, and 82.3% accuracy on NTU 60 (CS), UTD-MHAD, N-UCLA, and HMDB51 datasets respectively.
引用
收藏
页码:78285 / 78298
页数:14
相关论文
共 50 条
  • [31] On Multi-Modality in English Listening Teaching
    Zhang, Rui
    [J]. 2013 3RD INTERNATIONAL CONFERENCE ON SOCIAL SCIENCES AND SOCIETY (ICSSS 2013), PT 9, 2013, 40 : 222 - 225
  • [32] Modality mixer exploiting complementary information for multi-modal action recognition
    Lee, Sumin
    Woo, Sangmin
    Nugroho, Muhammad Adi
    Kim, Changick
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2025, 256
  • [33] 3D shape recognition and retrieval based on multi-modality deep learning
    Bu, Shuhui
    Wang, Lei
    Han, Pengcheng
    Liu, Zhenbao
    Li, Ke
    [J]. NEUROCOMPUTING, 2017, 259 : 183 - 193
  • [34] Impact of Domain Knowledge and Multi-Modality on Intelligent Molecular Property Prediction: A Systematic Survey
    Kuang, Taojie
    Liu, Pengfei
    Ren, Zhixiang
    [J]. BIG DATA MINING AND ANALYTICS, 2024, 7 (03): : 858 - 888
  • [35] Semantics-Aware Adaptive Knowledge Distillation for Sensor-to-Vision Action Recognition
    Liu, Yang
    Wang, Keze
    Li, Guanbin
    Lin, Liang
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 5573 - 5588
  • [36] Multi-modality artificial intelligence in digital pathology
    Qiao, Yixuan
    Zhao, Lianhe
    Luo, Chunlong
    Luo, Yufan
    Wu, Yang
    Li, Shengtong
    Bu, Dechao
    Zhao, Yi
    [J]. BRIEFINGS IN BIOINFORMATICS, 2022, 23 (06)
  • [37] Multi-modality in the Likelihood Function of GARCH Model
    Mahmood, Farrukh
    Khan, Saud Ahmed
    [J]. REVIEW OF PACIFIC BASIN FINANCIAL MARKETS AND POLICIES, 2020, 23 (03)
  • [38] On the multi-modality, materiality and contingency of organizational discourse
    Iedema, Rick
    [J]. ORGANIZATION STUDIES, 2007, 28 (06) : 931 - 946
  • [39] Hardware and Software Approaches to Multi-Modality Imaging
    Klausen, Thomas Levin
    Andersen, Flemming
    Kemp, Brad
    [J]. CURRENT MEDICAL IMAGING REVIEWS, 2011, 7 (03) : 169 - 174
  • [40] Progressive Cross-modal Knowledge Distillation for Human Action Recognition
    Ni, Jianyuan
    Ngu, Anne H. H.
    Yan, Yan
    [J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 5903 - 5912