Few-Shot Action Recognition with Hierarchical Matching and Contrastive Learning

被引:26
作者
Zheng, Sipeng [1 ]
Chen, Shizhe [2 ]
Jin, Qin [1 ]
机构
[1] Renmin Univ China, Beijing, Peoples R China
[2] INRIA, Paris, France
来源
COMPUTER VISION - ECCV 2022, PT IV | 2022年 / 13664卷
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Few-shot learning; Action recognition; Contrastive learning;
D O I
10.1007/978-3-031-19772-7_18
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Few-shot action recognition aims to recognize actions in test videos based on limited annotated data of target action classes. The dominant approaches project videos into a metric space and classify videos via nearest neighboring. They mainly measure video similarities using global or temporal alignment alone, while an optimum matching should be multi-level. However, the complexity of learning coarse-to-fine matching quickly rises as we focus on finer-grained visual cues, and the lack of detailed local supervision is another challenge. In this work, we propose a hierarchical matching model to support comprehensive similarity measure at global, temporal and spatial levels via a zoom-in matching module. We further propose a mixed-supervised hierarchical contrastive learning (HCL), which not only employs supervised contrastive learning to differentiate videos at different levels, but also utilizes cycle consistency as weak supervision to align discriminative temporal clips or spatial patches. Our model achieves state-of-the-art performance on four benchmarks especially under the most challenging 1-shot recognition setting.
引用
收藏
页码:297 / 313
页数:17
相关论文
共 50 条
  • [21] Learning Spatial-Preserved Skeleton Representations for Few-Shot Action Recognition
    Ma, Ning
    Zhang, Hongyi
    Li, Xuhui
    Zhou, Sheng
    Zhang, Zhen
    Wen, Jun
    Li, Haifeng
    Gu, Jingjun
    Bu, Jiajun
    COMPUTER VISION - ECCV 2022, PT IV, 2022, 13664 : 174 - 191
  • [22] Multimodal Cross-Domain Few-Shot Learning for Egocentric Action Recognition
    Hatano, Masashi
    Hachiuma, Ryo
    Fujii, Ryo
    Saito, Hideo
    COMPUTER VISION - ECCV 2024, PT XXXIII, 2025, 15091 : 182 - 199
  • [23] Task Adaptive Modeling for Few-shot Action Recognition
    Wang, Jiayi
    Jin, Yi
    Feng, Songhe
    Li, Yidong
    2022 IEEE 24TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2022,
  • [24] Few-Shot Action Recognition via Multi-View Representation Learning
    Wang, Xiao
    Lu, Yang
    Yu, Wanchuan
    Pang, Yanwei
    Wang, Hanzi
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (09) : 8522 - 8535
  • [25] On the Importance of Spatial Relations for Few-shot Action Recognition
    Zhang, Yilun
    Fu, Yuqian
    Ma, Xingjun
    Qi, Lizhe
    Chen, Jingjing
    Wu, Zuxuan
    Jiang, Yu-Gang
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 2243 - 2251
  • [26] Few-shot learning for ear recognition
    Zhang, Jie
    Yu, Wen
    Yang, Xudong
    Deng, Fang
    PROCEEDINGS OF 2019 INTERNATIONAL CONFERENCE ON IMAGE, VIDEO AND SIGNAL PROCESSING (IVSP 2019), 2019, : 50 - 54
  • [27] Advances in Few-Shot Action Recognition: A Comprehensive Review
    Ruan, Zanxi
    Wei, Yingmei
    Yuan, Yifei
    Li, Yu
    Guo, Yanming
    Xie, Yuxiang
    2024 7TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND BIG DATA, ICAIBD 2024, 2024, : 390 - 398
  • [28] Few-shot Object Detection with Refined Contrastive Learning
    Shangguan, Zeyu
    Huai, Lian
    Liu, Tong
    Jiang, Xingqun
    2023 IEEE 35TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, ICTAI, 2023, : 991 - 996
  • [29] Graph Few-shot Learning with Attribute Matching
    Wang, Ning
    Luo, Minnan
    Ding, Kaize
    Zhang, Lingling
    Li, Jundong
    Zheng, Qinghua
    CIKM '20: PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, 2020, : 1545 - 1554
  • [30] Attentive matching network for few-shot learning
    Mai, Sijie
    Hu, Haifeng
    Xu, Jia
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2019, 187