UPL-Net: Uncertainty-aware prompt learning network for semi-supervised action recognition

被引:0
作者
Yang, Shu [1 ]
Li, Ya-Li [1 ]
Wang, Shengjin [1 ]
机构
[1] Tsinghua Univ, Dept Elect Engn, Beijing 100084, Peoples R China
关键词
Semi-supervised learning; Prompt learning; Vision-language pre-training; Action recognition; Uncertainty estimation;
D O I
10.1016/j.neucom.2024.129126
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper focuses on understanding human behavior in videos by reframing the traditional video classification task as a transfer learning problem centered on visual concepts. Unlike existing action recognition approaches that rely solely on single-modal representations and video classifiers, our method leverages an uncertainty- aware prompt learning network (UPL-Net). This network is designed to extract spatiotemporal features that are pertinent to action-related concepts in videos while ensuring that the visual concepts derived from images are preserved. Furthermore, we introduce an uncertainty-guided semi-supervised learning strategy that harnesses unlabeled videos to enhance the model's generalizability. Extensive experiments conducted on benchmark datasets, namely UCF and HMDB, demonstrate the superiority of our approach over state-of-the-art semi- supervised action recognition methods. Notably, under a 1% labeling rate on the UCF dataset, our method achieves a significant improvement of 12.8%, underscoring its effectiveness in leveraging limited labeled data and abundant unlabeled videos for improved performance.
引用
收藏
页数:11
相关论文
共 50 条
  • [31] GRA: Graph Representation Alignment for Semi-Supervised Action Recognition
    Huang, Kuan-Hung
    Huang, Yao-Bang
    Lin, Yong-Xiang
    Hua, Kai-Lung
    Tanveer, M.
    Lu, Xuequan
    Razzak, Imran
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (09) : 11896 - 11905
  • [32] Uncertainty-Aware Dual-Evidential Learning for Weakly-Supervised Temporal Action Localization
    Chen, Mengyuan
    Gao, Junyu
    Xu, Changsheng
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (12) : 15896 - 15911
  • [33] Heterogeneous Network Based Semi-supervised Learning for Scene Text Recognition
    Jiang, Qianyi
    Song, Qi
    Li, Nan
    Zhang, Rui
    Wei, Xiaolin
    DOCUMENT ANALYSIS AND RECOGNITION, ICDAR 2021, PT IV, 2021, 12824 : 64 - 78
  • [34] Semi-supervised action recognition with dynamic temporal information fusion
    Qian, Huifang
    Zhang, Jialun
    Shi, Zhenyu
    Zhang, Yimin
    NEUROCOMPUTING, 2025, 611
  • [35] Momentum Contrastive Teacher for Semi-Supervised Skeleton Action Recognition
    Lu, Mingqi
    Lu, Xiaobo
    Liu, Jun
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2025, 34 : 295 - 305
  • [36] Actor-Aware Self-Supervised Learning for Semi-Supervised Video Representation Learning
    Assefa, Maregu
    Jiang, Wei
    Alemu, Kumie Gedamu
    Yilma, Getinet
    Adhikari, Deepak
    Ayalew, Melese
    Seid, Abegaz Mohammed
    Erbad, Aiman
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (11) : 6679 - 6692
  • [37] Joint-Bone Fusion Graph Convolutional Network for Semi-Supervised Skeleton Action Recognition
    Tu, Zhigang
    Zhang, Jiaxu
    Li, Hongyan
    Chen, Yujin
    Yuan, Junsong
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 1819 - 1831
  • [38] A novel semi-supervised learning for face recognition
    Gao, Quanxue
    Huang, Yunfang
    Gao, Xinbo
    Shen, Weiguo
    Zhang, Hailin
    NEUROCOMPUTING, 2015, 152 : 69 - 76
  • [39] SEMI-SUPERVISED LEARNING FOR MUSICAL INSTRUMENT RECOGNITION
    Diment, Aleksandr
    Heittola, Toni
    Virtanen, Tuomas
    2013 PROCEEDINGS OF THE 21ST EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2013,
  • [40] Semi-supervised learning for tongue constitution recognition
    Ma, Yichao
    Wu, Chunhong
    Li, Tian
    FOURTEENTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING, ICGIP 2022, 2022, 12705