Multi-granularity hierarchical contrastive learning between foreground and background for semi-supervised video action detection

被引:0
作者
Zhang, Qiming [1 ]
Hu, Zhengping [1 ,2 ,3 ]
Wang, Yulu [1 ]
Zhang, Hehao [1 ]
Di, Jirui [1 ]
机构
[1] Yanshan Univ, Sch Informat & Engn, Qinhuangdao 066004, Hebei, Peoples R China
[2] Yanshan Univ, Qinhuangdao 066004, Hebei, Peoples R China
[3] Hebei Key Lab Informat Transmiss & Signal Proc, Qinhuangdao 066004, Hebei, Peoples R China
关键词
Semi-supervised learning; Video action detection; Multi-granularity; Contrastive learning; NETWORK;
D O I
10.1016/j.knosys.2024.112853
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Semi-supervised video action detection has received increasing attention due to its lower data annotation cost and performance comparable to fully supervised methods. However, due to the presence of dynamic background regions in the video, existing methods may encounter biases when interpreting the foreground and background of the video. This bias causes the model to mistakenly identify dynamic background areas as action foregrounds or to overlook background information, leading to misjudgment of the foreground. In response to this issue, this paper proposes a multi-granularity hierarchical contrastive learning between foreground and background for semi-supervised video action detection method termed as Multi-FB. Specifically, this paper proposes a multi- granularity encoding network based on foreground and background. This network uses a unified encoder to represent and learn foreground and background regions in videos at different granularities, thereby improving the model's understanding of action foreground and related background. Secondly, this paper proposes an Intramodel multi-granularity hierarchical contrastive strategy, which aims to minimize the representation discrepancies of foreground-to-foreground and background-to-background at different granularities within the same video, while maximizing the representation differences between the foreground and background at various granularities within the video. Furthermore, this paper proposes a Cross-model multi-granularity hierarchical contrastive strategy, which aims to enhance the consistency of the model's representations of foregrounds and backgrounds between the original data and the augmented data. A large number of experimental results on JHMDB-21 and UCF101-24 show that the proposed method can significantly distinguish feature representations between different categories, achieving performance comparable to state-of-the-art methods under semi- supervised conditions.
引用
收藏
页数:16
相关论文
共 50 条
  • [21] An improved contrastive learning network for semi-supervised multi-structure segmentation in echocardiography
    Guo, Ziyu
    Zhang, Yuting
    Qiu, Zishan
    Dong, Suyu
    He, Shan
    Gao, Huan
    Zhang, Jinao
    Chen, Yingtao
    He, Bingtao
    Kong, Zhe
    Qiu, Zhaowen
    Li, Yan
    Li, Caijuan
    FRONTIERS IN CARDIOVASCULAR MEDICINE, 2023, 10
  • [22] A SEMI-SUPERVISED LEARNING APPROACH TO ONLINE AUDIO BACKGROUND DETECTION
    Chu, Selina
    Narayanan, Shrikanth
    Kuo, C-C Jay
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 1629 - +
  • [23] Reliable Contrastive Learning for Semi-Supervised Change Detection in Remote Sensing Images
    Wang, Jia-Xin
    Li, Teng
    Chen, Si-Bao
    Tang, Jin
    Luo, Bin
    Wilson, Richard C.
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [24] FMC: Multimodal fake news detection based on multi-granularity feature fusion and contrastive learning
    Yan, Facheng
    Zhang, Mingshu
    Wei, Bin
    Ren, Kelan
    Jiang, Wen
    ALEXANDRIA ENGINEERING JOURNAL, 2024, 109 : 376 - 393
  • [25] Applying semi-supervised learning in hierarchical multi-label classification
    Santos, Araken
    Canuto, Anne
    EXPERT SYSTEMS WITH APPLICATIONS, 2014, 41 (14) : 6075 - 6085
  • [26] Online web video topic detection and tracking with semi-supervised learning
    Guorong Li
    Shuqiang Jiang
    Weigang Zhang
    Junbiao Pang
    Qingming Huang
    Multimedia Systems, 2016, 22 : 115 - 125
  • [27] Online web video topic detection and tracking with semi-supervised learning
    Li, Guorong
    Jiang, Shuqiang
    Zhang, Weigang
    Pang, Junbiao
    Huang, Qingming
    MULTIMEDIA SYSTEMS, 2016, 22 (01) : 115 - 125
  • [28] DCL-NET: DUAL CONTRASTIVE LEARNING NETWORK FOR SEMI-SUPERVISED MULTI-ORGAN SEGMENTATION
    Wen, Lu
    Feng, Zhenghao
    Hou, Yun
    Wang, Peng
    Wu, Xi
    Zhou, Jiliu
    Wang, Yan
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 1876 - 1880
  • [29] MtCLSS: Multi-Task Contrastive Learning for Semi-Supervised Pediatric Sleep Staging
    Li, Yamei
    Luo, Shengqiong
    Zhang, Haibo
    Zhang, Yinkai
    Zhang, Yuan
    Lo, Benny
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2023, 27 (06) : 2647 - 2655
  • [30] Curriculum Consistency Learning and Multi-Scale Contrastive Constraint in Semi-Supervised Medical Image Segmentation
    Ding, Weizhen
    Li, Zhen
    BIOENGINEERING-BASEL, 2024, 11 (01):