Multi-granularity hierarchical contrastive learning between foreground and background for semi-supervised video action detection

被引:0
作者
Zhang, Qiming [1 ]
Hu, Zhengping [1 ,2 ,3 ]
Wang, Yulu [1 ]
Zhang, Hehao [1 ]
Di, Jirui [1 ]
机构
[1] Yanshan Univ, Sch Informat & Engn, Qinhuangdao 066004, Hebei, Peoples R China
[2] Yanshan Univ, Qinhuangdao 066004, Hebei, Peoples R China
[3] Hebei Key Lab Informat Transmiss & Signal Proc, Qinhuangdao 066004, Hebei, Peoples R China
关键词
Semi-supervised learning; Video action detection; Multi-granularity; Contrastive learning; NETWORK;
D O I
10.1016/j.knosys.2024.112853
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Semi-supervised video action detection has received increasing attention due to its lower data annotation cost and performance comparable to fully supervised methods. However, due to the presence of dynamic background regions in the video, existing methods may encounter biases when interpreting the foreground and background of the video. This bias causes the model to mistakenly identify dynamic background areas as action foregrounds or to overlook background information, leading to misjudgment of the foreground. In response to this issue, this paper proposes a multi-granularity hierarchical contrastive learning between foreground and background for semi-supervised video action detection method termed as Multi-FB. Specifically, this paper proposes a multi- granularity encoding network based on foreground and background. This network uses a unified encoder to represent and learn foreground and background regions in videos at different granularities, thereby improving the model's understanding of action foreground and related background. Secondly, this paper proposes an Intramodel multi-granularity hierarchical contrastive strategy, which aims to minimize the representation discrepancies of foreground-to-foreground and background-to-background at different granularities within the same video, while maximizing the representation differences between the foreground and background at various granularities within the video. Furthermore, this paper proposes a Cross-model multi-granularity hierarchical contrastive strategy, which aims to enhance the consistency of the model's representations of foregrounds and backgrounds between the original data and the augmented data. A large number of experimental results on JHMDB-21 and UCF101-24 show that the proposed method can significantly distinguish feature representations between different categories, achieving performance comparable to state-of-the-art methods under semi- supervised conditions.
引用
收藏
页数:16
相关论文
共 50 条
[41]   Individuality-enhanced and multi-granularity consistency-preserving graph neural network for semi-supervised node classification [J].
Liu, Xinxin ;
Yu, Weiren .
APPLIED INTELLIGENCE, 2023, 53 (22) :27608-27623
[42]   Individuality-enhanced and multi-granularity consistency-preserving graph neural network for semi-supervised node classification [J].
Xinxin Liu ;
Weiren Yu .
Applied Intelligence, 2023, 53 :27608-27623
[43]   X-Invariant Contrastive Augmentation and Representation Learning for Semi-Supervised Skeleton-Based Action Recognition [J].
Xu, Binqian ;
Shu, Xiangbo ;
Song, Yan .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 :3852-3867
[44]   Multi-task contrastive learning for semi-supervised medical image segmentation with multi-scale uncertainty estimation [J].
Xing, Chengcheng ;
Dong, Haoji ;
Xi, Heran ;
Ma, Jiquan ;
Zhu, Jinghua .
PHYSICS IN MEDICINE AND BIOLOGY, 2023, 68 (18)
[45]   Multi-Modality Semi-Supervised Learning for Ophthalmic Biomarkers Detection [J].
Chen, Yanming ;
Niu, Chenxi ;
Ye, Chen ;
Jin, Shengji ;
Li, Yue ;
Xu, Chi ;
Liu, Keyi ;
Gao, Haowei ;
Hu, Jingxi ;
Zou, Yuanhao ;
Zheng, Huizhong ;
He, Xiangjian .
INTERNATIONAL WORKSHOP ON ADVANCED IMAGING TECHNOLOGY, IWAIT 2024, 2024, 13164
[46]   Multi-domain Active Learning for Semi-supervised Anomaly Detection [J].
Vercruyssen, Vincent ;
Perini, Lorenzo ;
Meert, Wannes ;
Davis, Jesse .
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2022, PT IV, 2023, 13716 :485-501
[47]   Multi-species Seagrass Detection Using Semi-supervised Learning [J].
Noman, Md Kislu ;
Islam, Syed Mohammed Shamsul ;
Abu-Khalaf, Jumana ;
Lavery, Paul .
PROCEEDINGS OF THE 2021 36TH INTERNATIONAL CONFERENCE ON IMAGE AND VISION COMPUTING NEW ZEALAND (IVCNZ), 2021,
[48]   Boosted multi-class semi-supervised learning for human action recognition [J].
Zhang, Tianzhu ;
Liu, Si ;
Xu, Changsheng ;
Lu, Hanqing .
PATTERN RECOGNITION, 2011, 44 (10-11) :2334-2342
[49]   Human Action Recognition Based on Multi-view Semi-supervised Learning [J].
Tang C. ;
Wang W. ;
Wang X. ;
Zhang C. ;
Zou L. .
Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2019, 32 (04) :376-384
[50]   Long-tailed Temporal Action Detection Based on Semi-supervised Learning [J].
Wang, Yu-Hong ;
Wu, Gang-Shan ;
Wang, Li-Min .
Ruan Jian Xue Bao/Journal of Software, 2025, 36 (02) :625-643