Video Complicated-Information Extraction and Filtering Network for Weakly-Supervised Temporal Action Localization

被引:0
作者
Li, Jiaxuan [1 ,2 ]
Ma, Tiancheng [1 ,2 ]
Yang, Xiaohui [1 ,2 ]
Yang, Lijun [1 ,2 ]
Zheng, Chen [1 ,2 ]
机构
[1] Henan Univ, Sch Math & Stat, Kaifeng 475004, Peoples R China
[2] Henan Univ, Henan Engn Res Ctr Artificial Intelligence Theory, Kaifeng 475004, Peoples R China
关键词
Videos; Feature extraction; Filtering; Location awareness; Training; Kernel; Convolution; Accuracy; Annotations; Data mining; Weakly supervised learning; temporal action localization; multi-scale features; action recognition;
D O I
10.1109/lsp.2025.3575626; 10.1109/LSP.2025.3575626
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Weakly-supervised temporal action localiza- tion aims to identify action instances using only video-level labels, and localize the action position in untrimmed videos. Due to the temporal continuity of video data, most methods that use single scale convolution kernel cannot model against the characterization of video data effectively, and lead to a decrease in accuracy. However, simply using multi-scale features can introduce redundant information and noise, reducing model efficiency while also affecting the accurate judgement of the model during training process. To alleviate this problem, a video complicated-information extraction and filtering network (VCEF-Net) is proposed. It contains two main modules. The first multi-scale feature extraction module is developed to enrich the information that model received. The second pseudo-label filtering module inhibits redundant information interference. VCEF-Net introduces these two modules for achieving a better utilization of video information. Experiments tested on THUMOS14 and ActivityNet1.2 demonstrate better performances of the proposed VCEF-Net and validate its effectiveness.
引用
收藏
页码:2334 / 2338
页数:5
相关论文
共 34 条
[1]  
Heilbron FC, 2015, PROC CVPR IEEE, P961, DOI 10.1109/CVPR.2015.7298698
[2]   Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset [J].
Carreira, Joao ;
Zisserman, Andrew .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :4724-4733
[3]   Rethinking the Faster R-CNN Architecture for Temporal Action Localization [J].
Chao, Yu-Wei ;
Vijayanarasimhan, Sudheendra ;
Seybold, Bryan ;
Ross, David A. ;
Deng, Jia ;
Sukthankar, Rahul .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :1130-1139
[4]   Dual-Evidential Learning for Weakly-supervised Temporal Action Localization [J].
Chen, Mengyuan ;
Gao, Junyu ;
Yang, Shicai ;
Xu, Changsheng .
COMPUTER VISION - ECCV 2022, PT IV, 2022, 13664 :192-208
[5]   NWPU-MOC: A Benchmark for Fine-Grained Multicategory Object Counting in Aerial Images [J].
Gao, Junyu ;
Zhao, Liangliang ;
Li, Xuelong .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 :1-14
[6]   Fine-grained Temporal Contrastive Learning for Weakly-supervised Temporal Action Localization [J].
Gao, Junyu ;
Chen, Mengyuan ;
Xu, Changsheng .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :19967-19977
[7]   ASM-Loc: Action-aware Segment Modeling for Weakly-Supervised Temporal Action Localization [J].
He, Bo ;
Yang, Xitong ;
Kang, Le ;
Cheng, Zhiyu ;
Zhou, Xin ;
Shrivastava, Abhinav .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, :13915-13925
[8]   Cross-modal Consensus Network forWeakly Supervised Temporal Action Localization [J].
Hong, Fa-Ting ;
Feng, Jia-Chang ;
Xu, Dan ;
Shan, Ying ;
Zheng, Wei-Shi .
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, :1591-1599
[9]   Contrastive Tokens and Label Activation for Remote Sensing Weakly Supervised Semantic Segmentation [J].
Hu, Zaiyi ;
Gao, Junyu ;
Yuan, Yuan ;
Li, Xuelong .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 :1-11
[10]   Weakly Supervised Temporal Action Localization via Representative Snippet Knowledge Propagation [J].
Huang, Linjiang ;
Wang, Liang ;
Li, Hongsheng .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :3262-3271