Local Correspondence Network for Weakly Supervised Temporal Sentence Grounding

被引：54

作者：

Yang, Wenfei ^{[1
]}

Zhang, Tianzhu ^{[1
]}

Zhang, Yongdong ^{[1
]}

Wu, Feng ^{[1
]}

机构：

[1] Univ Sci & Technol China, Sch Informat Sci, Hefei 230027, Peoples R China

来源：

IEEE TRANSACTIONS ON IMAGE PROCESSING | 2021年 / 30卷

基金：

中国国家自然科学基金;

关键词：

Grounding; Annotations; Two dimensional displays; Training; Feature extraction; Computational modeling; Task analysis; Weakly supervised; temporal sentence grounding;

D O I：

10.1109/TIP.2021.3058614

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Weakly supervised temporal sentence grounding has better scalability and practicability than fully supervised methods in real-world application scenarios. However, most of existing methods cannot model the fine-grained video-text local correspondences well and do not have effective supervision information for correspondence learning, thus yielding unsatisfying performance. To address the above issues, we propose an end-to-end Local Correspondence Network (LCNet) for weakly supervised temporal sentence grounding. The proposed LCNet enjoys several merits. First, we represent video and text features in a hierarchical manner to model the fine-grained video-text correspondences. Second, we design a self-supervised cycle-consistent loss as a learning guidance for video and text matching. To the best of our knowledge, this is the first work to fully explore the fine-grained correspondences between video and text for temporal sentence grounding by using self-supervised learning. Extensive experimental results on two benchmark datasets demonstrate that the proposed LCNet significantly outperforms existing weakly supervised methods.

引用

页码：3252 / 3262

页数：11

共 50 条

[1] Contrastive Perturbation Network for Weakly Supervised Temporal Sentence Grounding
Han, Tingting
Lv, Yuanxin
Yu, Zhou
Yu, Jun
Fan, Jianping
Yuan, Liu
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT I, 2024, 14425 : 446 - 460
[2] Dual Semantic Reconstruction Network for Weakly Supervised Temporal Sentence Grounding
Tang, Kefan
He, Lihuo
Wang, Nannan
Gao, Xinbo
IEEE TRANSACTIONS ON MULTIMEDIA, 2025, 27 : 95 - 107
[3] Counterfactual contrastive learning for weakly supervised temporal sentence grounding
Xu, Yenan
Xu, Wanru
Miao, Zhenjiang
NEUROCOMPUTING, 2025, 624
[4] Adaptive proposal network based on generative adversarial learning for weakly supervised temporal sentence grounding
Wang, Weikang
Su, Yuting
Liu, Jing
Jing, Peiguang
PATTERN RECOGNITION LETTERS, 2024, 179 : 9 - 16
[5] Weakly Supervised Temporal Adjacent Network for Language Grounding
Wang, Yuechen
Deng, Jiajun
Zhou, Wengang
Li, Houqiang
IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 24 : 3276 - 3286
[6] Conditional Video-Text Reconstruction Network with Cauchy Mask for Weakly Supervised Temporal Sentence Grounding
Wei, Jueqi
Xu, Yuanwu
Chen, Mohan
Zhang, Yuejie
Feng, Rui
Gao, Shang
2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 1511 - 1516
[7] Query-aware multi-scale proposal network for weakly supervised temporal sentence grounding in videos
Zhou, Mingyao
Chen, Wenjing
Sun, Hao
Xie, Wei
Dong, Ming
Lu, Xiaoqiang
KNOWLEDGE-BASED SYSTEMS, 2024, 304
[8] Weakly Supervised Temporal Sentence Grounding with Gaussian-based Contrastive Proposal Learning
Zheng, Minghang
Huang, Yanjie
Chen, Qingchao
Peng, Yuxin
Liu, Yang
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 15534 - 15543
[9] Weakly Supervised Temporal Sentence Grounding with Uncertainty-Guided Self-training
Huang, Yifei
Yang, Lijin
Sato, Yoichi
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 18908 - 18918
[10] Reinforcement Learning with Multi-Policy Movement Strategy for Weakly Supervised Temporal Sentence Grounding
Jiang, Shan
Kong, Yuqiu
Zhang, Lihe
Yin, Baocai
APPLIED SCIENCES-BASEL, 2024, 14 (21):

← 1 2 3 4 5 →