共 80 条
[71]
Yuan Y., 2019, PROC NEURIPS, P536
[72]
Yuan YT, 2019, AAAI CONF ARTIF INTE, P9159
[73]
Dense Regression Network for Video Grounding
[J].
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020),
2020,
:10284-10293
[74]
Zhang B., 2008, PROC INT SYMPOSIUMS, V27, P703
[75]
Zhang Hao, 2022, ARXIV220108071
[76]
Zhang S., 2019, ARXIV COMPUTER VISIO
[77]
Cross-Modal Interaction Networks for Query-Based Moment Retrieval in Videos
[J].
PROCEEDINGS OF THE 42ND INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '19),
2019,
:655-664
[78]
Embracing Uncertainty: Decoupling and De-bias for Robust Temporal Grounding
[J].
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021,
2021,
:8441-8450
[79]
Zhou Z. H., 2004, Multi-Instance Learning: A Survey
[80]
Vision-Language Navigation with Self-Supervised Auxiliary Reasoning Tasks
[J].
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020),
2020,
:10009-10019