共 50 条
[21]
VLANet: Video-Language Alignment Network for Weakly-Supervised Video Moment Retrieval
[J].
COMPUTER VISION - ECCV 2020, PT XXVIII,
2020, 12373
:156-171
[23]
Weakly-supervised Visual Grounding of Phrases with Linguistic Structures
[J].
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017),
2017,
:5253-5262
[25]
Bi-calibration Networks for Weakly-Supervised Video Representation Learning
[J].
International Journal of Computer Vision,
2023, 131
:1704-1721
[26]
Not All Frames Are Equal: Weakly-Supervised Video Grounding with Contextual Similarity and Visual Clustering Losses
[J].
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019),
2019,
:10436-10444
[27]
Local-Global Multi-Modal Distillation for Weakly-Supervised Temporal Video Grounding
[J].
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 2,
2024,
:738-746
[29]
Semi-supervised Video Paragraph Grounding with Contrastive Encoder
[J].
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022),
2022,
:2456-2465
[30]
Weakly-supervised learning of visual relations
[J].
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV),
2017,
:5189-5198