共 51 条
[1]
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[2]
Escorcia V, 2022, Arxiv, DOI arXiv:1907.12763
[3]
SlowFast Networks for Video Recognition
[J].
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019),
2019,
:6201-6210
[4]
TALL: Temporal Activity Localization via Language Query
[J].
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV),
2017,
:5277-5285
[5]
Fast Video Moment Retrieval
[J].
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021),
2021,
:1503-1512
[6]
Gemmeke JF, 2017, INT CONF ACOUST SPEE, P776, DOI 10.1109/ICASSP.2017.7952261
[7]
Video2GIF: Automatic Generation of Animated GIFs from Video
[J].
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2016,
:1001-1009
[8]
HaoWang Zheng-Jun Zha, 2022, Semantic and relation modulation for audio-visual event localization
[9]
Localizing Moments in Video with Natural Language
[J].
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV),
2017,
:5804-5813
[10]
Cross-Sentence Temporal and Semantic Relations in Video Activity Localisation
[J].
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021),
2021,
:7179-7188