共 96 条
[71]
Purushwalkam S, 2020, ADV NEUR IN
[72]
Qian R, 2020, ARXIV200803800
[74]
Richemond Pierre H, 2020, ARXIV201010241
[75]
Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding
[J].
COMPUTER VISION - ECCV 2016, PT I,
2016, 9905
:510-526
[76]
Simonyan K, 2015, Arxiv, DOI arXiv:1409.1556
[77]
Soomro K., 2012, CRCVTR1201, V1212, P0402
[78]
Srivastava N, 2015, PR MACH LEARN RES, V37, P843
[79]
Sun Chen, 2019, Learning video representations using contrastive bidirectional transformer
[80]
Szegedy C, 2015, PROC CVPR IEEE, P1, DOI 10.1109/CVPR.2015.7298594