共 126 条
- [1] ViViT: A Video Vision Transformer [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 6816 - 6826
- [2] Ba JL, 2016, arXiv
- [3] Bertasius G, 2021, PR MACH LEARN RES, V139
- [4] Brown TB, 2020, ADV NEUR IN, V33
- [5] Carion Nicolas, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12346), P213, DOI 10.1007/978-3-030-58452-8_13
- [6] Cazenavette George, 2021, arXiv, DOI DOI 10.48550/ARXIV.2105.14110
- [7] Chen K, 2019, Arxiv, DOI arXiv:1906.07155
- [8] Chen LC, 2017, Arxiv, DOI arXiv:1706.05587
- [9] Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation [J]. COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 : 833 - 851
- [10] Choe J, 2022, Arxiv, DOI arXiv:2111.11187