共 60 条
- [1] Bahdanau D, 2016, Arxiv, DOI arXiv:1409.0473
- [2] One-Shot Video Object Segmentation [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 5320 - 5329
- [3] Multiple Temporal Fusion based Weakly-supervised Pre-training Techniques for Video Categorization [J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 7089 - 7093
- [4] Carion Nicolas, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12346), P213, DOI 10.1007/978-3-030-58452-8_13
- [5] Cheng HK, 2021, ADV NEUR IN, V34
- [6] Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 5555 - 5564
- [7] Cheng Ming-Ming, 2014, Global contrast based salient region detection, V37, P569
- [8] Cho Kyunghyun, 2014, EMNLP 2014 2014 C EM, P1724, DOI [DOI 10.3115/V1/D14-1179, 10.3115/v1/D14-1179]
- [9] Learning Contextual Transformer Network for Image Inpainting [J]. PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 2529 - 2538
- [10] Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171