共 39 条
[31]
Su W., 2020, INT C LEARNING REPRE
[32]
Transform and Tell: Entity-Aware News Image Captioning
[J].
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020),
2020,
:13032-13042
[33]
Unal Mesut Erhan, 2022, IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)
[34]
Wei Fangyun, 2021, Advances in Neural Information Processing Systems, V34
[35]
Wolf T, 2020, PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING: SYSTEM DEMONSTRATIONS, P38
[36]
Yuan L., 2021, arXiv, DOI DOI 10.48550/ARXIV.2111.11432
[37]
Open-Vocabulary Object Detection Using Captions
[J].
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021,
2021,
:14388-14397
[38]
RegionCLIP: Region-based Language-Image Pretraining
[J].
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022),
2022,
:16772-16782
[39]
DAP: Detection-Aware Pre-training with Weak Supervision
[J].
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021,
2021,
:4535-4544