共 61 条
[3]
The Unreasonable Effectiveness of CLIP Features for Image Captioning: An Experimental Analysis
[J].
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022,
2022,
:4661-4669
[4]
MUTAN: Multimodal Tucker Fusion for Visual Question Answering
[J].
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV),
2017,
:2631-2639
[5]
Chauhan Geeticka, 2020, Med Image Comput Comput Assist Interv, V12262, P529, DOI 10.1007/978-3-030-59713-9_51
[7]
CLIP-Art: Contrastive Pre-training for Fine-Grained Art Classification
[J].
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021,
2021,
:3951-3955
[8]
Dao I., 2023, P IEEE 16 INT S DEC, P1
[9]
Dao T., 2023, P IEEE INT C CONS EL, P1
[10]
Dao T. N., 2022, P IEEE INT C CONS EL, P1