共 51 条
[1]
Alam HMT, 2025, Arxiv, DOI arXiv:2412.16086
[2]
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
[J].
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2018,
:6077-6086
[3]
Banerjee S., 2005, P ACL WORKSH INTR EX, P65
[4]
Chen SZ, 2020, PROC CVPR IEEE, P9959, DOI 10.1109/CVPR42600.2020.00998
[5]
Chen ZH, 2020, PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), P1439
[6]
Cornia M, 2020, PROC CVPR IEEE, P10575, DOI 10.1109/CVPR42600.2020.01059
[7]
Cross-Domain Image Captioning with Discriminative Finetuning
[J].
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR,
2023,
:6935-6944
[8]
IIHT: Medical Report Generation with Image-to-Indicator Hierarchical Transformer
[J].
NEURAL INFORMATION PROCESSING, ICONIP 2023, PT VI,
2024, 14452
:57-71
[9]
Fan YJ, 2024, Arxiv, DOI arXiv:2409.00250
[10]
Han QH, 2024, Arxiv, DOI arXiv:2412.07141