共 53 条
[1]
Audio Visual Scene-Aware Dialog
[J].
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019),
2019,
:7550-7559
[2]
Allen-Zhu Z., 2022, INT C LEARN REPR
[3]
[Anonymous], 2011, Advances in neural information processing systems
[4]
Brown TB, 2020, ADV NEUR IN, V33
[5]
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts
[J].
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021,
2021,
:3557-3567
[6]
Chen X, 2016, 30 C NEURAL INFORM P, V29
[7]
Dai WL, 2022, FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), P2383
[8]
Visual Dialog
[J].
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017),
2017,
:1080-1089
[9]
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[10]
Ding ZY, 2024, AAAI CONF ARTIF INTE, P17907