Multimodal Graph Transformer for Multimodal Question Answering

被引:0
|
作者
He, Xuehai [1 ]
Wang, Xin Eric [1 ]
机构
[1] UC Santa Cruz, United States
关键词
Compilation and indexing terms; Copyright 2024 Elsevier Inc;
D O I
17th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2023
中图分类号
学科分类号
摘要
Computational linguistics - Natural language processing systems - Semantics
引用
收藏
页码:189 / 200
相关论文
共 50 条
  • [31] Multimodal Encoders and Decoders with Gate Attention for Visual Question Answering
    Li, Haiyan
    Han, Dezhi
    COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2021, 18 (03) : 1023 - 1040
  • [32] Multimodal fusion: advancing medical visual question-answering
    Mudgal, Anjali
    Kush, Udbhav
    Kumar, Aditya
    Jafari, Amir
    Neural Computing and Applications, 2024, 36 (33) : 20949 - 20962
  • [33] Multimodal Question Answering over Structured Data with Ambiguous Entities
    Li, Huadong
    Wang, Yafang
    de Melo, Gerard
    Tu, Changhe
    Chen, Baoquan
    WWW'17 COMPANION: PROCEEDINGS OF THE 26TH INTERNATIONAL CONFERENCE ON WORLD WIDE WEB, 2017, : 79 - 88
  • [34] Multimodal Local Perception Bilinear Pooling for Visual Question Answering
    Lao, Mingrui
    Guo, Yanming
    Wang, Hui
    Zhang, Xin
    IEEE ACCESS, 2018, 6 : 57923 - 57932
  • [35] FlowVQA: Mapping Multimodal Logic in Visual Question Answering with Flowcharts
    Singh, Shubhankar
    Chaurasia, Purvi
    Varun, Yerram
    Pandya, Pranshu
    Gupta, Vatsal
    Gupta, Vivek
    Roth, Dan
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 1330 - 1350
  • [36] Hierarchical Conditional Relation Networks for Multimodal Video Question Answering
    Le, Thao Minh
    Le, Vuong
    Venkatesh, Svetha
    Tran, Truyen
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2021, 129 (11) : 3027 - 3050
  • [37] Dual-Key Multimodal Backdoors for Visual Question Answering
    Walmer, Matthew
    Sikka, Karan
    Sur, Indranil
    Shrivastava, Abhinav
    Jha, Susmit
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 15354 - 15364
  • [38] Prompt-Enhanced Generation for Multimodal Open Question Answering
    Cui, Chenhao
    Li, Zhoujun
    ELECTRONICS, 2024, 13 (08)
  • [39] EduVQA: A multimodal Visual Question Answering framework for smart education
    Xiao, Jiongen
    Zhang, Zifeng
    ALEXANDRIA ENGINEERING JOURNAL, 2025, 122 : 615 - 624
  • [40] A Universal Quaternion Hypergraph Network for Multimodal Video Question Answering
    Guo, Zhicheng
    Zhao, Jiaxuan
    Jiao, Licheng
    Liu, Xu
    Liu, Fang
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 38 - 49