A Survey on Representation Learning in Visual Question Answering

被引:0
|
作者
Sahani, Manish [1 ]
Singh, Priyadarshan [1 ]
Jangpangi, Sachin [1 ]
Kumar, Shailender [1 ]
机构
[1] Delhi Technol Univ, Delhi, India
关键词
Computer vision; Visual Question Answering; Natural language processing; Representation learning;
D O I
10.1007/978-3-030-82469-3_29
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Visual question answering stands among the most researched computer vision problems, pattern recognition, and natural language processing. VQA extends the computer vision world's challenges and directs us toward developing some basic reasonings on visual scenes to answer questions on the specific elements, actions, and relationships between different objects in the image. Developing reasonings on the image has always been popular among computer vision and natural language processing researchers. It is directly dependent on the expressivity of the representations learned from the datasets. In the past decade, with advancements in computing machinery, neural networks, and the introduction of highly optimized and efficient software, a substantial amount of research has been done to solve VQA efficiently. In this survey, we present an in-depth examination of representation learning of state-of-the-art methods proposed in the literature of VQA and compare them to discuss the future directions in the field.
引用
收藏
页码:326 / 336
页数:11
相关论文
共 50 条
  • [1] Survey on Visual Question Answering
    Bao X.-G.
    Zhou C.-L.
    Xiao K.-J.
    Qin B.
    Ruan Jian Xue Bao/Journal of Software, 2021, 32 (08): : 2522 - 2544
  • [2] A survey of deep learning-based visual question answering
    Huang, Tong-yuan
    Yang, Yu-ling
    Yang, Xue-jiao
    JOURNAL OF CENTRAL SOUTH UNIVERSITY, 2021, 28 (03) : 728 - 746
  • [3] Visual Question Answering with Question Representation Update (QRU)
    Li, Ruiyu
    Jia, Jiaya
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [4] Adversarial Learning of Answer-Related Representation for Visual Question Answering
    Liu, Yun
    Zhang, Xiaoming
    Huang, Feiran
    Li, Zhoujun
    CIKM'18: PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2018, : 1013 - 1022
  • [5] Diffusion-based Visual Representation Learning for Medical Question Answering
    Bian, Dexin
    Wang, Xiaoru
    Li, Meifang
    ASIAN CONFERENCE ON MACHINE LEARNING, VOL 222, 2023, 222
  • [6] Medical visual question answering: A survey
    Lin, Zhihong
    Zhang, Donghao
    Tao, Qingyi
    Shi, Danli
    Haffari, Gholamreza
    Wu, Qi
    He, Mingguang
    Ge, Zongyuan
    ARTIFICIAL INTELLIGENCE IN MEDICINE, 2023, 143
  • [7] STRUCTURED SEMANTIC REPRESENTATION FOR VISUAL QUESTION ANSWERING
    Yu, Dongchen
    Gao, Xing
    Xiong, Hongkai
    2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 2286 - 2290
  • [8] A Bi-level representation learning model for medical visual question answering
    Li, Yong
    Long, Shaopei
    Yang, Zhenguo
    Weng, Heng
    Zeng, Kun
    Huang, Zhenhua
    Wang, Fu Lee
    Hao, Tianyong
    JOURNAL OF BIOMEDICAL INFORMATICS, 2022, 134
  • [9] Multitask Learning for Visual Question Answering
    Ma, Jie
    Liu, Jun
    Lin, Qika
    Wu, Bei
    Wang, Yaxian
    You, Yang
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (03) : 1380 - 1394
  • [10] Multi-Question Learning for Visual Question Answering
    Lei, Chenyi
    Wu, Lei
    Liu, Dong
    Li, Zhao
    Wang, Guoxin
    Tang, Haihong
    Li, Houqiang
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 11328 - 11335