Generative Bias for Robust Visual Question Answering

被引:25
|
作者
Cho, Jae Won [1 ]
Kim, Dong-Jin [2 ]
Ryu, Hyeonggon [1 ]
Kweon, In So [1 ]
机构
[1] Korea Adv Inst Sci & Technol, Daejeon, South Korea
[2] Hanyang Univ, Seoul, South Korea
来源
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2023年
关键词
D O I
10.1109/CVPR52729.2023.01124
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The task of Visual Question Answering (VQA) is known to be plagued by the issue of VQA models exploiting biases within the dataset to make its final prediction. Various previous ensemble based debiasing methods have been proposed where an additional model is purposefully trained to be biased in order to train a robust target model. However, these methods compute the bias for a model simply from the label statistics of the training data or from single modal branches. In this work, in order to better learn the bias a target VQA model suffers from, we propose a generative method to train the bias model directly from the target model, called GenB. In particular, GenB employs a generative network to learn the bias in the target model through a combination of the adversarial objective and knowledge distillation. We then debias our target model with GenB as a bias model, and show through extensive experiments the effects of our method on various VQA bias datasets including VQA-CP2, VQA-CP1, GQA-OOD, and VQA-CE, and show state-of-the-art results with the LXMERT architecture on VQA-CP2.
引用
收藏
页码:11681 / 11690
页数:10
相关论文
共 50 条
  • [1] Bias-guided margin loss for robust Visual Question Answering
    Sun, Yanhan
    Qi, Jiangtao
    Zhu, Zhenfang
    Li, Kefeng
    Zhao, Liang
    Lv, Lei
    INFORMATION PROCESSING & MANAGEMENT, 2025, 62 (02)
  • [2] Robust Visual Question Answering utilizing Bias Instances and Label Imbalance
    Zhao, Liang
    Li, Kefeng
    Qi, Jiangtao
    Sun, Yanhan
    Zhu, Zhenfang
    KNOWLEDGE-BASED SYSTEMS, 2024, 305
  • [3] Robust Explanations for Visual Question Answering
    Patro, Badri N.
    Patel, Shivansh
    Namboodiri, Vinay P.
    2020 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2020, : 1566 - 1575
  • [4] Multimodal Prompt Retrieval for Generative Visual Question Answering
    Ossowski, Timothy
    Hu, Junjie
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 2518 - 2535
  • [5] Generative Models in Medical Visual Question Answering: A Survey
    Dong, Wenjie
    Shen, Shuhao
    Han, Yuqiang
    Tan, Tao
    Wu, Jian
    Xu, Hongxia
    APPLIED SCIENCES-BASEL, 2025, 15 (06):
  • [6] Overcoming Language Priors via Shuffling Language Bias for Robust Visual Question Answering
    Zhao, J.
    Yu, Z.
    Zhang, X.
    Yang, Y.
    IEEE ACCESS, 2023, 11 : 85980 - 85989
  • [7] Dataset bias: A case study for visual question answering
    Das A.
    Anjum S.
    Gurari D.
    Proceedings of the Association for Information Science and Technology, 2019, 56 (01): : 58 - 67
  • [8] Multiview Language Bias Reduction for Visual Question Answering
    Li, Pengju
    Tan, Zhiyi
    Bao, Bing-Kun
    IEEE MULTIMEDIA, 2023, 30 (01) : 91 - 99
  • [9] Explicit Bias Discovery in Visual Question Answering Models
    Manjunatha, Varun
    Saini, Nirat
    Davis, Larry S.
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 9554 - 9563
  • [10] Cycle-Consistency for Robust Visual Question Answering
    Shah, Meet
    Chen, Xinlei
    Rohrbach, Marcus
    Parikh, Devi
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 6642 - 6651