Generative Bias for Robust Visual Question Answering

被引：25

作者：

Cho, Jae Won ^{[1
]}

Kim, Dong-Jin ^{[2
]}

Ryu, Hyeonggon ^{[1
]}

Kweon, In So ^{[1
]}

机构：

[1] Korea Adv Inst Sci & Technol, Daejeon, South Korea

[2] Hanyang Univ, Seoul, South Korea

来源：

2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2023年

关键词：

D O I：

10.1109/CVPR52729.2023.01124

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The task of Visual Question Answering (VQA) is known to be plagued by the issue of VQA models exploiting biases within the dataset to make its final prediction. Various previous ensemble based debiasing methods have been proposed where an additional model is purposefully trained to be biased in order to train a robust target model. However, these methods compute the bias for a model simply from the label statistics of the training data or from single modal branches. In this work, in order to better learn the bias a target VQA model suffers from, we propose a generative method to train the bias model directly from the target model, called GenB. In particular, GenB employs a generative network to learn the bias in the target model through a combination of the adversarial objective and knowledge distillation. We then debias our target model with GenB as a bias model, and show through extensive experiments the effects of our method on various VQA bias datasets including VQA-CP2, VQA-CP1, GQA-OOD, and VQA-CE, and show state-of-the-art results with the LXMERT architecture on VQA-CP2.

引用

页码：11681 / 11690

页数：10

共 50 条

[1] Bias-guided margin loss for robust Visual Question Answering
Sun, Yanhan
Qi, Jiangtao
Zhu, Zhenfang
Li, Kefeng
Zhao, Liang
Lv, Lei
INFORMATION PROCESSING & MANAGEMENT, 2025, 62 (02)
[2] Robust Visual Question Answering utilizing Bias Instances and Label Imbalance
Zhao, Liang
Li, Kefeng
Qi, Jiangtao
Sun, Yanhan
Zhu, Zhenfang
KNOWLEDGE-BASED SYSTEMS, 2024, 305
[3] Robust Explanations for Visual Question Answering
Patro, Badri N.
Patel, Shivansh
Namboodiri, Vinay P.
2020 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2020, : 1566 - 1575
[4] Multimodal Prompt Retrieval for Generative Visual Question Answering
Ossowski, Timothy
Hu, Junjie
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 2518 - 2535
[5] Generative Models in Medical Visual Question Answering: A Survey
Dong, Wenjie
Shen, Shuhao
Han, Yuqiang
Tan, Tao
Wu, Jian
Xu, Hongxia
APPLIED SCIENCES-BASEL, 2025, 15 (06):
[6] Overcoming Language Priors via Shuffling Language Bias for Robust Visual Question Answering
Zhao, J.
Yu, Z.
Zhang, X.
Yang, Y.
IEEE ACCESS, 2023, 11 : 85980 - 85989
[7] Dataset bias: A case study for visual question answering
Das A.
Anjum S.
Gurari D.
Proceedings of the Association for Information Science and Technology, 2019, 56 (01): : 58 - 67
[8] Multiview Language Bias Reduction for Visual Question Answering
Li, Pengju
Tan, Zhiyi
Bao, Bing-Kun
IEEE MULTIMEDIA, 2023, 30 (01) : 91 - 99
[9] Explicit Bias Discovery in Visual Question Answering Models
Manjunatha, Varun
Saini, Nirat
Davis, Larry S.
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 9554 - 9563
[10] Cycle-Consistency for Robust Visual Question Answering
Shah, Meet
Chen, Xinlei
Rohrbach, Marcus
Parikh, Devi
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 6642 - 6651

← 1 2 3 4 5 →