Generative Bias for Robust Visual Question Answering

被引:25
作者
Cho, Jae Won [1 ]
Kim, Dong-Jin [2 ]
Ryu, Hyeonggon [1 ]
Kweon, In So [1 ]
机构
[1] Korea Adv Inst Sci & Technol, Daejeon, South Korea
[2] Hanyang Univ, Seoul, South Korea
来源
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2023年
关键词
D O I
10.1109/CVPR52729.2023.01124
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The task of Visual Question Answering (VQA) is known to be plagued by the issue of VQA models exploiting biases within the dataset to make its final prediction. Various previous ensemble based debiasing methods have been proposed where an additional model is purposefully trained to be biased in order to train a robust target model. However, these methods compute the bias for a model simply from the label statistics of the training data or from single modal branches. In this work, in order to better learn the bias a target VQA model suffers from, we propose a generative method to train the bias model directly from the target model, called GenB. In particular, GenB employs a generative network to learn the bias in the target model through a combination of the adversarial objective and knowledge distillation. We then debias our target model with GenB as a bias model, and show through extensive experiments the effects of our method on various VQA bias datasets including VQA-CP2, VQA-CP1, GQA-OOD, and VQA-CE, and show state-of-the-art results with the LXMERT architecture on VQA-CP2.
引用
收藏
页码:11681 / 11690
页数:10
相关论文
共 50 条
  • [31] HCCL: Hierarchical Counterfactual Contrastive Learning for Robust Visual Question Answering
    Hao, Dongze
    Wang, Qunbo
    Zhu, Xinxin
    Liu, Jing
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (10)
  • [32] Robust visual question answering via semantic cross modal augmentation
    Mashrur, Akib
    Luo, Wei
    Zaidi, Nayyar A.
    Robles-Kelly, Antonio
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 238
  • [33] Robust Visual Question Answering Based on Counterfactual Samples and Relationship Perception
    Qin, Hong
    An, Gaoyun
    Ruan, Qiuqi
    IMAGE AND GRAPHICS TECHNOLOGIES AND APPLICATIONS, IGTA 2021, 2021, 1480 : 145 - 158
  • [34] Robust data augmentation and contrast learning for debiased visual question answering
    Ning, Ke
    Li, Zhixin
    NEUROCOMPUTING, 2025, 626
  • [35] GViG: Generative Visual Grounding Using Prompt-Based Language Modeling for Visual Question Answering
    Li, Yi-Ting
    Lin, Ying-Jia
    Yeh, Chia-Jen
    Lin, Chun-Yi
    Kao, Hung-Yu
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PT VI, PAKDD 2024, 2024, 14650 : 83 - 94
  • [36] Dataset Bias Mitigation in Multiple-Choice Visual Question Answering and Beyond
    Wang, Zhecan
    Chen, Long
    You, Haoxuan
    Xu, Keyang
    He, Yicheng
    Li, Wenhao
    Codella, Noel
    Chang, Kai-Wei
    Chang, Shih-Fu
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 8598 - 8617
  • [37] A Causal Approach to Mitigate Modality Preference Bias in Medical Visual Question Answering
    Ye, Shuchang
    Naseem, Usman
    Meng, Mingyuan
    Feng, Dagan
    Kim, Jinman
    PROCEEDINGS OF THE FIRST INTERNATIONAL WORKSHOP ON VISION-LANGUAGE MODELS FOR BIOMEDICAL APPLICATIONS, VLM4BIO 2024, 2024, : 13 - 17
  • [38] Multi-stage Reasoning on Introspecting and Revising Bias for Visual Question Answering
    L., An-An
    Lu, Zimu
    Xu, Ning
    Liu, Min
    Yan, Chenggang
    Zheng, Bolun
    Lv, Bo
    Duan, Yulong
    Shao, Zhuang
    Xuanya, Li
    ACM TRANSACTIONS ON THE WEB, 2024, 18 (04)
  • [39] GGM: Graph Generative Modeling for Out-of-Distribution Generalization in Visual Question Answering
    Jiang, Jingjing
    Liu, Ziyi
    Liu, Yifan
    Nan, Zhixiong
    Zheng, Nanning
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 199 - 208
  • [40] LiGT: layout-infused generative transformer for visual question answering on Vietnamese receipts
    Le, Thanh-Phong
    Phan, Trung Le Chi
    Nguyen, Nghia Hieu
    Van Nguyen, Kiet
    INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2025,