VLC-BERT: Visual Question Answering with Contextualized Commonsense Knowledge

被引:23
|
作者
Ravi, Sahithya [1 ,2 ]
Chinchure, Aditya [1 ,2 ]
Sigal, Leonid [1 ,2 ]
Liao, Renjie [1 ]
Shwartz, Vered [1 ,2 ]
机构
[1] Univ British Columbia, Vancouver, BC, Canada
[2] Vector Inst AI, Toronto, ON, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
D O I
10.1109/WACV56688.2023.00121
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
There has been a growing interest in solving Visual Question Answering (VQA) tasks that require the model to reason beyond the content present in the image. In this work, we focus on questions that require commonsense reasoning. In contrast to previous methods which inject knowledge from static knowledge bases, we investigate the incorporation of contextualized knowledge using Commonsense Transformer (COMET), an existing knowledge model trained on human-curated knowledge bases. We propose a method to generate, select, and encode external commonsense knowledge alongside visual and textual cues in a new pre-trained Vision-Language-Commonsense transformer model, VLC-BERT. Through our evaluation on the knowledge-intensive OK-VQA and A-OKVQA datasets, we show that VLC-BERT is capable of outperforming existing models that utilize static knowledge bases. Furthermore, through a detailed analysis, we explain which questions benefit, and which don't, from contextualized commonsense knowledge from COMET. Code: https://github.com/aditya10/VLC-BERT
引用
收藏
页码:1155 / 1165
页数:11
相关论文
共 50 条
  • [21] Distinguish Before Answer: Generating Contrastive Explanation as Knowledge for Commonsense Question Answering
    Chen, Qianglong
    Xu, Guohai
    Yang, Ming
    Zhang, Ji
    Huang, Fei
    Si, Luo
    Zhang, Yin
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 13207 - 13224
  • [22] JointLK: Joint Reasoning with Language Models and Knowledge Graphs for Commonsense Question Answering
    Sun, Yueqing
    Shi, Qi
    Qi, Le
    Zhang, Yu
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 5049 - 5060
  • [23] Benchmarking Knowledge-Enhanced Commonsense Question Answering via Knowledge-to-Text Transformation
    Bian, Ning
    Han, Xianpei
    Chen, Bo
    Sun, Le
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 12574 - 12582
  • [24] Learning Visual Knowledge Memory Networks for Visual Question Answering
    Su, Zhou
    Zhu, Chen
    Dong, Yinpeng
    Cai, Dongqi
    Chen, Yurong
    Li, Jianguo
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 7736 - 7745
  • [25] BERT Representations for Video Question Answering
    Yang, Zekun
    Garcia, Noa
    Chu, Chenhui
    Otani, Mayu
    Nakashima, Yuta
    Takemura, Haruo
    2020 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2020, : 1545 - 1554
  • [26] CooKie: commonsense knowledge-guided mixture-of-experts framework for fine-grained visual question answering
    Wang, Chao
    Yang, Jianming
    Zhou, Yang
    Yue, Xiaodong
    INFORMATION SCIENCES, 2025, 695
  • [27] Learning Contextualized Knowledge Structures for Commonsense Reasoning
    Yan, Jun
    Raman, Mrigank
    Chan, Aaron
    Zhang, Tianyu
    Rossi, Ryan
    Zhao, Handong
    Kim, Sungchul
    Lipka, Nedim
    Ren, Xiang
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 4038 - 4051
  • [28] Chop Chop BERT: Visual Question Answering by Chopping VisualBERT's Heads
    Gao, Chenyu
    Zhu, Qi
    Wang, Peng
    Wu, Qi
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 664 - 670
  • [29] Question answering over knowledge graphs using BERT based relation mapping
    Suneera, C. M.
    Prakash, Jay
    Singh, Pramod Kumar
    EXPERT SYSTEMS, 2023, 40 (10)
  • [30] BB-KBQA: BERT-Based Knowledge Base Question Answering
    Liu, Aiting
    Huang, Ziqi
    Lu, Hengtong
    Wang, Xiaojie
    Yuan, Caixia
    CHINESE COMPUTATIONAL LINGUISTICS, CCL 2019, 2019, 11856 : 81 - 92