Expert Knowledge-Aware Image Difference Graph Representation Learning for Difference-Aware Medical Visual Question Answering

被引:8
|
作者
Hu, Xinyue [1 ]
Gu, Lin [2 ,3 ]
An, Qiyuan [1 ]
Zhang, Mengliang [1 ]
Liu, Liangchen [4 ]
Kobayashi, Kazuma [5 ]
Harada, Tatsuya [2 ,3 ]
Summers, Ronald M. [4 ]
Zhu, Yingying [1 ]
机构
[1] Univ Texas Arlington, Arlington, TX 76019 USA
[2] RIKEN, Tokyo, Japan
[3] Univ Tokyo, Tokyo, Japan
[4] NIH, Clin Ctr, Bethesda, MD 20892 USA
[5] Natl Canc Ctr, Res Inst, Tokyo, Japan
基金
日本学术振兴会; 美国国家卫生研究院;
关键词
visual question answering; medical imaging; datasets;
D O I
10.1145/3580305.3599819
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
To contribute to automating the medical vision-language model, we propose a novel Chest-Xray Difference Visual Question Answering (VQA) task. Given a pair of main and reference images, this task attempts to answer several questions on both diseases and, more importantly, the differences between them. This is consistent with the radiologist's diagnosis practice that compares the current image with the reference before concluding the report. We collect a new dataset, namely MIMIC-Diff-VQA, including 700,703 QA pairs from 164,324 pairs of main and reference images. Compared to existing medical VQA datasets, our questions are tailored to the Assessment-Diagnosis-Intervention-Evaluation treatment procedure used by clinical professionals. Meanwhile, we also propose a novel expert knowledge-aware graph representation learning model to address this task. The proposed baseline model leverages expert knowledge such as anatomical structure prior, semantic, and spatial knowledge to construct a multi-relationship graph, representing the image differences between two images for the image difference VQA task. The dataset and code can be found at https://github.com/Holipori/MIMIC-Diff-VQA. We believe this work would further push forward the medical vision language model.
引用
收藏
页码:4156 / 4165
页数:10
相关论文
共 50 条
  • [1] KVQA: Knowledge-Aware Visual Question Answering
    Shah, Sanket
    Mishra, Anand
    Yadati, Naganand
    Talukdar, Partha Pratim
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 8876 - 8884
  • [2] Knowledge-aware image understanding with multi-level visual representation enhancement for visual question answering
    Yan, Feng
    Li, Zhe
    Silamu, Wushour
    Li, Yanbing
    MACHINE LEARNING, 2024, 113 (06) : 3789 - 3805
  • [3] Knowledge-aware image understanding with multi-level visual representation enhancement for visual question answering
    Feng Yan
    Zhe Li
    Wushour Silamu
    Yanbing Li
    Machine Learning, 2024, 113 : 3789 - 3805
  • [4] Knowledge-aware adaptive graph network for commonsense question answering
    Kang, Long
    Li, Xiaoge
    An, Xiaochun
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2024, 62 (05) : 1305 - 1324
  • [5] VQA as a factoid question answering problem: A novel approach for knowledge-aware and explainable visual question answering
    Narayanan, Abhishek
    Rao, Abijna
    Prasad, Abhishek
    Natarajan, S.
    IMAGE AND VISION COMPUTING, 2021, 116
  • [6] NEWSKVQA: Knowledge-Aware News Video Question Answering
    Gupta, Pranay
    Gupta, Manish
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2022, PT III, 2022, 13282 : 3 - 15
  • [7] Explicit Knowledge Integration for Knowledge-Aware Visual Question Answering about Named Entities
    Adjali, Omar
    Grimal, Paul
    Ferret, Olivier
    Ghannay, Sahar
    Le Borgne, Herve
    PROCEEDINGS OF THE 2023 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2023, 2023, : 29 - 38
  • [8] KnowReQA: A Knowledge-aware Retrieval Question Answering System
    Wang, Chuanrui
    Bai, Jun
    Zhang, Xiaofeng
    Yan, Cen
    Ouyang, Yuanxin
    Rong, Wenge
    Xiong, Zhang
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT I, 2022, 13368 : 709 - 721
  • [9] Question-aware dynamic scene graph of local semantic representation learning for visual question answering
    Wu, Jinmeng
    Ge, Fulin
    Hong, Hanyu
    Shi, Yu
    Hao, Yanbin
    Ma, Lei
    PATTERN RECOGNITION LETTERS, 2023, 170 : 93 - 99
  • [10] Knowledge-Aware Self-supervised Graph Representation Learning for Recommendation
    Sun, Yeheng
    Zhu, Jinghua
    Xi, Heran
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2022, PT IV, 2022, 13532 : 420 - 432