Expert Knowledge-Aware Image Difference Graph Representation Learning for Difference-Aware Medical Visual Question Answering

被引:8
|
作者
Hu, Xinyue [1 ]
Gu, Lin [2 ,3 ]
An, Qiyuan [1 ]
Zhang, Mengliang [1 ]
Liu, Liangchen [4 ]
Kobayashi, Kazuma [5 ]
Harada, Tatsuya [2 ,3 ]
Summers, Ronald M. [4 ]
Zhu, Yingying [1 ]
机构
[1] Univ Texas Arlington, Arlington, TX 76019 USA
[2] RIKEN, Tokyo, Japan
[3] Univ Tokyo, Tokyo, Japan
[4] NIH, Clin Ctr, Bethesda, MD 20892 USA
[5] Natl Canc Ctr, Res Inst, Tokyo, Japan
基金
日本学术振兴会; 美国国家卫生研究院;
关键词
visual question answering; medical imaging; datasets;
D O I
10.1145/3580305.3599819
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
To contribute to automating the medical vision-language model, we propose a novel Chest-Xray Difference Visual Question Answering (VQA) task. Given a pair of main and reference images, this task attempts to answer several questions on both diseases and, more importantly, the differences between them. This is consistent with the radiologist's diagnosis practice that compares the current image with the reference before concluding the report. We collect a new dataset, namely MIMIC-Diff-VQA, including 700,703 QA pairs from 164,324 pairs of main and reference images. Compared to existing medical VQA datasets, our questions are tailored to the Assessment-Diagnosis-Intervention-Evaluation treatment procedure used by clinical professionals. Meanwhile, we also propose a novel expert knowledge-aware graph representation learning model to address this task. The proposed baseline model leverages expert knowledge such as anatomical structure prior, semantic, and spatial knowledge to construct a multi-relationship graph, representing the image differences between two images for the image difference VQA task. The dataset and code can be found at https://github.com/Holipori/MIMIC-Diff-VQA. We believe this work would further push forward the medical vision language model.
引用
收藏
页码:4156 / 4165
页数:10
相关论文
共 50 条
  • [41] Object-difference drived graph convolutional networks for visual question answering
    Xi Zhu
    Zhendong Mao
    Zhineng Chen
    Yangyang Li
    Zhaohui Wang
    Bin Wang
    Multimedia Tools and Applications, 2021, 80 : 16247 - 16265
  • [42] Object-difference drived graph convolutional networks for visual question answering
    Zhu, Xi
    Mao, Zhendong
    Chen, Zhineng
    Li, Yangyang
    Wang, Zhaohui
    Wang, Bin
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (11) : 16247 - 16265
  • [43] Knowledge-aware Graph Attention Network with Distributed & Cross Learning for Collaborative Recommendation
    Dai, Yang
    Meng, Sliunmei
    Liu, Qiyan
    Liu, Xiao
    2022 IEEE INTL CONF ON PARALLEL & DISTRIBUTED PROCESSING WITH APPLICATIONS, BIG DATA & CLOUD COMPUTING, SUSTAINABLE COMPUTING & COMMUNICATIONS, SOCIAL COMPUTING & NETWORKING, ISPA/BDCLOUD/SOCIALCOM/SUSTAINCOM, 2022, : 294 - 301
  • [44] Knowledge-aware recommendation based on hypergraph representation learning and transformer model optimization
    Zuo, Yuqi
    Zhang, Yunfeng
    Zhang, Qiuyue
    Zhang, Wenbo
    APPLIED INTELLIGENCE, 2025, 55 (05)
  • [45] Deep Knowledge Graph Representation Learning for Completion, Alignment, and Question Answering
    Chakrabarti, Soumen
    PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 3451 - 3454
  • [46] Time-Aware Representation Learning for Time-Sensitive Question Answering
    Son, Jungbin
    Oh, Alice
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 70 - 77
  • [47] See and Learn More: Dense Caption-Aware Representation for Visual Question Answering
    Bi, Yandong
    Jiang, Huajie
    Hu, Yongli
    Sun, Yanfeng
    Yin, Baocai
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (02) : 1135 - 1146
  • [48] Semi-supervised Medical Image Classification with Temporal Knowledge-Aware Regularization
    Yang, Qiushi
    Liu, Xinyu
    Chen, Zhen
    Ibragimov, Bulat
    Yuan, Yixuan
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2022, PT VIII, 2022, 13438 : 119 - 129
  • [49] KaFSP: Knowledge-Aware Fuzzy Semantic Parsing for Conversational Question Answering over a Large-Scale Knowledge Base
    Li, Junzhuo
    Xiong, Deyi
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 461 - 473
  • [50] Enhancing question answering in educational knowledge bases using question-aware graph convolutional network
    He, Ping
    Chen, Jingfang
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 45 (06) : 12037 - 12048