Medical visual question answering based on question-type reasoning and semantic space constraint

被引:11
|
作者
Wang, Meiling [1 ]
He, Xiaohai [1 ]
Liu, Luping [1 ]
Qing, Linbo [1 ]
Chen, Honggang [1 ]
Liu, Yan [2 ]
Ren, Chao [1 ]
机构
[1] Sichuan Univ, Coll Elect & Informat Engn, Chengdu 610065, Sichuan, Peoples R China
[2] Southwest Jiaotong Univ, Dept Neurol, Affiliated Hosp, Peoples Hosp 3, Chengdu, Sichuan, Peoples R China
基金
中国国家自然科学基金;
关键词
Medical visual question answering; Question -type reasoning; Semantic space constraint; Attention mechanism; DYNAMIC MEMORY NETWORKS; LANGUAGE;
D O I
10.1016/j.artmed.2022.102346
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Medical visual question answering (Med-VQA) aims to accurately answer clinical questions about medical images. Despite its enormous potential for application in the medical domain, the current technology is still in its infancy. Compared with general visual question answering task, Med-VQA task involve more demanding challenges. First, clinical questions about medical images are usually diverse due to different clinicians and the complexity of diseases. Consequently, noise is inevitably introduced when extracting question features. Second, Med-VQA task have always been regarded as a classification problem for predefined answers, ignoring the relationships between candidate responses. Thus, the Med-VQA model pays equal attention to all candidate answers when predicting answers. In this paper, a novel Med-VQA framework is proposed to alleviate the abovementioned problems. Specifically, we employed a question-type reasoning module severally to closed-ended and open-ended questions, thereby extracting the important information contained in the questions through an attention mechanism and filtering the noise to extract more valuable question features. To take advantage of the relational information between answers, we designed a semantic constraint space to calculate the similarity between the answers and assign higher attention to answers with high correlation. To evaluate the effectiveness of the proposed method, extensive experiments were conducted on a public dataset, namely VQA-RAD. Experimental results showed that the proposed method achieved better performance compared to other the state-ofthe-art methods. The overall accuracy, closed-ended accuracy, and open-ended accuracy reached 74.1 %, 82.7 %, and 60.9 %, respectively. It is worth noting that the absolute accuracy of the proposed method improved by 5.5 % for closed-ended questions.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] Improving Visual Question Answering by Semantic Segmentation
    Pham, Viet-Quoc
    Mishima, Nao
    Nakasu, Toshiaki
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2021, PT III, 2021, 12893 : 459 - 470
  • [22] STRUCTURED SEMANTIC REPRESENTATION FOR VISUAL QUESTION ANSWERING
    Yu, Dongchen
    Gao, Xing
    Xiong, Hongkai
    2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 2286 - 2290
  • [23] Improving reasoning with contrastive visual information for visual question answering
    Long, Yu
    Tang, Pengjie
    Wang, Hanli
    Yu, Jian
    ELECTRONICS LETTERS, 2021, 57 (20) : 758 - 760
  • [24] Cascade Reasoning Network for Text-based Visual Question Answering
    Liu, Fen
    Xu, Guanghui
    Wu, Qi
    Du, Qing
    Jia, Wei
    Tan, Mingkui
    MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 4060 - 4069
  • [25] Hierarchical reasoning based on perception action cycle for visual question answering
    Mohamud, Safaa Abdullahi Moallim
    Jalali, Amin
    Lee, Minho
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 241
  • [26] Visual question answering method based on relational reasoning and gating mechanism
    Wang X.
    Chen Q.-H.
    Sun Q.
    Jia Y.-B.
    Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2022, 56 (01): : 36 - 46
  • [27] Learning Hierarchical Reasoning for Text-Based Visual Question Answering
    Li, Caiyuan
    Du, Qinyi
    Wang, Qingqing
    Jin, Yaohui
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2021, PT III, 2021, 12893 : 305 - 316
  • [28] Visual Question Answering via Combining Inferential Attention and Semantic Space Mapping
    Liu, Yun
    Zhang, Xiaoming
    Huang, Feiran
    Zhou, Zhibo
    Zhao, Zhonghua
    Li, Zhoujun
    KNOWLEDGE-BASED SYSTEMS, 2020, 207
  • [29] Coarse-to-Fine Reasoning for Visual Question Answering
    Nguyen, Binh X.
    Tuong Do
    Huy Tran
    Tjiputra, Erman
    Tran, Quang D.
    Anh Nguyen
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 4557 - 4565
  • [30] Interpretable Visual Question Answering by Reasoning on Dependency Trees
    Cao, Qingxing
    Liang, Xiaodan
    Li, Bailin
    Lin, Liang
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (03) : 887 - 901