Vision-knowledge fusion model for multi-domain medical report generation

被引:12
|
作者
Xu, Dexuan [1 ,2 ]
Zhu, Huashi [1 ,2 ]
Huang, Yu [1 ]
Jin, Zhi [3 ]
Ding, Weiping [4 ]
Li, Hang [5 ,6 ]
Ran, Menglong [5 ,6 ]
机构
[1] Peking Univ, Natl Engn Res Ctr Software Engn, Beijing 100871, Peoples R China
[2] Peking Univ, Sch Software & Microelect, Beijing 100871, Peoples R China
[3] Peking Univ, Key Lab High Confidence Software Technol, Beijing 100871, Peoples R China
[4] Nantong Univ, Sch Informat Sci & Technol, Nantong 226019, Peoples R China
[5] Peking Univ, Dept Dermatol, Hosp 1, Beijing 100034, Peoples R China
[6] Natl Clin Res Ctr Skin & Immune Dis, Beijing 100034, Peoples R China
关键词
Medical report generation; Knowledge graph; Multi-modal fusion; Graph neural network;
D O I
10.1016/j.inffus.2023.101817
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Medical report generation with knowledge graph is an essential task in the medical field. Although the existing knowledge graphs have many entities, their semantics are not sufficient due to the challenge of uniformly extracting and fusing the expert knowledge from different diseases. Therefore, it is necessary to automatically construct specific knowledge graph. In this paper, we propose a vision-knowledge fusion model based on medical images and knowledge graphs to fully utilize high-quality data from different diseases and languages. Firstly, we give a general method to automatically construct every domain knowledge graph based on medical standards. Secondly, we design a knowledge-based attention mechanism to effectively fuse image and knowledge. Then, we build a triples restoration module to obtain fine-grained knowledge, and the knowledge-based evaluation metrics are first proposed which are more reasonable and measurable from different dimensions. Finally, we conduct experiments to verify the effectiveness of our model on two different diseases datasets: the IU-Xray chest radiograph public dataset and the NCRC-DS dataset of Chinese dermoscopy reports we compiled. Our model outperforms previous benchmark methods and achieves excellent evaluation scores on both datasets. Additionally, interpretability and clinical usefulness of the model are validated and our method can be generalized to multiple domains and different diseases.
引用
收藏
页数:12
相关论文
共 44 条
  • [41] A Novel Deep Learning Model for Medical Report Generation by Inter-Intra Information Calibration
    Zhang, Junsan
    Shen, Xiuxuan
    Wan, Shaohua
    Goudos, Sotirios K.
    Wu, Jie
    Cheng, Ming
    Zhang, Weishan
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2023, 27 (10) : 5110 - 5121
  • [42] Cognitive knowledge graph generation for grid fault handling based on attention mechanism combined with multi-modal factor fusion
    Li, Zhenbin
    Huang, Zhigang
    Guo, Lingxu
    Shan, Lianfei
    Yu, Guangyao
    Chong, Zhiqiang
    Zhang, Yue
    COMPUTERS & ELECTRICAL ENGINEERING, 2023, 111
  • [43] A medical report generation method integrating teacher-student model and encoder-decoder network
    Zhang, Shujun
    Han, Qi
    Li, Jinsong
    Sun, Yukang
    Qin, Yuhua
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 94
  • [44] FedMRG: federated medical report generation via text-aware learning rate adjustment and multi-level prototype collaboration
    Metmer, Hichem
    Yang, Xiaoshan
    MULTIMEDIA SYSTEMS, 2025, 31 (02)