Weakly guided attention model with hierarchical interaction for brain CT report generation

被引:3
作者
Zhang, Xiaodan [1 ]
Yang, Sisi [1 ]
Shi, Yanzhao [1 ]
Ji, Junzhong [1 ]
Liu, Ying [2 ]
Wang, Zheng [2 ]
Xu, Huimin [2 ]
机构
[1] Beijing Univ Technol, Fac Informat Technol, Beijing, Peoples R China
[2] Peking Univ Third Hosp, Dept Radiol, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Weakly guided attention; Hierarchical interaction; Brain CT; Medical report generation; NETWORK;
D O I
10.1016/j.compbiomed.2023.107650
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Brain Computed Tomography (CT) report generation, which aims to assist radiologists in diagnosing cerebrovascular diseases efficiently, is challenging in feature representation for dozens of images and language descriptions with several sentences. Existing report generation methods have achieved significant achievement based on the encoder-decoder framework and attention mechanism. However, current research has limitations in solving the many-to-many alignment between the multi-images of Brain CT imaging and the multi-sentences of Brain CT report, and fails to attend to critical images and lesion areas, resulting in inaccurate descriptions. In this paper, we propose a novel Weakly Guided Attention Model with Hierarchical Interaction, named WGAM-HI, to improve Brain CT report generation. Specifically, WGAM-HI conducts many-to-many matching for multiple visual images and semantic sentences via a hierarchical interaction framework with a two -layer attention model and a two-layer report generator. In addition, two weakly guided mechanisms are proposed to facilitate the attention model to focus more on important images and lesion areas under the guidance of pathological events and Gradient-weighted Class Activation Mapping (Grad-CAM) respectively. The pathological event acts as a bridge between the essential serial images and the corresponding sentence, and the Grad-CAM bridges the lesion areas and pathology words. Therefore, under the hierarchical interaction with the weakly guided attention model, the report generator generates more accurate words and sentences. Experiments on the Brain CT dataset demonstrate the effectiveness of WGAM-HI in attending to important images and lesion areas gradually, and generating more accurate reports.
引用
收藏
页数:12
相关论文
共 52 条
  • [21] Liu FL, 2021, 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1, P3001
  • [22] Grounded Video Description
    Zhou, Luowei
    Kalantidis, Yannis
    Chen, Xinlei
    Corso, Jason J.
    Rohrbach, Marcus
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 6571 - 6580
  • [23] Nooralahzadeh F, 2021, FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, P2824
  • [24] Dual-Sampling Attention Network for Diagnosis of COVID-19 From Community Acquired Pneumonia
    Ouyang, Xi
    Huo, Jiayu
    Xia, Liming
    Shan, Fei
    Liu, Jun
    Mo, Zhanhao
    Yan, Fuhua
    Ding, Zhongxiang
    Yang, Qi
    Song, Bin
    Shi, Feng
    Yuan, Huan
    Wei, Ying
    Cao, Xiaohuan
    Gao, Yaozong
    Wu, Dijia
    Wang, Qian
    Shen, Dinggang
    [J]. IEEE TRANSACTIONS ON MEDICAL IMAGING, 2020, 39 (08) : 2595 - 2605
  • [25] Özbey M, 2022, Arxiv, DOI arXiv:2207.08208
  • [26] Pan YW, 2020, PROC CVPR IEEE, P10968, DOI 10.1109/CVPR42600.2020.01098
  • [27] BLEU: a method for automatic evaluation of machine translation
    Papineni, K
    Roukos, S
    Ward, T
    Zhu, WJ
    [J]. 40TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, 2002, : 311 - 318
  • [28] Adversarial Inference for Multi-Sentence Video Description
    Park, Jae Sung
    Rohrbach, Marcus
    Darrell, Trevor
    Rohrbach, Anna
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 6591 - 6601
  • [29] Qin H., 2022, Findings of the Association for Computational Linguistics
  • [30] Multi-level thresholding segmentation for pathological images: Optimal performance design of a new modified differential evolution
    Ren, Lili
    Zhao, Dong
    Zhao, Xuehua
    Chen, Weibin
    Li, Lingzhi
    Wu, TaiSong
    Liang, Guoxi
    Cai, Zhennao
    Xu, Suling
    [J]. COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 148