HistGen: Histopathology Report Generation via Local-Global Feature Encoding and Cross-Modal Context Interaction

被引:1
作者
Guo, Zhengrui [1 ]
Ma, Jiabo [1 ]
Xu, Yingxue [1 ]
Wang, Yihui [1 ]
Wang, Liansheng [3 ]
Chen, Hao [1 ,2 ]
机构
[1] Hong Kong Univ Sci & Technol, Dept Comp Sci & Engn, Hong Kong, Peoples R China
[2] Hong Kong Univ Sci & Technol, Dept Chem & Biol Engn, Hong Kong, Peoples R China
[3] Xiamen Univ, Sch Informat Sci & Engn, Xiamen, Peoples R China
来源
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT IV | 2024年 / 15004卷
基金
中国国家自然科学基金;
关键词
Histopathology Report Generation; Multiple Instance Learning; Cross-Modal Alignment;
D O I
10.1007/978-3-031-72083-3_18
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Histopathology serves as the gold standard in cancer diagnosis, with clinical reports being vital in interpreting and understanding this process, guiding cancer treatment and patient care. The automation of histopathology report generation with deep learning stands to significantly enhance clinical efficiency and lessen the labor-intensive, time-consuming burden on pathologists in report writing. In pursuit of this advancement, we introduce HistGen, a multiple instance learning-empowered framework for histopathology report generation together with the first benchmark dataset for evaluation. Inspired by diagnostic and report-writing workflows, HistGen features two delicately designed modules, aiming to boost report generation by aligning whole slide images (WSIs) and diagnostic reports at both local and global granularities. To achieve this, a local-global hierarchical encoder is developed for efficient visual feature aggregation from a region-to-slide perspective. Meanwhile, a cross-modal context module is proposed to explicitly facilitate alignment and interaction between distinct modalities, effectively bridging the gap between the extensive visual sequences of WSIs and corresponding highly summarized reports. Experimental results on WSI report generation show the proposed model outperforms state-of-the-art (SOTA) models by a large margin. Moreover, the results of fine-tuning our model on cancer subtyping and survival analysis tasks further demonstrate superior performance compared to SOTA methods, showcasing strong transfer learning capability. Dataset and code are available here.
引用
收藏
页码:189 / 199
页数:11
相关论文
共 33 条
  • [1] Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
    Anderson, Peter
    He, Xiaodong
    Buehler, Chris
    Teney, Damien
    Johnson, Mark
    Gould, Stephen
    Zhang, Lei
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 6077 - 6086
  • [2] Araujo A., 2019, Distill, V4, DOI DOI 10.23915/DISTILL.00021
  • [3] From Detection of Individual Metastases to Classification of Lymph Node Status at the Patient Level: The CAMELYON17 Challenge
    Bandi, Peter
    Geessink, Oscar
    Manson, Quirine
    van Dijk, Marcory
    Balkenhol, Maschenka
    Hermsen, Meyke
    Bejnordi, Babak Ehteshami
    Lee, Byungjae
    Paeng, Kyunghyun
    Zhong, Aoxiao
    Li, Quanzheng
    Zanjani, Farhad Ghazvinian
    Zinger, Svitlana
    Fukuta, Keisuke
    Komura, Daisuke
    Ovtcharov, Vlado
    Cheng, Shenghua
    Zeng, Shaoqun
    Thagaard, Jeppe
    Dahl, Anders B.
    Lin, Huangjing
    Chen, Hao
    Jacobsson, Ludwig
    Hedlund, Martin
    Cetin, Melih
    Halici, Eren
    Jackson, Hunter
    Chen, Richard
    Both, Fabian
    Franke, Joerg
    Kusters-Vandevelde, Heidi
    Vreuls, Willem
    Bult, Peter
    van Ginneken, Bram
    van der Laak, Jeroen
    Litjens, Geert
    [J]. IEEE TRANSACTIONS ON MEDICAL IMAGING, 2019, 38 (02) : 550 - 560
  • [4] Diagnostic Assessment of Deep Learning Algorithms for Detection of Lymph Node Metastases in Women With Breast Cancer
    Bejnordi, Babak Ehteshami
    Veta, Mitko
    van Diest, Paul Johannes
    van Ginneken, Bram
    Karssemeijer, Nico
    Litjens, Geert
    van der Laak, Jeroen A. W. M.
    [J]. JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2017, 318 (22): : 2199 - 2210
  • [5] Clinical-grade computational pathology using weakly supervised deep learning on whole slide images
    Campanella, Gabriele
    Hanna, Matthew G.
    Geneslaw, Luke
    Miraflor, Allen
    Silva, Vitor Werneck Krauss
    Busam, Klaus J.
    Brogi, Edi
    Reuter, Victor E.
    Klimstra, David S.
    Fuchs, Thomas J.
    [J]. NATURE MEDICINE, 2019, 25 (08) : 1301 - +
  • [6] Chen PY, 2024, Arxiv, DOI arXiv:2311.16480
  • [7] Chen Z., 2022, arXiv
  • [8] Chen ZH, 2022, Arxiv, DOI arXiv:2010.16056
  • [9] Cornia M, 2020, PROC CVPR IEEE, P10575, DOI 10.1109/CVPR42600.2020.01059
  • [10] Preparing a collection of radiology examinations for distribution and retrieval
    Demner-Fushman, Dina
    Kohli, Marc D.
    Rosenman, Marc B.
    Shooshan, Sonya E.
    Rodriguez, Laritza
    Antani, Sameer
    Thoma, George R.
    McDonald, Clement J.
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2016, 23 (02) : 304 - 310