Transformer-based multiple instance learning network with 2D positional encoding for histopathology image classification

被引:0
作者
Bin Yang [1 ]
Lei Ding [2 ]
Jianqiang Li [2 ]
Yong Li [2 ]
Guangzhi Qu [2 ]
Jingyi Wang [3 ]
Qiang Wang [2 ]
Bo Liu [2 ]
机构
[1] Center for Strategic Assessment and Consulting, Academy of Military Science, Beijing
[2] Faculty of Information Technology, Beijing University of Technology, Beijing
[3] Computer Science and Engineering Department, Oakland University, Rochester
[4] School of Mathematical and Computational Sciences, Massey University, Auckland
基金
中国国家自然科学基金;
关键词
Image classification; Multiple instance learning; Weakly supervised training;
D O I
10.1007/s40747-025-01779-y
中图分类号
学科分类号
摘要
Digital medical imaging, particularly pathology images, is essential for cancer diagnosis but faces challenges in direct model training due to its super-resolution nature. Although weakly supervised learning has reduced the need for manual annotations, many multiple instance learning (MIL) methods struggle to effectively capture crucial spatial relationships in histopathological images. Existing methods incorporating positional information often overlook nuanced spatial correlations or use positional encoding strategies that do not fully capture the unique spatial dynamics of pathology images. To address this issue, we propose a new framework named TMIL (Transformer-based Multiple Instance Learning Network with 2D positional encoding), which leverages multiple instance learning for weakly supervised classification of histopathological images. TMIL incorporates a 2D positional encoding module, based on the Transformer, to model positional information and explore correlations between instances. Furthermore, TMIL divides histopathological images into pseudo-bags and trains patch-level feature vectors with deep metric learning to enhance classification performance. Finally, the proposed approach is evaluated on a public colorectal adenoma dataset. The experimental results show that TMIL outperforms existing MIL methods, achieving an AUC of 97.28% and an ACC of 95.19%. These findings suggest that TMIL’s integration of deep metric learning and positional encoding offers a promising approach for improving the efficiency and accuracy of pathology image analysis in cancer diagnosis. © The Author(s) 2025.
引用
收藏
相关论文
共 45 条
  • [31] Pseudo-label attention-based multiple instance learning for whole slide image classification
    He, Jing
    Wang, Ping
    Cai, Jingwen
    Tang, Dan
    Yao, Shaowen
    Liu, Renyang
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 142
  • [32] SMIL-DeiT:Multiple Instance Learning and Self-supervised Vision Transformer network for Early Alzheimer's disease classification
    Yin, Yue
    Jin, Weikang
    Bai, Jing
    Liu, Ruotong
    Zhen, Haowei
    [J]. 2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [33] Pseudo-Bag Mixup Augmentation for Multiple Instance Learning-Based Whole Slide Image Classification
    Liu, Pei
    Ji, Luping
    Zhang, Xinyu
    Ye, Feng
    [J]. IEEE TRANSACTIONS ON MEDICAL IMAGING, 2024, 43 (05) : 1841 - 1852
  • [34] Development of a deep learning image classification network based on vehicular collision using instance mask guided attention
    Madhumitha, G.
    Senthilnathan, R.
    [J]. SIGNAL IMAGE AND VIDEO PROCESSING, 2025, 19 (04)
  • [35] E2-MIL: An explainable and evidential multiple instance learning framework for whole slide image classification
    Shi, Jiangbo
    Li, Chen
    Gong, Tieliang
    Fu, Huazhu
    [J]. MEDICAL IMAGE ANALYSIS, 2024, 97
  • [36] CoLM: Contrastive learning and multiple instance learning network for lung cancer classification of surgical options based on frozen pathological images
    Zhao, Lu
    Zhao, Wangyuan
    Qiu, Lu
    Jiang, Mengqi
    Qian, Liqiang
    Ting, Hua-Nong
    Fu, Xiaolong
    Zhang, Puming
    Han, Yuchen
    Zhao, Jun
    [J]. BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 100
  • [37] Image segmentation and classification based on a 2D distributed hidden Markov model
    Ma, Xiang
    Schonfeld, Dan
    Khokhar, Ashfaq
    [J]. VISUAL COMMUNICATIONS AND IMAGE PROCESSING 2008, PTS 1 AND 2, 2008, 6822
  • [38] Improved deep learning image classification algorithm based on Swin Transformer V2
    Wei, Jiangshu
    Chen, Jinrong
    Wang, Yuchao
    Luo, Hao
    Li, Wujie
    [J]. PEERJ COMPUTER SCIENCE, 2023, 9
  • [39] Improved deep learning image classification algorithm based on Swin Transformer V2
    Wei J.
    Chen J.
    Wang Y.
    Luo H.
    Li W.
    [J]. PeerJ Computer Science, 2023, 9
  • [40] Extracting 2D weak labels from volume labels using multiple instance learning in CT hemorrhage detection
    Remedios, Samuel W.
    Wu, Zihao
    Bermudez, Camilo
    Kerley, Cailey, I
    Roy, Snehashis
    Patel, Mayur B.
    Butman, John A.
    Landman, Bennett A.
    Pham, Dzung L.
    [J]. MEDICAL IMAGING 2020: IMAGE PROCESSING, 2021, 11313