TriAxial Low-Rank Transformer for Efficient Medical Image Segmentation

被引:0
|
作者
Shang, Jiang [1 ]
Fang, Xi [2 ,3 ]
机构
[1] City Univ Hong Kong, Dept Informat Syst, Kowloon Tong, Hong Kong, Peoples R China
[2] Rensselaer Polytech Inst, Dept Biomed Engn, Troy, NY 12180 USA
[3] Rensselaer Polytech Inst, Ctr Biotechnol & Interdisciplinary Studies, Troy, NY 12180 USA
关键词
Low-rank Representation; Efficient Self-attention; Vision Transformer; TriLoRa Attention; 3D Medical Image Segmentation;
D O I
10.1007/978-981-99-8432-9_8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transformer-CNN architectures have achieved state-of-the-art on 3D medical image segmentation due to their ability to capture both long-term dependencies and local information. However, directly using the existing transformers as encoders can be inefficient, particularly when dealing with high-resolution 3D medical images. This is due to the fact that self-attention computes pixel-to-pixel relationships, which is computationally expensive. Despite attempts to mitigate this through the use of local-window attention or axial-wise attention, these methods may result in the loss of interaction between certain local regions during the self-attention computation. Instead of using the sparsified attention, we aim to incorporate the relationships between all pixels while substantially reducing the computational demand. Inspired by the low-rank property of attention, we hypothesized that the pixel-to-pixel relationship can be approximated by the composition of the plane-to-plane relationship. We propose TriAxial Low-Rank Transformer Network (TALoRT-Net) for medical image segmentation. The core of this model lies in its attention module, which approximates pixel-to-pixel attention matrix using the low-rank representation of the product of plane-to-plane matrices and significantly reduces the computation complexity inherent in 3D self-attention. Moreover, we replaced the linear projection and vanilla Multi-Layer Perceptron (MLP) in Vision Transformer with a convolutional stem and depthwise convolution layer (DCL) to further reduce the number of model parameters. We evaluated the performance of the method on the public BTCV dataset, which significantly reduce the computational effort while maintaining uncompromised accuracy.
引用
收藏
页码:91 / 102
页数:12
相关论文
共 50 条
  • [1] StruNet: Perceptual and low-rank regularized transformer for medical image denoising
    Ma, Yuhui
    Yan, Qifeng
    Liu, Yonghuai
    Liu, Jiang
    Zhang, Jiong
    Zhao, Yitian
    MEDICAL PHYSICS, 2023, 50 (12) : 7654 - 7669
  • [2] Low-Rank Mixture-of-Experts for Continual Medical Image Segmentation
    Chen, Qian
    Zhu, Lei
    He, Hangzhou
    Zhang, Xinliang
    Zeng, Shuang
    Ren, Qiushi
    Lu, Yanye
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT VIII, 2024, 15008 : 382 - 392
  • [3] Efficient Low-rank Backpropagation for Vision Transformer Adaptation
    Yang, Yuedong
    Chiang, Hung-Yueh
    Li, Guihong
    Marculescu, Diana
    Marculescu, Radu
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [4] Nonconvex Low-Rank Sparse Factorization for Image Segmentation
    Li, Xiaoping
    Wang, Weiwei
    Razi, Amir
    Li, Tao
    2015 11TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY (CIS), 2015, : 227 - 230
  • [5] Shape Registration and Low-Rank for Multiple Image Segmentation
    Hua, Wei
    Chen, Fei
    IMAGE AND GRAPHICS (ICIG 2017), PT II, 2017, 10667 : 466 - 475
  • [6] Robust Structured Low-Rank Representation for Image Segmentation
    You, Cong-Zhe
    Palade, Vasile
    Wu, Xiao-Jun
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2018, 27 (05)
  • [7] DBA: Efficient Transformer With Dynamic Bilinear Low-Rank Attention
    Qin, Bosheng
    Li, Juncheng
    Tang, Siliang
    Zhuang, Yueting
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2025,
  • [8] Multitask Low-Rank Affinity Graph for Image Segmentation and Image Annotation
    Li, Teng
    Cheng, Bin
    Ni, Bingbing
    Liu, Guangchan
    Yan, Shuicheng
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2016, 7 (04)
  • [9] MsLRR: A Unified Multiscale Low-Rank Representation for Image Segmentation
    Liu, Xiaobai
    Xu, Qian
    Ma, Jiayi
    Jin, Hai
    Zhang, Yanduo
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2014, 23 (05) : 2159 - 2167
  • [10] Hyperspectral image denoising with superpixel segmentation and low-rank representation
    Fan, Fan
    Ma, Yong
    Li, Chang
    Mei, Xiaoguang
    Huang, Jun
    Ma, Jiayi
    INFORMATION SCIENCES, 2017, 397 : 48 - 68