TriAxial Low-Rank Transformer for Efficient Medical Image Segmentation

被引:0
|
作者
Shang, Jiang [1 ]
Fang, Xi [2 ,3 ]
机构
[1] City Univ Hong Kong, Dept Informat Syst, Kowloon Tong, Hong Kong, Peoples R China
[2] Rensselaer Polytech Inst, Dept Biomed Engn, Troy, NY 12180 USA
[3] Rensselaer Polytech Inst, Ctr Biotechnol & Interdisciplinary Studies, Troy, NY 12180 USA
关键词
Low-rank Representation; Efficient Self-attention; Vision Transformer; TriLoRa Attention; 3D Medical Image Segmentation;
D O I
10.1007/978-981-99-8432-9_8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transformer-CNN architectures have achieved state-of-the-art on 3D medical image segmentation due to their ability to capture both long-term dependencies and local information. However, directly using the existing transformers as encoders can be inefficient, particularly when dealing with high-resolution 3D medical images. This is due to the fact that self-attention computes pixel-to-pixel relationships, which is computationally expensive. Despite attempts to mitigate this through the use of local-window attention or axial-wise attention, these methods may result in the loss of interaction between certain local regions during the self-attention computation. Instead of using the sparsified attention, we aim to incorporate the relationships between all pixels while substantially reducing the computational demand. Inspired by the low-rank property of attention, we hypothesized that the pixel-to-pixel relationship can be approximated by the composition of the plane-to-plane relationship. We propose TriAxial Low-Rank Transformer Network (TALoRT-Net) for medical image segmentation. The core of this model lies in its attention module, which approximates pixel-to-pixel attention matrix using the low-rank representation of the product of plane-to-plane matrices and significantly reduces the computation complexity inherent in 3D self-attention. Moreover, we replaced the linear projection and vanilla Multi-Layer Perceptron (MLP) in Vision Transformer with a convolutional stem and depthwise convolution layer (DCL) to further reduce the number of model parameters. We evaluated the performance of the method on the public BTCV dataset, which significantly reduce the computational effort while maintaining uncompromised accuracy.
引用
收藏
页码:91 / 102
页数:12
相关论文
共 50 条
  • [41] Low-Rank Sparse Coding for Image Classification
    Zhang, Tianzhu
    Ghanem, Bernard
    Liu, Si
    Xu, Changsheng
    Ahuja, Narendra
    2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, : 281 - 288
  • [42] Low-Rank Linear Embedding for Image Recognition
    Chen, Yudong
    Lai, Zhihui
    Wong, Wai Keung
    Shen, Linlin
    Hu, Qinghua
    IEEE TRANSACTIONS ON MULTIMEDIA, 2018, 20 (12) : 3212 - 3222
  • [43] CROSS: EFFICIENT LOW-RANK TENSOR COMPLETION
    Zhang, Anru
    ANNALS OF STATISTICS, 2019, 47 (02): : 936 - 964
  • [44] EFFICIENT LEARNING OF DICTIONARIES WITH LOW-RANK ATOMS
    Ravishankar, Saiprasad
    Moore, Brian E.
    Nadakuditi, Raj Rao
    Fessler, Jeffrey A.
    2016 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP), 2016, : 222 - 226
  • [45] Hyperspectral Image Denoising Based on Superpixel Segmentation Low-Rank Matrix Approximation and Total Variation
    Y. Behroozi
    M. Yazdi
    A. Zolghadr asli
    Circuits, Systems, and Signal Processing, 2022, 41 : 3372 - 3396
  • [46] Joint Neighboring Coding with a Low-Rank Constraint for Multi-Atlas Based Image Segmentation
    Zhu, Hancan
    He, Guanghua
    JOURNAL OF MEDICAL IMAGING AND HEALTH INFORMATICS, 2020, 10 (02) : 310 - 315
  • [47] Multi-level Low-rank Approximation-based Spectral Clustering for image segmentation
    Wang, Lijun
    Dong, Ming
    PATTERN RECOGNITION LETTERS, 2012, 33 (16) : 2206 - 2215
  • [48] Hyperspectral Image Denoising Based on Superpixel Segmentation Low-Rank Matrix Approximation and Total Variation
    Behroozi, Y.
    Yazdi, M.
    Asli, A. Zolghadr
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2022, 41 (06) : 3372 - 3396
  • [49] Evolving masked low-rank transformer for long text understanding
    Liu, Chenjing
    Chen, Xiangru
    Lin, Jie
    Hu, Peng
    Wang, Junfeng
    Geng, Xue
    APPLIED SOFT COMPUTING, 2024, 152
  • [50] Low-Rank Subspace Learning of Multikernel Based on Weighted Truncated Nuclear Norm for Image Segmentation
    Li, Li
    Wang, Xiao
    Pu, Lei
    Wang, Jing
    Zhang, Xiaoqian
    IEEE ACCESS, 2022, 10 : 66290 - 66299