TransUMobileNet: Integrating multi-channel attention fusion with hybrid CNN-Transformer architecture for medical image segmentation

被引：0

作者：

Cai, Sijing ^{[1
]}

Jiang, Yukun ^{[2
]}

Xiao, Yuwei ^{[2
]}

Zeng, Jian ^{[2
]}

Zhou, Guangming ^{[2
]}

机构：

[1] Fujian Univ Technol, Sch Transportat, Fuzhou 350108, Peoples R China

[2] Fujian Univ Technol, Sch Elect Elect Engn & Phys, Fuzhou 350108, Peoples R China

来源：

BIOMEDICAL SIGNAL PROCESSING AND CONTROL | 2025年 / 107卷

基金：

中国国家自然科学基金;

关键词：

Transformer; Unet; Medical image segmentation; Attention Mechanism; NET; NETWORK;

D O I：

10.1016/j.bspc.2025.107850

中图分类号：

R318 [生物医学工程];

学科分类号：

0831 ;

摘要：

To address the common issue of strong similarity and blurred boundaries between lesion and normal tissues in medical images, we propose the TransUMobileNet model, which employs a symmetrical encoder-decoder structure. First, the feature encoder uses a hybrid CNN-Transformer architecture, where the Transformer encodes tokenized image patches from convolutional neural network (CNN) feature maps as input sequences for global context extraction. The Transformer's sequence prediction attention mechanism enhances the encoding of long-range dependencies and expressive learning, strengthening global information representation. Second, the feature decoder uses a fully symmetrical encoding form. Through symmetrical skip connections, the loss of positional information in the Transformer decoding path is mitigated, improving the depiction of target boundaries. The feature decoder utilizes cascaded upsampling to restore local spatial information and enhance finer details. Additionally, a Multi-Channel Attention Fusion (MCAF) module is incorporated into the decoding section. This module, characterized by a structure with small channels at both ends and a large one in the middle, along with an attention mechanism, enriches feature information and automatically adjusts weights for key regions, enhancing focus on target areas. TransUMobileNet was evaluated on three different public medical image segmentation datasets and a custom thyroid nodule segmentation dataset. The results show that TransUMobileNet achieves a recall rate of 82.23% and a mean average precision of 95.62%, outperforming current mainstream methods for medical image segmentation.

引用

页数：12

共 42 条

[11] Fully Dense UNet for 2-D Sparse Photoacoustic Tomography Artifact Removal [J].

Guan, Steven ;

Khan, Amir A. ;

Sikdar, Siddhartha ;

Chitnis, Parag V. .

IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2020, 24 (02) :568-576

[12] Swin UNETR: Swin Transformers for Semantic Segmentation of Brain Tumors in MRI Images [J].

Hatamizadeh, Ali ;

Nath, Vishwesh ;

Tang, Yucheng ;

Yang, Dong ;

Roth, Holger R. ;

Xu, Daguang .

BRAINLESION: GLIOMA, MULTIPLE SCLEROSIS, STROKE AND TRAUMATIC BRAIN INJURIES, BRAINLES 2021, PT I, 2022, 12962 :272-284

[13] UNETR: Transformers for 3D Medical Image Segmentation [J].

Hatamizadeh, Ali ;

Tang, Yucheng ;

Nath, Vishwesh ;

Yang, Dong ;

Myronenko, Andriy ;

Landman, Bennett ;

Roth, Holger R. ;

Xu, Daguang .

2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, :1748-1758

[14] Searching for MobileNetV3 [J].

Howard, Andrew ;

Sandler, Mark ;

Chu, Grace ;

Chen, Liang-Chieh ;

Chen, Bo ;

Tan, Mingxing ;

Wang, Weijun ;

Zhu, Yukun ;

Pang, Ruoming ;

Vasudevan, Vijay ;

Le, Quoc V. ;

Adam, Hartwig .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :1314-1324

[15]

Huang HM, 2020, INT CONF ACOUST SPEE, P1055, DOI [10.1109/icassp40776.2020.9053405, 10.1109/ICASSP40776.2020.9053405]

[16] CCNet: Criss-Cross Attention for Semantic Segmentation [J].

Huang, Zilong ;

Wang, Xinggang ;

Huang, Lichao ;

Huang, Chang ;

Wei, Yunchao ;

Liu, Wenyu .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :603-612

[17] MultiResUNet : Rethinking the U-Net architecture for multimodal biomedical image segmentation [J].

Ibtehaz, Nabil ;

Rahman, M. Sohel .

NEURAL NETWORKS, 2020, 121 :74-87

[18]

Isensee F, 2018, Arxiv, DOI [arXiv:1809.10486, DOI 10.48550/ARXIV.1809.10486]

[19] DoubleU-Net: A Deep Convolutional Neural Network for Medical Image Segmentation [J].

Jha, Debesh ;

Riegler, Michael A. ;

Johansen, Dag ;

Halvorsen, Pal ;

Johansen, Havard D. .

2020 IEEE 33RD INTERNATIONAL SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS(CBMS 2020), 2020, :558-564

[20] ResUNet plus plus : An Advanced Architecture for Medical Image Segmentation [J].

Jha, Debesh ;

Smedsrud, Pia H. ;

Riegler, Michael A. ;

Johansen, Dag ;

de Lange, Thomas ;

Halvorsen, Pal ;

Johansen, Havard D. .

2019 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM 2019), 2019, :225-230

← 1 2 3 4 5 →