An efficient medical image classification network based on multi-branch CNN, token grouping Transformer and mixer MLP

被引:17
作者
Liu, Shiwei [1 ]
Wang, Liejun [1 ]
Yue, Wenwen [1 ]
机构
[1] Xinjiang Univ, Sch Comp Sci & Technol, Urumqi 830017, Xinjiang, Peoples R China
基金
美国国家科学基金会;
关键词
CNN; MLP; Transformer; Medical image classification;
D O I
10.1016/j.asoc.2024.111323
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, medical image classification techniques based on deep learning have made remarkable achievements, but most of the current models sacrifice the efficiency of the model for performance improvement. This poses a great challenge in practical clinical applications. Meanwhile, Convolutional Neural Network (CNN)based methods, Visual Transformer(ViT)-based and Multi -layer Perceptron(MLP)-based methods have their own advantages and disadvantages in capturing local features and global features of medical images. And there is no good method to combine the three to achieve a better trade-off in model scale and performance. Based on the above problems, we propose Eff-CTM: an hybrid efficient medical image classification network based on multi -branch CNN, token grouping Transformer and mixer MLP. It combines the advantages of all three and takes a small number of parameters to classify pneumonia, colon cancer histopathology and dermatology images quickly and accurately. Eff-CTM uses an efficient CNN module with multi -branch structure to learn local detail information in the shallow CNN stage of the network, an efficient CNN, Transformer (ECT) module and efficient MLP (EM) module in the middle stage of the network to extract local features and global features. An efficient Transformer (ET) module is used in the final stage to fuse the rich feature information. We have conducted extensive experiments on three publicly available medical image classification datasets, and the experimental results show that our proposed Eff-CTM achieves a better trade-off in efficiency and performance than methods based on CNN, Transformer and MLP.
引用
收藏
页数:16
相关论文
共 39 条
[21]   A deep temporal network for motor imagery classification based on multi-branch feature fusion and attention mechanism [J].
Zhao, Jinke ;
Liu, Mingliang .
BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2025, 100
[22]   Efficient classification of different medical image multimodalities based on simple CNN architecture and augmentation algorithms [J].
Walid El-Shafai ;
Amira A. Mahmoud ;
Anas M. Ali ;
El-Sayed M. El-Rabaie ;
Taha E. Taha ;
Adel S. El-Fishawy ;
Osama Zahran ;
Fathi E. Abd El-Samie .
Journal of Optics, 2024, 53 :775-787
[23]   Efficient classification of different medical image multimodalities based on simple CNN architecture and augmentation algorithms [J].
El-Shafai, Walid ;
Mahmoud, Amira A. ;
Ali, Anas M. ;
El-Rabaie, El-Sayed M. ;
Taha, Taha E. ;
El-Fishawy, Adel S. ;
Zahran, Osama ;
Abd El-Samie, Fathi E. .
JOURNAL OF OPTICS-INDIA, 2024, 53 (02) :775-787
[24]   DCTN: Dual-Branch Convolutional Transformer Network With Efficient Interactive Self-Attention for Hyperspectral Image Classification [J].
Zhou, Yunfei ;
Huang, Xiaohui ;
Yang, Xiaofei ;
Peng, Jiangtao ;
Ban, Yifang .
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 :1-16
[25]   C-TUnet: A CNN-Transformer Architecture-Based Ultrasound Breast Image Classification Network [J].
Wu, Ying ;
Li, Faming ;
Xu, Bo .
INTERNATIONAL JOURNAL OF IMAGING SYSTEMS AND TECHNOLOGY, 2025, 35 (01)
[26]   HYPERSPECTRAL IMAGE CLASSIFICATION BASED ON MULTI-LEVEL SPECTRAL-SPATIAL TRANSFORMER NETWORK [J].
Yang, Hao ;
Yu, Haoyang ;
Hong, Danfeng ;
Xu, Zhen ;
Wang, Yulei ;
Song, Meiping .
2022 12TH WORKSHOP ON HYPERSPECTRAL IMAGING AND SIGNAL PROCESSING: EVOLUTION IN REMOTE SENSING (WHISPERS), 2022,
[27]   A hybrid network of CNN and transformer for subpixel shifting-based multi-image super-resolution [J].
Wu, Qiang ;
Zeng, Hongfei ;
Zhang, Jin ;
Li, Weishi ;
Xia, Haojie .
OPTICS AND LASERS IN ENGINEERING, 2024, 182
[28]   Multi-level wavelet network based on CNN-Transformer hybrid attention for single image deraining [J].
Liu, Bin ;
Fang, Siyan .
NEURAL COMPUTING & APPLICATIONS, 2023, 35 (30) :22387-22404
[29]   Multi-level wavelet network based on CNN-Transformer hybrid attention for single image deraining [J].
Bin Liu ;
Siyan Fang .
Neural Computing and Applications, 2023, 35 :22387-22404
[30]   DM-CNN: Dynamic Multi-scale Convolutional Neural Network with uncertainty quantification for medical image classification [J].
Han, Qi ;
Qian, Xin ;
Xu, Hongxiang ;
Wu, Kepeng ;
Meng, Lun ;
Qiu, Zicheng ;
Weng, Tengfei ;
Zhou, Baoping ;
Gao, Xianqiang .
COMPUTERS IN BIOLOGY AND MEDICINE, 2024, 168