Small Sample Image Segmentation by Coupling Convolutions and Transformers

被引:0
作者
Qi, Hao [1 ]
Zhou, Huiyu [2 ]
Dong, Junyu [1 ]
Dong, Xinghui [1 ]
机构
[1] Ocean Univ China, Sch Comp Sci & Technol, Qingdao 266100, Peoples R China
[2] Univ Leicester, Sch Comp & Math Sci, Leicester LE1 7RH, England
基金
中国国家自然科学基金;
关键词
Image segmentation; Convolutional neural networks; convolutional neural networks; transformers; cross-attention; NETWORK;
D O I
10.1109/TCSVT.2023.3343632
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Compared with natural image segmentation, small sample image segmentation tasks, such as medical image segmentation and defect detection, have been less studied. Recent studies made efforts on bringing together Convolutional Neural Networks (CNNs) and Transformers in a serial or interleaved architecture in order to incorporate long-range dependencies into the features extracted using CNNs. In this study, we argue that these architectures limit the capability of the combination of CNNs and Transformers. To this end, we propose a dual-stream small sample image segmentation network, namely, the Interactive Coupling of Convolutions and Transformers Based UNet (ICCT-UNet, code and models are available at: https://indtlab.github.io/projects/ICCTUNet), motivated by the success achieved using the UNet in the scenario of small sample image segmentation. Within this network, a CNN stream is paralleled with a Transformer stream while maintaining feature exchange inside each block through the proposed Window-Based Multi-head Cross-Attention (W-MHCA) mechanism. To derive an overall segmentation, the features learned by both the streams are further fused using a Residual Fusion Module (RFM). Experimental results show that the ICCT-UNet outperforms, or at least performs comparably to, its counterparts on eight sets of medical and defective images. These promising results should be attributed to the effective combination of the local and global features fulfilled by the proposed interactive coupling method.
引用
收藏
页码:5282 / 5294
页数:13
相关论文
共 71 条
  • [1] Dataset of breast ultrasound images
    Al-Dhabyani, Walid
    Gomaa, Mohammed
    Khaled, Hussien
    Fahmy, Aly
    [J]. DATA IN BRIEF, 2020, 28
  • [2] Image and Video Segmentation by Combining Unsupervised Generalized Gaussian Mixture Modeling and Feature Selection
    Allili, Mohand Said
    Ziou, Djemel
    Bouguila, Nizar
    Boutemedjet, Sabri
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2010, 20 (10) : 1373 - 1377
  • [3] Asacikheybari G., 2019, SMART HLTH, V14
  • [4] Tunnel inspection using photogrammetric techniques and image processing: A review
    Attard, Leanne
    Debono, Carl James
    Valentino, Gianluca
    Di Castro, Mario
    [J]. ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2018, 144 : 180 - 188
  • [5] Deep Learning Techniques for Automatic MRI Cardiac Multi-Structures Segmentation and Diagnosis: Is the Problem Solved?
    Bernard, Olivier
    Lalande, Alain
    Zotti, Clement
    Cervenansky, Frederick
    Yang, Xin
    Heng, Pheng-Ann
    Cetin, Irem
    Lekadir, Karim
    Camara, Oscar
    Gonzalez Ballester, Miguel Angel
    Sanroma, Gerard
    Napel, Sandy
    Petersen, Steffen
    Tziritas, Georgios
    Grinias, Elias
    Khened, Mahendra
    Kollerathu, Varghese Alex
    Krishnamurthi, Ganapathy
    Rohe, Marc-Michel
    Pennec, Xavier
    Sermesant, Maxime
    Isensee, Fabian
    Jaeger, Paul
    Maier-Hein, Klaus H.
    Full, Peter M.
    Wolf, Ivo
    Engelhardt, Sandy
    Baumgartner, Christian F.
    Koch, Lisa M.
    Wolterink, Jelmer M.
    Isgum, Ivana
    Jang, Yeonggul
    Hong, Yoonmi
    Patravali, Jay
    Jain, Shubham
    Humbert, Olivier
    Jodoin, Pierre-Marc
    [J]. IEEE TRANSACTIONS ON MEDICAL IMAGING, 2018, 37 (11) : 2514 - 2525
  • [6] Multi-Deformation Aware Attention Learning for Concrete Structural Defect Classification
    Bhattacharya, Gaurab
    Mandal, Bappaditya
    Puhan, Niladri B.
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (09) : 3707 - 3713
  • [7] XNet: A convolutional neural network (CNN) implementation for medical X-Ray image segmentation suitable for small datasets
    Bullock, Joseph
    Cuesta-Lazaro, Carolina
    Quera-Bofarull, Arnau
    [J]. MEDICAL IMAGING 2019: BIOMEDICAL APPLICATIONS IN MOLECULAR, STRUCTURAL, AND FUNCTIONAL IMAGING, 2019, 10953
  • [8] Cao Hu, 2023, Computer Vision - ECCV 2022 Workshops: Proceedings. Lecture Notes in Computer Science (13803), P205, DOI 10.1007/978-3-031-25066-8_9
  • [9] Carion Nicolas, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12346), P213, DOI 10.1007/978-3-030-58452-8_13
  • [10] Chen J., 2021, TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation, P1, DOI DOI 10.1038/s41566-021-00828-5