Transformer-based multi-task learning for classification and segmentation of gastrointestinal tract endoscopic images

被引:20
|
作者
Tang, Suigu [1 ]
Yu, Xiaoyuan [1 ]
Cheang, Chak Fong [1 ]
Liang, Yanyan [1 ]
Zhao, Penghui [1 ]
Yu, Hon Ho [2 ]
Choi, I. Cheong [2 ]
机构
[1] Macau Univ Sci & Technol, Fac Innovat Engn, Sch Comp Sci & Engn, Cotai, Macao, Peoples R China
[2] Kiang Wu Hosp, Macau, Macau, Peoples R China
关键词
Transformer; Multi-task learning; Classification; Segmentation; Active learning; NETWORK; DIAGNOSIS;
D O I
10.1016/j.compbiomed.2023.106723
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Despite being widely utilized to help endoscopists identify gastrointestinal (GI) tract diseases using classifica-tion and segmentation, models based on convolutional neural network (CNN) have difficulties in distinguishing the similarities among some ambiguous types of lesions presented in endoscopic images, and in the training when lacking labeled datasets. Those will prevent CNN from further improving the accuracy of diagnosis. To address these challenges, we first proposed a Multi-task Network (TransMT-Net) capable of simultaneously learning two tasks (classification and segmentation), which has the transformer designed to learn global features and can combine the advantages of CNN in learning local features so that to achieve a more accurate prediction in identifying the lesion types and regions in GI tract endoscopic images. We further adopted the active learning in TransMT-Net to tackle the labeled image-hungry problem. A dataset was created from the CVC-ClinicDB dataset, Macau Kiang Wu Hospital, and Zhongshan Hospital to evaluate the model performance. Then, the experimental results show that our model not only achieved 96.94% accuracy in the classification task and 77.76% Dice Similarity Coefficient in the segmentation task but also outperformed those of other models on our test set. Meanwhile, active learning also produced positive results for the performance of our model with a small-scale initial training set, and even its performance with 30% of the initial training set was comparable to that of most comparable models with the full training set. Consequently, the proposed TransMT-Net has demonstrated its potential performance in GI tract endoscopic images and it through active learning can alleviate the shortage of labeled images.
引用
收藏
页数:11
相关论文
共 50 条
  • [41] Multi-task learning architecture for brain tumor detection and segmentation in MRI images
    Nazir, Maria
    Ali, Muhammad Junaid
    Tufail, Hafiz Zahid
    Shahid, Ahmad Raza
    Raza, Basit
    Shakil, Sadia
    Khurshid, Khurram
    JOURNAL OF ELECTRONIC IMAGING, 2022, 31 (05)
  • [42] Simultaneous segmentation and classification of colon cancer polyp images using a dual branch multi-task learning network
    Li C.
    Liu J.
    Tang J.
    Mathematical Biosciences and Engineering, 2024, 21 (02) : 2024 - 2049
  • [43] Multi-Task Mean Teacher Medical Image Segmentation Based on Swin Transformer
    Zhang, Jie
    Li, Fan
    Zhang, Xin
    Cheng, Yue
    Hei, Xinhong
    APPLIED SCIENCES-BASEL, 2024, 14 (07):
  • [44] Multi-task Learning for Brain Tumor Segmentation
    Weninger, Leon
    Liu, Qianyu
    Merhof, Dorit
    BRAINLESION: GLIOMA, MULTIPLE SCLEROSIS, STROKE AND TRAUMATIC BRAIN INJURIES (BRAINLES 2019), PT I, 2020, 11992 : 327 - 337
  • [45] Multi-task learning framework for echocardiography segmentation
    Monkam, Patrice
    Jin, Songbai
    Lu, Wenkai
    2022 IEEE INTERNATIONAL ULTRASONICS SYMPOSIUM (IEEE IUS), 2022,
  • [46] Multi-task Swin Transformer for Motion Artifacts Classification and Cardiac Magnetic Resonance Image Segmentation
    Grzeszczyk, Michal K.
    Plotka, Szymon
    Sitek, Arkadiusz
    STATISTICAL ATLASES AND COMPUTATIONAL MODELS OF THE HEART: REGULAR AND CMRXMOTION CHALLENGE PAPERS, STACOM 2022, 2022, 13593 : 409 - 417
  • [47] Bidirectional Transformer Based Multi-Task Learning for Natural Language Understanding
    Tripathi, Suraj
    Singh, Chirag
    Kumar, Abhay
    Pandey, Chandan
    Jain, Nishant
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS (NLDB 2019), 2019, 11608 : 54 - 65
  • [48] Semi-Supervised Segmentation Of Pathological Images Based on Multi-Task Regularization And Contrastive Representation Learning
    Shi, Peng
    Zhong, Jing
    Wu, Xinting
    Lin, Liyan
    SSRN,
  • [49] A Multi-Task Model for Pulmonary Nodule Segmentation and Classification
    Tang, Tiequn
    Zhang, Rongfu
    JOURNAL OF IMAGING, 2024, 10 (09)
  • [50] A transformer-based multi-task framework for joint detection of aggression and hate on social media data
    Ghosh, Soumitra
    Priyankar, Amit
    Ekbal, Asif
    Bhattacharyya, Pushpak
    NATURAL LANGUAGE ENGINEERING, 2023, 29 (06) : 1495 - 1515