Transformer-based multi-task learning for classification and segmentation of gastrointestinal tract endoscopic images

被引：20

作者：

Tang, Suigu ^{[1
]}

Yu, Xiaoyuan ^{[1
]}

Cheang, Chak Fong ^{[1
]}

Liang, Yanyan ^{[1
]}

Zhao, Penghui ^{[1
]}

Yu, Hon Ho ^{[2
]}

Choi, I. Cheong ^{[2
]}

机构：

[1] Macau Univ Sci & Technol, Fac Innovat Engn, Sch Comp Sci & Engn, Cotai, Macao, Peoples R China

[2] Kiang Wu Hosp, Macau, Macau, Peoples R China

来源：

COMPUTERS IN BIOLOGY AND MEDICINE | 2023年 / 157卷

关键词：

Transformer; Multi-task learning; Classification; Segmentation; Active learning; NETWORK; DIAGNOSIS;

D O I：

10.1016/j.compbiomed.2023.106723

中图分类号：

Q [生物科学];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

Despite being widely utilized to help endoscopists identify gastrointestinal (GI) tract diseases using classifica-tion and segmentation, models based on convolutional neural network (CNN) have difficulties in distinguishing the similarities among some ambiguous types of lesions presented in endoscopic images, and in the training when lacking labeled datasets. Those will prevent CNN from further improving the accuracy of diagnosis. To address these challenges, we first proposed a Multi-task Network (TransMT-Net) capable of simultaneously learning two tasks (classification and segmentation), which has the transformer designed to learn global features and can combine the advantages of CNN in learning local features so that to achieve a more accurate prediction in identifying the lesion types and regions in GI tract endoscopic images. We further adopted the active learning in TransMT-Net to tackle the labeled image-hungry problem. A dataset was created from the CVC-ClinicDB dataset, Macau Kiang Wu Hospital, and Zhongshan Hospital to evaluate the model performance. Then, the experimental results show that our model not only achieved 96.94% accuracy in the classification task and 77.76% Dice Similarity Coefficient in the segmentation task but also outperformed those of other models on our test set. Meanwhile, active learning also produced positive results for the performance of our model with a small-scale initial training set, and even its performance with 30% of the initial training set was comparable to that of most comparable models with the full training set. Consequently, the proposed TransMT-Net has demonstrated its potential performance in GI tract endoscopic images and it through active learning can alleviate the shortage of labeled images.

引用

页数：11

共 50 条

[41] Multi-task learning architecture for brain tumor detection and segmentation in MRI images
Nazir, Maria
Ali, Muhammad Junaid
Tufail, Hafiz Zahid
Shahid, Ahmad Raza
Raza, Basit
Shakil, Sadia
Khurshid, Khurram
JOURNAL OF ELECTRONIC IMAGING, 2022, 31 (05)
[42] Simultaneous segmentation and classification of colon cancer polyp images using a dual branch multi-task learning network
Li C.
Liu J.
Tang J.
Mathematical Biosciences and Engineering, 2024, 21 (02) : 2024 - 2049
[43] Multi-Task Mean Teacher Medical Image Segmentation Based on Swin Transformer
Zhang, Jie
Li, Fan
Zhang, Xin
Cheng, Yue
Hei, Xinhong
APPLIED SCIENCES-BASEL, 2024, 14 (07):
[44] Multi-task Learning for Brain Tumor Segmentation
Weninger, Leon
Liu, Qianyu
Merhof, Dorit
BRAINLESION: GLIOMA, MULTIPLE SCLEROSIS, STROKE AND TRAUMATIC BRAIN INJURIES (BRAINLES 2019), PT I, 2020, 11992 : 327 - 337
[45] Multi-task learning framework for echocardiography segmentation
Monkam, Patrice
Jin, Songbai
Lu, Wenkai
2022 IEEE INTERNATIONAL ULTRASONICS SYMPOSIUM (IEEE IUS), 2022,
[46] Multi-task Swin Transformer for Motion Artifacts Classification and Cardiac Magnetic Resonance Image Segmentation
Grzeszczyk, Michal K.
Plotka, Szymon
Sitek, Arkadiusz
STATISTICAL ATLASES AND COMPUTATIONAL MODELS OF THE HEART: REGULAR AND CMRXMOTION CHALLENGE PAPERS, STACOM 2022, 2022, 13593 : 409 - 417
[47] Bidirectional Transformer Based Multi-Task Learning for Natural Language Understanding
Tripathi, Suraj
Singh, Chirag
Kumar, Abhay
Pandey, Chandan
Jain, Nishant
NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS (NLDB 2019), 2019, 11608 : 54 - 65
[48] Semi-Supervised Segmentation Of Pathological Images Based on Multi-Task Regularization And Contrastive Representation Learning
Shi, Peng
Zhong, Jing
Wu, Xinting
Lin, Liyan
SSRN,
[49] A Multi-Task Model for Pulmonary Nodule Segmentation and Classification
Tang, Tiequn
Zhang, Rongfu
JOURNAL OF IMAGING, 2024, 10 (09)
[50] A transformer-based multi-task framework for joint detection of aggression and hate on social media data
Ghosh, Soumitra
Priyankar, Amit
Ekbal, Asif
Bhattacharyya, Pushpak
NATURAL LANGUAGE ENGINEERING, 2023, 29 (06) : 1495 - 1515

← 1 2 3 4 5 →