Task-aware asynchronous multi-task model with class incremental contrastive learning for surgical scene understanding

被引：0

作者：

Seenivasan, Lalithkumar ^{[1
]}

Islam, Mobarakol ^{[2
]}

Xu, Mengya ^{[1
]}

Lim, Chwee Ming ^{[3
]}

Ren, Hongliang ^{[1
,4
,5
]}

机构：

[1] Natl Univ Singapore, Dept Biomed Engn, Singapore, Singapore

[2] Imperial Coll London, Dept Comp, London, England

[3] Singapore Gen Hosp, Head & Neck Surg, Singapore, Singapore

[4] Chinese Univ Hong Kong, Dept Elect Engn, Shatin, Hong Kong, Peoples R China

[5] Chinese Univ Hong Kong, Shun Hing Inst Adv Engn, Shatin, Hong Kong, Peoples R China

来源：

INTERNATIONAL JOURNAL OF COMPUTER ASSISTED RADIOLOGY AND SURGERY | 2023年 / 18卷 / 05期

基金：

国家重点研发计划;

关键词：

Surgical scene understanding; Domain generalization; Scene graph; Curriculum learning;

D O I：

10.1007/s11548-022-02800-2

中图分类号：

R318 [生物医学工程];

学科分类号：

0831 ;

摘要：

PurposeSurgery scene understanding with tool-tissue interaction recognition and automatic report generation can play an important role in intra-operative guidance, decision-making and postoperative analysis in robotic surgery. However, domain shifts between different surgeries with inter and intra-patient variation and novel instruments' appearance degrade the performance of model prediction. Moreover, it requires output from multiple models, which can be computationally expensive and affect real-time performance.MethodologyA multi-task learning (MTL) model is proposed for surgical report generation and tool-tissue interaction prediction that deals with domain shift problems. The model forms of shared feature extractor, mesh-transformer branch for captioning and graph attention branch for tool-tissue interaction prediction. The shared feature extractor employs class incremental contrastive learning to tackle intensity shift and novel class appearance in the target domain. We design Laplacian of Gaussian-based curriculum learning into both shared and task-specific branches to enhance model learning. We incorporate a task-aware asynchronous MTL optimization technique to fine-tune the shared weights and converge both tasks optimally.ResultsThe proposed MTL model trained using task-aware optimization and fine-tuning techniques reported a balanced performance (BLEU score of 0.4049 for scene captioning and accuracy of 0.3508 for interaction detection) for both tasks on the target domain and performed on-par with single-task models in domain adaptation.ConclusionThe proposed multi-task model was able to adapt to domain shifts, incorporate novel instruments in the target domain, and perform tool-tissue interaction detection and report generation on par with single-task models.

引用

页码：921 / 928

页数：8

共 50 条

[11] A Contrastive Sharing Model for Multi-Task Recommendation
Bai, Ting
Xiao, Yudong
Wu, Bin
Yang, Guojun
Yu, Hongyong
Nie, Jian-Yun
PROCEEDINGS OF THE ACM WEB CONFERENCE 2022 (WWW'22), 2022, : 3239 - 3247
[12] TASK AWARE MULTI-TASK LEARNING FOR SPEECH TO TEXT TASKS
Indurthi, Sathish
Zaidi, Mohd Abbas
Lakumarapu, Nikhil Kumar
Lee, Beomseok
Han, Hyojung
Ahn, Seokchan
Kim, Sangha
Kim, Chanwoo
Hwang, Inchul
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7723 - 7727
[13] Entity-aware Multi-task Learning for Query Understanding at Walmart
Peng, Zhiyuan
Dave, Vachik
McNabb, Nicole
Sharnagat, Rahul
Magnani, Alessandro
Liao, Ciya
Fang, Yi
Rajanala, Sravanthi
PROCEEDINGS OF THE 29TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2023, 2023, : 4733 - 4742
[14] AdaMT-Net: An Adaptive Weight Learning Based Multi-Task Learning Model For Scene Understanding
Jha, Ankit
Kumar, Awanish
Banerjee, Biplab
Chaudhuri, Subhasis
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 3027 - 3035
[15] Encoder augmentation for multi-task graph contrastive learning
Wang, Xiaoyu
Zhang, Qiqi
Liu, Gen
Zhao, Zhongying
Cui, Hongzhi
NEUROCOMPUTING, 2025, 630
[16] Research on Road Scene Understanding of Autonomous Vehicles Based on Multi-Task Learning
Guo, Jinghua
Wang, Jingyao
Wang, Huinian
Xiao, Baoping
He, Zhifei
Li, Lubin
SENSORS, 2023, 23 (13)
[17] Service recommendation based on contrastive learning and multi-task learning
Yu, Ting
Zhang, Lihua
Liu, Hailin
Liu, Hongbing
Wang, Jiaojiao
COMPUTER COMMUNICATIONS, 2024, 213 : 285 - 295
[18] Scale-Aware Task Message Transferring for Multi-Task Learning
Sirejiding, Shalayiding
Lu, Yuxiang
Lu, Hongtao
Ding, Yue
2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 1859 - 1864
[19] Context-Aware Multi-Task Learning for Traffic Scene Recognition in Autonomous Vehicles
Lee, Younkwan
Jeon, Jihyo
Yu, Jongmin
Jeon, Moongu
2020 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2020, : 723 - 730
[20] Pre-training Multi-task Contrastive Learning Models for Scientific Literature Understanding
Zhang, Yu
Cheng, Hao
Shen, Zhihong
Liu, Xiaodong
Wang, Ye-Yi
Gao, Jianfeng
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 12259 - 12275

← 1 2 3 4 5 →