Task-aware asynchronous multi-task model with class incremental contrastive learning for surgical scene understanding

被引:0
|
作者
Seenivasan, Lalithkumar [1 ]
Islam, Mobarakol [2 ]
Xu, Mengya [1 ]
Lim, Chwee Ming [3 ]
Ren, Hongliang [1 ,4 ,5 ]
机构
[1] Natl Univ Singapore, Dept Biomed Engn, Singapore, Singapore
[2] Imperial Coll London, Dept Comp, London, England
[3] Singapore Gen Hosp, Head & Neck Surg, Singapore, Singapore
[4] Chinese Univ Hong Kong, Dept Elect Engn, Shatin, Hong Kong, Peoples R China
[5] Chinese Univ Hong Kong, Shun Hing Inst Adv Engn, Shatin, Hong Kong, Peoples R China
基金
国家重点研发计划;
关键词
Surgical scene understanding; Domain generalization; Scene graph; Curriculum learning;
D O I
10.1007/s11548-022-02800-2
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
PurposeSurgery scene understanding with tool-tissue interaction recognition and automatic report generation can play an important role in intra-operative guidance, decision-making and postoperative analysis in robotic surgery. However, domain shifts between different surgeries with inter and intra-patient variation and novel instruments' appearance degrade the performance of model prediction. Moreover, it requires output from multiple models, which can be computationally expensive and affect real-time performance.MethodologyA multi-task learning (MTL) model is proposed for surgical report generation and tool-tissue interaction prediction that deals with domain shift problems. The model forms of shared feature extractor, mesh-transformer branch for captioning and graph attention branch for tool-tissue interaction prediction. The shared feature extractor employs class incremental contrastive learning to tackle intensity shift and novel class appearance in the target domain. We design Laplacian of Gaussian-based curriculum learning into both shared and task-specific branches to enhance model learning. We incorporate a task-aware asynchronous MTL optimization technique to fine-tune the shared weights and converge both tasks optimally.ResultsThe proposed MTL model trained using task-aware optimization and fine-tuning techniques reported a balanced performance (BLEU score of 0.4049 for scene captioning and accuracy of 0.3508 for interaction detection) for both tasks on the target domain and performed on-par with single-task models in domain adaptation.ConclusionThe proposed multi-task model was able to adapt to domain shifts, incorporate novel instruments in the target domain, and perform tool-tissue interaction detection and report generation on par with single-task models.
引用
收藏
页码:921 / 928
页数:8
相关论文
共 50 条
  • [11] A Contrastive Sharing Model for Multi-Task Recommendation
    Bai, Ting
    Xiao, Yudong
    Wu, Bin
    Yang, Guojun
    Yu, Hongyong
    Nie, Jian-Yun
    PROCEEDINGS OF THE ACM WEB CONFERENCE 2022 (WWW'22), 2022, : 3239 - 3247
  • [12] TASK AWARE MULTI-TASK LEARNING FOR SPEECH TO TEXT TASKS
    Indurthi, Sathish
    Zaidi, Mohd Abbas
    Lakumarapu, Nikhil Kumar
    Lee, Beomseok
    Han, Hyojung
    Ahn, Seokchan
    Kim, Sangha
    Kim, Chanwoo
    Hwang, Inchul
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7723 - 7727
  • [13] Entity-aware Multi-task Learning for Query Understanding at Walmart
    Peng, Zhiyuan
    Dave, Vachik
    McNabb, Nicole
    Sharnagat, Rahul
    Magnani, Alessandro
    Liao, Ciya
    Fang, Yi
    Rajanala, Sravanthi
    PROCEEDINGS OF THE 29TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2023, 2023, : 4733 - 4742
  • [14] AdaMT-Net: An Adaptive Weight Learning Based Multi-Task Learning Model For Scene Understanding
    Jha, Ankit
    Kumar, Awanish
    Banerjee, Biplab
    Chaudhuri, Subhasis
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 3027 - 3035
  • [15] Encoder augmentation for multi-task graph contrastive learning
    Wang, Xiaoyu
    Zhang, Qiqi
    Liu, Gen
    Zhao, Zhongying
    Cui, Hongzhi
    NEUROCOMPUTING, 2025, 630
  • [16] Research on Road Scene Understanding of Autonomous Vehicles Based on Multi-Task Learning
    Guo, Jinghua
    Wang, Jingyao
    Wang, Huinian
    Xiao, Baoping
    He, Zhifei
    Li, Lubin
    SENSORS, 2023, 23 (13)
  • [17] Service recommendation based on contrastive learning and multi-task learning
    Yu, Ting
    Zhang, Lihua
    Liu, Hailin
    Liu, Hongbing
    Wang, Jiaojiao
    COMPUTER COMMUNICATIONS, 2024, 213 : 285 - 295
  • [18] Scale-Aware Task Message Transferring for Multi-Task Learning
    Sirejiding, Shalayiding
    Lu, Yuxiang
    Lu, Hongtao
    Ding, Yue
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 1859 - 1864
  • [19] Context-Aware Multi-Task Learning for Traffic Scene Recognition in Autonomous Vehicles
    Lee, Younkwan
    Jeon, Jihyo
    Yu, Jongmin
    Jeon, Moongu
    2020 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2020, : 723 - 730
  • [20] Pre-training Multi-task Contrastive Learning Models for Scientific Literature Understanding
    Zhang, Yu
    Cheng, Hao
    Shen, Zhihong
    Liu, Xiaodong
    Wang, Ye-Yi
    Gao, Jianfeng
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 12259 - 12275