Prompt Guided Transformer for Multi-Task Dense Prediction

被引:6
|
作者
Lu, Yuxiang [1 ]
Sirejiding, Shalayiding [1 ]
Ding, Yue [1 ]
Wang, Chunlin [2 ]
Lu, Hongtao [1 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai 200240, Peoples R China
[2] Chuxiong Normal Univ, Sch Informat Sci & Technol, Chuxiong 675099, Peoples R China
关键词
Multi-task learning; dense prediction; prompting; vision transformer;
D O I
10.1109/TMM.2024.3349865
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Task-conditional architecture offers advantage in parameter efficiency but falls short in performance compared to state-of-the-art multi-decoder methods. How to trade off performance and model parameters is an important and difficult problem. In this paper, we introduce a simple and lightweight task-conditional model called Prompt Guided Transformer (PGT) to optimize this challenge. Our approach designs a Prompt-conditioned Transformer block, which incorporates task-specific prompts in the self-attention mechanism to achieve global dependency modeling and parameter-efficient feature adaptation across multiple tasks. This block is integrated into both the shared encoder and decoder, enhancing the capture of intra- and inter-task features. Moreover, we design a lightweight decoder to further reduce parameter usage, which accounts for only 2.7% of the total model parameters. Extensive experiments on two multi-task dense prediction benchmarks, PASCAL-Context and NYUD-v2, demonstrate that our approach achieves state-of-the-art results among task-conditional methods while using fewer parameters, and maintains a significant balance between performance and parameter size.
引用
收藏
页码:6375 / 6385
页数:11
相关论文
共 50 条
  • [1] TFUT: Task fusion upward transformer model for multi-task learning on dense prediction
    Xin, Zewei
    Sirejiding, Shalayiding
    Lu, Yuxiang
    Ding, Yue
    Wang, Chunlin
    Alsarhan, Tamam
    Lu, Hongtao
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 244
  • [2] Multi-Task Learning With Multi-Query Transformer for Dense Prediction
    Xu, Yangyang
    Li, Xiangtai
    Yuan, Haobo
    Yang, Yibo
    Zhang, Lefei
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (02) : 1228 - 1240
  • [3] MTLFormer: Multi-Task Learning Guided Transformer Network for Business Process Prediction
    Wang, Jiaojiao
    Huang, Jiawei
    Ma, Xiaoyu
    Li, Zhongjin
    Wang, Yaqi
    Yu, Dingguo
    IEEE ACCESS, 2023, 11 : 76722 - 76738
  • [4] Multi-Task Learning for Dense Prediction Tasks: A Survey
    Vandenhende, Simon
    Georgoulis, Stamatios
    Van Gansbeke, Wouter
    Proesmans, Marc
    Dai, Dengxin
    Van Gool, Luc
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (07) : 3614 - 3633
  • [5] A Multi-task Transformer Architecture for Drone State Identification and Trajectory Prediction
    Souli, Nicolas
    Palamas, Andreas
    Panayiotou, Tania
    Kolios, Panayiotis
    Ellinas, Georgios
    2024 20TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING IN SMART SYSTEMS AND THE INTERNET OF THINGS, DCOSS-IOT 2024, 2024, : 285 - 291
  • [6] A Transformer-Embedded Multi-Task Model for Dose Distribution Prediction
    Wen, Lu
    Xiao, Jianghong
    Tan, Shuai
    Wu, Xi
    Zhou, Jiliu
    Peng, Xingchen
    Wang, Yan
    INTERNATIONAL JOURNAL OF NEURAL SYSTEMS, 2023, 33 (08)
  • [7] Paraphrase Bidirectional Transformer with Multi-Task Learning
    Ko, Bowon
    Choi, Ho-Jin
    2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP 2020), 2020, : 217 - 220
  • [8] Improving Vision Transformer with Multi-Task Training
    Ahn, Woo Jin
    Yang, Geun Yeong
    Choi, Hyun Duck
    Lim, Myo Taeg
    Kang, Tae Koo
    2022 22ND INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS 2022), 2022, : 1963 - 1965
  • [9] HTML']HTML: Hierarchical Transformer-based Multi-task Learning for Volatility Prediction
    Yang, Linyi
    Ng, Tin Lok James
    Smyth, Barry
    Dong, Riuhai
    WEB CONFERENCE 2020: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2020), 2020, : 441 - 451
  • [10] Multi-task learning for pKa prediction
    Grigorios Skolidis
    Katja Hansen
    Guido Sanguinetti
    Matthias Rupp
    Journal of Computer-Aided Molecular Design, 2012, 26 : 883 - 895