Prompt Guided Transformer for Multi-Task Dense Prediction

被引:6
|
作者
Lu, Yuxiang [1 ]
Sirejiding, Shalayiding [1 ]
Ding, Yue [1 ]
Wang, Chunlin [2 ]
Lu, Hongtao [1 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai 200240, Peoples R China
[2] Chuxiong Normal Univ, Sch Informat Sci & Technol, Chuxiong 675099, Peoples R China
关键词
Multi-task learning; dense prediction; prompting; vision transformer;
D O I
10.1109/TMM.2024.3349865
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Task-conditional architecture offers advantage in parameter efficiency but falls short in performance compared to state-of-the-art multi-decoder methods. How to trade off performance and model parameters is an important and difficult problem. In this paper, we introduce a simple and lightweight task-conditional model called Prompt Guided Transformer (PGT) to optimize this challenge. Our approach designs a Prompt-conditioned Transformer block, which incorporates task-specific prompts in the self-attention mechanism to achieve global dependency modeling and parameter-efficient feature adaptation across multiple tasks. This block is integrated into both the shared encoder and decoder, enhancing the capture of intra- and inter-task features. Moreover, we design a lightweight decoder to further reduce parameter usage, which accounts for only 2.7% of the total model parameters. Extensive experiments on two multi-task dense prediction benchmarks, PASCAL-Context and NYUD-v2, demonstrate that our approach achieves state-of-the-art results among task-conditional methods while using fewer parameters, and maintains a significant balance between performance and parameter size.
引用
收藏
页码:6375 / 6385
页数:11
相关论文
共 50 条
  • [41] Hotspot Detection via Multi-task Learning and Transformer Encoder
    Zhu, Binwu
    Chen, Ran
    Zhang, Xinyun
    Yang, Fan
    Zeng, Xuan
    Yu, Bei
    Wong, Martin D. F.
    2021 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER AIDED DESIGN (ICCAD), 2021,
  • [42] HirMTL: Hierarchical Multi-Task Learning for dense scene understanding
    Luo, Huilan
    Hu, Weixia
    Wei, Yixiao
    He, Jianlong
    Yu, Minghao
    NEURAL NETWORKS, 2025, 181
  • [43] CoTexT: Multi-task Learning with Code-Text Transformer
    Long Phan
    Hieu Tran
    Le, Daniel
    Hieu Nguyen
    Anibal, James
    Peltekian, Alec
    Ye, Yanfang
    NLP4PROG 2021: THE 1ST WORKSHOP ON NATURAL LANGUAGE PROCESSING FOR PROGRAMMING (NLP4PROG 2021), 2021, : 40 - 47
  • [44] ADVERSARIAL-PREDICTION GUIDED MULTI-TASK ADAPTATION FOR SEMANTIC SEGMENTATION OF ELECTRON MICROSCOPY IMAGES
    Yi, Jiajin
    Yuan, Zhimin
    Peng, Jialin
    2020 IEEE 17TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (ISBI 2020), 2020, : 1205 - 1208
  • [45] Multi-task Envisioning Transformer-based Autoencoder for Corporate Credit Rating Migration Early Prediction
    Yue, Han
    Xia, Steve
    Liu, Hongfu
    PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 4452 - 4460
  • [46] Multi-task gradient descent for multi-task learning
    Bai, Lu
    Ong, Yew-Soon
    He, Tiantian
    Gupta, Abhishek
    MEMETIC COMPUTING, 2020, 12 (04) : 355 - 369
  • [47] Multi-task gradient descent for multi-task learning
    Lu Bai
    Yew-Soon Ong
    Tiantian He
    Abhishek Gupta
    Memetic Computing, 2020, 12 : 355 - 369
  • [48] Multi-Task Multi-Attention Transformer for Generative Named Entity Recognition
    Mo, Ying
    Liu, Jiahao
    Tang, Hongyin
    Wang, Qifan
    Xu, Zenglin
    Wang, Jingang
    Quan, Xiaojun
    Wu, Wei
    Li, Zhoujun
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 4171 - 4183
  • [49] Adversarial Learning Guided Task Relatedness Refinement for Multi-Task Deep Learning
    Fang, Yuchun
    Cai, Sirui
    Cao, Yiting
    Li, Zhengchen
    Zhang, Zhaoxiang
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 6946 - 6957
  • [50] Multi-Task Learning Using Task Dependencies for Face Attributes Prediction
    Fan, Di
    Kim, Hyunwoo
    Kim, Junmo
    Liu, Yunhui
    Huang, Qiang
    APPLIED SCIENCES-BASEL, 2019, 9 (12):