Cross-task Attention Mechanism for Dense Multi-task Learning

被引:25
作者
Lopes, Ivan [1 ]
Tuan-Hung Vu [1 ,2 ]
de Charette, Raoul [1 ]
机构
[1] INRIA, Paris, France
[2] Valeo Ai, Paris, France
来源
2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV) | 2023年
关键词
D O I
10.1109/WACV56688.2023.00236
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-task learning has recently become a promising solution for comprehensive understanding of complex scenes. With an appropriate design, multi-task models can not only be memory-efficient but also favour the exchange of complementary signals across tasks. In this work, we jointly address 2D semantic segmentation, and two geometry-related tasks, namely dense depth, surface normal estimation as well as edge estimation showing their benefit on several datasets. We propose a novel multi-task learning architecture that exploits pair-wise cross-task exchange through correlation-guided attention and self-attention to enhance the average representation learning for all tasks. We conduct extensive experiments on three multi-task setups, showing the benefit of our proposal in comparison to competitive baselines in both synthetic and real benchmarks. We also extend our method to the novel multi-task unsupervised domain adaptation setting. Our code is available at https://github.com/cv-rits/DenseMTL
引用
收藏
页码:2328 / 2337
页数:10
相关论文
共 38 条
[1]  
[Anonymous], 2018, CVPR, DOI DOI 10.1109/CVPR.2018.00412
[2]   Large-Scale Machine Learning with Stochastic Gradient Descent [J].
Bottou, Leon .
COMPSTAT'2010: 19TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL STATISTICS, 2010, :177-186
[3]  
Bruggemann David, 2021, ICCV
[4]   DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J].
Chen, Liang-Chieh ;
Papandreou, George ;
Kokkinos, Iasonas ;
Murphy, Kevin ;
Yuille, Alan L. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) :834-848
[5]  
Chen Z, 2018, PR MACH LEARN RES, V80
[6]   The Cityscapes Dataset for Semantic Urban Scene Understanding [J].
Cordts, Marius ;
Omran, Mohamed ;
Ramos, Sebastian ;
Rehfeld, Timo ;
Enzweiler, Markus ;
Benenson, Rodrigo ;
Franke, Uwe ;
Roth, Stefan ;
Schiele, Bernt .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3213-3223
[7]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[8]   Dual Attention Network for Scene Segmentation [J].
Fu, Jun ;
Liu, Jing ;
Tian, Haijie ;
Li, Yong ;
Bao, Yongjun ;
Fang, Zhiwei ;
Lu, Hanqing .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :3141-3149
[9]   Virtual Worlds as Proxy for Multi-Object Tracking Analysis [J].
Gaidon, Adrien ;
Wang, Qiao ;
Cabon, Yohann ;
Vig, Eleonora .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :4340-4349
[10]   Digging Into Self-Supervised Monocular Depth Estimation [J].
Godard, Clement ;
Mac Aodha, Oisin ;
Firman, Michael ;
Brostow, Gabriel .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :3827-3837