Multi-task learning with deformable convolution

被引:15
作者
Li, Jie [1 ]
Huang, Lei [1 ,2 ]
Wei, Zhiqiang [1 ,2 ]
Zhang, Wenfeng [1 ]
Qin, Qibing [1 ]
机构
[1] Ocean Univ China, Coll Informat Sci & Engn, Qingdao 266000, Peoples R China
[2] Pilot Natl Lab Marine Sci & Technol Qingdao, Qingdao 266000, Peoples R China
基金
中国国家自然科学基金;
关键词
Multi-task learning; Deformable convolution; Recognition;
D O I
10.1016/j.jvcir.2021.103109
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Multi-task learning aims to tackle various tasks with branched feature sharing architectures. Considering its diversity and complexity, discriminative feature representations need to be extracted for each individual task. Fixed geometric structures as a limitation of convolutional neural networks (CNNs) in building models, is also exists and poses a severe challenge in multi-task learning since the geometric variations will augment when we deal with multiple tasks. In this paper, we go beyond these limitations and propose a novel multi-task network by introducing the deformable convolution. Our design, the Deformable Multi-Task Network (DMTN), starts with a single shared network for constructing a shared feature pool. Then, we present task-specific deformable modules to extract discriminative features to be tailored for each task from the shared feature pool. The task-specific deformable modules utilize two new parts, deformable part and alignment part, to extract more discriminative task-specific features while greatly enhancing the transformation modeling capability. Experiments conducted on various multi-task learning types demonstrate the effectiveness of the proposed method. On multiple classification tasks, semantic segmentation and depth estimation tasks, our DMTN exceeds state-of-the-art approaches against strong baselines.
引用
收藏
页数:13
相关论文
共 58 条
[21]   12-in-1: Multi-Task Vision and Language Representation Learning [J].
Lu, Jiasen ;
Goswami, Vedanuj ;
Rohrbach, Marcus ;
Parikh, Devi ;
Lee, Stefan .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :10434-10443
[22]   DUNet: A deformable network for retinal vessel segmentation [J].
Jin, Qiangguo ;
Meng, Zhaopeng ;
Tuan D Pham ;
Chen, Qi ;
Wei, Leyi ;
Su, Ran .
KNOWLEDGE-BASED SYSTEMS, 2019, 178 :149-162
[23]   Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics [J].
Kendall, Alex ;
Gal, Yarin ;
Cipolla, Roberto .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7482-7491
[24]  
Krizhevsky A., 2009, Masters Thesis
[25]   ImageNet Classification with Deep Convolutional Neural Networks [J].
Krizhevsky, Alex ;
Sutskever, Ilya ;
Hinton, Geoffrey E. .
COMMUNICATIONS OF THE ACM, 2017, 60 (06) :84-90
[26]   UM-Adapt: Unsupervised Multi-Task Adaptation Using Adversarial Cross-Task Distillation [J].
Kundu, Jogendra Nath ;
Lakkakula, Nishank ;
Babu, R. Venkatesh .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :1436-1445
[27]   DFANet: Deep Feature Aggregation for Real-Time Semantic Segmentation [J].
Li, Hanchao ;
Xiong, Pengfei ;
Fan, Haoqiang ;
Sun, Jian .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :9514-9523
[28]   Focal Loss for Dense Object Detection [J].
Lin, Tsung-Yi ;
Goyal, Priya ;
Girshick, Ross ;
He, Kaiming ;
Dollar, Piotr .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :2999-3007
[29]  
Lin X, 2019, ADV NEUR IN, V32
[30]  
Liu K, 2018, AAAI CONF ARTIF INTE, P7138