Multi-task learning with deformable convolution

被引：15

作者：

Li, Jie ^{[1
]}

Huang, Lei ^{[1
,2
]}

Wei, Zhiqiang ^{[1
,2
]}

Zhang, Wenfeng ^{[1
]}

Qin, Qibing ^{[1
]}

机构：

[1] Ocean Univ China, Coll Informat Sci & Engn, Qingdao 266000, Peoples R China

[2] Pilot Natl Lab Marine Sci & Technol Qingdao, Qingdao 266000, Peoples R China

来源：

JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION | 2021年 / 77卷

基金：

中国国家自然科学基金;

关键词：

Multi-task learning; Deformable convolution; Recognition;

D O I：

10.1016/j.jvcir.2021.103109

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Multi-task learning aims to tackle various tasks with branched feature sharing architectures. Considering its diversity and complexity, discriminative feature representations need to be extracted for each individual task. Fixed geometric structures as a limitation of convolutional neural networks (CNNs) in building models, is also exists and poses a severe challenge in multi-task learning since the geometric variations will augment when we deal with multiple tasks. In this paper, we go beyond these limitations and propose a novel multi-task network by introducing the deformable convolution. Our design, the Deformable Multi-Task Network (DMTN), starts with a single shared network for constructing a shared feature pool. Then, we present task-specific deformable modules to extract discriminative features to be tailored for each task from the shared feature pool. The task-specific deformable modules utilize two new parts, deformable part and alignment part, to extract more discriminative task-specific features while greatly enhancing the transformation modeling capability. Experiments conducted on various multi-task learning types demonstrate the effectiveness of the proposed method. On multiple classification tasks, semantic segmentation and depth estimation tasks, our DMTN exceeds state-of-the-art approaches against strong baselines.

引用

页数：13

共 58 条

[1]

Ager S, 2008, OMNIGLOT WRITING SYS, V27, P2008

[2] Regularized uncertainty-based multi-task learning model for food analysis [J].

Aguilar, Eduardo ;

Bolanos, Marc ;

Radeva, Petia .

JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2019, 60 :360-370

[3]

[Anonymous], 2017, CoRR abs/1705.08142

[4]

[Anonymous], 2016, P BRIT MACH VIS C BM

[5]

[Anonymous], ABS13065151 CORR

[6]

[Anonymous], 2020, IEEE T NEUR NET LEAR, DOI [DOI 10.1109/TNNLS.2019.2912082, DOI 10.1109/TKDE.2019.2903810]

[7] SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].

Badrinarayanan, Vijay ;

Kendall, Alex ;

Cipolla, Roberto .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495

[8] Stochastic Filter Groups for Multi-Task CNNs: Learning Specialist and Generalist Convolution Kernels [J].

Bragman, Felix J. S. ;

Tanno, Ryutaro ;

Ourselin, Sebastien ;

Alexander, Daniel C. ;

Cardoso, M. Jorge .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :1385-1394

[9] Multitask learning [J].

Caruana, R .

MACHINE LEARNING, 1997, 28 (01) :41-75

[10]

Chen Z, 2018, PR MACH LEARN RES, V80

← 1 2 3 4 5 6 →