Deep multi-task learning with flexible and compact architecture search

被引：0

作者：

Jiejie Zhao

Weifeng Lv

Bowen Du

Junchen Ye

Leilei Sun

Guixi Xiong

机构：

[1] Beihang University,SKLSDE and BDBC Lab

来源：

International Journal of Data Science and Analytics | 2023年 / 15卷

关键词：

Multi-task learning; Network architecture; Feedforward neural networks; Parameter generation; Task relationship;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Multi-task learning has been applied successfully in various applications. Recent research shows that the performance of multi-task learning methods could be improved by appropriately sharing model architectures. However, the existing work either identifies multi-task architecture manually based on prior knowledge, or simply uses an identical model structure for all tasks with a parameter sharing mechanism. In this paper, we propose a novel architecture search method to discover flexible and compact architectures for deep multi-task learning automatically, which not only extends the expressiveness of existing reinforcement learning-based neural architecture search methods, but also enhances the flexibility of existing hand-crafted multi-task learning methods. The discovered architecture shares structure and parameters adaptively to handle different levels of task relatedness, resulting in effectiveness improvement. In particular, for deep multi-task learning, we propose an architecture search space which includes a combination of partially shared modules at the low-level layer, and a set of task-specific modules with various depths at high-level layers. Secondly, a parameter generation mechanism is proposed to not only explore all possible cross-layer connections, but also reduce the search cost. Thirdly, we propose a task-specific shadow batch normalization mechanism to stabilize the training process and improve the search effectiveness. Finally, an auxiliary module is designed to guide the model training process. Experimental results demonstrate that the learned architectures outperform state-of-the-art methods with fewer learning parameters.

引用

页码：187 / 199

页数：12

共 19 条

[1]

Argyriou A(2008)Convex multi-task feature learning Mach. Learn. 73 243-272

[2]

Evgeniou T(1997)Multitask learning Mach. Learn. 28 41-75

[3]

Pontil M(2016)Multi-task learning in deep neural networks for mandarin-english code-mixing speech recognition IEICE Trans. Inform. Syst. 99 2554-2557

[4]

Caruana R(1992)Simple statistical gradient-following algorithms for connectionist reinforcement learning Mach. Learn. 8 229-256

[5]

Chen M(2021)Deep multi-task learning with relational attention for business success prediction Pattern Recog. 110 107469-504

[6]

Pan J(2018)Multi-target regression via robust low-rank learning IEEE Trans. Pattern Anal. Mach. Intell. 40 497-undefined

[7]

Zhao Q(undefined)undefined undefined undefined undefined-undefined

[8]

Yan Y(undefined)undefined undefined undefined undefined-undefined

[9]

Williams RJ(undefined)undefined undefined undefined undefined-undefined

[10]

Zhao J(undefined)undefined undefined undefined undefined-undefined

← 1 2 →