In Defense of the Unitary Scalarization for Deep Multi-Task Learning

被引:0
作者
Kurin, Vitaly [1 ]
De Palma, Alessandro [1 ]
Kostrikov, Ilya [2 ,3 ]
Whiteson, Shimon [1 ]
Kumar, M. Pawan [1 ]
机构
[1] Univ Oxford, Oxford, England
[2] New York Univ, New York, NY USA
[3] Univ Calif Berkeley, Berkeley, CA USA
来源
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022 | 2022年
基金
英国工程与自然科学研究理事会; 欧洲研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent multi-task learning research argues against unitary scalarization, where training simply minimizes the sum of the task losses. Several ad-hoc multi-task optimization algorithms have instead been proposed, inspired by various hypotheses about what makes multi-task settings difficult. The majority of these optimizers require per-task gradients, and introduce significant memory, runtime, and implementation overhead. We show that unitary scalarization, coupled with standard regularization and stabilization techniques from single-task learning, matches or improves upon the performance of complex multi-task optimizers in popular supervised and reinforcement learning settings. We then present an analysis suggesting that many specialized multi-task optimizers can be partly interpreted as forms of regularization, potentially explaining our surprising results. We believe our results call for a critical reevaluation of recent research in the area.
引用
收藏
页数:15
相关论文
共 66 条
[1]  
Allen-Zhu Z, 2019, PR MACH LEARN RES, V97
[2]  
[Anonymous], ADV NEURAL INFORM PR
[3]  
[Anonymous], 2018, INT C MACH LEARN
[4]  
[Anonymous], 2008, MACH LEARN P 25 INT
[5]  
[Anonymous], 2018, INT C MACH LEARN
[6]   SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J].
Badrinarayanan, Vijay ;
Kendall, Alex ;
Cipolla, Roberto .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) :2481-2495
[7]  
Bakker B., 2003, J MACHINE LEARNING R
[8]  
Cappart Q., 2021, P 30 INT JOINT C ART
[9]   Multitask learning [J].
Caruana, R .
MACHINE LEARNING, 1997, 28 (01) :41-75
[10]  
Caruana R., 2000, NEURAL INFORM PROCES