Multi-task Neural Shared Structure Search: A Study Based on Text Mining

被引：2

作者：

Li, Jiyi ^{[1
]}

Fukumoto, Fumiyo ^{[1
]}

机构：

[1] Univ Yamanashi, Kofu, Yamanashi, Japan

来源：

DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2021), PT II | 2021年 / 12682卷

关键词：

Multi-task; Shard structure search; Text mining;

D O I：

10.1007/978-3-030-73197-7_13

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Multi-task techniques are effective for handling the problem of small size of the datasets. They can leverage additional rich information from other tasks for improving the performance of the target task. One of the problems in the multi-task based methods is which resources are proper to be utilized as the auxiliary tasks and how to select the shared structures with an effective search mechanism. We propose a novel neural-based multi-task Shared Structure Encoding (SSE) to define the exploration space by which we can easily formulate the multi-task architecture search. For the search approaches, because these existing Network Architecture Search (NAS) techniques are not specially designed for the multi-task scenario, we propose two original search approaches, i.e., m-Sparse Search approach by Shared Structure encoding for neural-based Multi-Task models (m-S4MT) and Task-wise Greedy Generation Search approach by Shared Structure encoding for neural-based MultiTask models (TGG-S3MT). The experiments based on the real text datasets with multiple text mining tasks show that SSE is effective for formulating the multi-task architecture search. Moreover, both m-S4MT and TGG-S3MT have better performance on the target aspects than the single-task method, multi-label method, naive multi-task methods and the variant of the NAS approach from the existing works. Especially, 1-S4MT with a sparse assumption on the auxiliary tasks has good performance with very low computation cost.

引用

页码：202 / 218

页数：17

共 33 条

[1]

Bojanowski P, 2017, Arxiv, DOI arXiv:1607.04606

[2]

De Silva PUK, 2017, FASCINAT LIFE SCI, P73, DOI 10.1007/978-3-319-50627-2_6

[3]

Dong F., 2017, P 21 C COMP NAT LANG, P153, DOI DOI 10.18653/V1/K17-1017

[4]

Dong F., 2016, P 2016 C EMP METH NA, P1072, DOI [DOI 10.18653/V1/D16-1115, 10.18653/v1/D16-1115]

[5]

Elsken T, 2019, J MACH LEARN RES, V20

[6]

Ghosal T, 2019, 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), P1120

[7]

Guo H, 2018, PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, P687

[8]

Hershcovich D, 2018, PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, P373

[9]

Huang JB, 2018, Arxiv, DOI arXiv:1812.08775

[10]

Isonuma Masaru, 2017, P 2017 C EMP METH NA, P2101, DOI DOI 10.18653/V1/D17-1223

← 1 2 3 4 →