Low-resource text classification using domain-adversarial learning

被引：11

作者：

Griesshaber, Daniel ^{[1
]}

Ngoc Thang Vu ^{[2
]}

Maucher, Johannes ^{[1
]}

机构：

[1] Stuttgart Media Univ, Nobelstr 10, D-70569 Stuttgart, Germany

[2] Univ Stuttgart, Inst Nat Language Proc IMS, Pfaffenwaldring 5b, D-70569 Stuttgart, Germany

来源：

COMPUTER SPEECH AND LANGUAGE | 2020年 / 62卷

关键词：

NLP; Low-resource; Deep learning; Domain-adversarial;

D O I：

10.1016/j.csl.2019.101056

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep learning techniques have recently shown to be successful in many natural language processing tasks forming state-of-the-art systems. They require, however, a large amount of annotated data which is often missing. This paper explores the use of domain-adversarial learning as a regularizer to avoid overfitting when training domain invariant features for deep, complex neural networks in low-resource and zero-resource settings in new target domains or languages. In case of new languages, we show that monolingual word vectors can be directly used for training without prealignment. Their projection into a common space can be learnt ad-hoc at training time reaching the final performance of pretrained multilingual word vectors. (C) 2019 Elsevier Ltd. All rights reserved.

引用

页数：11

共 52 条

[1]

Aggarwal CharuC., 2012, MINING TEXT DATA, DOI DOI 10.1007/978-1-4614-3223-4.6

[2]

Agi Z., 2016, T ASS COMPUTATIONAL, V4, P301, DOI [DOI 10.1162/TACLA00100, DOI 10.1162/TACL_A_00100, 10.1162/tacl_a_00100]

[3]

[Anonymous], 14085882 ARXIV

[4]

[Anonymous], 160201925 ARXIV

[5]

[Anonymous], P LEARN TEST CAT AAA

[6]

[Anonymous], 2008, Fundamental Principles of Mathematical Sciences

[7]

[Anonymous], 2017, P INT C LEARN REPR

[8]

[Anonymous], 2012, Mining text data, DOI DOI 10.1007/978-1-4614-3223-4_13

[9]

[Anonymous], USING BILINGUAL KNOW

[10]

[Anonymous], 2018, Transactions of the Association for Computational Linguistics, DOI DOI 10.1162/TACL_A_00039

← 1 2 3 4 5 6 →