Low-resource text classification using domain-adversarial learning

被引:11
作者
Griesshaber, Daniel [1 ]
Ngoc Thang Vu [2 ]
Maucher, Johannes [1 ]
机构
[1] Stuttgart Media Univ, Nobelstr 10, D-70569 Stuttgart, Germany
[2] Univ Stuttgart, Inst Nat Language Proc IMS, Pfaffenwaldring 5b, D-70569 Stuttgart, Germany
关键词
NLP; Low-resource; Deep learning; Domain-adversarial;
D O I
10.1016/j.csl.2019.101056
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep learning techniques have recently shown to be successful in many natural language processing tasks forming state-of-the-art systems. They require, however, a large amount of annotated data which is often missing. This paper explores the use of domain-adversarial learning as a regularizer to avoid overfitting when training domain invariant features for deep, complex neural networks in low-resource and zero-resource settings in new target domains or languages. In case of new languages, we show that monolingual word vectors can be directly used for training without prealignment. Their projection into a common space can be learnt ad-hoc at training time reaching the final performance of pretrained multilingual word vectors. (C) 2019 Elsevier Ltd. All rights reserved.
引用
收藏
页数:11
相关论文
共 52 条
[11]  
[Anonymous], 170500108 ARXIV
[12]  
[Anonymous], 170601399 ARXIV
[13]  
[Anonymous], 2017, P C EMP METH NAT LAN
[14]  
[Anonymous], 2015, CORR
[15]   A neural probabilistic language model [J].
Bengio, Y ;
Ducharme, R ;
Vincent, P ;
Jauvin, C .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (06) :1137-1155
[16]  
Collobert R, 2011, J MACH LEARN RES, V12, P2493
[17]  
Dai AM, 2015, ADV NEUR IN, V28
[18]  
Devlin J., 2018, 181004805 ARXIV
[19]   Model Transfer for Tagging Low-resource Languages using a Bilingual Dictionary [J].
Fang, Meng ;
Cohn, Trevor .
PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 2, 2017, :587-593
[20]  
Faruqui M., 2014, Proc. 14th Conf. Eur. Chapter Assoc. Comput. Linguistics, Gothenburg, P462, DOI DOI 10.3115/V1/E14-1049