MetaXL: Meta Representation Transformation for Low-resource Cross-lingual Learning

被引:0
作者
Xia, Mengzhou [1 ]
Zheng, Guoqing [3 ]
Mukherjee, Subhabrata [3 ]
Shokouhi, Milad [3 ]
Neubig, Graham [2 ]
Awadallah, Ahmed Hassan [3 ]
机构
[1] Princeton Univ, Princeton, NJ 08544 USA
[2] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
[3] Microsoft Res, Redmond, WA USA
来源
2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021) | 2021年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The combination of multilingual pre-trained representations and cross-lingual transfer learning is one of the most effective methods for building functional NLP systems for low-resource languages. However, for extremely low-resource languages without large-scale monolingual corpora for pre-training or sufficient annotated data for fine-tuning, transfer learning remains an under-studied and challenging task. Moreover, recent work shows that multilingual representations are surprisingly disjoint across languages (Singh et al., 2019), bringing additional challenges for transfer onto extremely low-resource languages. In this paper, we propose MetaXL, a meta-learning based framework that learns to transform representations judiciously from auxiliary languages to a target one and brings their representation spaces closer for effective transfer. Extensive experiments on real-world low-resource languages - without access to large-scale monolingual corpora or large amounts of labeled data - for tasks like cross-lingual sentiment analysis and named entity recognition show the effectiveness of our approach. Code for MetaXL is publicly available at github.com/microsoft/MetaXL.
引用
收藏
页码:499 / 511
页数:13
相关论文
共 39 条
[1]  
Adams O, 2017, 15TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2017), VOL 1: LONG PAPERS, P937
[2]  
Artetxe M, 2020, Arxiv, DOI arXiv:2004.14958
[3]  
Artetxe M, 2020, Arxiv, DOI arXiv:1910.11856
[4]  
Bapna A., 2019, Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, P1538
[5]  
Chen Xilun, 2018, Transactions of the Association for Computational Linguistics, V6, P557, DOI DOI 10.1162/TACL_A_00039
[6]  
Conneau A., 2020, P 58 ANN M ASS COMP, P8440, DOI [DOI 10.18653/V1/2020.ACL-MAIN.747, 10.18653/v1/2020.acl-main.747]
[7]  
Devlin J, 2019, Arxiv, DOI [arXiv:1810.04805, 10.48550/arXiv.1810.04805, DOI 10.48550/ARXIV.1810.04805]
[8]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[9]  
Dou ZY, 2019, 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019), P1192
[10]  
DUBUISSON MP, 1994, INT C PATT RECOG, P566, DOI 10.1109/ICPR.1994.576361