Solving Hungarian natural language processing tasks with multilingual generative models

被引:1
作者
Yang, Zijian Gyozo [1 ,2 ,3 ]
Laki, Laszlo Janos [1 ,2 ,3 ]
机构
[1] Hungarian Res Ctr Linguist, Budapest, Hungary
[2] MTA PPKE Hungarian Language Technol Res Grp, Budapest, Hungary
[3] Pazmany Peter Catholic Univ, Fac Informat Technol & Bion, Budapest, Hungary
来源
ANNALES MATHEMATICAE ET INFORMATICAE | 2023年 / 57卷
关键词
natural language processing; multilingual model; sentiment analysis; abstractive summarization; machine translation; Marian NMT; M2M100;
D O I
10.33039/ami.2022.11.001
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Generative ability is a crucial need for artificial intelligence applications, such as chatbots, virtual assistants, machine translation systems etc. In recent years, the transformer-based neural architectures gave a huge boost to generate human-like English texts. In our research we did experiments to create pre-trained generative transformer models for Hungarian language and fine-tune them for multiple types of natural language processing tasks. In our focus, multilingual models were trained. We have pre-trained a multilingual BART, then fine-tuned it to various NLP tasks, such as text classification, abstractive summarization. In our experiments, we focused on transfer learning techniques to increase the performance. Furthermore, a M2M100 multilingual model was fine-tuned for a 12-lingual Hungarian-Centric machine translation. Last but not least, a Marian NMT based machine translation system was also built from scratch for the 12-lingual Hungarian-Centric machine translation task. In our results, using the cross-lingual transfer method we could achieve higher performance in all of our tasks. In our machine translation experiment, using our fine-tuned M2M100 model we could outperform the Google Translate, Microsoft Translator and eTranslation.
引用
收藏
页码:92 / 106
页数:15
相关论文
共 34 条
  • [1] Aharoni R, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P3874
  • [2] Artetxe M, 2020, 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), P4623
  • [3] Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond
    Artetxe, Mikel
    Schwenk, Holger
    [J]. TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2019, 7 : 597 - 610
  • [4] Barrault L, 2019, FOURTH CONFERENCE ON MACHINE TRANSLATION (WMT 2019), P1
  • [5] Bucila Cristian, 2006, KDD, P535
  • [6] Cao Y., 2022, arXiv, DOI [10.48550/ARXIV.2205.01757, DOI 10.48550/ARXIV.2205.01757]
  • [7] Chen PJ, 2019, Arxiv, DOI [arXiv:1910.06848, 10.48550/ARXIV.1910.06848, DOI 10.48550/ARXIV.1910.06848]
  • [8] Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
  • [9] Fan AEL, 2020, Arxiv, DOI arXiv:2010.11125
  • [10] Feng XC, 2018, PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P4071