Multilingual Sequence-to-Sequence Models for Hebrew NLP

被引：0

作者：

Eyal, Matan ^{[1
]}

Noga, Hila ^{[1
]}

Aharoni, Roee ^{[1
]}

Szpektor, Idan ^{[1
]}

Tsarfaty, Reut ^{[1
]}

机构：

[1] Google Res, Mountain View, CA 94043 USA

来源：

FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023) | 2023年

关键词：

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Recent work attributes progress in NLP to large language models (LMs) with increased model size and large quantities of pretraining data. Despite this, current state-of-the-art LMs for Hebrew are both under-parameterized and under-trained compared to LMs in other languages. Additionally, previous work on pretrained Hebrew LMs focused on encoderonly models. While the encoder-only architecture is beneficial for classification tasks, it does not cater well for sub-word prediction tasks, such as Named Entity Recognition, when considering the morphologically rich nature of Hebrew. In this paper we argue that sequence-to-sequence generative architectures are more suitable for large LMs in morphologically rich languages (MRLs) such as Hebrew. We demonstrate this by casting tasks in the Hebrew NLP pipeline as text-to-text tasks, for which we can leverage powerful multilingual, pretrained sequence-to-sequence models as mT5, eliminating the need for a separate, specialized, morpheme-based, decoder. Using this approach, our experiments show substantial improvements over previously published results on all existing Hebrew NLP benchmarks. These results suggest that multilingual sequence-to-sequence models present a promising building block for NLP for MRLs.

引用

页码：7700 / 7708

页数：9

共 50 条

[1] Sparse Sequence-to-Sequence Models
Peters, Ben
Niculae, Vlad
Martins, Andre F. T.
57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 1504 - 1519
[2] Analysis of Multilingual Sequence-to-Sequence Speech Recognition Systems
Karafiat, Martin
Baskar, Murali Karthick
Watanabe, Shinji
Hori, Takaaki
Wiesner, Matthew
Cernocky, Jan Honza
INTERSPEECH 2019, 2019, : 2220 - 2224
[3] PARADISE: Exploiting Parallel Data for Multilingual Sequence-to-Sequence Pretraining
Reid, Machel
Artetxe, Mikel
NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 800 - 810
[4] Assessing incrementality in sequence-to-sequence models
Ulmer, Dennis
Hupkes, Dieuwke
Bruni, Elia
4TH WORKSHOP ON REPRESENTATION LEARNING FOR NLP (REPL4NLP-2019), 2019, : 209 - 217
[5] An Analysis of "Attention" in Sequence-to-Sequence Models
Prabhavalkar, Rohit
Sainath, Tara N.
Li, Bo
Rao, Kanishka
Jaitly, Navdeep
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3702 - 3706
[6] Pre-Trained Multilingual Sequence-to-Sequence Models: A Hope for Low-Resource Language Translation?
Lee, En-Shiun Annie
Thillainathan, Sarubi
Nayak, Shravan
Ranathunga, Surangika
Adelani, David Ifeoluwa
Su, Ruisi
McCarthy, Arya D.
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 58 - 67
[7] Deep Reinforcement Learning for Sequence-to-Sequence Models
Keneshloo, Yaser
Shi, Tian
Ramakrishnan, Naren
Reddy, Chandan K.
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (07) : 2469 - 2489
[8] Sequence-to-Sequence Models for Automated Text Simplification
Botarleanu, Robert-Mihai
Dascalu, Mihai
Crossley, Scott Andrew
McNamara, Danielle S.
ARTIFICIAL INTELLIGENCE IN EDUCATION (AIED 2020), PT II, 2020, 12164 : 31 - 36
[9] Sequence-to-Sequence Models for Emphasis Speech Translation
Quoc Truong Do
Sakti, Sakriani
Nakamura, Satoshi
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (10) : 1873 - 1883
[10] On Evaluation of Adversarial Perturbations for Sequence-to-Sequence Models
Michel, Paul
Li, Xian
Neubig, Graham
Pino, Juan Miguel
2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 3103 - 3114

← 1 2 3 4 5 →