Multilingual Sequence-to-Sequence Models for Hebrew NLP

被引：0

作者：

Eyal, Matan ^{[1
]}

Noga, Hila ^{[1
]}

Aharoni, Roee ^{[1
]}

Szpektor, Idan ^{[1
]}

Tsarfaty, Reut ^{[1
]}

机构：

[1] Google Res, Mountain View, CA 94043 USA

来源：

FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023) | 2023年

关键词：

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Recent work attributes progress in NLP to large language models (LMs) with increased model size and large quantities of pretraining data. Despite this, current state-of-the-art LMs for Hebrew are both under-parameterized and under-trained compared to LMs in other languages. Additionally, previous work on pretrained Hebrew LMs focused on encoderonly models. While the encoder-only architecture is beneficial for classification tasks, it does not cater well for sub-word prediction tasks, such as Named Entity Recognition, when considering the morphologically rich nature of Hebrew. In this paper we argue that sequence-to-sequence generative architectures are more suitable for large LMs in morphologically rich languages (MRLs) such as Hebrew. We demonstrate this by casting tasks in the Hebrew NLP pipeline as text-to-text tasks, for which we can leverage powerful multilingual, pretrained sequence-to-sequence models as mT5, eliminating the need for a separate, specialized, morpheme-based, decoder. Using this approach, our experiments show substantial improvements over previously published results on all existing Hebrew NLP benchmarks. These results suggest that multilingual sequence-to-sequence models present a promising building block for NLP for MRLs.

引用

页码：7700 / 7708

页数：9

共 50 条

[31] Unleashing the True Potential of Sequence-to-Sequence Models for Sequence Tagging and Structure Parsing
He, Han
Choi, Jinho D.
TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2023, 11 : 582 - 599
[32] Retrosynthetic Reaction Prediction Using Neural Sequence-to-Sequence Models
Liu, Bowen
Ramsundar, Bharath
Kawthekar, Prasad
Shi, Jade
Gomes, Joseph
Quang Luu Nguyen
Ho, Stephen
Sloane, Jack
Wender, Paul
Pande, Vijay
ACS CENTRAL SCIENCE, 2017, 3 (10) : 1103 - 1113
[33] Reformulating natural language queries using sequence-to-sequence models
Xiaoyu Liu
Shunda Pan
Qi Zhang
Yu-Gang Jiang
Xuanjing Huang
Science China Information Sciences, 2019, 62
[34] Runoff predictions in ungauged basins using sequence-to-sequence models
Yin, Hanlin
Guo, Zilong
Zhang, Xiuwei
Chen, Jiaojiao
Zhang, Yanning
JOURNAL OF HYDROLOGY, 2021, 603
[35] Reformulating natural language queries using sequence-to-sequence models
Liu, Xiaoyu
Pan, Shunda
Zhang, Qi
Jiang, Yu-Gang
Huang, Xuanjing
SCIENCE CHINA-INFORMATION SCIENCES, 2019, 62 (12)
[36] Guiding Attention in Sequence-to-Sequence Models for Dialogue Act prediction
Colombo, Pierre
Chapuis, Emile
Manica, Matteo
Vignon, Emmanuel
Varni, Giovanna
Clavel, Chloe
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 7594 - 7601
[37] STATE-OF-THE-ART SPEECH RECOGNITION WITH SEQUENCE-TO-SEQUENCE MODELS
Chiu, Chung-Cheng
Sainath, Tara N.
Wu, Yonghui
Prabhavalkar, Rohit
Nguyen, Patrick
Chen, Zhifeng
Kannan, Anjuli
Weiss, Ron J.
Rao, Kanishka
Gonina, Ekaterina
Jaitly, Navdeep
Li, Bo
Chorowski, Jan
Bacchiani, Michiel
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 4774 - 4778
[38] COUPLED TRAINING OF SEQUENCE-TO-SEQUENCE MODELS FOR ACCENTED SPEECH RECOGNITION
Unni, Vinit
Joshi, Nitish
Jyothi, Preethi
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 8254 - 8258
[39] BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese
Nguyen Luong Tran
Duong Minh Le
Dat Quoc Nguyen
INTERSPEECH 2022, 2022, : 1751 - 1755
[40] Multitask Sequence-to-Sequence Models for Grapheme-to-Phoneme Conversion
Milde, Benjamin
Schmidt, Christoph
Koehler, Joachim
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2536 - 2540

← 1 2 3 4 5 →