Multilingual Sequence-to-Sequence Models for Hebrew NLP

被引:0
|
作者
Eyal, Matan [1 ]
Noga, Hila [1 ]
Aharoni, Roee [1 ]
Szpektor, Idan [1 ]
Tsarfaty, Reut [1 ]
机构
[1] Google Res, Mountain View, CA 94043 USA
来源
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023) | 2023年
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Recent work attributes progress in NLP to large language models (LMs) with increased model size and large quantities of pretraining data. Despite this, current state-of-the-art LMs for Hebrew are both under-parameterized and under-trained compared to LMs in other languages. Additionally, previous work on pretrained Hebrew LMs focused on encoderonly models. While the encoder-only architecture is beneficial for classification tasks, it does not cater well for sub-word prediction tasks, such as Named Entity Recognition, when considering the morphologically rich nature of Hebrew. In this paper we argue that sequence-to-sequence generative architectures are more suitable for large LMs in morphologically rich languages (MRLs) such as Hebrew. We demonstrate this by casting tasks in the Hebrew NLP pipeline as text-to-text tasks, for which we can leverage powerful multilingual, pretrained sequence-to-sequence models as mT5, eliminating the need for a separate, specialized, morpheme-based, decoder. Using this approach, our experiments show substantial improvements over previously published results on all existing Hebrew NLP benchmarks. These results suggest that multilingual sequence-to-sequence models present a promising building block for NLP for MRLs.
引用
收藏
页码:7700 / 7708
页数:9
相关论文
共 50 条
  • [1] Sparse Sequence-to-Sequence Models
    Peters, Ben
    Niculae, Vlad
    Martins, Andre F. T.
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 1504 - 1519
  • [2] Analysis of Multilingual Sequence-to-Sequence Speech Recognition Systems
    Karafiat, Martin
    Baskar, Murali Karthick
    Watanabe, Shinji
    Hori, Takaaki
    Wiesner, Matthew
    Cernocky, Jan Honza
    INTERSPEECH 2019, 2019, : 2220 - 2224
  • [3] PARADISE: Exploiting Parallel Data for Multilingual Sequence-to-Sequence Pretraining
    Reid, Machel
    Artetxe, Mikel
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 800 - 810
  • [4] Assessing incrementality in sequence-to-sequence models
    Ulmer, Dennis
    Hupkes, Dieuwke
    Bruni, Elia
    4TH WORKSHOP ON REPRESENTATION LEARNING FOR NLP (REPL4NLP-2019), 2019, : 209 - 217
  • [5] An Analysis of "Attention" in Sequence-to-Sequence Models
    Prabhavalkar, Rohit
    Sainath, Tara N.
    Li, Bo
    Rao, Kanishka
    Jaitly, Navdeep
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3702 - 3706
  • [6] Pre-Trained Multilingual Sequence-to-Sequence Models: A Hope for Low-Resource Language Translation?
    Lee, En-Shiun Annie
    Thillainathan, Sarubi
    Nayak, Shravan
    Ranathunga, Surangika
    Adelani, David Ifeoluwa
    Su, Ruisi
    McCarthy, Arya D.
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 58 - 67
  • [7] Deep Reinforcement Learning for Sequence-to-Sequence Models
    Keneshloo, Yaser
    Shi, Tian
    Ramakrishnan, Naren
    Reddy, Chandan K.
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (07) : 2469 - 2489
  • [8] Sequence-to-Sequence Models for Automated Text Simplification
    Botarleanu, Robert-Mihai
    Dascalu, Mihai
    Crossley, Scott Andrew
    McNamara, Danielle S.
    ARTIFICIAL INTELLIGENCE IN EDUCATION (AIED 2020), PT II, 2020, 12164 : 31 - 36
  • [9] Sequence-to-Sequence Models for Emphasis Speech Translation
    Quoc Truong Do
    Sakti, Sakriani
    Nakamura, Satoshi
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (10) : 1873 - 1883
  • [10] On Evaluation of Adversarial Perturbations for Sequence-to-Sequence Models
    Michel, Paul
    Li, Xian
    Neubig, Graham
    Pino, Juan Miguel
    2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 3103 - 3114