Multilingual Sequence-to-Sequence Models for Hebrew NLP

被引:0
|
作者
Eyal, Matan [1 ]
Noga, Hila [1 ]
Aharoni, Roee [1 ]
Szpektor, Idan [1 ]
Tsarfaty, Reut [1 ]
机构
[1] Google Res, Mountain View, CA 94043 USA
来源
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023) | 2023年
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Recent work attributes progress in NLP to large language models (LMs) with increased model size and large quantities of pretraining data. Despite this, current state-of-the-art LMs for Hebrew are both under-parameterized and under-trained compared to LMs in other languages. Additionally, previous work on pretrained Hebrew LMs focused on encoderonly models. While the encoder-only architecture is beneficial for classification tasks, it does not cater well for sub-word prediction tasks, such as Named Entity Recognition, when considering the morphologically rich nature of Hebrew. In this paper we argue that sequence-to-sequence generative architectures are more suitable for large LMs in morphologically rich languages (MRLs) such as Hebrew. We demonstrate this by casting tasks in the Hebrew NLP pipeline as text-to-text tasks, for which we can leverage powerful multilingual, pretrained sequence-to-sequence models as mT5, eliminating the need for a separate, specialized, morpheme-based, decoder. Using this approach, our experiments show substantial improvements over previously published results on all existing Hebrew NLP benchmarks. These results suggest that multilingual sequence-to-sequence models present a promising building block for NLP for MRLs.
引用
收藏
页码:7700 / 7708
页数:9
相关论文
共 50 条
  • [31] Unleashing the True Potential of Sequence-to-Sequence Models for Sequence Tagging and Structure Parsing
    He, Han
    Choi, Jinho D.
    TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2023, 11 : 582 - 599
  • [32] Retrosynthetic Reaction Prediction Using Neural Sequence-to-Sequence Models
    Liu, Bowen
    Ramsundar, Bharath
    Kawthekar, Prasad
    Shi, Jade
    Gomes, Joseph
    Quang Luu Nguyen
    Ho, Stephen
    Sloane, Jack
    Wender, Paul
    Pande, Vijay
    ACS CENTRAL SCIENCE, 2017, 3 (10) : 1103 - 1113
  • [33] Reformulating natural language queries using sequence-to-sequence models
    Xiaoyu Liu
    Shunda Pan
    Qi Zhang
    Yu-Gang Jiang
    Xuanjing Huang
    Science China Information Sciences, 2019, 62
  • [34] Runoff predictions in ungauged basins using sequence-to-sequence models
    Yin, Hanlin
    Guo, Zilong
    Zhang, Xiuwei
    Chen, Jiaojiao
    Zhang, Yanning
    JOURNAL OF HYDROLOGY, 2021, 603
  • [35] Reformulating natural language queries using sequence-to-sequence models
    Liu, Xiaoyu
    Pan, Shunda
    Zhang, Qi
    Jiang, Yu-Gang
    Huang, Xuanjing
    SCIENCE CHINA-INFORMATION SCIENCES, 2019, 62 (12)
  • [36] Guiding Attention in Sequence-to-Sequence Models for Dialogue Act prediction
    Colombo, Pierre
    Chapuis, Emile
    Manica, Matteo
    Vignon, Emmanuel
    Varni, Giovanna
    Clavel, Chloe
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 7594 - 7601
  • [37] STATE-OF-THE-ART SPEECH RECOGNITION WITH SEQUENCE-TO-SEQUENCE MODELS
    Chiu, Chung-Cheng
    Sainath, Tara N.
    Wu, Yonghui
    Prabhavalkar, Rohit
    Nguyen, Patrick
    Chen, Zhifeng
    Kannan, Anjuli
    Weiss, Ron J.
    Rao, Kanishka
    Gonina, Ekaterina
    Jaitly, Navdeep
    Li, Bo
    Chorowski, Jan
    Bacchiani, Michiel
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 4774 - 4778
  • [38] COUPLED TRAINING OF SEQUENCE-TO-SEQUENCE MODELS FOR ACCENTED SPEECH RECOGNITION
    Unni, Vinit
    Joshi, Nitish
    Jyothi, Preethi
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 8254 - 8258
  • [39] BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese
    Nguyen Luong Tran
    Duong Minh Le
    Dat Quoc Nguyen
    INTERSPEECH 2022, 2022, : 1751 - 1755
  • [40] Multitask Sequence-to-Sequence Models for Grapheme-to-Phoneme Conversion
    Milde, Benjamin
    Schmidt, Christoph
    Koehler, Joachim
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2536 - 2540