Contextualized Embeddings Encode Monolingual and Cross-lingual Knowledge of Idiomaticity

被引:0
|
作者
Fakharian, Samin [1 ]
Cook, Paul [1 ]
机构
[1] Univ New Brunswick, Fac Comp Sci, Fredericton, NB E3B 5A3, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Potentially idiomatic expressions (PIEs) are ambiguous between non-compositional idiomatic interpretations and transparent literal interpretations. For example, hit the road can have an idiomatic meaning corresponding to 'start a journey' or have a literal interpretation. In this paper we propose a supervised model based on contextualized embeddings for predicting whether usages of PIEs are idiomatic or literal. We consider monolingual experiments for English and Russian, and show that the proposed model outperforms previous approaches, including in the case that the model is tested on instances of PIE types that were not observed during training. We then consider cross-lingual experiments in which the model is trained on PIE instances in one language, English or Russian, and tested on the other language. We find that the model outperforms baselines in this setting. These findings suggest that contextualized embeddings are able to learn representations that encode knowledge of idiomaticity that is not restricted to specific expressions, nor to a specific language.
引用
收藏
页码:23 / 32
页数:10
相关论文
共 50 条
  • [1] Monolingual and Cross-Lingual Knowledge Transfer for Topic Classification
    D. Karpov
    M. Burtsev
    Journal of Mathematical Sciences, 2024, 285 (1) : 36 - 48
  • [2] Multilingual Knowledge Graph Embeddings for Cross-lingual Knowledge Alignment
    Chen, Muhao
    Tian, Yingtao
    Yang, Mohan
    Zaniolo, Carlo
    PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 1511 - 1517
  • [3] Monolingual and Cross-Lingual Information Retrieval Models Based on (Bilingual) Word Embeddings
    Vulic, Ivan
    Moens, Marie-Francine
    SIGIR 2015: PROCEEDINGS OF THE 38TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2015, : 363 - 372
  • [4] Cross-Lingual Word Embeddings
    Søgaard A.
    Vulić I.
    Ruder S.
    Faruqui M.
    Synthesis Lectures on Human Language Technologies, 2019, 12 (02): : 1 - 132
  • [5] On the Cross-lingual Transferability of Monolingual Representations
    Artetxe, Mikel
    Ruder, Sebastian
    Yogatama, Dani
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 4623 - 4637
  • [6] Cross-Lingual Word Embeddings
    Corro, Caio Filippo
    TRAITEMENT AUTOMATIQUE DES LANGUES, 2019, 60 (01): : 46 - 48
  • [7] Cross-Lingual Word Embeddings
    Agirre, Eneko
    COMPUTATIONAL LINGUISTICS, 2020, 46 (01) : 245 - 248
  • [8] Cross-lingual Transfer of Monolingual Models
    Gogoulou, Evangelia
    Ekgren, Ariel
    Isbister, Tim
    Sahlgren, Magnus
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 948 - 955
  • [9] Cross-Lingual Taxonomy Alignment with Bilingual Knowledge Graph Embeddings
    Wu, Tianxing
    Zhang, Du
    Zhang, Lei
    Qi, Guilin
    SEMANTIC TECHNOLOGY, JIST 2017, 2017, 10675 : 251 - 258
  • [10] WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models
    Minixhofer, Benjamin
    Paischer, Fabian
    Rekabsaz, Navid
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 3992 - 4006