Linear transformations for cross-lingual semantic textual similarity

被引:13
作者
Brychcin, Tomas [1 ]
机构
[1] Univ West Bohemia, Fac Sci Appl, NTIS, Plzen, Czech Republic
关键词
Semantic textual similarity; Semantic spaces; Linear transformations; Word embeddings; Cross-lingual semantic spaces;
D O I
10.1016/j.knosys.2019.06.027
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cross-lingual semantic textual similarity systems estimate the degree of the meaning similarity between two sentences, each in a different language. State-of-the-art algorithms usually employ machine translation and combine vast amount of features, making the approach strongly supervised, resource rich, and difficult to use for poorly-resourced languages. In this paper, we study linear transformations, which project monolingual semantic spaces into a shared space using bilingual dictionaries. We propose a novel transformation, which builds on the best ideas from prior works. We experiment with unsupervised techniques for sentence similarity based only on semantic spaces and we show they can be significantly improved by the word weighting. Our transformation outperforms other methods and together with word weighting leads to very promising results on several datasets in different languages. (C) 2019 Elsevier B.V. All rights reserved.
引用
收藏
页数:9
相关论文
共 39 条
[1]  
Agirre Eneko, 2012, SEM 2012, P385
[2]  
[Anonymous], 2014, T ASSOC COMPUT LING
[3]  
[Anonymous], 2015, SEMEVAL
[4]  
[Anonymous], 2018, INT C LEARN REPR
[5]  
[Anonymous], 2013, CORR
[6]  
[Anonymous], 2013, P MAIN C SHARED TASK
[7]  
[Anonymous], 2012, 1 JOINT C LEXICAL CO
[8]  
[Anonymous], 2016, CORR
[9]  
[Anonymous], 2016, P 10 INT WORKSH SEM
[10]  
[Anonymous], 2015, P 2015 C N AM CHAPTE, DOI DOI 10.3115/V1/N15-1062