Semantic Textual Similarity on Brazilian Portuguese: An approach based on language-mixture models

被引:2
作者
Silva, A. [1 ]
Lozkins, A. [2 ]
Bertoldi, L. R. [1 ]
Rigo, S. [1 ]
Bure, V. M. [2 ]
机构
[1] Univ Vale Rio dos Sinos, 950 Av Unisinos, BR-93020190 Sao Leopoldo, RS, Brazil
[2] St Petersburg State Univ, 7-9 Univ Skaya Nab, St Petersburg 199034, Russia
来源
VESTNIK SANKT-PETERBURGSKOGO UNIVERSITETA SERIYA 10 PRIKLADNAYA MATEMATIKA INFORMATIKA PROTSESSY UPRAVLENIYA | 2019年 / 15卷 / 02期
关键词
Semantic Textual Similarity; natural language processing; computational linguistics; ontologies;
D O I
10.21638/11702/spbu10.2019.207
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
The literature describes the Semantic Textual Similarity (STS) area as a fundamental part of many Natural Language Processing (NLP) tasks. The STS approaches are dependent on the availability of lexical-semantic resources. There are several efforts to improve the lexicalsemantics resources for the English language, and the state-of-art report a large amount of application for this language. Brazilian Portuguese linguistics resources, when compared with English ones, do not have the same availability regarding relation and contents, generation a loss of precision in STS tasks. Therefore, the current work presents an approach that combines Brazilian Portuguese and English lexical-semantics ontology resources to reach all potential of both language linguistic relations, to generate a language-mixture model to measure STS. We evaluated the proposed approach with a well-known and respected Brazilian Portuguese STS dataset, which brought to light some considerations about mixture models and their relations with ontology language semantics.
引用
收藏
页码:235 / 244
页数:10
相关论文
共 24 条
[11]  
Freire J., 2016, PROPOR INT C COMP PR
[12]  
Gomaa W.H., 2013, INT J COMPUT APPL, V68, P975, DOI [10.5120/11638-7118, DOI 10.5120/11638-7118]
[13]  
Hartmann N, 2017, ARXIV170806025
[14]  
Hartmann NS, 2016, LINGUAMATICA, V8, P59
[15]   An iterative approach for the global estimation of sentence similarity [J].
Kajiwara, Tomoyuki ;
Bollegala, Danushka ;
Yoshida, Yuichi ;
Kawarabayashi, Ken-ichi .
PLOS ONE, 2017, 12 (09)
[16]   Robust semantic text similarity using LSA, machine learning, and linguistic resources [J].
Kashyap, Abhay ;
Han, Lushan ;
Yus, Roberto ;
Sleeman, Jennifer ;
Satyapanich, Taneeya ;
Gandhi, Sunil ;
Finin, Tim .
LANGUAGE RESOURCES AND EVALUATION, 2016, 50 (01) :125-161
[17]  
Lozkins A, 2016, VESTN ST PET U-P MAT, V12, P28
[18]  
Mikolov T., 2013, P 1 INT C LEARN REPR, DOI [DOI 10.48550/ARXIV.1301.3781, 10.48550/arXiv.1301.3781]
[19]   WORDNET - A LEXICAL DATABASE FOR ENGLISH [J].
MILLER, GA .
COMMUNICATIONS OF THE ACM, 1995, 38 (11) :39-41
[20]  
Paiva V., 2012, COLING 2012