Semantic Textual Similarity on Brazilian Portuguese: An approach based on language-mixture models

被引：2

作者：

Silva, A. ^{[1
]}

Lozkins, A. ^{[2
]}

Bertoldi, L. R. ^{[1
]}

Rigo, S. ^{[1
]}

Bure, V. M. ^{[2
]}

机构：

[1] Univ Vale Rio dos Sinos, 950 Av Unisinos, BR-93020190 Sao Leopoldo, RS, Brazil

[2] St Petersburg State Univ, 7-9 Univ Skaya Nab, St Petersburg 199034, Russia

来源：

VESTNIK SANKT-PETERBURGSKOGO UNIVERSITETA SERIYA 10 PRIKLADNAYA MATEMATIKA INFORMATIKA PROTSESSY UPRAVLENIYA | 2019年 / 15卷 / 02期

关键词：

Semantic Textual Similarity; natural language processing; computational linguistics; ontologies;

D O I：

10.21638/11702/spbu10.2019.207

中图分类号：

O1 [数学];

学科分类号：

0701 ; 070101 ;

摘要：

The literature describes the Semantic Textual Similarity (STS) area as a fundamental part of many Natural Language Processing (NLP) tasks. The STS approaches are dependent on the availability of lexical-semantic resources. There are several efforts to improve the lexicalsemantics resources for the English language, and the state-of-art report a large amount of application for this language. Brazilian Portuguese linguistics resources, when compared with English ones, do not have the same availability regarding relation and contents, generation a loss of precision in STS tasks. Therefore, the current work presents an approach that combines Brazilian Portuguese and English lexical-semantics ontology resources to reach all potential of both language linguistic relations, to generate a language-mixture model to measure STS. We evaluated the proposed approach with a well-known and respected Brazilian Portuguese STS dataset, which brought to light some considerations about mixture models and their relations with ontology language semantics.

引用

页码：235 / 244

页数：10

共 24 条

[11]

Freire J., 2016, PROPOR INT C COMP PR

[12]

Gomaa W.H., 2013, INT J COMPUT APPL, V68, P975, DOI [10.5120/11638-7118, DOI 10.5120/11638-7118]

[13]

Hartmann N, 2017, ARXIV170806025

[14]

Hartmann NS, 2016, LINGUAMATICA, V8, P59

[15] An iterative approach for the global estimation of sentence similarity [J].

Kajiwara, Tomoyuki ;

Bollegala, Danushka ;

Yoshida, Yuichi ;

Kawarabayashi, Ken-ichi .

PLOS ONE, 2017, 12 (09)

[16] Robust semantic text similarity using LSA, machine learning, and linguistic resources [J].

Kashyap, Abhay ;

Han, Lushan ;

Yus, Roberto ;

Sleeman, Jennifer ;

Satyapanich, Taneeya ;

Gandhi, Sunil ;

Finin, Tim .

LANGUAGE RESOURCES AND EVALUATION, 2016, 50 (01) :125-161

[17]

Lozkins A, 2016, VESTN ST PET U-P MAT, V12, P28

[18]

Mikolov T., 2013, P 1 INT C LEARN REPR, DOI [DOI 10.48550/ARXIV.1301.3781, 10.48550/arXiv.1301.3781]

[19] WORDNET - A LEXICAL DATABASE FOR ENGLISH [J].

MILLER, GA .

COMMUNICATIONS OF THE ACM, 1995, 38 (11) :39-41

[20]

Paiva V., 2012, COLING 2012

← 1 2 3 →