Semantic Textual Similarity on Brazilian Portuguese: An approach based on language-mixture models

被引:2
作者
Silva, A. [1 ]
Lozkins, A. [2 ]
Bertoldi, L. R. [1 ]
Rigo, S. [1 ]
Bure, V. M. [2 ]
机构
[1] Univ Vale Rio dos Sinos, 950 Av Unisinos, BR-93020190 Sao Leopoldo, RS, Brazil
[2] St Petersburg State Univ, 7-9 Univ Skaya Nab, St Petersburg 199034, Russia
来源
VESTNIK SANKT-PETERBURGSKOGO UNIVERSITETA SERIYA 10 PRIKLADNAYA MATEMATIKA INFORMATIKA PROTSESSY UPRAVLENIYA | 2019年 / 15卷 / 02期
关键词
Semantic Textual Similarity; natural language processing; computational linguistics; ontologies;
D O I
10.21638/11702/spbu10.2019.207
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
The literature describes the Semantic Textual Similarity (STS) area as a fundamental part of many Natural Language Processing (NLP) tasks. The STS approaches are dependent on the availability of lexical-semantic resources. There are several efforts to improve the lexicalsemantics resources for the English language, and the state-of-art report a large amount of application for this language. Brazilian Portuguese linguistics resources, when compared with English ones, do not have the same availability regarding relation and contents, generation a loss of precision in STS tasks. Therefore, the current work presents an approach that combines Brazilian Portuguese and English lexical-semantics ontology resources to reach all potential of both language linguistic relations, to generate a language-mixture model to measure STS. We evaluated the proposed approach with a well-known and respected Brazilian Portuguese STS dataset, which brought to light some considerations about mixture models and their relations with ontology language semantics.
引用
收藏
页码:235 / 244
页数:10
相关论文
共 24 条
  • [11] Freire J., 2016, PROPOR INT C COMP PR
  • [12] Gomaa W.H., 2013, INT J COMPUT APPL, V68, P975, DOI [10.5120/11638-7118, DOI 10.5120/11638-7118]
  • [13] Hartmann N, 2017, ARXIV170806025
  • [14] Hartmann NS, 2016, LINGUAMATICA, V8, P59
  • [15] An iterative approach for the global estimation of sentence similarity
    Kajiwara, Tomoyuki
    Bollegala, Danushka
    Yoshida, Yuichi
    Kawarabayashi, Ken-ichi
    [J]. PLOS ONE, 2017, 12 (09):
  • [16] Robust semantic text similarity using LSA, machine learning, and linguistic resources
    Kashyap, Abhay
    Han, Lushan
    Yus, Roberto
    Sleeman, Jennifer
    Satyapanich, Taneeya
    Gandhi, Sunil
    Finin, Tim
    [J]. LANGUAGE RESOURCES AND EVALUATION, 2016, 50 (01) : 125 - 161
  • [17] Lozkins A, 2016, VESTN ST PET U-P MAT, V12, P28
  • [18] Mikolov T., 2013, P 1 INT C LEARN REPR, DOI [DOI 10.48550/ARXIV.1301.3781, 10.48550/arXiv.1301.3781]
  • [19] WORDNET - A LEXICAL DATABASE FOR ENGLISH
    MILLER, GA
    [J]. COMMUNICATIONS OF THE ACM, 1995, 38 (11) : 39 - 41
  • [20] Paiva V., 2012, COLING 2012