SICK through the SemEval glasses. Lesson learned from the evaluation of compositional distributional semantic models on full sentences through semantic relatedness and textual entailment

被引:30
作者
Bentivogli, Luisa [1 ]
Bernardi, Raffaella [2 ]
Marelli, Marco [2 ]
Menini, Stefano [1 ]
Baroni, Marco [2 ]
Zamparelli, Roberto [2 ]
机构
[1] FBK, Via Sommarive 18, I-38123 Povo, TN, Italy
[2] Univ Trento, Corso Bettini 31, I-38068 Rovereto, TN, Italy
基金
欧洲研究理事会;
关键词
Compositionality; Computational semantics; Distributional semantics models;
D O I
10.1007/s10579-015-9332-5
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
This paper is an extended description of SemEval-2014 Task 1, the task on the evaluation of Compositional Distributional Semantics Models on full sentences. Systems participating in the task were presented with pairs of sentences and were evaluated on their ability to predict human judgments on (1) semantic relatedness and (2) entailment. Training and testing data were subsets of the SICK (Sentences Involving Compositional Knowledge) data set. SICK was developed with the aim of providing a proper benchmark to evaluate compositional semantic systems, though task participation was open to systems based on any approach. Taking advantage of the SemEval experience, in this paper we analyze the SICK data set, in order to evaluate the extent to which it meets its design goal and to shed light on the linguistic phenomena that are still challenging for state-of-the-art computational semantic systems. Qualitative and quantitative error analyses show that many systems are quite sensitive to changes in the proportion of sentence pair types, and degrade in the presence of additional lexico-syntactic complexities which do not affect human judgements. More compositional systems seem to perform better when the task proportions are changed, but the effect needs further confirmation.
引用
收藏
页码:95 / 124
页数:30
相关论文
共 29 条
  • [1] Agirre E., 2012, P SEMEVAL 2012 6 INT
  • [2] Alves A. O., 2014, P SEMEVAL 2014 INT W
  • [3] [Anonymous], 2014, P LREC
  • [4] [Anonymous], P ACL
  • [5] Baroni Marco, 2010, P EMNLP
  • [6] Beltagy I., 2014, P SEMEVAL 2014 INT W
  • [7] Bentivogli Luisa, 2009, P 2 TEXT AN C
  • [8] Bestgen Y., 2014, P SEMEVAL 2014 INT W
  • [9] Bicici E., 2014, P SEMEVAL 2014 INT W
  • [10] Bjerva J., 2014, P SEMEVAL 2014 INT W