A Siamese Neural Network for Learning Semantically-Informed Sentence Embeddings

被引:12
作者
Bolucu, Necva [1 ,3 ]
Can, Burcu [2 ]
Artuner, Harun [1 ]
机构
[1] Hacettepe Univ, Dept Comp Engn, Ankara, Turkiye
[2] Univ Wolverhampton, Res Inst Informat & Language Proc, Wolverhampton, England
[3] Hacettepe Univ, Grad Sch Sci & Engn, Ankara, Turkiye
关键词
Semantic parsing; UCCA; Self-attention; Semantic textual similarity; Siamese Network; Recursive Neural Network;
D O I
10.1016/j.eswa.2022.119103
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Semantic representation is a way of expressing the meaning of a text that can be processed by a machine to serve a particular natural language processing (NLP) task that usually requires meaning comprehension such as text summarisation, question answering or machine translation. In this paper, we present a semantic parsing model based on neural networks to obtain semantic representation of a given sentence. We utilise semantic representation of each sentence to generate semantically informed sentence embeddings for extrinsic evaluation of the proposed semantic parser, in particular for the semantic textual similarity task. Our neural parser utilises self-attention mechanism to learn semantic relations between words in a sentence to generate semantic representation of a sentence in UCCA (Universal Conceptual Cognitive Annotation) semantic annotation framework (Abend and Rappoport, 2013), which is a cross-linguistically applicable graph-based semantic representation. The UCCA representations are conveyed into a Siamese Neural Network built on top of two Recursive Neural Networks (Siamese-RvNN) to derive semantically informed sentence embeddings which are evaluated on semantic textual similarity task. We conduct both single-lingual and cross-lingual experiments with zero-shot and few-shot learning, which have shown superior performance even in low-resource scenario. The experimental results show that the proposed self-attentive neural parser outperforms the other parsers in the literature on English and German, and shows significant improvement in the cross-lingual setting for French which has comparatively low sources. Moreover, the results obtained from other downstream tasks such as sentiment analysis confirm that semantically informed sentence embeddings provide higher-quality embeddings compared to other pre-trained models such as SBERT (Reimers et al., 2019) or SimCSE (Gao et al., 2021), which do not utilise such structured information.
引用
收藏
页数:12
相关论文
共 132 条
  • [1] Abend O., 2013, Proceedings of the 10th International Conference on Computational Semantics (IWCS 2013) Long Papers, P1
  • [2] The State of the Art in Semantic Representation
    Abend, Omri
    Rappoport, Ari
    [J]. PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1, 2017, : 77 - 89
  • [3] Abzianidze L., 2017, P 15 C EUR CHAPT ASS, P242, DOI DOI 10.18653/V1/E17-2039
  • [4] Afzal N., 2016, Proceedings of SemEval, P674
  • [5] Agirre E, 2016, P 10 INT WORKSH SEM, P497, DOI [10.18653/v1/S16-1081, DOI 10.18653/V1/S16-1081]
  • [6] Agirre E., 2015, P 9 INT WORKSH SEM E, P252, DOI DOI 10.18653/V1/S15-2045
  • [7] Agirre E, 2013, P MAIN C SHARED TASK, V1, P32
  • [8] Agirre Eneko, 2012, SEM 2012 1 JT C LEX, P385, DOI DOI 10.5555/2387636.2387697
  • [9] Ando R. K., 2000, SIGIR Forum, V34, P216
  • [10] [Anonymous], 2006, AAAI