Syntactic-Semantic Similarity Based on Dependency Tree Kernel

被引:4
作者
Alian, Marwah [1 ]
Awajan, Arafat [2 ]
机构
[1] Hashemite Univ, Fac Sci, Basic Sci Dept, POB 330127, Zarqa 13133, Jordan
[2] Princess Sumaya Univ Technol, Comp Sci Dept, Amman, Jordan
关键词
Dependency tree kernel; Semantic similarity; Syntactic semantic; Sentence similarity; Word embeddings; Arabic sentence similarity;
D O I
10.1007/s13369-023-07694-z
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The representation of words in the vector space generated using the Word2Vec model does not capture the syntactic similarity between sentences, but only the semantic similarity by considering the context of words. To address this problem, we propose to exploit the dependency tree that gives the syntactic grammar relations among words in a sentence that is combined with the vector representation of the similarity computation of words in sentences. We also adapt and investigate the effect of a dependency tree kernel that incorporates the dependency relation type with the similarity of tree nodes to measure the similarity between two sentences. Using three pre-trained embedding models and FARASA dependency parser, we conduct experiments on Arabic paraphrasing benchmark and SemEval-MSRvid dataset to evaluate the adapted dependency tree kernels. The results show that these kernel functions provide satisfactory results in terms of precision and recall with AraBERT embedding model while Aravec and Fasttext achieve the best correlation values.
引用
收藏
页码:10937 / 10948
页数:12
相关论文
共 31 条
[1]  
Alian, 2019, 2 INT C DAT SCI E LE
[2]   Building Arabic Paraphrasing Benchmark based on Transformation Rules [J].
Alian, Marwah ;
Awajan, Arafat ;
Al-Hasan, Ahmad ;
Akuzhia, Raeda .
ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2021, 20 (04)
[3]   Arabic sentence similarity based on similarity features and machine learning [J].
Alian, Marwah ;
Awajan, Arafa .
SOFT COMPUTING, 2021, 25 (15) :10089-10101
[4]   Semantic Similarity for English and Arabic Texts: A Review [J].
Alian, Marwah ;
Awajan, Arafat .
JOURNAL OF INFORMATION & KNOWLEDGE MANAGEMENT, 2020, 19 (04)
[5]   Factors affecting sentence similarity and paraphrasing identification [J].
Alian, Marwah ;
Awajan, Arafat .
INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2020, 23 (04) :851-859
[6]  
Alian M, 2018, INT ARAB CONF INF TE, P14
[7]  
AlKholi M., 1999, Transformation rules for Arabic language- qwAEd tHwylyAh llgAh AlErbyAh
[8]   AWSS: An Algorithm for Measuring Arabic Word Semantic Similarity [J].
Almarsoomi, Faaza A. ;
O'Shea, James D. ;
Bandar, Zuhair ;
Crockett, Keeley .
2013 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC 2013), 2013, :504-509
[9]  
[Anonymous], 2004, Proceedings of the 42nd Meeting of the Association for Computational Linguistics (ACL'04), Main Volume, DOI DOI 10.3115/1218955.1219009
[10]  
Antoun W., 2020, P 12 INT C LANGUAGE