Syntactic-Semantic Similarity Based on Dependency Tree Kernel

被引:4
作者
Alian, Marwah [1 ]
Awajan, Arafat [2 ]
机构
[1] Hashemite Univ, Fac Sci, Basic Sci Dept, POB 330127, Zarqa 13133, Jordan
[2] Princess Sumaya Univ Technol, Comp Sci Dept, Amman, Jordan
关键词
Dependency tree kernel; Semantic similarity; Syntactic semantic; Sentence similarity; Word embeddings; Arabic sentence similarity;
D O I
10.1007/s13369-023-07694-z
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The representation of words in the vector space generated using the Word2Vec model does not capture the syntactic similarity between sentences, but only the semantic similarity by considering the context of words. To address this problem, we propose to exploit the dependency tree that gives the syntactic grammar relations among words in a sentence that is combined with the vector representation of the similarity computation of words in sentences. We also adapt and investigate the effect of a dependency tree kernel that incorporates the dependency relation type with the similarity of tree nodes to measure the similarity between two sentences. Using three pre-trained embedding models and FARASA dependency parser, we conduct experiments on Arabic paraphrasing benchmark and SemEval-MSRvid dataset to evaluate the adapted dependency tree kernels. The results show that these kernel functions provide satisfactory results in terms of precision and recall with AraBERT embedding model while Aravec and Fasttext achieve the best correlation values.
引用
收藏
页码:10937 / 10948
页数:12
相关论文
共 31 条
[21]   2L-APD: A Two-Level Plagiarism Detection System for Arabic Documents [J].
Nagoudi, El Moatez Billah ;
Khorsi, Ahmed ;
Cherroun, Hadda ;
Schwab, Didier .
CYBERNETICS AND INFORMATION TECHNOLOGIES, 2018, 18 (01) :124-138
[22]   Knowledge-based sentence semantic similarity: algebraical properties [J].
Oussalah, Mourad ;
Mohamed, Muhidin .
PROGRESS IN ARTIFICIAL INTELLIGENCE, 2022, 11 (01) :43-63
[23]  
Özates SB, 2016, LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, P2833
[24]  
Rong X., 2014, ARXIV
[25]   AraVec: A set of Arabic Word Embedding Models for use in Arabic NLP [J].
Soliman, Abu Bakr ;
Eissa, Kareem ;
El-Beltagy, Samhaa R. .
ARABIC COMPUTATIONAL LINGUISTICS (ACLING 2017), 2017, 117 :256-265
[26]  
Vo N.P.A., 2015, 3 INT WORKSHOP NATUR, P10
[27]  
Wali Wafa, 2021, Hybrid Intelligent Systems. 20th International Conference on Hybrid Intelligent Systems (HIS 2020). Advances in Intelligent Systems and Computing (AISC 1375), P394, DOI 10.1007/978-3-030-73050-5_40
[28]  
Wali Wafa, 2017, Vietnam Journal of Computer Science, V4, P51, DOI 10.1007/s40595-016-0080-2
[29]   Kernel methods for relation extraction [J].
Zelenko, D ;
Aone, C ;
Richardella, A .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (06) :1083-1106
[30]  
Zeroual I, 2019, FOURTH ARABIC NATURAL LANGUAGE PROCESSING WORKSHOP (WANLP 2019), P175