Word sense disambiguation for Arabic text using Wikipedia and Vector Space Model

被引:12
作者
Alian, Marwah [1 ]
Awajan, Arafat [2 ]
Al-Kouz, Akram [2 ]
机构
[1] Hashemite Univ, Zarqa, Jordan
[2] Princess Sumaya Univ Technol, Amman, Jordan
关键词
Arabic word sense disambiguation; Disambiguation resource; Vector space model; Wikipedia;
D O I
10.1007/s10772-016-9376-y
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this research we introduce a new approach for Arabic word sense disambiguation by utilizing Wikipedia as a lexical resource for disambiguation. The nearest sense for an ambiguous word is selected using Vector Space Model as a representation and cosine similarity between the word context and the retrieved senses from Wikipedia as a measure. Three experiments have been conducted to evaluate the proposed approach, two experiments use the first retrieved sentence for each sense from Wikipedia but they use different Vector Space Model representations while the third experiment uses the first paragraph for the retrieved sense from Wikipedia. The experiments show that using first paragraph is better than the first sentence and the use of TF-IDF is better than using abstract frequency in VSM. Also, the proposed approach is tested on English words and it gives better results using the first sentence retrieved from Wikipedia for each sense.
引用
收藏
页码:857 / 867
页数:11
相关论文
共 30 条
[1]  
Abdullah A., 2013, ARABIC WIKIPEDIA WHY
[2]  
[Anonymous], 2001, PROCEEDINGS OF THE 2
[3]  
Bouhriz N, 2016, INT J ADV COMPUT SC, V7, P381
[4]  
Carpuat M, 2005, P 43 ANN M ASS COMP, P387
[5]  
Chan Y.S., 2007, P 45 ANN M ASS COMPU, V45, P33
[6]  
Cleary J. G., 1995, Machine Learning. Proceedings of the Twelfth International Conference on Machine Learning, P108
[7]  
Dandala B., 2013, THESIS
[8]  
Diab M., 2003, THESIS
[9]  
El Bachir Menai Mohamed, 2012, Proceedings of the 2012 13th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel & Distributed Computing (SNPD 2012), P195, DOI 10.1109/SNPD.2012.38
[10]  
El-Gedawy M. N, 2013, INT J COMPUT APPL, V79, P1