Arabic word sense disambiguation using sense inventories

被引:3
作者
Alian M. [1 ]
Awajan A. [2 ]
机构
[1] Basic Sciences Department, Faculty of Science, The Hashemite University, Zarqa
[2] Mutah University, Karak
关键词
Sense inventory; Sentence similarity; Word embeddings; Word sense disambiguation;
D O I
10.1007/s41870-022-01147-w
中图分类号
学科分类号
摘要
Disambiguation of words that have more than one meaning in a context is one of the challenging tasks in natural language processing. The representation of words and context via word embeddings is used in unsupervised approaches to construct sense inventories. In this research, we propose a disambiguation approach that utilizes an unsupervised approach for building a sense inventory from pre-trained embeddings. Part of Speech tagging is applied to the retrieved senses and compared with the tag of the ambiguous word to improve the selection of an appropriate sense. Experiments are conducted on the sentence similarity by using the selected sense vector compared with that of utilizing an ambiguous word vector to evaluate the selected senses. The sense vectors significantly improve the sentence similarity in terms of Pearson correlation. The use of Aravec embeddings provides an enhanced correlation of 0.423. © 2023, The Author(s), under exclusive licence to Bharati Vidyapeeth's Institute of Computer Applications and Management.
引用
收藏
页码:735 / 744
页数:9
相关论文
共 33 条
  • [1] Jurgens D., An analysis of ambiguity in word sense annotations, The 9Th International Conference on Language Resources and Evaluation (LREC 2014), pp. 3006-3012, (2014)
  • [2] Navigli R., Word sense disambiguation: a survey, ACM Comput Surv, 41, 2, pp. 1-69, (2009)
  • [3] Ide N., Veronis J., Word sense disambiguation: the state of the art, Comput Linguist, 24, 1, pp. 1-40, (1998)
  • [4] Alian A., Awajan A., Al-Kouz A., Word sense disambiguation for Arabic text using Wikipedia and Vector Space Model, Int J Speech Technol, 19, 4, pp. 857-867, (2016)
  • [5] Alian M., Awajan A., Al-Kouz A., Arabic word sense disambiguation-survey, 2017 International Conference on New Trends in Computing Sciences (ICTCS), pp. 236-240, (2017)
  • [6] Mikolov T., Chen K., Corrado G., Dean J., Efficient estimation of word representations in vector space, In: ICLR Workshop Papers, (2013)
  • [7] Laatar R., Aloulou C., Belghuith L.H., Word2vec for arabic word sense disambiguation, Natural Language Processing and Information Systems. NLDB 2018. Lecture Notes in Computer Science, 10859, (2018)
  • [8] Adjuik T.A., Ananey-Obiri D., Word2vec neural model-based technique to generate protein vectors for combating COVID-19: a machine learning approach, Int J Inf Technol, 14, (2022)
  • [9] Soni J., Mathur K., Sentiment analysis based on aspect and context fusion using attention encoder with LSTM, Int J Inf Technol, 14, (2022)
  • [10] Jain G., Sharma M., Agarwal B., Optimizing semantic LSTM for spam detection, Int J Inf Technol, 11, pp. 239-250, (2019)