Combination of information retrieval methods with LESK algorithm for Arabic word sense disambiguation

被引:31
|
作者
Zouaghi, Anis [1 ]
Merhbene, Laroussi [1 ]
Zrigui, Mounir [1 ]
机构
[1] ISI Medenine, UTIC Lab, Unit Monastir, Medenine, Tunisia
关键词
Arabic word sense disambiguation (AWSD); Unsupervised and incremental approach; Knowledge based approach; Information retrieval methods; Lesk algorithm;
D O I
10.1007/s10462-011-9249-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose to use Harman, Croft and Okapi measures with Lesk algorithm to develop a system for Arabic word sense disambiguation, that combines unsupervised and knowledge based methods. This system must solve the lexical semantic ambiguity in Arabic language. The information retrieval measures are used to estimate the most relevant sense of the ambiguous word, by returning a semantic coherence score corresponding to the context that is semantically closest to the original sentence containing the ambiguous word. The Lesk algorithm is used to assign and select the adequate sense from those proposed by the information retrieval measures mentioned above. This selection is based on a comparison between the glosses of the word to be disambiguated, and its different contexts of use extracted from a corpus. Our experimental study proves that using of Lesk algorithm with Harman, Croft, and Okapi measures allows us to obtain an accuracy rate of 73%.
引用
收藏
页码:257 / 269
页数:13
相关论文
共 6 条
  • [1] Combination of information retrieval methods with LESK algorithm for Arabic word sense disambiguation
    Anis Zouaghi
    Laroussi Merhbene
    Mounir Zrigui
    Artificial Intelligence Review, 2012, 38 : 257 - 269
  • [2] Modified lesk algorithm for word sense disambiguation in Bengali
    Das, Ratul
    Pal, Alok Ranjan
    Saha, Diganta
    SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES, 2024, 49 (02):
  • [3] Word Sense Disambiguation (WSD) for Indonesian Homograph Word Meaning Determination by LESK Algorithm Application
    Basuki, Setio
    Kholimi, Ali Sofyan
    Minarno, Agus Eko
    Sumadi, Fauzi Dwi Setiawan
    Effendy, M. Rizal Arif
    PROCEEDINGS OF 2019 12TH INTERNATIONAL CONFERENCE ON INFORMATION & COMMUNICATION TECHNOLOGY AND SYSTEM (ICTS), 2019, : 8 - 15
  • [4] Information Retrieval with Word Sense Disambiguation for Spanish
    Ledo Mezquita, Yoel
    COMPUTACION Y SISTEMAS, 2008, 11 (03): : 288 - 300
  • [5] ARABIC WORD SENSE DISAMBIGUATION
    Merhbene, Laroussi
    Zouaghi, Anis
    Zrigui, Mounir
    ICAART 2010: PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE, VOL 1: ARTIFICIAL INTELLIGENCE, 2010, : 652 - 655
  • [6] Hindi Word Sense Disambiguation Using Lesk Approach on Bigram and Trigram Words
    Gautam, Chandra Bhal Singh
    Sharma, Dilip Kumar
    INTERNATIONAL CONFERENCE ON ADVANCES IN INFORMATION COMMUNICATION TECHNOLOGY & COMPUTING, 2016, 2016,