A Supervised Approach for Word Sense Disambiguation based on Arabic Diacritics

被引:0
作者
Alrakaf, Alaa Abdullah [1 ]
Rahman, Sk. Md. Mizanur [1 ]
机构
[1] King Saud Univ, Coll Comp & Informat Sci, Riyadh, Saudi Arabia
来源
2016 5TH INTERNATIONAL CONFERENCE ON INFORMATICS, ELECTRONICS AND VISION (ICIEV) | 2016年
关键词
Arabic natural language processing; Machine learning; Machine translation; Naive Bayes Classifier; word sense disambiguation;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Since the last two decades' Arabic natural language processing (ANLP) has become increasingly much more important. One of the key issues related to ANLP is ambiguity. In Arabic language different pronunciation of one word may have a different meaning. Furthermore, ambiguity also has an impact on the effectiveness and efficiency of Machine Translation (MT). The issue of ambiguity has limited the usefulness and accuracy of the translation from Arabic to English. The lack of Arabic resources makes ambiguity problem more complicated. Additionally, the orthographic level of representation cannot specify the exact meaning of the word. This paper looked at the diacritics of Arabic language and used them to disambiguate an ambiguous word. The proposed approach of word sense disambiguation used Diacritizer application to Diacritize Arabic text. Then find the most accurate sense of an ambiguous word using Naive Bayes Classifier. Our system gets 91% precision, and 12.11% error rate. This experimental study proves that using Arabic Diacritics with Naive Bayes Classifier enhances the accuracy of choosing the appropriate sense for ambiguous Arabic words.
引用
收藏
页码:1015 / 1021
页数:7
相关论文
共 50 条
  • [11] Word2vec for Arabic Word Sense Disambiguation
    Laatar, Rim
    Aloulou, Chafik
    Belghuith, Lamia Hadrich
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS (NLDB 2018), 2018, 10859 : 308 - 311
  • [12] Supervised word sense disambiguation using new features based on word embeddings
    Sadi, Majid Fahandezi
    Ansari, Ebrahim
    Afsharchi, Mohsen
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2019, 37 (01) : 1467 - 1476
  • [13] A New Approach to Word Sense Disambiguation Based on Context Similarity
    Nameh, M.
    Fakhrahmad, S. M.
    Jahromi, M. Zolghadri
    WORLD CONGRESS ON ENGINEERING, WCE 2011, VOL I, 2011, : 456 - 459
  • [14] Applying active learning to supervised word sense disambiguation in MEDLINE
    Chen, Yukun
    Cao, Hongxin
    Mei, Qiaozhu
    Zheng, Kai
    Xu, Hua
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2013, 20 (05) : 1001 - 1006
  • [15] Word Embedding for Arabic Word Sense Disambiguation to create a Historical Dictionary for Arabic Language
    Laatar, Rim
    Aloulou, Chafik
    Belghuith, Lamia Hadrich
    2018 8TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND INFORMATION TECHNOLOGY (CSIT), 2018, : 131 - 135
  • [16] Word sense disambiguation based on semi-supervised ensemble learning
    Zhang C.
    Xiong J.
    Gao X.
    Harbin Gongcheng Daxue Xuebao/Journal of Harbin Engineering University, 2020, 41 (08): : 1216 - 1222
  • [17] A novel approach to word sense disambiguation in Bengali language using supervised methodology
    Alok Ranjan Pal
    Diganta Saha
    Niladri Sekhar Dash
    Sudip Kumar Naskar
    Antara Pal
    Sādhanā, 2019, 44
  • [18] Effect of Supervised Sense Disambiguation Model Using Machine Learning Technique and Word Embedding in Word Sense Disambiguation
    Mahajan, Rupesh
    Kokane, Chandrakant
    Pathak, Kishor
    Kodmelwar, Manohar
    Wagh, Kapil
    Bhandari, Mahesh
    JOURNAL OF ELECTRICAL SYSTEMS, 2024, 20 (01) : 436 - 443
  • [19] A novel approach to word sense disambiguation in Bengali language using supervised methodology
    Pal, Alok Ranjan
    Saha, Diganta
    Dash, Niladri Sekhar
    Naskar, Sudip Kumar
    Pal, Antara
    SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES, 2019, 44 (08):
  • [20] An Improved Approach to Word Sense Disambiguation
    Sachdeva, Pradeep
    Verma, Surbhi
    Singh, Sandeep Kumar
    2014 IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (ISSPIT), 2014, : 235 - 240