A hybrid possibilistic approach for Arabic full morphological disambiguation

被引:17
作者
Bounhas, Ibrahim [1 ]
Ayed, Raja [2 ]
Elayeb, Bilel [2 ,3 ]
Ben Saoud, Narjes Bellamine [2 ,4 ]
机构
[1] Carthage Univ, LISI Lab Comp Sci Ind Syst, Carthage, Tunisia
[2] Manouba Univ, RIADI Res Lab, ENSI, Manouba 2010, Tunisia
[3] Emirates Coll Technol, Abu Dhabi, U Arab Emirates
[4] Tunis El Manar Univ, Higher Inst Informat ISI, Ariana 2080, Tunisia
关键词
Arabic morphological disambiguation; Possibilistic classification; Imperfect data; Linguistic rules; Out-of-Vocabulary word analysis; CLASSIFIER;
D O I
10.1016/j.datak.2015.06.008
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Morphological ambiguity is an important phenomenon affecting several tasks in Arabic text analysis, indexing and mining. Nevertheless, it has not been well studied in related works. We investigate, in this paper, new approaches to disambiguate the morphological features of non-vocalized Arabic texts, combining statistical classification and linguistic rules. Indeed, we perform unsupervised training from unlabelled vocalized Arabic corpora. Thus, the training and testing sets contain imperfect instances (i.e. having ambiguous attributes and/or classes). To handle imperfect data, we compare two approaches; i) a possibilistic approach allowing to handle imperfection in a direct manner; and, ii) a data transformation-based approach permitting to convert an imperfect dataset to a perfect one, thus allowing to exploit classical classifiers. We also present an approach dealing with unknown (Out-of-Vocabulary) words. The experiments focus mainly on classical texts, which were not sufficiently studied in related works. We show that the possibilistic approach performs better than the transformation-based one. Besides, we report encouraging results as far as i) the role of linguistic rules in enhancing the disambiguation rates; and, ii) the accuracy of our approach for full morphological disambiguation of unknown words. (C) 2015 Elsevier B.V. All rights reserved.
引用
收藏
页码:240 / 254
页数:15
相关论文
共 25 条
  • [1] Arabic Morphological Analysis and Disambiguation Using a Possibilistic Classifier
    Ayed, Raja
    Bounhas, Ibrahim
    Elayeb, Bilel
    Evrard, Fabrice
    Ben Saoud, Narjes Bellamine
    INTELLIGENT COMPUTING THEORIES AND APPLICATIONS, ICIC 2012, 2012, 7390 : 274 - 279
  • [2] Experimenting Machine-Learning Algorithms for Morphological Disambiguation of Arabic Texts
    Elayeb, Bilel
    Ettih, Mohamed Firas
    Ayed, Raja
    ICAART: PROCEEDINGS OF THE 14TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE - VOL 3, 2022, : 851 - 862
  • [3] Rules-based grammatical and semantic disambiguation of the token "hatta" in Arabic
    Ghoul, Dhaou
    Ibrahim, Amr Helmy
    Audebert, Claude
    2015 5TH INTERNATIONAL CONFERENCE ON INFORMATION & COMMUNICATION TECHNOLOGY AND ACCESSIBILITY (ICTA), 2015,
  • [4] An Approach for Named Entity Disambiguation with Knowledge Graph
    Zhang, Ke
    Zhu, Yunwen
    Gao, Wenjing
    Xing, Yixue
    Zhou, Jin
    2018 INTERNATIONAL CONFERENCE ON AUDIO, LANGUAGE AND IMAGE PROCESSING (ICALIP), 2018, : 138 - 143
  • [5] A Hybrid Deep Model for Recognizing Arabic Handwritten Characters
    Alrobah, Naseem
    Albahli, Saleh
    IEEE ACCESS, 2021, 9 : 87058 - 87069
  • [6] Intelligent Arabic Handwriting Recognition Using Different Standalone and Hybrid CNN Architectures
    Albattah, Waleed
    Albahli, Saleh
    APPLIED SCIENCES-BASEL, 2022, 12 (19):
  • [7] Machine Learning Based Hybrid Approach for Credit Assessment
    Guo, Hai
    Shi, Lei
    Zhao, Jingying
    JOURNAL OF COMPUTATIONAL AND THEORETICAL NANOSCIENCE, 2012, 9 (10) : 1793 - 1797
  • [8] A two-level hybrid approach for intrusion detection
    Guo, Chun
    Ping, Yuan
    Liu, Nian
    Luo, Shou-Shan
    NEUROCOMPUTING, 2016, 214 : 391 - 400
  • [9] A Hybrid Ensemble Stacking Model for Gender Voice Recognition Approach
    Alkhammash, Eman H.
    Hadjouni, Myriam
    Elshewey, Ahmed M.
    ELECTRONICS, 2022, 11 (11)
  • [10] Signature verification approach using fusion of hybrid texture features
    Bhunia, Ankan Kumar
    Alaei, Alireza
    Roy, Partha Pratim
    NEURAL COMPUTING & APPLICATIONS, 2019, 31 (12) : 8737 - 8748