Distributional Semantic Model Based on Convolutional Neural Network for Arabic Textual Similarity

被引:3
|
作者
Mahmoud, Adnen [1 ]
Zrigui, Mounir [2 ]
机构
[1] Higher Inst Comp Sci & Commun Tech, Monastir, Tunisia
[2] Fac Sci Monastir, Monastir, Tunisia
关键词
Arabic Language; Context Based Approach; Global Vectors Representation; Natural Language Processing; Paraphrase Detection; Semantic Similarity; Word Embedding; Word2vec;
D O I
10.4018/IJCINI.2020010103
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The problem addressed is to develop a model that can reliably identify whether a previously unseen document pair is paraphrased or not. Its detection in Arabic documents is a challenge because of its variability in features and the lack of publicly available corpora. Faced with these problems, the authors propose a semantic approach. At the feature extraction level, the authors use global vectors representation combining global co-occurrence counting and a contextual skip gram model. At the paraphrase identification level, the authors apply a convolutional neural network model to learn more contextual and semantic information between documents. For experiments, the authors use Open Source Arabic Corpora as a source corpus. Then the authors collect different datasets to create a vocabulary model. For the paraphrased corpus construction, the authors replace each word from the source corpus by its most similar one which has the same grammatical class applying the word2vec algorithm and the part-of-speech annotation. Experiments show that the model achieves promising results in terms of precision and recall compared to existing approaches in the literature.
引用
收藏
页码:35 / 50
页数:16
相关论文
共 50 条
  • [11] Short Text Semantic Similarity Measurement Approach Based on Semantic Network
    Hameed, Naamah Hussien
    Alimi, Adel M.
    Sadiq, Ahmed T.
    BAGHDAD SCIENCE JOURNAL, 2022, 19 (06) : 1581 - 1591
  • [12] Word Embedding based Textual Semantic Similarity Measure in Bengali
    Iqbal, Md Asif
    Sharif, Omar
    Hoque, Mohammed Moshiul
    Sarker, Iqbal H.
    10TH INTERNATIONAL YOUNG SCIENTISTS CONFERENCE IN COMPUTATIONAL SCIENCE (YSC2021), 2021, 193 : 92 - 101
  • [13] Semantic Similarity Algorithm Based on Generalized Regression Neural Network
    Cao, Rui
    Wu, Lingda
    Wang, Rui
    Yang, Chao
    PROCEEDINGS OF THE FIRST INTERNATIONAL CONFERENCE ON INFORMATION SCIENCES, MACHINERY, MATERIALS AND ENERGY (ICISMME 2015), 2015, 126 : 1333 - 1336
  • [14] Evaluation of semantic similarity using vector space model based on textual corpus
    Hssina, Badr
    Bouikhalene, Belaid
    Merbouha, Abdelkrim
    2016 13TH INTERNATIONAL CONFERENCE ON COMPUTER GRAPHICS, IMAGING AND VISUALIZATION (CGIV), 2016, : 295 - 300
  • [15] Stemming for Arabic Words Similarity Measures based on Latent Semantic Analysis Model
    Froud, Hanane
    Lachkar, Abdelmonaime
    Alaoui Ouatik, Said
    2012 INTERNATIONAL CONFERENCE ON MULTIMEDIA COMPUTING AND SYSTEMS (ICMCS), 2012, : 780 - 784
  • [16] Semantic Template-based Convolutional Neural Network for Text Classification
    Chang, Yung-Chun
    Ng, Siu Hin
    Chen, Jung-Peng
    Liang, Yu-Chi
    Hsu, Wen-Lian
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (11)
  • [17] Multi-Channel Embedding Convolutional Neural Network Model for Arabic Sentiment Classification
    Dahou, Abdelghani
    Xiong, Shengwu
    Zhou, Junwei
    Abd Elaziz, Mohamed
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2019, 18 (04)
  • [18] Semantic Convolutional Neural Network model for Safe Business Investment by Using BERT
    Heidari, Maryam
    Rafatirad, Setareh
    2020 SEVENTH INTERNATIONAL CONFERENCE ON SOCIAL NETWORK ANALYSIS, MANAGEMENT AND SECURITY (SNAMS), 2020, : 142 - 147
  • [19] Phrase-based Semantic Textual Similarity for Linking Researchers
    Reyes-Ortiz, Jose A.
    Bravo, Maricela
    Padilla, Omar E.
    2015 26TH INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS (DEXA), 2015, : 202 - 206
  • [20] Crossmodal Network-Based Distributional Semantic Models
    Iosif, Elias
    Potamianos, Alexandros
    LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 3973 - 3979