Comparative study of word embeddings models and their usage in Arabic language applications

被引:0
|
作者
Suleiman, Dima [1 ,2 ]
Awajan, Arafat [1 ]
机构
[1] Princess Sumaya Univ Technol, King Hussein Fac Comp Sci, Dept Comp Sci, Amman, Jordan
[2] Univ Jordan, Dept Informat Technol, Amman, Jordan
来源
2018 19TH INTERNATIONAL ARAB CONFERENCE ON INFORMATION TECHNOLOGY (ACIT) | 2018年
关键词
word embeddings; deep learning; sentiment analysis; word2vec; Glove; semantic similarity; CBOW; Skip-grant;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Word embeddings is the representation of the text using vectors such that the words that have similar syntax and semantic will have similar vector representation. Representing words using vectors is very crucial for most of natural language processing applications. In natural language, when using neural network for processing, the words vectors will be fed as input to the network. In this paper, a comparative study of several word embeddings models is conducted including Glove and the two approaches of word2vec model called CBOW and Skip-gram. Furthermore, this study surveying most of the state-of-art of using word embeddings in Arabic language applications such as sentiment analysis, semantic similarity, short answer grading, information retrieval, paraphrase identification, plagiarism detection and Textual Entailment.
引用
收藏
页码:64 / 70
页数:7
相关论文
共 50 条
  • [31] ArWordVec: efficient word embedding models for Arabic tweets
    Mohammed M. Fouad
    Ahmed Mahany
    Naif Aljohani
    Rabeeh Ayaz Abbasi
    Saeed-Ul Hassan
    Soft Computing, 2020, 24 : 8061 - 8068
  • [32] ArWordVec: efficient word embedding models for Arabic tweets
    Fouad, Mohammed M.
    Mahany, Ahmed
    Aljohani, Naif
    Abbasi, Rabeeh Ayaz
    Hassan, Saeed-Ul
    SOFT COMPUTING, 2020, 24 (11) : 8061 - 8068
  • [33] Age of Exposure 2.0: Estimating word complexity using iterative models of word embeddings
    Robert-Mihai Botarleanu
    Mihai Dascalu
    Micah Watanabe
    Scott Andrew Crossley
    Danielle S. McNamara
    Behavior Research Methods, 2022, 54 : 3015 - 3042
  • [34] Age of Exposure 2.0: Estimating word complexity using iterative models of word embeddings
    Botarleanu, Robert-Mihai
    Dascalu, Mihai
    Watanabe, Micah
    Crossley, Scott Andrew
    McNamara, Danielle S.
    BEHAVIOR RESEARCH METHODS, 2022, 54 (06) : 3015 - 3042
  • [35] Enriching Word Embeddings with Global Information and Testing on Highly Inflected Language
    Svoboda, Lukas
    Brychcin, Tomas
    COMPUTACION Y SISTEMAS, 2019, 23 (03): : 773 - 783
  • [36] Biomedical Semantic Embeddings: Using hybrid sentences to construct biomedical word embeddings and its applications
    Shaik, Arshad
    Jin, Wei
    2019 IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI), 2019,
  • [37] Word Embeddings Reveal How Fundamental Sentiments Structure Natural Language
    van Loon, Austin
    Freese, Jeremy
    AMERICAN BEHAVIORAL SCIENTIST, 2023, 67 (02) : 175 - 200
  • [38] Application of WordNet and word embeddings in the development of prototypes for automatic language generation
    Dominguez Vazquez, Maria Jose
    LINGUAMATICA, 2020, 12 (02): : 71 - 80
  • [39] What do BERT word embeddings learn about the French language?
    Goliakova, Ekaterina
    Langlois, David
    PROCEEDINGS OF THE SIXTH INTERNATIONAL CONFERENCE COMPUTATIONAL LINGUISTICS IN BULGARIA, CLIB 2024, 2024, : 14 - 32
  • [40] Affective Knowledge-enhanced Emotion Detection in Arabic Language: A Comparative Study
    Serrano-Guerrero, Jesus
    Alshouha, Bashar
    Romero, Francisco P.
    Olivas, Jose A.
    JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2022, 28 (07) : 733 - 757