Comparative study of word embeddings models and their usage in Arabic language applications

被引:0
|
作者
Suleiman, Dima [1 ,2 ]
Awajan, Arafat [1 ]
机构
[1] Princess Sumaya Univ Technol, King Hussein Fac Comp Sci, Dept Comp Sci, Amman, Jordan
[2] Univ Jordan, Dept Informat Technol, Amman, Jordan
来源
2018 19TH INTERNATIONAL ARAB CONFERENCE ON INFORMATION TECHNOLOGY (ACIT) | 2018年
关键词
word embeddings; deep learning; sentiment analysis; word2vec; Glove; semantic similarity; CBOW; Skip-grant;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Word embeddings is the representation of the text using vectors such that the words that have similar syntax and semantic will have similar vector representation. Representing words using vectors is very crucial for most of natural language processing applications. In natural language, when using neural network for processing, the words vectors will be fed as input to the network. In this paper, a comparative study of several word embeddings models is conducted including Glove and the two approaches of word2vec model called CBOW and Skip-gram. Furthermore, this study surveying most of the state-of-art of using word embeddings in Arabic language applications such as sentiment analysis, semantic similarity, short answer grading, information retrieval, paraphrase identification, plagiarism detection and Textual Entailment.
引用
收藏
页码:64 / 70
页数:7
相关论文
共 50 条
  • [41] Application-specific word embeddings for hate and offensive language detection
    Claver P. Soto
    Gustavo M. S. Nunes
    José Gabriel R. C. Gomes
    Nadia Nedjah
    Multimedia Tools and Applications, 2022, 81 : 27111 - 27136
  • [42] Application-specific word embeddings for hate and offensive language detection
    Soto, Claver P.
    Nunes, Gustavo M. S.
    Gomes, Jose Gabriel R. C.
    Nedjah, Nadia
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (19) : 27111 - 27136
  • [43] Evaluation of Different Word Embeddings to Create Personality Models in Spanish
    Orlando Lopez-Pabon, Felipe
    Rafael Orozco-Arroyave, Juan
    APPLIED COMPUTER SCIENCES IN ENGINEERING, WEA 2021, 2021, 1431 : 121 - 132
  • [44] Applications of Tf-idf Concept to Improve Monolingual and Cross-Language Information Retrieval based on Word Embeddings
    Sari, Syandra
    Adriani, Mirna
    PROCEEDINGS OF THE 1ST INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION SCIENCE AND SYSTEM, AISS 2019, 2019,
  • [45] Why Can Computers Understand Natural Language?: The Structuralist Image of Language Behind Word Embeddings
    Gastaldi J.L.
    Philosophy & Technology, 2021, 34 (1) : 149 - 214
  • [46] Text Representation Models based on the Spatial Distributional Properties of Word Embeddings
    Unnam, Narendra Babu
    Reddy, P. Krishna
    Pandey, Amit
    Manwani, Naresh
    PROCEEDINGS OF 7TH JOINT INTERNATIONAL CONFERENCE ON DATA SCIENCE AND MANAGEMENT OF DATA, CODS-COMAD 2024, 2024, : 603 - 604
  • [47] Semantic Information Retrieval: A comparative experimental study of NLP Tools and Language Resources for Arabic
    Soudani, Nadia
    Bounhas, Ibrahim
    Slimani, Yahya
    2016 IEEE 28TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2016), 2016, : 879 - 887
  • [48] RECURRENT NEURAL NETWORK LANGUAGE MODEL WITH STRUCTURED WORD EMBEDDINGS FOR SPEECH RECOGNITION
    He, Tianxing
    Xiang, Xu
    Qian, Yanmin
    Yu, Kai
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5396 - 5400
  • [49] Leveraging Temporal Trends for Training Contextual Word Embeddings to Address Bias in Biomedical Applications: Development Study
    Agmon, Shunit
    Singer, Uriel
    Radinsky, Kira
    JMIR AI, 2024, 3
  • [50] Investigating the impact of pre-processing techniques and pre-trained word embeddings in detecting Arabic health information on social media
    Albalawi, Yahya
    Buckley, Jim
    Nikolov, Nikola S.
    JOURNAL OF BIG DATA, 2021, 8 (01)