Gender identification for Egyptian Arabic dialect in twitter using deep learning models

被引:14
作者
ElSayed, Shereen [1 ]
Farouk, Mona [1 ]
机构
[1] Cairo Univ, Fac Engn, Giza, Egypt
关键词
Gender identification; Egyptian Arabic text classification; Deep learning; Natural language processing; Social Media analysis and mining;
D O I
10.1016/j.eij.2020.04.001
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Although the number of Arabic language writers in social media is increasing, the research work targeting Author Profiling (AP) is at the initial development phase. This paper investigates Gender Identification (GI) (male or female) of authors posting Egyptian dialect tweets using Neural Networks (NN) models. Various architectures of NN are explored with extensive parameters' selection such as simple Artificial Neural Network (ANN), Convolutional Neural Network (CNN), Long-Short Term Memory (LSTM), Convolutional Bidirectional Long-Short Term Memory (C-Bi-LSTM) and Convolutional Bidirectional Gated Recurrent Units (C-Bi-GRU) NN which is tuned for the GI problem at hand. The best acquired GI accuracy using C-Bi-GRU multichannel model is 91.37%. It is worth noting that the presence of the bidirectional layer as well as the convolutional layer in the NN models has significantly enhanced the GI accuracy. (C) 2020 Production and hosting by Elsevier B.V. on behalf of Faculty of Computers and Artificial Intelligence, Cairo University.
引用
收藏
页码:159 / 167
页数:9
相关论文
共 20 条
  • [1] [Anonymous], 2017, CLEF 2017 EV LABS WO
  • [2] [Anonymous], 2016, T ASSOC COMPUT LING, DOI DOI 10.1162/TACL_A_00051
  • [3] [Anonymous], 2014, ADAM METHOD STOCHAST
  • [4] Belinkov Y., 2016, 3 WORKSH NLP SIM LAN
  • [5] Enhancing Deep Learning Gender Identification with Gated Recurrent Units Architecture in Social Text
    Bsir, Bassem
    Zrigui, Mounir
    [J]. COMPUTACION Y SISTEMAS, 2018, 22 (03): : 757 - 766
  • [6] Cho Kyunghyun, 2014, ASS COMPUT LINGUIST
  • [7] Collobert R, 2011, J MACH LEARN RES, V12, P2493
  • [8] Estruch Carlos Perez, 2017, P INT C REC ADV NAT, P577
  • [9] FrancoSalvador M., 2017, CLEF 2017 LABS WORKS, P1866
  • [10] Gender identification of egyptian dialect in twitter
    Husseina, Shereen
    Farouk, Mona
    Hemayed, ElSayed
    [J]. EGYPTIAN INFORMATICS JOURNAL, 2019, 20 (02) : 109 - 116