Gender Identification Using Marginalised Stacked Denoising Autoencoders on Twitter Data

被引:1
作者
Al-onazi, Badriyya B. [1 ]
Nour, Mohamed K. [2 ]
Alshamrani, Hassan [3 ]
Al Duhayyim, Mesfer [4 ]
Mohsen, Heba [5 ]
Abdelmageed, Amgad Atta [6 ]
Mohammed, Gouse Pasha [6 ]
Zamani, Abu Sarwar [6 ]
机构
[1] Princess Nourah Bint Abdulrahman Univ, Arab Language Teaching Inst, Dept Language Preparat, POB 84428, Riyadh 11671, Saudi Arabia
[2] Umm Al Qura Univ, Coll Comp & Informat Syst, Dept Comp Sci, Mecca 24211, Saudi Arabia
[3] King Saud Univ, Arab Linguist Inst, Dept Teachers Training, POB 145111, Riyadh 4545, Saudi Arabia
[4] Prince Sattam Bin Abdulaziz Univ, Coll Sci & Humanities Aflaj, Dept Comp Sci, Al Aflaj 16733, Saudi Arabia
[5] Future Univ Egypt, Fac Comp & Informat Technol, Dept Comp Sci, New Cairo 11835, Egypt
[6] Prince Sattam Bin Abdulaziz Univ, Dept Comp & Self Dev Preparatory Year Deanship, AlKharj, Saudi Arabia
关键词
Arabic twitter; gender identification; bat algorithm; hybrid deep learning; social media; arabic corpus; DIALECT;
D O I
10.32604/iasc.2023.034623
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Gender analysis of Twitter could reveal significant socio-cultural differ-ences between female and male users. Efforts had been made to analyze and auto-matically infer gender formerly for more commonly spoken languages' content, but, as we now know that limited work is being undertaken for Arabic. Most of the research works are done mainly for English and least amount of effort for non-English language. The study for Arabic demographic inference like gen-der is relatively uncommon for social networking users, especially for Twitter. Therefore, this study aims to design an optimal marginalized stacked denoising autoencoder for gender identification on Arabic Twitter (OMSDAE-GIAT) model. The presented OMSDAE-GIAR technique mainly concentrates on the identifica-tion and classification of gender exist in the Twitter data. To attain this, the OMS-DAE-GIAT model derives initial stages of data pre-processing and word embedding. Next, the MSDAE model is exploited for the identification of gender into two classes namely male and female. In the final stage, the OMSDAE-GIAT technique uses enhanced bat optimization algorithm (EBOA) for parameter tuning process, showing the novelty of our work. The performance validation of the OMSDAE-GIAT model is inspected against an Arabic corpus dataset and the results are measured under distinct metrics. The comparison study reported the enhanced performance of the OMSDAE-GIAT model over other recent approaches.
引用
收藏
页码:2529 / 2544
页数:16
相关论文
共 20 条
[1]   Estimating Intelligence Quotient Using Stylometry and Machine Learning Techniques: A Review [J].
Adebayo, Glory O. ;
Yampolskiy, Roman V. .
BIG DATA MINING AND ANALYTICS, 2022, 5 (03) :163-191
[2]   A Study of Arabic Social Media UsersPosting Behavior and Author's Gender Prediction [J].
Al-Ghadir, Abdulrahman I. ;
Azmi, Aqil M. .
COGNITIVE COMPUTATION, 2019, 11 (01) :71-86
[3]   Using Dynamic Pruned N-Gram Model for Identifying the Gender of the User [J].
Ali, Noaman M. ;
Alshahrani, Abdullah ;
Alghamdi, Ahmed M. ;
Novikov, Boris .
APPLIED SCIENCES-BASEL, 2022, 12 (13)
[4]   Investigating the effects of gender, dialect, and training size on the performance of Arabic speech recognition [J].
Alsharhan, Eiman ;
Ramsay, Allan .
LANGUAGE RESOURCES AND EVALUATION, 2020, 54 (04) :975-998
[5]  
Alzahrani S., 2019, US CHINA FOREIGN LAN, V17, P251
[6]   Identification of sex from footprint dimensions using machine learning: a study on population of Punjab in Pakistan [J].
Awais M. ;
Naeem F. ;
Rasool N. ;
Mahmood S. .
Egyptian Journal of Forensic Sciences, 8 (1)
[7]   Enhancing Deep Learning Gender Identification with Gated Recurrent Units Architecture in Social Text [J].
Bsir, Bassem ;
Zrigui, Mounir .
COMPUTACION Y SISTEMAS, 2018, 22 (03) :757-766
[8]   A Panoramic Survey of Natural Language Processing in the Arab World [J].
Darwish, Kareem ;
Habash, Nizar ;
Abbas, Mourad ;
Al-Khalifa, Hend ;
Al-Natsheh, Huseein T. ;
Bouamor, Houda ;
Bouzoubaa, Karim ;
Cavalli-Sforza, Violetta ;
El-Beltagy, Samhaa R. ;
El-Hajj, Wassim ;
Jarrar, Mustafa ;
Mubarak, Hamdy .
COMMUNICATIONS OF THE ACM, 2021, 64 (04) :72-81
[9]   Gender identification for Egyptian Arabic dialect in twitter using deep learning models [J].
ElSayed, Shereen ;
Farouk, Mona .
EGYPTIAN INFORMATICS JOURNAL, 2020, 21 (03) :159-167
[10]  
Haidar B., 2017, Advances in Science Technology and Engineering Systems Journal, V2, P275, DOI DOI 10.25046/AJ020634