Measuring Gender: A Machine Learning Approach to Social Media Demographics and Author Profiling

被引:0
|
作者
Kovacs, Erik-Robert [1 ]
Cotfas, Liviu-Adrian [1 ]
Delcea, Camelia [1 ]
机构
[1] Bucharest Univ Econ Studies, Dept Econ Informat & Cybernet, Bucharest 010552, Romania
来源
COMPUTATIONAL COLLECTIVE INTELLIGENCE, ICCCI 2023 | 2023年 / 14162卷
关键词
author profiling; gender identification; ensemble methods; social media analysis; COVID-19; SENTIMENT ANALYSIS; TWITTER; NETWORKS; TWEETS;
D O I
10.1007/978-3-031-41456-5_26
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Social media has become a preeminent medium of communication during the early 21(st) century, facilitating dialogue between the political sphere, businesses, scientific experts, and everyday people. Researchers in the social sciences are focusing their attention on social media as a central site of social discourse, but such approaches are hampered by the lack of demographic data that could help them connect phenomena originating in social media spaces to their larger social context. Computational social science methods which use machine learning and deep learning natural language processing (NLP) tools for the task of author profiling (AP) can serve as an essential complement to such research. One of the major demographic categories of interest concerning social media is the gender distribution of users. We propose an ensemble of multiple machine learning classifiers able to distinguish whether a user is anonymous with an F1 score of 90.24%, then predict the gender of the user based on their name, obtaining an F1 score of 89.22%. We apply the classification pipeline to a set of approximately 44,000,000 posts related to COVID-19 extracted from the social media platform Twitter, comparing our results to a benchmark classifier trained on the PAN18 Author Profiling dataset, showing the validity of the proposed approach. An n-gram analysis on the text of the tweets to further compare the two methods has been performed.
引用
收藏
页码:337 / 349
页数:13
相关论文
共 50 条
  • [41] Detection of Cyberbullying on Social Media Platforms Using Machine Learning
    Ali, Mohammad Usmaan
    Lefticaru, Raluca
    ADVANCES IN COMPUTATIONAL INTELLIGENCE SYSTEMS, UKCI 2023, 2024, 1453 : 220 - 233
  • [42] Measuring urban sentiments from social media data: a dual-polarity metric approach
    Gao, Yong
    Chen, Yuanyuan
    Mu, Lan
    Gong, Shize
    Zhang, Pengcheng
    Liu, Yu
    JOURNAL OF GEOGRAPHICAL SYSTEMS, 2022, 24 (02) : 199 - 221
  • [43] Unscramble social media power for waste management: A multilayer deep learning approach
    Shahidzadeh, Mohammad Hossein
    Shokouhyar, Sajjad
    Javadi, Fatemeh
    Shokoohyar, Sina
    JOURNAL OF CLEANER PRODUCTION, 2022, 377
  • [44] Intelligent detection of hate speech in Arabic social network: A machine learning approach
    Aljarah, Ibrahim
    Habib, Maria
    Hijazi, Neveen
    Faris, Hossam
    Qaddoura, Raneem
    Hammo, Bassam
    Abushariah, Mohammad
    Alfawareh, Mohammad
    JOURNAL OF INFORMATION SCIENCE, 2021, 47 (04) : 483 - 501
  • [45] Sentiment analysis for cruises in Saudi Arabia on social media platforms using machine learning algorithms
    Al Sari, Bador
    Alkhaldi, Rawan
    Alsaffar, Dalia
    Alkhaldi, Tahani
    Almaymuni, Hanan
    Alnaim, Norah
    Alghamdi, Najwa
    Olatunji, Sunday O.
    JOURNAL OF BIG DATA, 2022, 9 (01)
  • [46] Toward Social Media Content Recommendation Integrated with Data Science and Machine Learning Approach for E-Learners
    Shahbazi, Zeinab
    Byun, Yung Cheol
    SYMMETRY-BASEL, 2020, 12 (11): : 1 - 22
  • [47] Detecting Social Media Rumor Debunking Effectiveness During Public Health Emergencies: An Interpretable Machine Learning Approach
    Zhang, Shuai
    Hou, Jianhua
    Zhang, Yang
    Yao, Zhizhen
    Zhang, Zhijian
    SCIENCE COMMUNICATION, 2025, 47 (01) : 23 - 56
  • [48] Predicting Brazilian and US Elections with Machine Learning and Social Media Data
    Brito, Kellyton dos Santos
    Leitao Adeodato, Paulo Jorge
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [49] Suicidal ideation detection on social media: a review of machine learning methods
    Abdulsalam, Asma
    Alhothali, Areej
    SOCIAL NETWORK ANALYSIS AND MINING, 2024, 14 (01)
  • [50] Examining CSR communication on social media during a victim crisis: a machine learning based text analytics approach
    Yang, Jing
    Basile, Kelly
    Zhao, Xiaowei
    JOURNAL OF RESEARCH IN INTERACTIVE MARKETING, 2024,