Measuring Gender: A Machine Learning Approach to Social Media Demographics and Author Profiling

被引:0
|
作者
Kovacs, Erik-Robert [1 ]
Cotfas, Liviu-Adrian [1 ]
Delcea, Camelia [1 ]
机构
[1] Bucharest Univ Econ Studies, Dept Econ Informat & Cybernet, Bucharest 010552, Romania
来源
COMPUTATIONAL COLLECTIVE INTELLIGENCE, ICCCI 2023 | 2023年 / 14162卷
关键词
author profiling; gender identification; ensemble methods; social media analysis; COVID-19; SENTIMENT ANALYSIS; TWITTER; NETWORKS; TWEETS;
D O I
10.1007/978-3-031-41456-5_26
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Social media has become a preeminent medium of communication during the early 21(st) century, facilitating dialogue between the political sphere, businesses, scientific experts, and everyday people. Researchers in the social sciences are focusing their attention on social media as a central site of social discourse, but such approaches are hampered by the lack of demographic data that could help them connect phenomena originating in social media spaces to their larger social context. Computational social science methods which use machine learning and deep learning natural language processing (NLP) tools for the task of author profiling (AP) can serve as an essential complement to such research. One of the major demographic categories of interest concerning social media is the gender distribution of users. We propose an ensemble of multiple machine learning classifiers able to distinguish whether a user is anonymous with an F1 score of 90.24%, then predict the gender of the user based on their name, obtaining an F1 score of 89.22%. We apply the classification pipeline to a set of approximately 44,000,000 posts related to COVID-19 extracted from the social media platform Twitter, comparing our results to a benchmark classifier trained on the PAN18 Author Profiling dataset, showing the validity of the proposed approach. An n-gram analysis on the text of the tweets to further compare the two methods has been performed.
引用
收藏
页码:337 / 349
页数:13
相关论文
共 50 条
  • [31] A machine learning approach predicts future risk to suicidal ideation from social media data
    Roy, Arunima
    Nikolitch, Katerina
    McGinn, Rachel
    Jinah, Safiya
    Klement, William
    Kaminsky, Zachary A.
    NPJ DIGITAL MEDICINE, 2020, 3 (01)
  • [32] Sentiment analysis of Arabic social media texts: A machine learning approach to deciphering customer perceptions
    Alsemaree, Ohud
    Alam, Atm S.
    Gill, Sukhpal Singh
    Uhlig, Steve
    HELIYON, 2024, 10 (09)
  • [33] A Collaborative Approach to Identifying Social Media Markers of Schizophrenia by Employing Machine Learning and Clinical Appraisals
    Birnbaum, Michael L.
    Ernala, Sindhu Kiranmai
    Rizvi, Asra F.
    De Choudhury, Munmun
    Kane, John M.
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2017, 19 (08)
  • [34] Social media research: The application of supervised machine learning in organizational communication research
    van Zoonen, Ward
    van der Meer, Toni G. L. A.
    COMPUTERS IN HUMAN BEHAVIOR, 2016, 63 : 132 - 141
  • [35] Harnessing Machine Learning to Unveil Emotional Responses to Hateful Content on Social Media
    Louati, Ali
    Louati, Hassen
    Albanyan, Abdullah
    Lahyani, Rahma
    Kariri, Elham
    Alabduljabbar, Abdulrahman
    COMPUTERS, 2024, 13 (05)
  • [36] Sentiment analysis of COVID-19 social media data through machine learning
    Dangi, Dharmendra
    Dixit, Dheeraj K.
    Bhagat, Amit
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (29) : 42261 - 42283
  • [37] Using social media and machine learning to understand sentiments towards Brazilian National Parks
    Souza, Carolina Neves
    Martinez-Arribas, Javier
    Correia, Ricardo A.
    Almeida, Joao A. G. R.
    Ladle, Richard
    Vaz, Ana Sofia
    Malhado, Ana Claudia
    BIOLOGICAL CONSERVATION, 2024, 293
  • [38] An Empirical Study and Analysis of the Machine Learning Algorithms Used in Detecting Cyberbullying in Social Media
    Sintaha, Mifta
    Mostakim, Moin
    2018 21ST INTERNATIONAL CONFERENCE OF COMPUTER AND INFORMATION TECHNOLOGY (ICCIT), 2018,
  • [39] Optimization of machine learning models for sentiment analysis in social media
    Brandao, Jhonathan Godoi
    Castro Junior, Antonio P.
    Pacheco, Viviane M. Gomes
    Rodrigues, Cloves Gonsalves
    Belo, Orlando M. Oliveira
    Coimbra, Antonio Paulo
    Calixto, Wesley Pacheco
    INFORMATION SCIENCES, 2025, 694
  • [40] Detection of Cyberbullying on Social Media Platforms Using Machine Learning
    Ali, Mohammad Usmaan
    Lefticaru, Raluca
    ADVANCES IN COMPUTATIONAL INTELLIGENCE SYSTEMS, UKCI 2023, 2024, 1453 : 220 - 233