Tweet topics on cancer among Indian Twitter users—computational approach using latent Dirichlet allocation topic modelling

被引:0
作者
Thilagavathi Ramamoorthy
Bagavandas Mappillairaju
机构
[1] SRM Institute of Science and Technology,School of Public Health
[2] SRM Institute of Science and Technology,Centre for Statistics
来源
Journal of Computational Social Science | 2023年 / 6卷 / 2期
关键词
Twitter; Cancer; Latent Dirichlet allocation; Machine learning; Natural language processing; Social media; Topic modelling;
D O I
暂无
中图分类号
学科分类号
摘要
Understanding the extent and content of conversations on cancers inform the stakeholders regarding the needs of the community in terms of knowledge, support and interventions. This study identified the topics of tweet content shared regarding cancer, source of messages and the degree of reachability of identified topics among Twitter users in India. Twitter messages geocoded within India, related to cancer and posted between September 15, 2021 and October 15, 2021 were retrieved using the Twitter application programming interface based on keywords identified from Symplur Signals. The tweets were pre-processed to remove the stop words, hashtags and Uniform Resource Locators. Tweets were visualized using word clouds and correlations between word tokens. Latent Dirichlet allocation (LDA) topic model, an unsupervised machine learning technique was used to identify the commonly discussed cancer topics. A total of 6374 tweets from 3135 unique twitter users were analysed in the study. Majority of the tweets (60.8%) were from the individual twitter users. LDA model identified four topics: (1) prevention, early detection and promotion (36.1%), (2) seeking support and sharing personal experience (15.8%), (3) Human Papillomavirus vaccine and cancer research (13.4%), (4) risk factors, treatment and raising awareness (34.7%). Among the four identified topics, prevention, early detection and promotion had the highest reachability. Twitter is being used as a potential alternative communication platform for disseminating cancer-related information in India. The topics identified in the study provides useful insights for public health professionals and organizations for aligning cancer-related engagement and education for the target audience.
引用
收藏
页码:1033 / 1054
页数:21
相关论文
共 78 条
  • [1] Yeole BB(2008)Geriatric cancers in India: An epidemiological and demographic overview Asian Pacific Journal of Cancer Prevention 9 271-274
  • [2] Kurkure AP(2016)Predicting retweeting behavior on breast cancer social networks: Network and content characteristics Journal of health communication 21 479-486
  • [3] Koyande SS(2016)Cancer communication in the social media age JAMA oncology 2 822-823
  • [4] Kim E(2016)A pattern-matched Twitter analysis of US cancer-patient sentiments The Journal of surgical research 206 536-542
  • [5] Hou J(2021)Cancer communication and user engagement on Chinese social media: Content analysis and topic modeling study Journal of Medical Internet Research 23 1829-1022
  • [6] Han JY(2019)rtweet: Collecting and analyzing Twitter data Journal of Open Source Software 4 993-552
  • [7] Himelboim I(2003)Latent Dirichlet allocation Journal of Machine Learning Research Journal 3 544-106
  • [8] Sedrak MS(2020)Exploring eating disorder topics on Twitter: Machine learning approach JMIR Medical Informatics 8 89-338
  • [9] Cohen RB(2020)Discussions of miscarriage and preterm births on Twitter Paediatric and perinatal epidemiology 34 332-827
  • [10] Merchant RM(2020)Social media insights into US mental health during the COVID-19 pandemic: Longitudinal analysis of Twitter data Journal of medical Internet research 22 3130-521