CoWIN twitter dataset: A comprehensive collection of public discourse on India's COVID-19 vaccination platform

被引:0
作者
Mittal, Shubham [1 ,2 ]
Umamaheswaran, Swarnalakshmi [1 ,2 ]
机构
[1] Symbiosis Int Deemed Univ, Symbiosis Inst Business Management, Bengaluru 560100, India
[2] Mckinsey, Gurugram 122018, India
关键词
CoWIN; COVID-19; Social media analytics; Digital health; Sentiment analysis; Health informatics; Twitter data; India; FUTURE;
D O I
10.1016/j.dib.2024.111252
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The CoWIN Twitter Dataset offers a wide-ranging collection of public opinions on India's COVID-19 vaccination platform CoWIN. The raw dataset has 635,000 tweets that mention "cowin," collected over the period of January to December 2021. The dataset was extracted by employing the Twitter Academic API. It addition to the raw data, it also included a cleaned and processed set of 419,409 English tweets, and a labeled subset with sentiment analysis. The raw data file has tweet details like ID, text, timestamp, user ID, and language. The processed dataset is devoid of URLs and hashtags and other noise, and also adds month and category groupings. Finally,the labelled dataset gives sentiment classifications of positive or negative the relevant tweets. This dataset enables researchers to analyse themes and sentiments related to India's vaccination administration. It can help policymakers gain insights around issues related to large-scale health initiatives and digital health systems. The mix of languages in the data also makes it useful for language processing research. (c) 2024 Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/)
引用
收藏
页数:9
相关论文
共 9 条
[1]   CoWIN: The future of universal immunization program in India [J].
Arjun, M. C. ;
Singh, Arvind Kumar ;
Parida, Swayam Pragyan .
INDIAN JOURNAL OF COMMUNITY MEDICINE, 2023, 48 (04) :514-517
[2]  
Camacho-Collados J, 2022, Arxiv, DOI [arXiv:2206.14774, DOI 10.18653/V1/2022.EMNLP-DEMOS.5]
[3]   The COWIN portal - current update, personal experience and future possibilities [J].
Gupta, Mukund ;
Goel, Akhil Dhanesh ;
Bhardwaj, Pankaj .
INDIAN JOURNAL OF COMMUNITY HEALTH, 2021, 33 (02) :414-414
[4]   Strategy for COVID-19 vaccination in India: the country with the second highest population and number of cases [J].
Kumar, Velayudhan Mohan ;
Pandi-Perumal, Seithikurippu R. ;
Trakht, Ilya ;
Thyagarajan, Sadras Panchatcharam .
NPJ VACCINES, 2021, 6 (01)
[5]   Overcoming Rare-Language Discrimination in Multi-Lingual Sentiment Analysis [J].
Lampert, Jasmin ;
Lampert, Christoph H. .
2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, :5185-5192
[6]  
Mathieu E., 2024, CORONAVIRUS PANDEMIC
[7]   Comparative analysis of preprocessing tasks over social media texts in Spanish [J].
Pablo Tessore, Juan ;
Martin Esnaola, Leonardo ;
Cecilia Russo, Claudia ;
Baldassarri, Sandra .
PROCEEDINGS OF THE XX INTERNATIONAL CONFERENCE ON HUMAN-COMPUTER INTERACTION (INTERACCION'2019), 2019,
[8]   Multilingual evaluation of pre-processing for BERT-based sentiment analysis of tweets [J].
Pota, Marco ;
Ventura, Mirko ;
Fujita, Hamido ;
Esposito, Massimo .
EXPERT SYSTEMS WITH APPLICATIONS, 2021, 181
[9]   Harnessing the Power of Hugging Face Transformers for Predicting Mental Health Disorders in Social Networks [J].
Pourkeyvan, Alireza ;
Safa, Ramin ;
Sorourkhah, Ali .
IEEE ACCESS, 2024, 12 :28025-28035