Arabic Twitter Conversation Dataset about the COVID-19 Vaccine

被引:3
作者
Alhazmi, Huda [1 ]
机构
[1] Umm Al Qura Univ, Dept Comp Sci, Mecca 24236, Saudi Arabia
关键词
COVID-19; pandemic; vaccine; Twitter; dataset; Arabic;
D O I
10.3390/data7110152
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The development and rollout of COVID-19 vaccination around the world offers hope for controlling the pandemic. People turned to social media such as Twitter seeking information or to voice their opinion. Therefore, mining such conversation can provide a rich source of data for different applications related to the COVID-19 vaccine. In this data article, we developed an Arabic Twitter dataset of 1.1 M Arabic posts regarding the COVID-19 vaccine. The dataset was streamed over one year, covering the period from January to December 2021. We considered a set of crawling keywords in the Arabic language related to the conversation about the vaccine. The dataset consists of seven databases that can be analyzed separately or merged for further analysis. The initial analysis depicts the embedded features within the posts, including hashtags, media, and the dynamic of replies and retweets. Further, the textual analysis reveals the most frequent words that can capture the trends of the discussions. The dataset was designed to facilitate research across different fields, such as social network analysis, information retrieval, health informatics, and social science.
引用
收藏
页数:17
相关论文
共 38 条
[1]   Data set on coping strategies in the digital age: The role of psychological well-being and social capital among university students in Java']Java Timor, Surabaya, Indonesia [J].
Abbas, Ansar ;
Eliyana, Anis ;
Ekowati, Dian ;
Saud, Muhammad ;
Raza, Ali ;
Wardani, Ratna .
DATA IN BRIEF, 2020, 30
[2]  
Abdul-Mageed M, 2021, Arxiv, DOI [arXiv:2005.06012, 10.18653/v1/2021.eacl-main.298, DOI 10.18653/V1/2021.EACL-MAIN.298]
[3]   Dataset on dynamics of Coronavirus on Twitter [J].
Aguilar-Gallegos, Norman ;
Elizabeth Romero-Garcia, Leticia ;
Genaro Martinez-Gonzalez, Enrique ;
Ivan Garcia-Sanchez, Edgar ;
Aguilar-Avila, Jorge .
DATA IN BRIEF, 2020, 30
[4]  
Alam F, 2021, FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, P611
[5]   The Saudi Ministries Twitter communication strategies during the COVID-19 pandemic: A qualitative content analysis study [J].
Aldekhyyel, Raniah N. ;
Binkheder, Samar ;
Aldekhyyel, Shahad N. ;
Alhumaid, Nuha ;
Hassounah, Marwah ;
Almogbel, Alanoud ;
Jamal, Amr A. .
PUBLIC HEALTH IN PRACTICE, 2022, 3
[6]  
Alshaabi T, 2021, Arxiv, DOI arXiv:2003.03667
[7]  
Alsudias L., 2020, P 1 WORKSH NLP COVID
[8]  
[Anonymous], TWITTER DEV AGREEMEN
[9]  
[Anonymous], COVID 19 PFIZER BION
[10]  
Aslam S., 2018, TWITTER NUMBERS STAT