IRLCov19: A Large COVID-19 Multilingual Twitter Dataset of Indian Regional Languages

被引:0
作者
Uniyal, Deepak [1 ]
Agarwal, Amit [2 ]
机构
[1] Graph Era Univ, Dehra Dun, Uttarakhand, India
[2] IIT Roorkee, Roorkee, Uttar Pradesh, India
来源
MACHINE LEARNING AND PRINCIPLES AND PRACTICE OF KNOWLEDGE DISCOVERY IN DATABASES, PT II | 2021年 / 1525卷
关键词
COVID-19; Twitter; Indian Regional Languages; Natural Language Processing;
D O I
10.1007/978-3-030-93733-1_22
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Emerged in Wuhan city of China in December 2019, COVID-19 continues to spread rapidly across the world despite authorities having made available a number of vaccines. While the coronavirus has been around for a significant period of time, people and authorities still feel the need for awareness due to the mutating nature of the virus and therefore varying symptoms and prevention strategies. People and authorities resort to social media platforms the most to share awareness information and voice out their opinions due to their massive outreach in spreading the word in practically no time. People use a number of languages to communicate over social media platforms based on their familiarity, language outreach, and availability on social media platforms. The entire world has been hit by the coronavirus and India is the second worst-hit country in terms of the number of active coronavirus cases. India, being a multilingual country, offers a great opportunity to study the outreach of various languages that have been actively used across social media platforms. In this study, we aim to study the dataset related to COVID-19 collected in the period between February 2020 to July 2020 specifically for regional languages in India. This could be helpful for the Government of India, various state governments, NGOs, researchers, and policymakers in studying different issues related to the pandemic. We found that English has been the mode of communication in over 64% of tweets while as many as twelve regional languages in India account for approximately 4.77% of tweets.
引用
收藏
页码:309 / 324
页数:16
相关论文
共 50 条
  • [41] Impact Of Covid-19 On Education Using Twitter Data
    Makode, Anshita
    Chakraborty, Alakananda
    Darekar, Avanti
    Bist, Poojakumari
    2021 16TH INTERNATIONAL WORKSHOP ON SEMANTIC AND SOCIAL MEDIA ADAPTATION & PERSONALIZATION (SMAP 2021), 2021, : 36 - 41
  • [42] Machine Learning in Detecting COVID-19 Misinformation on Twitter
    Alenezi, Mohammed N.
    Alqenaei, Zainab M.
    FUTURE INTERNET, 2021, 13 (10)
  • [43] Analyzing Impact Dynamics of Misinformation Spread on X (Formerly Twitter) With a COVID-19 Dataset
    Duzen, Zafer
    Riveni, Mirela
    Aktas, Mehmet S.
    IEEE ACCESS, 2024, 12 : 165114 - 165129
  • [44] Perception of COVID-19 vaccination among Indian Twitter users: computational approach
    Prateeksha Dawn Davidson
    Thanujah Muniandy
    Dhivya Karmegam
    Journal of Computational Social Science, 2023, 6 : 541 - 560
  • [45] Trump's Twitter Propaganda During Covid-19
    Muqsith, Munadhil Abdul
    Kuswanti, Ana
    Pratomo, Rizky Ridho
    Muzykant, Valerii L.
    JURNAL THE MESSENGER, 2021, 13 (03) : 223 - 237
  • [46] Perception of COVID-19 vaccination among Indian Twitter users: computational approach
    Davidson, Prateeksha Dawn
    Muniandy, Thanujah
    Karmegam, Dhivya
    JOURNAL OF COMPUTATIONAL SOCIAL SCIENCE, 2023, 6 (02): : 541 - 560
  • [47] Twitter Diary and COVID-19 Survival: The Case of @acielumumba
    Musanga, Terrence
    JOURNAL OF LITERARY STUDIES, 2022, 38 (01)
  • [48] A Mixed Malay-English Language COVID-19 Twitter Dataset: A Sentiment Analysis
    Kong, Jeffery T. H.
    Juwono, Filbert H. H.
    Ngu, Ik Ying
    Nugraha, I. Gde Dharma
    Maraden, Yan
    Wong, W. K.
    BIG DATA AND COGNITIVE COMPUTING, 2023, 7 (02)
  • [49] Gendered COVID-19 discussions on Twitter: a Norwegian case
    Arora, Sanjana
    Debesay, Jonas
    Eslen-Ziya, Hande
    ONLINE INFORMATION REVIEW, 2024, 48 (02) : 425 - 437
  • [50] Twitter communication of university libraries in the face of Covid-19
    Martinez-Cardama, Sara
    Pacios, Ana R.
    PROFESIONAL DE LA INFORMACION, 2020, 29 (06): : 1 - 15