IRLCov19: A Large COVID-19 Multilingual Twitter Dataset of Indian Regional Languages

被引:0
作者
Uniyal, Deepak [1 ]
Agarwal, Amit [2 ]
机构
[1] Graph Era Univ, Dehra Dun, Uttarakhand, India
[2] IIT Roorkee, Roorkee, Uttar Pradesh, India
来源
MACHINE LEARNING AND PRINCIPLES AND PRACTICE OF KNOWLEDGE DISCOVERY IN DATABASES, PT II | 2021年 / 1525卷
关键词
COVID-19; Twitter; Indian Regional Languages; Natural Language Processing;
D O I
10.1007/978-3-030-93733-1_22
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Emerged in Wuhan city of China in December 2019, COVID-19 continues to spread rapidly across the world despite authorities having made available a number of vaccines. While the coronavirus has been around for a significant period of time, people and authorities still feel the need for awareness due to the mutating nature of the virus and therefore varying symptoms and prevention strategies. People and authorities resort to social media platforms the most to share awareness information and voice out their opinions due to their massive outreach in spreading the word in practically no time. People use a number of languages to communicate over social media platforms based on their familiarity, language outreach, and availability on social media platforms. The entire world has been hit by the coronavirus and India is the second worst-hit country in terms of the number of active coronavirus cases. India, being a multilingual country, offers a great opportunity to study the outreach of various languages that have been actively used across social media platforms. In this study, we aim to study the dataset related to COVID-19 collected in the period between February 2020 to July 2020 specifically for regional languages in India. This could be helpful for the Government of India, various state governments, NGOs, researchers, and policymakers in studying different issues related to the pandemic. We found that English has been the mode of communication in over 64% of tweets while as many as twelve regional languages in India account for approximately 4.77% of tweets.
引用
收藏
页码:309 / 324
页数:16
相关论文
共 50 条
  • [31] Opinions on Homeopathy for COVID-19 on Twitter
    Bopaiah, Jeevith
    Garimella, Kiran
    Kavuluru, Ramakanth
    PROCEEDINGS OF THE 14TH ACM WEB SCIENCE CONFERENCE, WEBSCI 2022, 2022, : 359 - 363
  • [32] Sentiment Analysis of Finnish Twitter Discussions on COVID-19 During the Pandemic
    Claes M.
    Farooq U.
    Salman I.
    Teern A.
    Isomursu M.
    Halonen R.
    SN Computer Science, 5 (2)
  • [33] Public risk perception and emotion on Twitter during the Covid-19 pandemic
    Joel Dyer
    Blas Kolic
    Applied Network Science, 5
  • [34] Public risk perception and emotion on Twitter during the Covid-19 pandemic
    Dyer, Joel
    Kolic, Blas
    APPLIED NETWORK SCIENCE, 2020, 5 (01)
  • [35] National Leaders' Usage of Twitter in Response to COVID-19: A Sentiment Analysis
    Wang, Yuming
    Croucher, Stephen M.
    Pearson, Erika
    FRONTIERS IN COMMUNICATION, 2021, 6
  • [36] Twitter sentiment analysis for COVID-19 associated mucormycosis
    Singh, Maneet
    Dhillon, Hennaav Kaur
    Ichhpujani, Parul
    Iyengar, Sudarshan
    Kaur, Rishemjit
    INDIAN JOURNAL OF OPHTHALMOLOGY, 2022, 70 (05) : 1773 - +
  • [37] Twitter discussions on breastfeeding during the COVID-19 pandemic
    Jawahar Jagarapu
    Marlon I. Diaz
    Christoph U. Lehmann
    Richard J. Medford
    International Breastfeeding Journal, 18
  • [38] #Covid-19: An exploratory investigation of hashtag usage on Twitter
    Petersen, Kai
    Gerken, Jan M.
    HEALTH POLICY, 2021, 125 (04) : 541 - 547
  • [39] Causal Modeling of Twitter Activity during COVID-19
    Gencoglu, Oguzhan
    Gruber, Mathias
    COMPUTATION, 2020, 8 (04) : 1 - 14
  • [40] Twitter discussions on breastfeeding during the COVID-19 pandemic
    Jagarapu, Jawahar
    Diaz, Marlon I.
    Lehmann, Christoph U.
    Medford, Richard J.
    INTERNATIONAL BREASTFEEDING JOURNAL, 2023, 18 (01)