Mining association rules from COVID-19 related twitter data to discover word and inferences

被引:17
作者
Koukaras, Paraskevas [1 ]
Tjortjis, Christos [1 ]
Rousidis, Dimitrios [1 ]
机构
[1] Int Hellen Univ, Sch Sci & Technol, Data Min & Analyt Res Grp, 14th Km Thessaloniki-N Moudania, Thermi 57001, Greece
关键词
Social media; Topic extraction; Association rule mining; Data mining; COVID-19;
D O I
10.1016/j.is.2022.102054
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This work utilizes data from Twitter to mine association rules and extract knowledge about public attitudes regarding worldwide crises. It exploits the COVID-19 pandemic as a use case, and analyzes tweets gathered between February and August 2020. The proposed methodology comprises topic extraction and visualization techniques, such as WordClouds, to form clusters or themes of opinions. It then uses Association Rule Mining (ARM) to discover frequent wordsets and generate rules that infer to user attitudes. The goal is to utilize ARM as a postprocessing technique to enhance the output of any topic extraction method. Therefore, only strong wordsets are stored after discarding trivia ones. We also employ frequent wordset identification to reduce the number of extracted topics. Our findings showcase that 50 initially retrieved topics are narrowed down to just 4, when combining Latent Dirichlet Allocation with ARM. Our methodology facilitates producing more accurate and generalizable results, whilst exposing implications regarding social media user attitudes. (C) 2022 Elsevier Ltd. All rights reserved.
引用
收藏
页数:21
相关论文
共 35 条
[1]   Top Concerns of Tweeters During the COVID-19 Pandemic: Infoveillance Study [J].
Abd-Alrazaq, Alaa ;
Alhuwail, Dari ;
Househ, Mowafa ;
Hamdi, Mounir ;
Shah, Zubair .
JOURNAL OF MEDICAL INTERNET RESEARCH, 2020, 22 (04)
[2]  
Agrawal R., 2013, INT J SCI TECHNOL RE, V2, P13
[3]  
Alvarez-Melis D., 2016, 10 INT AAAI C WEB SO, DOI DOI 10.1609/ICWSM.V10I1.14817
[4]  
[Anonymous], 2011, INT C ART INT STAT
[5]  
[Anonymous], 2012, P ACM SIGKDD INT C K, DOI DOI 10.1145/2339530.2339592
[6]  
[Anonymous], 2011, P ANN M ASS COMP LIN
[7]   Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022
[8]  
Brin S., 1997, SIGMOD Record, V26, P265, DOI [10.1145/253262.253325, 10.1145/253262.253327]
[9]  
Cataldi M., 2010, INT WORKSH MULT WAT, DOI [DOI 10.1145/1814245.1814249, 10.1145/1814245.1814249]
[10]   Emerging topic detection in twitter stream based on high utility pattern mining [J].
Choi, Hyeok-Jun ;
Park, Cheong Hee .
EXPERT SYSTEMS WITH APPLICATIONS, 2019, 115 :27-36