Exploring the effect of streamed social media data variations on social network analysis

被引:7
作者
Weber, Derek [1 ,2 ]
Nasim, Mehwish [3 ,4 ,5 ,6 ]
Mitchell, Lewis [6 ,7 ]
Falzon, Lucia [8 ]
机构
[1] Univ Adelaide, Sch Comp Sci, Adelaide, SA, Australia
[2] Def Sci & Technol Grp, Adelaide, SA, Australia
[3] Flinders Univ S Australia, Coll Sci & Engn, Adelaide, SA, Australia
[4] CSIRO, Data61, Adelaide, SA, Australia
[5] Cyber Secur Cooperat Res Ctr, Adelaide, SA, Australia
[6] ARC Ctr Excellence Math & Stat Frontiers, Adelaide, SA, Australia
[7] Univ Adelaide, Sch Math Sci, Adelaide, SA, Australia
[8] Univ Melbourne, Sch Psychol Sci, Melbourne, Vic, Australia
关键词
Social media analytics; Dataset reliability; Social network analysis; BIG DATA; TWITTER;
D O I
10.1007/s13278-021-00770-y
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
To study the effects of online social network (OSN) activity on real-world offline events, researchers need access to OSN data, the reliability of which has particular implications for social network analysis. This relates not only to the completeness of any collected dataset, but also to constructing meaningful social and information networks from them. In this multidisciplinary study, we consider the question of constructing traditional social networks from OSN data and then present several measurement case studies showing how variations in collected OSN data affect social network analyses. To this end, we developed a systematic comparison methodology, which we applied to five pairs of parallel datasets collected from Twitter in four case studies. We found considerable differences in several of the datasets collected with different tools and that these variations significantly alter the results of subsequent analyses. Our results lead to a set of guidelines for researchers planning to collect online data streams to infer social networks.
引用
收藏
页数:38
相关论文
共 73 条
[21]  
Falzon Lucia, 2017, ASONAM, P1183, DOI DOI 10.1145/3110025.3122118
[22]  
Ferrara E., 2017, 1 MONDAY, DOI 10.5210/ fm.v22i8.8005
[23]   The Rise of Social Bots [J].
Ferrara, Emilio ;
Varol, Onur ;
Davis, Clayton ;
Menczer, Filippo ;
Flammini, Alessandro .
COMMUNICATIONS OF THE ACM, 2016, 59 (07) :96-104
[24]   Risk-Based Data Validation in Machine Learning-Based Software Systems [J].
Foidl, Harald ;
Felderer, Michael .
PROCEEDINGS OF THE 3RD ACM SIGSOFT INTERNATIONAL WORKSHOP ON MACHINE LEARNING TECHNIQUES FOR SOFTWARE QUALITY EVALUATION (MALTESQUE '19), 2019, :13-18
[25]   It takes a village to manipulate the media: coordinated link sharing behavior during 2018 and 2019 Italian elections [J].
Giglietto, Fabio ;
Righetti, Nicola ;
Rossi, Luca ;
Marino, Giada .
INFORMATION COMMUNICATION & SOCIETY, 2020, 23 (06) :867-891
[26]   Assessing the bias in samples of large online networks [J].
Gonzalez-Bailon, Sandra ;
Wang, Ning ;
Rivero, Alejandro ;
Borge-Holthoefer, Javier ;
Moreno, Yamir .
SOCIAL NETWORKS, 2014, 38 :16-27
[27]   #IStandWithDan versus #DictatorDan: the polarised dynamics of Twitter discussions about Victoria's COVID-19 restrictions [J].
Graham, Timothy ;
Bruns, Axel ;
Angus, Daniel ;
Hurcombe, Edward ;
Hames, Sam .
MEDIA INTERNATIONAL AUSTRALIA, 2021, 179 (01) :127-148
[28]   Bayesian Inference of Network Structure From Information Cascades [J].
Gray, Caitlin ;
Mitchell, Lewis ;
Roughan, Matthew .
IEEE TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING OVER NETWORKS, 2020, 6 :371-381
[29]   Changing Perspectives: Is It Sufficient to Detect Social Bots? [J].
Grimme, Christian ;
Assenmacher, Dennis ;
Adam, Lena .
SOCIAL COMPUTING AND SOCIAL MEDIA: USER EXPERIENCE AND BEHAVIOR, SCSM 2018, PT I, 2018, 10913 :445-461
[30]   Imagining Twitter as an Imagined Community [J].
Gruzd, Anatoliy ;
Wellman, Barry ;
Takhteyev, Yuri .
AMERICAN BEHAVIORAL SCIENTIST, 2011, 55 (10) :1294-1318