Upscaling human activity data: A statistical ecology approach

被引:2
作者
Tovo, Anna [1 ,2 ]
Stivanello, Samuele [2 ]
Maritan, Amos [1 ]
Suweis, Samir [1 ,3 ]
Favaro, Stefano [4 ]
Formentin, Marco [1 ,3 ]
机构
[1] Univ Padua, Ist Nazl Fis Nucleare, Dipartimento Fis & Astron Galileo Galilei, I-35100 Padua, Italy
[2] Univ Padua, Dipartimento Matemat Tullio LeviCivita, I-35100 Padua, Italy
[3] Univ Padua, Padova Neurosci Ctr, I-35100 Padua, Italy
[4] Univ Turin, Dipartimento Sci Econ Sociali & Matemat Stat, I-10124 Turin, Italy
基金
欧洲研究理事会;
关键词
RELATIVE SPECIES ABUNDANCE; HEAVY TAILS; DYNAMICS; NUMBER; ORIGIN;
D O I
10.1371/journal.pone.0253461
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Big data require new techniques to handle the information they come with. Here we consider four datasets (email communication, Twitter posts, Wikipedia articles and Gutenberg books) and propose a novel statistical framework to predict global statistics from random samples. More precisely, we infer the number of senders, hashtags and words of the whole dataset and how their abundances (i.e. the popularity of a hashtag) change through scales from a small sample of sent emails per sender, posts per hashtag and word occurrences. Our approach is grounded on statistical ecology as we map inference of human activities into the unseen species problem in biodiversity. Our findings may have applications to resource management in emails, collective attention monitoring in Twitter and language learning process in word databases.
引用
收藏
页数:13
相关论文
共 48 条
[1]   Conference registration: how people react to a deadline [J].
Alfi, Valentina ;
Parisi, Giorgio ;
Pietronero, Luciano .
NATURE PHYSICS, 2007, 3 (11) :746-746
[2]  
Baayen R.H., 2002, WORD FREQUENCY DISTR, V18
[3]   Collective Response of Human Populations to Large-Scale Emergencies [J].
Bagrow, James P. ;
Wang, Dashun ;
Barabasi, Albert-Laszlo .
PLOS ONE, 2011, 6 (03)
[4]   The origin of bursts and heavy tails in human dynamics [J].
Barabási, AL .
NATURE, 2005, 435 (7039) :207-211
[5]   Emergence of scaling in random networks [J].
Barabási, AL ;
Albert, R .
SCIENCE, 1999, 286 (5439) :509-512
[6]   Emergence of consensus as a modular-to-nested transition in communication dynamics [J].
Borge-Holthoefer, Javier ;
Banos, Raquel A. ;
Gracia-Lazaro, Carlos ;
Moreno, Yamir .
SCIENTIFIC REPORTS, 2017, 7
[7]   Statistical physics of social dynamics [J].
Castellano, Claudio ;
Fortunato, Santo ;
Loreto, Vittorio .
REVIEWS OF MODERN PHYSICS, 2009, 81 (02) :591-646
[8]  
Chao A, 2016, Wiley StatsRef: Statistics Reference Online, V1, P1, DOI [DOI 10.1002/9781118445112.STAT03432.PUB2, 10.1002/9781118445112.stat03432.pub2]
[9]   The Semantic Brand Score [J].
Colladon, Andrea Fronzetti .
JOURNAL OF BUSINESS RESEARCH, 2018, 88 :150-160
[10]   Scaling identity connects human mobility and social interactions [J].
Deville, Pierre ;
Song, Chaoming ;
Eagle, Nathan ;
Blondel, Vincent D. ;
Barabasi, Albert-Laszlo ;
Wang, Dashun .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2016, 113 (26) :7047-7052