Assessing the Quality of Home Detection from Mobile Phone Data for Official Statistics

被引:45
作者
Vanhoof, Maarten [1 ]
Reis, Fernando [2 ]
Ploetz, Thomas [1 ,3 ]
Smoreda, Zbigniew [4 ]
机构
[1] Open Lab, Sci Cent, Urban Sci Bldg,1 Sci Sq, Newcastle Upon Tyne NE4 5TG, Tyne & Wear, England
[2] European Commiss, Eurostat, Task Force Big Data, Joseph Bech Bldg,5,Rue Alphonse Weicker, L-2721 Luxembourg, Luxembourg
[3] Georgia Inst Technol, Sch Interact Comp, Atlanta, GA 30332 USA
[4] Orange Labs, Dept SENSE, Orange Gardens, 44 Ave Republ, F-92320 Chatillon, France
关键词
Mobile phone data; home location; home detection algorithms; official statistics; big data; POSITIONING DATA; BIG DATA; LOCATIONS; TRAVEL; MOVEMENT; PATTERNS; SYSTEM; GPS;
D O I
10.2478/JOS-2018-0046
中图分类号
O1 [数学]; C [社会科学总论];
学科分类号
03 ; 0303 ; 0701 ; 070101 ;
摘要
Mobile phone data are an interesting new data source for official statistics. However, multiple problems and uncertainties need to be solved before these data can inform, support or even become an integral part of statistical production processes. In this article, we focus on arguably the most important problem hindering the application of mobile phone data in official statistics: detecting home locations. We argue that current efforts to detect home locations suffer from a blind deployment of criteria to define a place of residence and from limited validation possibilities. We support our argument by analysing the performance of five home detection algorithms (HDAs) that have been applied to a large, French, Call Detailed Record (CDR) data set (similar to 18 million users, five months). Our results show that criteria choice in HDAs influences the detection of home locations for up to about 40% of users, that HDAs perform poorly when compared with a validation data set (resulting in 35 degrees-gap), and that their performance is sensitive to the time period and the duration of observation. Based on our findings and experiences, we offer several recommendations for official statistics. If adopted, our recommendations would help ensure more reliable use of mobile phone data vis-a-vis official statistics.
引用
收藏
页码:935 / 960
页数:26
相关论文
共 46 条
[1]   Evaluating passive mobile positioning data for tourism surveys:: An Estonian case study [J].
Ahas, Rein ;
Aasa, Anto ;
Roose, Antti ;
Mark, Uelar ;
Silm, Siiri .
TOURISM MANAGEMENT, 2008, 29 (03) :469-486
[2]   Using Mobile Positioning Data to Model Locations Meaningful to Users of Mobile Phones [J].
Ahas, Rein ;
Silm, Siiri ;
Jarv, Olle ;
Saluveer, Erki ;
Tiru, Margus .
JOURNAL OF URBAN TECHNOLOGY, 2010, 17 (01) :3-27
[3]  
[Anonymous], WHAT DOES BIG DATA M
[4]  
[Anonymous], 2014, ARXIV14074885
[5]  
ARCEP, 2008, SUIV IND MOB CHIFFR
[6]   Using GPS to learn significant locations and predict movement across multiple users [J].
Ashbrook, Daniel ;
Starner, Thad .
PERSONAL AND UBIQUITOUS COMPUTING, 2003, 7 (05) :275-286
[7]   A survey of results on mobile phone datasets analysis [J].
Blondel, Vincent D. ;
Decuyper, Adeline ;
Krings, Gautier .
EPJ DATA SCIENCE, 2015, 4 (01) :1-55
[8]  
Blondel Vincent D., 2012, ARXIV12100137
[9]   Inferring patterns of internal migration from mobile phone call records: evidence from Rwanda [J].
Blumenstock, Joshua E. .
INFORMATION TECHNOLOGY FOR DEVELOPMENT, 2012, 18 (02) :107-125
[10]   Choosing the Right Home Location Definition Method for the Given Dataset [J].
Bojic, Iva ;
Massaro, Emanuele ;
Belyi, Alexander ;
Sobolevsky, Stanislav ;
Ratti, Carlo .
SOCIAL INFORMATICS (SOCINFO 2015), 2015, 9471 :194-208