Exploiting Foursquare and Cellular Data to Infer User Activity in Urban Environments

被引:83
作者
Noulas, Anastasios [1 ]
Mascolo, Cecilia [1 ]
Frias-Martinez, Enrique [2 ]
机构
[1] Univ Cambridge, Comp Lab, Cambridge CB2 1TN, England
[2] Telefon Res, Madrid, Spain
来源
2013 IEEE 14TH INTERNATIONAL CONFERENCE ON MOBILE DATA MANAGEMENT (MDM 2013), VOL 1 | 2013年
关键词
D O I
10.1109/MDM.2013.27
中图分类号
TN [电子技术、通信技术];
学科分类号
0809 ;
摘要
Inferring the type of activities in neighborhoods of urban centers may be helpful in a number of contexts including urban planning, content delivery and activity recommendations for mobile web users or may even yield to a deeper understanding of the geographical evolution of social life in the city. During the past few years, the analysis of mobile phone usage patterns, or of social media with longitudinal attributes, have aided the automatic characterization of the dynamics of the urban environment. In this work, we combine a dataset sourced from a telecommunication provider in Spain with a database of millions of geo-tagged venues from Foursquare and we formulate the problem of urban activity inference in a supervised learning framework. In particular, we exploit user communication patterns observed at the base station level in order to predict the activity of Foursquare users who checkin-in at nearby venues. First, we mine a set of machine learning features that allow us to encode the input telecommunication signal of a tower. Subsequently, we evaluate a diverse set of supervised learning algorithms using labels extracted from Foursquare place categories and we consider two application scenarios. Initially, we assess how hard it is to predict specific urban activity of an area, showing that Nightlife and Entertainment spots are those easier to infer, whereas College and Shopping areas are those featuring the lowest accuracy rates. Then, considering a candidate set of activity types in a geographic area, we aim to elect the most prominent one. We demonstrate how the difficulty of the problem increases with the number of classes incorporated in the prediction task, yet the classifiers achieve a considerably better performance compared to a random guess even when the set of candidate classes increases.
引用
收藏
页码:167 / 176
页数:10
相关论文
共 23 条
[1]  
[Anonymous], 2009, P 18 INT C WORLD WID
[2]  
[Anonymous], 2011, FACEBOOK HAS ACQUIRE
[3]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[4]  
Edwards C, 2010, NEW ELECTRON, V43, P19
[5]  
Ester M., 1996, DENSITY BASED ALGORI
[6]  
Facebook, BUILDING BETTER STOR
[7]  
Fujisaka Tatsuya., 2010, Proceedings of the 11th Workshop on Mobile Computing Systems Applications, P13
[8]  
Hall M., 2009, SIGKDD Explorations, V11, P10, DOI DOI 10.1145/1656274.1656278
[9]  
Horanont T., 2009, CUPUM
[10]  
Kinsella S., 2011, PROC OF THE 3RD WORK