On temporal-constrained sub-trajectory cluster analysis

被引:16
作者
Pelekis, Nikos [1 ]
Tampakis, Panagiotis [2 ]
Vodas, Marios [2 ]
Doulkeridis, Christos [3 ]
Theodoridis, Yannis [2 ]
机构
[1] Univ Piraeus, Dept Stat & Insurance Sci, Piraeus, Greece
[2] Univ Piraeus, Dept Informat, Piraeus, Greece
[3] Univ Piraeus, Dept Digital Syst, Piraeus, Greece
基金
欧盟地平线“2020”;
关键词
Cluster analysis; Temporal-constrained (sub-)trajectory clustering; Moving objects; Indexing; PATTERNS;
D O I
10.1007/s10618-017-0503-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cluster analysis over Moving Object Databases (MODs) is a challenging research topic that has attracted the attention of the mobility data mining community. In this paper, we study the temporal-constrained sub-trajectory cluster analysis problem, where the aim is to discover clusters of sub-trajectories given an ad-hoc, user-specified temporal constraint within the dataset's lifetime. The problem is challenging because: (a) the time window is not known in advance, instead it is specified at query time, and (b) the MOD is continuously updated with new trajectories. Existing solutions first filter the trajectory database according to the temporal constraint, and then apply a clustering algorithm from scratch on the filtered data. However, this approach is extremely inefficient, when considering explorative data analysis where multiple clustering tasks need to be performed over different temporal subsets of the database, while the database is updated with new trajectories. To address this problem, we propose an incremental and scalable solution to the problem, which is built upon a novel indexing structure, called Representative Trajectory Tree (ReTraTree). ReTraTree acts as an effective spatio-temporal partitioning technique; partitions in ReTraTree correspond to groupings of sub-trajectories, which are incrementally maintained and assigned to representative (sub-)trajectories. Due to the proposed organization of sub-trajectories, the problem under study can be efficiently solved as simply as executing a query operator on ReTraTree, while insertion of new trajectories is supported. Our extensive experimental study performed on real and synthetic datasets shows that our approach outperforms a state-of-the-art in-DBMS solution supported by PostgreSQL by orders of magnitude.
引用
收藏
页码:1294 / 1330
页数:37
相关论文
共 37 条
[1]  
Ankerst M., 1999, SIGMOD Record, V28, P49, DOI 10.1145/304181.304187
[2]  
Benkert M, 2006, LECT NOTES COMPUT SC, V4168, P660
[3]  
Buchin M., 2010, Proc. 18th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (ACM GIS), P202, DOI DOI 10.1145/1869790.1869821
[4]  
Cudre-Mauroux P, 2010, P IEEE 26 INT C DAT
[5]  
Ester M, 1996, P 2 INT C KNOWLEDGE, DOI DOI 10.5555/3001460.3001507
[6]   Vector Field k-Means: Clustering Trajectories by Fitting Multiple Vector Fields [J].
Ferreira, Nivan ;
Klosowski, James T. ;
Scheidegger, Carlos E. ;
Silva, Claudio T. .
COMPUTER GRAPHICS FORUM, 2013, 32 (03) :201-210
[7]  
Frentzos E, 2007, P IEEE 23 INT C DAT
[8]  
Gaffney S., 1999, P 5 ACM SIGKDD INT C, P63, DOI [DOI 10.1145/312129.312198, 10.1145/312129.312198]
[9]   Unveiling the complexity of human mobility by querying and mining massive trajectory data [J].
Giannotti, Fosca ;
Nanni, Mirco ;
Pedreschi, Dino ;
Pinelli, Fabio ;
Renso, Chiara ;
Rinzivillo, Salvatore ;
Trasarti, Roberto .
VLDB JOURNAL, 2011, 20 (05) :695-719
[10]  
Guha S., 1998, SIGMOD Record, V27, P73, DOI 10.1145/276305.276312