ST-Hadoop: a MapReduce framework for spatio-temporal data

被引:53
|
作者
Alarabi, Louai [1 ]
Mokbel, Mohamed F. [1 ]
Musleh, Mashaal [1 ]
机构
[1] Univ Minnesota, Dept Comp Sci & Engn, Minneapolis, MN 55455 USA
基金
美国国家科学基金会;
关键词
MapReduce-based systems; Spatio-temporal systems; Spatio-temporal range query; Spatio-temporal nearest neighbor query; Spatio-temporal join query;
D O I
10.1007/s10707-018-0325-6
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents ST-Hadoop; the first full-fledged open-source MapReduce framework with a native support for spatio-temporal data. ST-Hadoop is a comprehensive extension to Hadoop and SpatialHadoop that injects spatio-temporal data awareness inside each of their layers, mainly, language, indexing, and operations layers. In the language layer, ST-Hadoop provides built in spatio-temporal data types and operations. In the indexing layer, ST-Hadoop spatiotemporally loads and divides data across computation nodes in Hadoop Distributed File System in a way that mimics spatio-temporal index structures, which result in achieving orders of magnitude better performance than Hadoop and SpatialHadoop when dealing with spatio-temporal data and queries. In the operations layer, ST-Hadoop shipped with support for three fundamental spatio-temporal queries, namely, spatio-temporal range, top-k nearest neighbor, and join queries. Extensibility of ST-Hadoop allows others to extend features and operations easily using similar approaches described in the paper. Extensive experiments conducted on large-scale dataset of size 10 TB that contains over 1 Billion spatio-temporal records, to show that ST-Hadoop achieves orders of magnitude better performance than Hadoop and SpaitalHadoop when dealing with spatio-temporal data and operations. The key idea behind the performance gained in ST-Hadoop is its ability in indexing spatio-temporal data within Hadoop Distributed File System.
引用
收藏
页码:785 / 813
页数:29
相关论文
共 50 条
  • [41] Exploiting Spatio-Temporal Tradeoffs for Energy-Aware MapReduce in the Cloud
    Cardosa, Michael
    Singh, Aameek
    Pucha, Himabindu
    Chandra, Abhishek
    IEEE TRANSACTIONS ON COMPUTERS, 2012, 61 (12) : 1737 - 1751
  • [42] WORKING WITH SPATIO-TEMPORAL DATA TYPE
    Raza, Ale
    XXII ISPRS CONGRESS, TECHNICAL COMMISSION II, 2012, 39-B2 : 5 - 10
  • [43] SQL extension for spatio-temporal data
    Viqueira, Jose R. Rios
    Lorentzos, Nikos A.
    VLDB JOURNAL, 2007, 16 (02): : 179 - 200
  • [44] Differential Privacy on Spatio-Temporal Data
    Li, Yi
    Ning, Bo
    Bai, Mei
    Zheng, Yawen
    Wang, Yu
    PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING, INFORMATION SCIENCE & APPLICATION TECHNOLOGY (ICCIA 2017), 2017, 74 : 503 - 507
  • [45] SQL extension for spatio-temporal data
    Jose R. Rios Viqueira
    Nikos A. Lorentzos
    The VLDB Journal, 2007, 16 : 179 - 200
  • [46] A Spatio-temporal Data Compression Algorithm
    Wang, Lei
    Guo, Yiming
    Chen, Chen
    Yan, Yaowei
    2012 FOURTH INTERNATIONAL CONFERENCE ON MULTIMEDIA INFORMATION NETWORKING AND SECURITY (MINES 2012), 2012, : 421 - 424
  • [47] Towards a formal framework for spatio-temporal granularities
    Belussi, Alberto
    Combi, Carlo
    Pozzani, Gabriele
    TIME 2008: 15TH INTERNATIONAL SYMPOSIUM ON TEMPORAL REPRESENTATION AND REASONING, PROCEEDINGS, 2008, : 49 - 53
  • [48] A framework for discovering spatio-temporal cohesive networks
    Yoo, Jin Soung
    Hwang, Joengmin
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2008, 5012 : 1056 - +
  • [49] EasyST: A Simple Framework for Spatio-Temporal Prediction
    Tang, Jiabin
    Wei, Wei
    Xia, Lianghao
    Huang, Chao
    PROCEEDINGS OF THE 33RD ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2024, 2024, : 2220 - 2229
  • [50] Additive models with spatio-temporal data
    Fang, Xiangming
    Chan, Kung-Sik
    ENVIRONMENTAL AND ECOLOGICAL STATISTICS, 2015, 22 (01) : 61 - 86