ST-Hadoop: a MapReduce framework for spatio-temporal data

被引:53
|
作者
Alarabi, Louai [1 ]
Mokbel, Mohamed F. [1 ]
Musleh, Mashaal [1 ]
机构
[1] Univ Minnesota, Dept Comp Sci & Engn, Minneapolis, MN 55455 USA
基金
美国国家科学基金会;
关键词
MapReduce-based systems; Spatio-temporal systems; Spatio-temporal range query; Spatio-temporal nearest neighbor query; Spatio-temporal join query;
D O I
10.1007/s10707-018-0325-6
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents ST-Hadoop; the first full-fledged open-source MapReduce framework with a native support for spatio-temporal data. ST-Hadoop is a comprehensive extension to Hadoop and SpatialHadoop that injects spatio-temporal data awareness inside each of their layers, mainly, language, indexing, and operations layers. In the language layer, ST-Hadoop provides built in spatio-temporal data types and operations. In the indexing layer, ST-Hadoop spatiotemporally loads and divides data across computation nodes in Hadoop Distributed File System in a way that mimics spatio-temporal index structures, which result in achieving orders of magnitude better performance than Hadoop and SpatialHadoop when dealing with spatio-temporal data and queries. In the operations layer, ST-Hadoop shipped with support for three fundamental spatio-temporal queries, namely, spatio-temporal range, top-k nearest neighbor, and join queries. Extensibility of ST-Hadoop allows others to extend features and operations easily using similar approaches described in the paper. Extensive experiments conducted on large-scale dataset of size 10 TB that contains over 1 Billion spatio-temporal records, to show that ST-Hadoop achieves orders of magnitude better performance than Hadoop and SpaitalHadoop when dealing with spatio-temporal data and operations. The key idea behind the performance gained in ST-Hadoop is its ability in indexing spatio-temporal data within Hadoop Distributed File System.
引用
收藏
页码:785 / 813
页数:29
相关论文
共 50 条
  • [21] SHAHED: A MapReduce-based System for Querying and Visualizing Spatio-temporal Satellite Data
    Eldawy, Ahmed
    Mokbel, Mohamed F.
    Alharthi, Saif
    Alzaidy, Abdulhadi
    Tarek, Kareem
    Ghani, Sohaib
    2015 IEEE 31ST INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2015, : 1585 - 1596
  • [22] Hadoop-based spatio-temporal analysis of urban public transportation big data
    Ni, Yan
    Huang, Yijie
    Li, Aidi
    Zhang, Jianqin
    Ding, Ying
    Zhao, Ming
    INTERNATIONAL CONFERENCE ON INTELLIGENT TRAFFIC SYSTEMS AND SMART CITY (ITSSC 2021), 2022, 12165
  • [23] SAFAL: A MapReduce Spatio-temporal Analyzer for UNAVCO FTP Logs
    Hodgkinson, Kathleen
    Rezgui, Abdelmounaam
    2013 IEEE 16TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING (CSE 2013), 2013, : 1083 - 1090
  • [24] Mining spatio-temporal data
    Gennady Andrienko
    Donato Malerba
    Michael May
    Maguelonne Teisseire
    Journal of Intelligent Information Systems, 2006, 27 : 187 - 190
  • [25] Statistics for Spatio-Temporal Data
    Mills, Jeff
    JOURNAL OF REGIONAL SCIENCE, 2012, 52 (03) : 512 - 513
  • [26] Statistics for Spatio-Temporal Data
    Haining, Robert P.
    GEOGRAPHICAL ANALYSIS, 2012, 44 (04) : 411 - 412
  • [27] On Robustness for Spatio-Temporal Data
    Garcia-Perez, Alfonso
    MATHEMATICS, 2022, 10 (10)
  • [28] Spatio-Temporal Data Construction
    Le, Hai Ha
    ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2013, 2 (03): : 837 - 853
  • [29] Mining spatio-temporal data
    Andrienko, Gennady
    Malerba, Donato
    May, Michael
    Teisseire, Maguelonne
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2006, 27 (03) : 187 - 190
  • [30] An Expressive Hadoop MapReduce Framework
    Shah, Nathar
    Messom, Christopher
    ADVANCED SCIENCE LETTERS, 2017, 23 (11) : 11197 - 11201