GISQAF: MapReduce guided spatial query processing and analytics system

被引:3
|
作者
Al-Naami, Khaled Mohammed [1 ]
Seker, Sadi Evren [2 ]
Khan, Latifur [1 ]
机构
[1] Univ Texas Dallas, Dept Comp Sci, Dallas, TX USA
[2] Istanbul Medeniyet Univ, Dept Business, Istanbul, Turkey
基金
美国国家科学基金会;
关键词
big data; MapReduce; Hadoop; spatial query processing; data analytics; spatial co-occurring events;
D O I
10.1002/spe.2383
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The Global Database of Event, Language, and Tone (GDELT) is the only global political georeferenced event dataset with more than 250 million observations covering all countries in the world since January 1, 1979. TABARI and CAMEO are the tools that are used to collect and code events from all international news coverage. To query such big geospatial data, traditional RDBMS can no longer be used, and the need for parallel distributed solutions has become a necessity. MapReduce paradigm has proven to be a scalable platform to process and analyze Big Data in the cloud. Hadoop, as an implementation of MapReduce, is an open-source application that has been widely used and accepted in academia and industry. However, when dealing with Spatial Data, Hadoop is not equipped well and does not perform efficiently. SpatialHadoop is an extension of Hadoop with the support of spatial data. In this paper, we present Geographic Information System Query and Analytics Framework (GISQAF), which has been built on top of SpatialHadoop. GISQAF focuses on two parts: query processing and data analytics. For the query processing part, we show how this solution outperforms Hadoop query processing by orders of magnitude when applying queries on the GDELT dataset with a size of 60 GB. We show the results for various types of queries. For the data analytics part, we present an approach for finding Spatial co-occurring events. We show how GISQAF is suitable and efficient to handle data analytics techniques. Copyright (c) 2015 John Wiley & Sons, Ltd.
引用
收藏
页码:1329 / 1349
页数:21
相关论文
共 50 条
  • [31] AQUA+: Query Optimization for Hybrid Database-MapReduce System
    Zhifei Pang
    Sai Wu
    Haichao Huang
    Zhouzhenyan Hong
    Yuqing Xie
    Knowledge and Information Systems, 2021, 63 : 905 - 938
  • [32] AQUA plus : Query Optimization for Hybrid Database-MapReduce System
    Pang, Zhifei
    Wu, Sai
    Huang, Haichao
    Hong, Zhouzhenyan
    Xie, Yuqing
    2019 10TH IEEE INTERNATIONAL CONFERENCE ON BIG KNOWLEDGE (ICBK 2019), 2019, : 199 - 206
  • [33] AQUA plus : Query Optimization for Hybrid Database-MapReduce System
    Pang, Zhifei
    Wu, Sai
    Huang, Haichao
    Hong, Zhouzhenyan
    Xie, Yuqing
    KNOWLEDGE AND INFORMATION SYSTEMS, 2021, 63 (04) : 905 - 938
  • [34] An Efficient Two-Table Join Query Processing Based on Extended Bloom Filter in MapReduce
    Wang, Junlu
    Pang, Jun
    Li, Xiaoyan
    Han, Baishuo
    Huang, Lei
    Ding, Linlin
    WEB-AGE INFORMATION MANAGEMENT, 2016, 9998 : 249 - 258
  • [35] A remote sensing image processing system based on MapReduce
    Pan, Xin
    DESIGN, MANUFACTURING AND MECHATRONICS (ICDMM 2015), 2016, : 674 - 680
  • [36] Big Data Analytics Framework for Childhood Infectious Disease Surveillance and Response System using Modified MapReduce Algorithm
    Mwamnyange, Mdoe
    Luhanga, Edith
    Thodge, Sanket R.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (03) : 373 - 385
  • [37] Object-based directional query processing in spatial databases
    Liu, X
    Shekhar, S
    Chawla, S
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2003, 15 (02) : 295 - 304
  • [38] Optimizing the Spatial Query Processing with A Proxy-Based Approach
    Geetha, K.
    Kannan, A.
    2014 SIXTH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING, 2014, : 77 - 81
  • [39] Query Processing Techniques for Big Spatial-Keyword Data
    Mahmood, Ahmed
    Aref, Walid G.
    SIGMOD'17: PROCEEDINGS OF THE 2017 ACM INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2017, : 1777 - 1782
  • [40] Large-Scale Spatial Join Query Processing in Cloud
    You, Simin
    Zhang, Jianting
    Gruenwald, Le
    2015 13TH IEEE INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOPS (ICDEW), 2015, : 34 - 41