GISQAF: MapReduce guided spatial query processing and analytics system

被引:3
|
作者
Al-Naami, Khaled Mohammed [1 ]
Seker, Sadi Evren [2 ]
Khan, Latifur [1 ]
机构
[1] Univ Texas Dallas, Dept Comp Sci, Dallas, TX USA
[2] Istanbul Medeniyet Univ, Dept Business, Istanbul, Turkey
基金
美国国家科学基金会;
关键词
big data; MapReduce; Hadoop; spatial query processing; data analytics; spatial co-occurring events;
D O I
10.1002/spe.2383
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The Global Database of Event, Language, and Tone (GDELT) is the only global political georeferenced event dataset with more than 250 million observations covering all countries in the world since January 1, 1979. TABARI and CAMEO are the tools that are used to collect and code events from all international news coverage. To query such big geospatial data, traditional RDBMS can no longer be used, and the need for parallel distributed solutions has become a necessity. MapReduce paradigm has proven to be a scalable platform to process and analyze Big Data in the cloud. Hadoop, as an implementation of MapReduce, is an open-source application that has been widely used and accepted in academia and industry. However, when dealing with Spatial Data, Hadoop is not equipped well and does not perform efficiently. SpatialHadoop is an extension of Hadoop with the support of spatial data. In this paper, we present Geographic Information System Query and Analytics Framework (GISQAF), which has been built on top of SpatialHadoop. GISQAF focuses on two parts: query processing and data analytics. For the query processing part, we show how this solution outperforms Hadoop query processing by orders of magnitude when applying queries on the GDELT dataset with a size of 60 GB. We show the results for various types of queries. For the data analytics part, we present an approach for finding Spatial co-occurring events. We show how GISQAF is suitable and efficient to handle data analytics techniques. Copyright (c) 2015 John Wiley & Sons, Ltd.
引用
收藏
页码:1329 / 1349
页数:21
相关论文
共 50 条
  • [41] What-If Query Processing Policy for Big Data in OLAP System
    Xu, Huan
    Luo, Hao
    He, Jieyue
    2013 INTERNATIONAL CONFERENCE ON ADVANCED CLOUD AND BIG DATA (CBD), 2013, : 110 - 116
  • [42] Prefetched wald adaptive boost classification based Czekanowski similarity MapReduce for user query processing with bigdata
    Selvan, S. Tamil
    Balamurugan, P.
    Vijayakumar, M.
    DISTRIBUTED AND PARALLEL DATABASES, 2021, 39 (04) : 855 - 872
  • [43] MapReduce-based skyline query processing scheme using adaptive two-level grids
    Hyeong-Cheol Ryu
    Sungwon Jung
    Cluster Computing, 2017, 20 : 3605 - 3616
  • [44] Prefetched wald adaptive boost classification based Czekanowski similarity MapReduce for user query processing with bigdata
    S. Tamil Selvan
    P. Balamurugan
    M. Vijayakumar
    Distributed and Parallel Databases, 2021, 39 : 855 - 872
  • [45] MapReduce-based skyline query processing scheme using adaptive two-level grids
    Ryu, Hyeong-Cheol
    Jung, Sungwon
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2017, 20 (04): : 3605 - 3616
  • [46] Mastiff: A MapReduce-based System for Time-based Big Data Analytics
    Guo, Sijie
    Xiong, Jin
    Wang, Weiping
    Lee, Rubao
    2012 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER), 2012, : 72 - 80
  • [47] An efficient and scalable SPARQL query processing framework for big data using MapReduce and hybrid optimum load balancing
    Kumar, V. Naveen
    Kumar, P. S. Ashok
    DATA & KNOWLEDGE ENGINEERING, 2023, 148
  • [48] A Boundary Filtering Based Spatial Join Query Processing Optimization Algorithm
    Qiao, Baiyou
    Zhu, Junhai
    Shen, Muchuan
    Chen, Yang
    2015 12TH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (FSKD), 2015, : 1764 - 1769
  • [49] SparkGIS: Resource Aware Efficient In-Memory Spatial Query Processing
    Baig, Furqan
    Hoang Vo
    Kurc, Tahsin
    Saltz, Joel
    Wang, Fusheng
    25TH ACM SIGSPATIAL INTERNATIONAL CONFERENCE ON ADVANCES IN GEOGRAPHIC INFORMATION SYSTEMS (ACM SIGSPATIAL GIS 2017), 2017,
  • [50] Improving the performance of GIS polygon overlay computation with MapReduce for spatial big data processing
    Wang, Yong
    Liu, Zhenling
    Liao, Hongyan
    Li, Chengjun
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2015, 18 (02): : 507 - 516