Accelerating Spatial Cross-Matching on CPU-GPU Hybrid Platform With CUDA and OpenACC

被引:5
作者
Baig, Furqan [1 ]
Gao, Chao [2 ]
Teng, Dejun [1 ]
Kong, Jun [3 ]
Wang, Fusheng [1 ,4 ]
机构
[1] SUNY Stony Brook, Dept Comp Sci, Stony Brook, NY 11794 USA
[2] NYU, Dept Comp Sci, 550 1St Ave, New York, NY 10003 USA
[3] Gerogia State Univ, Math & Stat Dept, Atlanta, GA USA
[4] SUNY Stony Brook, Biomed Informat Dept, Stony Brook, NY 11794 USA
来源
FRONTIERS IN BIG DATA | 2020年 / 3卷
基金
美国国家科学基金会;
关键词
spatial-cross-matching; spatial-join; gpu; gpgpu; cpugpu-hybrid; geospatial; openacc;
D O I
10.3389/fdata.2020.00014
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Spatial cross-matching operation over geospatial polygonal datasets is a highly compute-intensive yet an essential task to a wide array of real-world applications. At the same time, modern computing systems are typically equipped with multiple processing units capable of task parallelization and optimization at various levels. This mandates for the exploration of novel strategies in the geospatial domain focusing on efficient utilization of computing resources, such as CPUs and GPUs. In this paper, we present a CPU-GPU hybrid platform to accelerate the cross-matching operation of geospatial datasets. We propose a pipeline of geospatial subtasks that are dynamically scheduled to be executed on either CPU or GPU. To accommodate geospatial datasets processing on GPU using pixelization approach, we convert the floating point-valued vertices into integer-valued vertices with an adaptive scaling factor as a function of the area of minimum bounding box. We present a comparative analysis of GPU enabled cross-matching algorithm implementation in CUDA and OpenACC accelerated C++. We test our implementations over Natural Earth Data and our results indicate that although CUDA based implementations provide better performance, OpenACC accelerated implementations are more portable and extendable while still providing considerable performance gain as compared to CPU. We also investigate the effects of input data size on the IO / computation ratio and note that a larger dataset compensates for IO overheads associated with GPU computations. Finally, we demonstrate that an efficient cross-matching comparison can be achieved with a cost-effective GPU.
引用
收藏
页数:14
相关论文
共 31 条
  • [1] Adler D. W., 2001, Proceedings of the 27th International Conference on Very Large Data Bases, P687
  • [2] Aji A., 2014, P 3 ACM SIGSPATIAL I, P15
  • [3] Hadoop-GIS: A High Performance Spatial Data Warehousing System over MapReduce
    Aji, Ablimit
    Wang, Fusheng
    Vo, Hoang
    Lee, Rubao
    Liu, Qiaoling
    Zhang, Xiaodong
    Saltz, Joel
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2013, 6 (11): : 1009 - 1020
  • [4] [Anonymous], 2012, P 1 ACM SIGSPATIAL I
  • [5] Baig F., 2015, VLDB WORKSHOP BIG GR, P134
  • [6] SparkGIS: Resource Aware Efficient In-Memory Spatial Query Processing
    Baig, Furqan
    Hoang Vo
    Kurc, Tahsin
    Saltz, Joel
    Wang, Fusheng
    [J]. 25TH ACM SIGSPATIAL INTERNATIONAL CONFERENCE ON ADVANCES IN GEOGRAPHIC INFORMATION SYSTEMS (ACM SIGSPATIAL GIS 2017), 2017,
  • [7] Committee P.P.S., 2018, SPATIAL GEOGRAPHIC O
  • [8] Eldawy A., 2014, Proceedings of the 2014 SIGMOD PhD symposium, P46, DOI [10.1145/2602622.2602625, DOI 10.1145/2602622.2602625]
  • [9] Gao C, 2018, IEEE INT CONF BIG DA, P3402, DOI 10.1109/BigData.2018.8622600
  • [10] IBM, 2018, IBM DB2 SPAT EXT