High-Performance Spatial Query Processing on Big Taxi Trip Data using GPGPUs

被引:10
作者
Zhang, Jianting [1 ]
You, Simin [2 ]
Gruenwald, Le [3 ]
机构
[1] CUNY, Dept Comp Sci, New York, NY 10021 USA
[2] CUNY, Grad Ctr, Dept Comp Sci, New York, NY USA
[3] Univ Oklahoma, Sch Comp Sci, Norman, OK 73019 USA
来源
2014 IEEE INTERNATIONAL CONGRESS ON BIG DATA (BIGDATA CONGRESS) | 2014年
关键词
High Performance; Spatial Query; Big Data; Taxi Trip; GPGPU;
D O I
10.1109/BigData.Congress.2014.20
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
City-wide GPS recorded taxi trip data contains rich information for traffic and travel analysis to facilitate transportation planning and urban studies. However, traditional data management techniques are largely incapable of processing big taxi trip data at the scale of hundreds of millions. In this study, we aim at utilizing the General Purpose computing on Graphics Processing Units (GPGPUs) technologies to speed up processing complex spatial queries on big taxi data on inexpensive commodity GPUs. By using the land use types of tax lot polygons as a proxy for trip purposes at the pickup and drop-off locations, we formulate a taxi trip data analysis problem as a large-scale nearest neighbor spatial query problem based on point-to-polygon distance. Experiments on nearly 170 million taxi trips in the New York City (NYC) in 2009 and 735,488 tax lot polygons with 4,698,986 vertices have demonstrated the efficiency of the proposed techniques: the GPU implementations is about 10-20X faster than the host system and completes the spatial query in about a minute by using a low-end workstation equipped with an Nvidia GTX Titan GPU device with a total equipment cost of below $2,000. We further discuss several interesting patterns discovered from the query results which warrant further study. The proposed approach can be an interesting alternative to traditional MapReduce/Hadoop based approaches to processing big data with respect to performance and cost.
引用
收藏
页码:72 / 79
页数:8
相关论文
共 50 条
[21]   An Overview on the Convergence of High Performance Computing and Big Data Processing [J].
Mei, Songzhu ;
Guan, Hongtao ;
Wang, Qinglin .
2018 IEEE 24TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS 2018), 2018, :1046-1051
[22]   LotusSQL: SQL Engine for High-Performance Big Data Systems [J].
Li, Xiaohan ;
Yu, Bowen ;
Feng, Guanyu ;
Wang, Haojie ;
Chen, Wenguang .
BIG DATA MINING AND ANALYTICS, 2021, 4 (04) :252-265
[23]   FAST: A High-Performance Architecture for Heterogeneous Big Data Forensics [J].
Pungila, Ciprian ;
Negru, Viorel .
INTERNATIONAL JOINT CONFERENCE SOCO'17- CISIS'17-ICEUTE'17 PROCEEDINGS, 2018, 649 :618-627
[24]   A Review on Recent Trends in Query Processing and Optimization in Big Data [J].
Deepak Kumar ;
Vijay Kumar Jha .
Wireless Personal Communications, 2022, 124 :633-654
[25]   A Review on Recent Trends in Query Processing and Optimization in Big Data [J].
Kumar, Deepak ;
Jha, Vijay Kumar .
WIRELESS PERSONAL COMMUNICATIONS, 2022, 124 (01) :633-654
[26]   A Solution to Query Processing Challenges Through Smart Query Processor for Big Data Analytics [J].
Vaidya G.M. ;
Kshirsagar M.M. .
SN Computer Science, 4 (2)
[27]   HppCnn: A High-Performance, Portable Deep-Learning Library for GPGPUs [J].
Yang, Yi ;
Feng, Min ;
Chakradhar, Srimat .
PROCEEDINGS 45TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING - ICPP 2016, 2016, :582-587
[28]   HIGH-PERFORMANCE COMPUTING BASED BIG DATA ANALYTICS FOR SMART MANUFACTURING [J].
Yang, Yuhang ;
Cai, Y. Dora ;
Lu, Qiyue ;
Zhang, Yifang ;
Koric, Seid ;
Shao, Chenhui .
PROCEEDINGS OF THE ASME 13TH INTERNATIONAL MANUFACTURING SCIENCE AND ENGINEERING CONFERENCE, 2018, VOL 3, 2018,
[29]   How Big Data and High-performance Computing Drive Brain Science [J].
Chen, Shanyu ;
He, Zhipeng ;
Han, Xinyin ;
He, Xiaoyu ;
Li, Ruilin ;
Zhu, Haidong ;
Zhao, Dan ;
Dai, Chuangchuang ;
Zhang, Yu ;
Lu, Zhonghua ;
Chi, Xuebin ;
Niu, Beifang .
GENOMICS PROTEOMICS & BIOINFORMATICS, 2019, 17 (04) :381-392
[30]   Optimized load balancing in high-performance computing for big data analytics [J].
Mirtaheri, Seyedeh Leili ;
Grandinetti, Lucio .
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2021, 33 (16)