A survey on graphic processing unit computing for large-scale data mining

被引:62
作者
Cano, Alberto [1 ]
机构
[1] Virginia Commonwealth Univ, Dept Comp Sci, Richmond, VA 23284 USA
关键词
NEURAL-NETWORK; FINGERPRINT IDENTIFICATION; MAPREDUCE FRAMEWORK; K-MEANS; GPU; CLASSIFICATION; ALGORITHMS; PERFORMANCE; CLUSTER; SYSTEM;
D O I
10.1002/widm.1232
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
General purpose computation using Graphic Processing Units (GPUs) is a well-established research area focusing on high-performance computing solutions for massively parallelizable and time-consuming problems. Classical methodologies in machine learning and data mining cannot handle processing of massive and high-speed volumes of information in the context of the big data era. GPUs have successfully improved the scalability of data mining algorithms to address significantly larger dataset sizes in many application areas. The popularization of distributed computing frameworks for big data mining opens up new opportunities for transformative solutions combining GPUs and distributed frameworks. This survey analyzes current trends in the use of GPU computing for large-scale data mining, discusses GPU architecture advantages for handling volume and velocity of data, identifies limitation factors hampering the scalability of the problems, and discusses open issues and future directions. (c) 2017 Wiley Periodicals, Inc.
引用
收藏
页数:24
相关论文
共 141 条
[21]  
[Anonymous], 2016, CONQUERING BIG DATA
[22]   GPU-FS-kNN: A Software Tool for Fast and Scalable kNN Computation Using GPUs [J].
Arefin, Ahmed Shamsul ;
Riveros, Carlos ;
Berretta, Regina ;
Moscato, Pablo .
PLOS ONE, 2012, 7 (08)
[23]  
Arnaldo Ignacio, 2014, Genetic Programming. 17th European Conference (EuroGP 2014). Revised Selected Papers: LNCS 8599, P13, DOI 10.1007/978-3-662-44303-3_2
[24]   Large-scale data mining using genetics-based machine learning [J].
Bacardit, Jaume ;
Llora, Xavier .
WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2013, 3 (01) :37-61
[25]  
Barrientos RJ, 2011, LECT NOTES COMPUT SC, V6852, P380, DOI 10.1007/978-3-642-23400-2_35
[26]   Grex: An efficient MapReduce framework for graphics processing units [J].
Basaran, Can ;
Kang, Kyoung-Don .
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2013, 73 (04) :522-533
[27]   Sparse Matrix Format Selection with Multiclass SVM for SpMV on GPU [J].
Benatia, Akrem ;
Ji, Weixing ;
Wang, Yizhuo ;
Shi, Feng .
PROCEEDINGS 45TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING - ICPP 2016, 2016, :496-505
[28]  
Böhm C, 2009, LECT NOTES COMPUT SC, V5740, P63
[29]   Agglomerative Fuzzy Clustering [J].
Borgelt, Christian ;
Kruse, Rudolf .
SOFT METHODS FOR DATA SCIENCE, 2017, 456 :69-77
[30]   Big Data Approaches for the Analysis of Large-Scale fMRI Data Using Apache Spark and GPU Processing: A Demonstration on Resting-State fMRI Data from the Human Connectome Project [J].
Boubela, Roland N. ;
Kalcher, Klaudius ;
Huf, Wolfgang ;
Nasel, Christian ;
Moser, Ewald .
FRONTIERS IN NEUROSCIENCE, 2016, 9