A formally based parallelization of data mining algorithms formulti-core systems

被引:2
作者
Kholod, Ivan [1 ]
Shorov, Andrey [1 ]
Titkov, Evgenii [1 ]
Gorlatch, Sergei [2 ]
机构
[1] St Petersburg Electrotech Univ LETI, St Petersburg, Russia
[2] Univ Munster, Munster, Germany
关键词
Parallel algorithms; Data mining; Parallel data mining; Program transformation; Functional programming; Parallel programming;
D O I
10.1007/s11227-018-2473-8
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We describe a novel, systematic approach to efficiently parallelizing datamining algorithms: starting with the representation of an algorithm as a sequential composition of functions, we formally transform it into a parallel form using higher-order functions for specifying parallelism. We implement the approach as an extension of the industrial-strength Java-based library Xelopes, and we illustrate its use by developing a multi-threaded Java program for the popular naive Bayes classification algorithm. In comparison with the popular MapReduce programming model, our resulting programs enable not only data-parallel, but also task-parallel implementation and a combination of both. Our experiments demonstrate an efficient parallelization and good scalability on multi-core processors.
引用
收藏
页码:7909 / 7920
页数:12
相关论文
共 12 条
[1]  
Allen R., 2002, OPTIMIZING COMPILERS
[2]  
[Anonymous], 2006, NIPS
[3]  
[Anonymous], LNCS
[4]  
[Anonymous], INT J ADV RES COMPUT
[5]   ANALYSIS OF PROGRAMS FOR PARALLEL PROCESSING [J].
BERNSTEIN, AJ .
IEEE TRANSACTIONS ON ELECTRONIC COMPUTERS, 1966, EC15 (05) :757-+
[6]  
Dean J, 2004, USENIX ASSOCIATION PROCEEDINGS OF THE SIXTH SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION (OSDE '04), P137
[7]  
John G. H., 1995, Uncertainty in Artificial Intelligence. Proceedings of the Eleventh Conference (1995), P338
[8]  
Li Z., 1990, IEEE Transactions on Parallel and Distributed Systems, V1, P26, DOI 10.1109/71.80122
[9]   Top 10 algorithms in data mining [J].
Wu, Xindong ;
Kumar, Vipin ;
Quinlan, J. Ross ;
Ghosh, Joydeep ;
Yang, Qiang ;
Motoda, Hiroshi ;
McLachlan, Geoffrey J. ;
Ng, Angus ;
Liu, Bing ;
Yu, Philip S. ;
Zhou, Zhi-Hua ;
Steinbach, Michael ;
Hand, David J. ;
Steinberg, Dan .
KNOWLEDGE AND INFORMATION SYSTEMS, 2008, 14 (01) :1-37
[10]   Data Mining with Big Data [J].
Wu, Xindong ;
Zhu, Xingquan ;
Wu, Gong-Qing ;
Ding, Wei .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (01) :97-107