High-performance data mining with intelligent SSD

被引：2

作者：

Jo, Yong-Yeon ^{[1
]}

Kim, Sang-Wook ^{[1
]}

Cho, Sung-Woo ^{[1
]}

Bae, Duck-Ho ^{[1
]}

Oh, Hyunok ^{[2
]}

机构：

[1] Hanyang Univ, Dept Comp & Software, Seoul, South Korea

[2] Hanyang Univ, Dept Informat Syst, Seoul, South Korea

来源：

CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS | 2017年 / 20卷 / 02期

基金：

新加坡国家研究基金会;

关键词：

Intelligent SSD; Simulator-based evaluation; Collaborative processing; Heterogeneous scheduling;

D O I：

10.1007/s10586-017-0789-4

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

An intuitive way to process the big data efficiently is to reduce the volume of data transferred over the storage interface to a host system. This is the reason that the notion of intelligent SSD (iSSD) was proposed to give processing power to SSD. There is rich literature on iSSD, however, its real implementation has not been provided to the public yet. Most prior work aims to quantify the benefits of iSSD with analytical modeling. In this paper, we first develop on iSSD simulator and present the potential of iSSD in data mining through the iSSD simulator. Our iSSD simulator performs on top of the gem 5 simulator and fully simulates all the processes of data mining algorithms running in iSSD with cycle-level accuracy. Then, we further addresse how to exploit all the computing resources for efficient processing of data mining algorithms. These days, CPU, GPU, and SSD are recently equipped together in most computing environment. If SSD is replaced with iSSD later on, we have a new computing environment where the three computing resources collaborate one another to process big data quite effectively. For this, scheduling is required to decide which computing resource is going to run for which function at which time. In our heterogeneous scheduling, types of computing resources, memory sizes in computing resources, and inter-processor communication times including IO time in SSD are considered. Our scheduling results show that processing in the collaborative environment outperforms that in the traditional one by up to about 10 times.

引用

页码：1155 / 1166

页数：12

共 27 条

[1] Implementation of Association Rule Mining using CUDA [J].

Adil, Syed Hasan ;

Qamar, Sadaf .

ICET: 2009 INTERNATIONAL CONFERENCE ON EMERGING TECHNOLOGIES, PROCEEDINGS, 2009, :332-+

[2]

Agrawal R., 1994, P 20 INT C VER LARG, V1215, P487, DOI DOI 10.5555/645920.672836

[3]

[Anonymous], 2008, P 25 INT C MACHINE L, DOI DOI 10.1145/1390156.1390170

[4]

[Anonymous], 2006, P 2006 ACM SIGMOD IN

[5] Intelligent SSD: A turbo for big data mining [J].

Bae, Duck-Ho ;

Kim, Jin-Hyung ;

Jo, Yong-Yeon ;

Kim, Sang-Wook ;

Oh, Hyunok ;

Park, Chanik .

COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2016, 13 (02) :375-394

[6]

Bell N., 2008, EFFICIENT SPARSEMATR

[7]

Binkert Nathan, 2011, Computer Architecture News, V39, P1, DOI 10.1145/2024716.2024718

[8]

Cho S., 2013, P 27 INT ACM C INT C, P91102, DOI DOI 10.1145/2464996.2465003

[9]

Do J., 2013, P 2013 ACM SIGMOD IN, P1221

[10]

Farivar Reza, 2008, Proceedings of the 2008 International Conference on Parallel and Distributed Processing Techniques and Applications. (PDPTA 2008), P340

← 1 2 3 →