Design of SPRINT Parallelization of Data Mining Algorithms Based on Cloud Computing

被引:0
作者
Song, Lei [1 ]
Zhang, Huajie [2 ]
Feng, Dongdong [3 ]
机构
[1] Kaifeng Vocat Coll Culture & Arts, Modern Educ Ctr, Kaifeng 475004, Peoples R China
[2] Zhengzhou Univ Technol, Engn Training Ctr, Zhengzhou 450044, Peoples R China
[3] Henan Univ, Sch Software, Kaifeng 475004, Peoples R China
关键词
data mining; cloud computing; SPRINT algorithm; parallel design; ENERGY; PREDICTION;
D O I
暂无
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
For traditional data mining, all data shall be loaded into memory for analysis and calculation. It belongs to a stand-alone computing mode, which has low calculation efficiency, and a high mining failure rate during the work process. As the data storage and computer technology develop rapidly, how to store and process big data effectively has become an important problem to be solved. Cloud computing can quickly obtain resources from the computing resource pool, and implement parallel improvement of data mining algorithms, which can achieve an efficient combination of cloud computing platform and data mining, and effectively make up for the bottlenecks faced by traditional data mining processes. Therefore, based on the Hadoop cloud computing platform, this paper makes full use of the characteristics of the MapReduce programming framework, and proposes a parallel design of decision tree nodes, node attribute metrics, and Gini index ranking for the SPRINT decision tree algorithm. The performance of the parallelized SPRINT algorithm on classification accuracy, scalability, and speedup ratio is tested. The results indicate that the parallel design of the SPRINT algorithm can obtain good scalability and parallel speedup under the premise of ensuring classification accuracy, which verifies the feasibility of the parallel design of data mining algorithms on the basis of cloud computing.
引用
收藏
页码:399 / 405
页数:7
相关论文
共 50 条
[31]   Study and Application of Big Data Mining Based on Cloud Computing [J].
Shao, Jie .
PROCEEDINGS OF THE 2016 3RD INTERNATIONAL CONFERENCE ON MATERIALS ENGINEERING, MANUFACTURING TECHNOLOGY AND CONTROL, 2016, 67 :34-38
[32]   Study and Application of Big Data Mining Based on Cloud Computing [J].
Luo, Jinwei ;
Li, Chunfei ;
Huang, Fuping .
PROCEEDINGS OF THE 2016 6TH INTERNATIONAL CONFERENCE ON MACHINERY, MATERIALS, ENVIRONMENT, BIOTECHNOLOGY AND COMPUTER (MMEBC), 2016, 88 :221-224
[33]   Implementation and application of Web data mining based on cloud computing [J].
Lei, Wang ;
Chong, Liu .
2015 INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION, BIG DATA AND SMART CITY (ICITBS), 2016, :470-473
[34]   TDRM: tensor-based data representation and mining for healthcare data in cloud computing environments [J].
Sandhu, Rajinder ;
Kaur, Navroop ;
Sood, Sandeep K. ;
Buyya, Rajkumar .
JOURNAL OF SUPERCOMPUTING, 2018, 74 (02) :592-614
[35]   Design of data mining model based on improved manifold learning algorithm in cloud computing environment [J].
Zhao Zhan-kun .
PROCEEDINGS OF THE 2017 5TH INTERNATIONAL CONFERENCE ON FRONTIERS OF MANUFACTURING SCIENCE AND MEASURING TECHNOLOGY (FMSMT 2017), 2017, 130 :1421-1424
[36]   TDRM: tensor-based data representation and mining for healthcare data in cloud computing environments [J].
Rajinder Sandhu ;
Navroop Kaur ;
Sandeep K. Sood ;
Rajkumar Buyya .
The Journal of Supercomputing, 2018, 74 :592-614
[37]   Research on the Fuzzy Model of E-learning based Data Mining and Data Mining Technology under the Environment of Cloud Computing [J].
Chuan, Wan .
PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON COMMUNICATION AND ELECTRONICS SYSTEMS (ICCES), 2016, :878-882
[38]   Design of the Data Centre Based on the Cloud Computing for University [J].
Liu, Yongliang ;
Zhang, Weihong ;
Dong, Peng .
INFORMATION TECHNOLOGY APPLICATIONS IN INDUSTRY, PTS 1-4, 2013, 263-266 :1388-+
[39]   Application of Intelligent Data Mining Approach in Securing the Cloud Computing [J].
Said, Hanna M. ;
Alyoubi, Bader A. ;
El Emary, Ibrahim ;
Alyoubi, Adel A. .
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2016, 7 (09) :151-159
[40]   Cloud Computing Environments Parallel Data Mining Policy Research [J].
Lian, Wenwu ;
Zhu, Xiaoshu ;
Zhang, Jie ;
Li, Shangfang .
INTERNATIONAL JOURNAL OF GRID AND DISTRIBUTED COMPUTING, 2015, 8 (04) :135-144