Design of SPRINT Parallelization of Data Mining Algorithms Based on Cloud Computing

被引:0
作者
Song, Lei [1 ]
Zhang, Huajie [2 ]
Feng, Dongdong [3 ]
机构
[1] Kaifeng Vocat Coll Culture & Arts, Modern Educ Ctr, Kaifeng 475004, Peoples R China
[2] Zhengzhou Univ Technol, Engn Training Ctr, Zhengzhou 450044, Peoples R China
[3] Henan Univ, Sch Software, Kaifeng 475004, Peoples R China
关键词
data mining; cloud computing; SPRINT algorithm; parallel design; ENERGY; PREDICTION;
D O I
暂无
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
For traditional data mining, all data shall be loaded into memory for analysis and calculation. It belongs to a stand-alone computing mode, which has low calculation efficiency, and a high mining failure rate during the work process. As the data storage and computer technology develop rapidly, how to store and process big data effectively has become an important problem to be solved. Cloud computing can quickly obtain resources from the computing resource pool, and implement parallel improvement of data mining algorithms, which can achieve an efficient combination of cloud computing platform and data mining, and effectively make up for the bottlenecks faced by traditional data mining processes. Therefore, based on the Hadoop cloud computing platform, this paper makes full use of the characteristics of the MapReduce programming framework, and proposes a parallel design of decision tree nodes, node attribute metrics, and Gini index ranking for the SPRINT decision tree algorithm. The performance of the parallelized SPRINT algorithm on classification accuracy, scalability, and speedup ratio is tested. The results indicate that the parallel design of the SPRINT algorithm can obtain good scalability and parallel speedup under the premise of ensuring classification accuracy, which verifies the feasibility of the parallel design of data mining algorithms on the basis of cloud computing.
引用
收藏
页码:399 / 405
页数:7
相关论文
共 50 条
[21]   An approach to smart grid online data mining based on cloud computing [J].
Wang Y. ;
Chen S. .
International Journal of Simulation: Systems, Science and Technology, 2016, 17 (02) :17.1-17.5
[22]   Delivering Data Mining Services in Cloud Computing [J].
Parra-Royon, Manuel ;
Benitez, Jose M. .
2019 IEEE WORLD CONGRESS ON SERVICES (IEEE SERVICES 2019), 2019, :396-397
[23]   <bold>Data mining in Cloud Computing </bold> [J].
Geng, Xia ;
Yang, Zhi .
PROCEEDINGS OF 2013 INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND COMPUTER APPLICATIONS (ICSA 2013), 2013, 92 :1-7
[24]   Semantics of Data Mining Services in Cloud Computing [J].
Parra-Royon, Manuel ;
Atemezing, Ghislain ;
Benitez, Jose M. .
IEEE TRANSACTIONS ON SERVICES COMPUTING, 2022, 15 (02) :945-955
[25]   A formally based parallelization of data mining algorithms formulti-core systems [J].
Kholod, Ivan ;
Shorov, Andrey ;
Titkov, Evgenii ;
Gorlatch, Sergei .
JOURNAL OF SUPERCOMPUTING, 2019, 75 (12) :7909-7920
[26]   A formally based parallelization of data mining algorithms for multi-core systems [J].
Ivan Kholod ;
Andrey Shorov ;
Evgenii Titkov ;
Sergei Gorlatch .
The Journal of Supercomputing, 2019, 75 :7909-7920
[27]   Data mining algorithm of experiential sports marketing based on cloud computing technology [J].
Chen, Mengzhong ;
Tian, Guixian ;
Tao, Yongchao .
JOURNAL OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING, 2023, 23 (06) :3315-3330
[28]   INTERNET OF THINGS EDGE DATA MINING TECHNOLOGY BASED ON CLOUD COMPUTING MODEL [J].
Hu, Ning .
INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2024, 20 (06) :1749-1763
[29]   A Proposal for the Specification of Data Mining Services in Cloud Computing [J].
Parra-Royon, Manuel ;
Benitez, Jose M. .
CLOSER: PROCEEDINGS OF THE 8TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND SERVICES SCIENCE, 2018, :541-548
[30]   Cloud Computing Data Mining To SCADA for Energy Management [J].
Gupta, Richa ;
Moinuddin ;
Kumar, Parmod .
2015 ANNUAL IEEE INDIA CONFERENCE (INDICON), 2015,