Hybrid Method for Cluster Analysis of Big Data

被引：0

作者：

Dabas, Chetna ^{[1
]}

Nigam, Gaurav Kumar ^{[1
]}

机构：

[1] Jaypee Inst Informat Technol, Noida, India

来源：

INTELLIGENT COMPUTING TECHNIQUES FOR SMART ENERGY SYSTEMS | 2020年 / 607卷

关键词：

Analysis; Big data; Clustering; Hybrid method; GENE-EXPRESSION DATA; MODEL;

D O I：

10.1007/978-981-15-0214-9_17

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

In big data analytics, deep interest in a communication known as computer-mediated has cropped up. While using traditional techniques, it is difficult to handle the data which is magnanimous. Hence, there exists a need for improved methods to handle this data since the past methods do not fit properly in all kinds of situations. Normally, there are various steps for the handling of big data like acquisition, preprocessing, and processing and analysis of this data in order to retrieve proper semantics out of that amount of data. In a similar context, clustering has evolved as a popular approach for organizing and analysis of big data. In the present research work, a hybrid method for analysis of big data is proposed. The hybrid approach consists of the blending of K-means, Ward hierarchical along with the interpolation technique. The evaluation of and validation of the proposed approach has been carried out for the city dataset in R language. In the present work, the number of clusters and the size of the data get varied while carrying out the results. The results of the proposed work reflect impressive execution times of the proposed method over the existing ones. The proposed method also presents possible recommendation for extracting specific semantics for providing insights to business recommendations.

引用

页码：133 / 139

页数：7

共 17 条

[1]

Andersson A., 1993, Automata, Languages and Programming. 20th International Colloquium, ICALP 93 Proceedings, P15

[2] Analysis of time-series gene expression data: Methods, challenges, and opportunities [J].