Hybrid Method for Cluster Analysis of Big Data

被引:0
作者
Dabas, Chetna [1 ]
Nigam, Gaurav Kumar [1 ]
机构
[1] Jaypee Inst Informat Technol, Noida, India
来源
INTELLIGENT COMPUTING TECHNIQUES FOR SMART ENERGY SYSTEMS | 2020年 / 607卷
关键词
Analysis; Big data; Clustering; Hybrid method; GENE-EXPRESSION DATA; MODEL;
D O I
10.1007/978-981-15-0214-9_17
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In big data analytics, deep interest in a communication known as computer-mediated has cropped up. While using traditional techniques, it is difficult to handle the data which is magnanimous. Hence, there exists a need for improved methods to handle this data since the past methods do not fit properly in all kinds of situations. Normally, there are various steps for the handling of big data like acquisition, preprocessing, and processing and analysis of this data in order to retrieve proper semantics out of that amount of data. In a similar context, clustering has evolved as a popular approach for organizing and analysis of big data. In the present research work, a hybrid method for analysis of big data is proposed. The hybrid approach consists of the blending of K-means, Ward hierarchical along with the interpolation technique. The evaluation of and validation of the proposed approach has been carried out for the city dataset in R language. In the present work, the number of clusters and the size of the data get varied while carrying out the results. The results of the proposed work reflect impressive execution times of the proposed method over the existing ones. The proposed method also presents possible recommendation for extracting specific semantics for providing insights to business recommendations.
引用
收藏
页码:133 / 139
页数:7
相关论文
共 17 条
[1]  
Andersson A., 1993, Automata, Languages and Programming. 20th International Colloquium, ICALP 93 Proceedings, P15
[2]   Analysis of time-series gene expression data: Methods, challenges, and opportunities [J].
Androulakis, I. P. ;
Yang, E. ;
Almon, R. R. .
ANNUAL REVIEW OF BIOMEDICAL ENGINEERING, 2007, 9 :205-228
[3]   Analyzing time series gene expression data [J].
Bar-Joseph, Z .
BIOINFORMATICS, 2004, 20 (16) :2493-2503
[4]   Big data, Big bang? [J].
Bughin J. .
Journal of Big Data, 3 (1)
[5]  
Dabas Chetna, 2017, International Journal of Information Technology and Management, V16, P348
[6]   Pattern recognition in bioinformatics [J].
de Ridder, Dick ;
de Ridder, Jeroen ;
Reinders, Marcel J. T. .
BRIEFINGS IN BIOINFORMATICS, 2013, 14 (05) :633-647
[7]   Data clustering: 50 years beyond K-means [J].
Jain, Anil K. .
PATTERN RECOGNITION LETTERS, 2010, 31 (08) :651-666
[8]   An unsupervised conditional random fields approach for clustering gene expression time series [J].
Li, Chang-Tsun ;
Yuan, Yinyin ;
Wilson, Roland .
BIOINFORMATICS, 2008, 24 (21) :2467-2473
[9]   Clustering of time-course gene expression data using a mixed-effects model with B-splines [J].
Luan, YH ;
Li, HZ .
BIOINFORMATICS, 2003, 19 (04) :474-482
[10]   Bayesian mixture model based clustering of replicated microarray data [J].
Medvedovic, M ;
Yeung, KY ;
Bumgarner, RE .
BIOINFORMATICS, 2004, 20 (08) :1222-1232