IMPROVEMENT IN K-MEANS CLUSTERING ALGORITHM FOR DATA CLUSTERING

被引：8

作者：

Rajeswari, K. ^{[1
]}

Acharya, Omkar ^{[1
]}

Sharma, Mayur ^{[1
]}

Kopnar, Mahesh ^{[1
]}

Karandikar, Kiran ^{[1
]}

机构：

[1] Pimpri Chinchwad Coll Engn, Pune 411044, Maharashtra, India

来源：

1ST INTERNATIONAL CONFERENCE ON COMPUTING COMMUNICATION CONTROL AND AUTOMATION ICCUBEA 2015 | 2015年

关键词：

Data Clustering; K-Means; unsupervised learning; centroid;

D O I：

10.1109/ICCUBEA.2015.205

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The set of objects having same characteristics are organized in groups and clusters of these objects are formed known as Data Clustering. It is an unsupervised learning technique for classification of data. K-means algorithm is widely used and famous algorithm for analysis of clusters. In this algorithm, n number of data points are divided into k clusters based on some similarity measurement criterion. K-Means Algorithm has fast speed and thus is used commonly clustering algorithm. Vector quantization, cluster analysis, feature learning are some of the application of K-Means. However results generated using this algorithm are mainly dependant on choosing initial cluster centroids. The main shortcome of this algorithm is to provide appropriate number of clusters. Provision of number of clusters before applying the algorithm is highly impractical and requires deep knowledge of clustering field. In this project, we are going to propose an algorithm for improvement in the initializing the centroids for K-Means algorithm. We are going to work on numerical data sets along with the categorical datasets with the n dimensions. For similarity measurement we are going to consider the manhattan distance, Dice distance and cosine distance. The result of this proposed algorithm will be compared with the original K-Means. Also the quality and complexity of the proposed algorithm will be checked with the existing algorithm

引用

页码：367 / 369

页数：3

共 50 条

[1] Improvement of K-Means Algorithm for Accelerated Big Data Clustering
Wu, Chunqiong
Yan, Bingwen
Yu, Rongrui
Huang, Zhangshu
Yu, Baoqin
Yu, Yanliang
Chen, Na
Zhou, Xiukao
INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGIES AND SYSTEMS APPROACH, 2021, 14 (02) : 99 - 119
[2] Research and Improvement on K-Means Clustering Algorithm
Wang, Xue-mei
Wang, Jin-bo
PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION APPLICATIONS (ICCIA 2012), 2012, : 1138 - 1141
[3] Improvement of the k-means clustering filtering algorithm
Lai, Jim Z. C.
Liaw, Yi-Ching
PATTERN RECOGNITION, 2008, 41 (12) : 3677 - 3681
[4] The Improvement and Application of a K-Means Clustering Algorithm
Tao, Li Jun
Hong, Liu Yin
Yan, Hao
PROCEEDINGS OF 2016 IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA ANALYSIS (ICCCBDA 2016), 2016, : 93 - 96
[5] Improvement and Parallelism of k-Means Clustering Algorithm
田金兰
朱林
张素琴
刘璐
Tsinghua Science and Technology, 2005, (03) : 277 - 281
[6] Research on k-means Clustering Algorithm An Improved k-means Clustering Algorithm
Shi Na
Liu Xumin
Guan Yong
2010 THIRD INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY AND SECURITY INFORMATICS (IITSI 2010), 2010, : 63 - 67
[7] Improvement of the Fast Clustering Algorithm Improved by K-Means in the Big Data
Xie, Ting
Liu, Ruihua
Wei, Zhengyuan
APPLIED MATHEMATICS AND NONLINEAR SCIENCES, 2020, 5 (01) : 1 - 10
[8] On K-means Data Clustering Algorithm with Genetic Algorithm
Kapil, Shruti
Chawla, Meenu
Ansari, Mohd Dilshad
2016 FOURTH INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND GRID COMPUTING (PDGC), 2016, : 202 - 206
[9] Soil data clustering by using K-means and fuzzy K-means algorithm
Hot, Elma
Popovic-Bugarin, Vesna
2015 23RD TELECOMMUNICATIONS FORUM TELFOR (TELFOR), 2015, : 890 - 893
[10] An efficient K-means clustering algorithm for tall data
Capo, Marco
Perez, Aritz
Lozano, Jose A.
DATA MINING AND KNOWLEDGE DISCOVERY, 2020, 34 (03) : 776 - 811

← 1 2 3 4 5 →