IMPROVEMENT IN K-MEANS CLUSTERING ALGORITHM FOR DATA CLUSTERING

被引:8
|
作者
Rajeswari, K. [1 ]
Acharya, Omkar [1 ]
Sharma, Mayur [1 ]
Kopnar, Mahesh [1 ]
Karandikar, Kiran [1 ]
机构
[1] Pimpri Chinchwad Coll Engn, Pune 411044, Maharashtra, India
来源
1ST INTERNATIONAL CONFERENCE ON COMPUTING COMMUNICATION CONTROL AND AUTOMATION ICCUBEA 2015 | 2015年
关键词
Data Clustering; K-Means; unsupervised learning; centroid;
D O I
10.1109/ICCUBEA.2015.205
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The set of objects having same characteristics are organized in groups and clusters of these objects are formed known as Data Clustering. It is an unsupervised learning technique for classification of data. K-means algorithm is widely used and famous algorithm for analysis of clusters. In this algorithm, n number of data points are divided into k clusters based on some similarity measurement criterion. K-Means Algorithm has fast speed and thus is used commonly clustering algorithm. Vector quantization, cluster analysis, feature learning are some of the application of K-Means. However results generated using this algorithm are mainly dependant on choosing initial cluster centroids. The main shortcome of this algorithm is to provide appropriate number of clusters. Provision of number of clusters before applying the algorithm is highly impractical and requires deep knowledge of clustering field. In this project, we are going to propose an algorithm for improvement in the initializing the centroids for K-Means algorithm. We are going to work on numerical data sets along with the categorical datasets with the n dimensions. For similarity measurement we are going to consider the manhattan distance, Dice distance and cosine distance. The result of this proposed algorithm will be compared with the original K-Means. Also the quality and complexity of the proposed algorithm will be checked with the existing algorithm
引用
收藏
页码:367 / 369
页数:3
相关论文
共 50 条
  • [1] Improvement of K-Means Algorithm for Accelerated Big Data Clustering
    Wu, Chunqiong
    Yan, Bingwen
    Yu, Rongrui
    Huang, Zhangshu
    Yu, Baoqin
    Yu, Yanliang
    Chen, Na
    Zhou, Xiukao
    INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGIES AND SYSTEMS APPROACH, 2021, 14 (02) : 99 - 119
  • [2] Research and Improvement on K-Means Clustering Algorithm
    Wang, Xue-mei
    Wang, Jin-bo
    PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION APPLICATIONS (ICCIA 2012), 2012, : 1138 - 1141
  • [3] Improvement of the k-means clustering filtering algorithm
    Lai, Jim Z. C.
    Liaw, Yi-Ching
    PATTERN RECOGNITION, 2008, 41 (12) : 3677 - 3681
  • [4] The Improvement and Application of a K-Means Clustering Algorithm
    Tao, Li Jun
    Hong, Liu Yin
    Yan, Hao
    PROCEEDINGS OF 2016 IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA ANALYSIS (ICCCBDA 2016), 2016, : 93 - 96
  • [5] Improvement and Parallelism of k-Means Clustering Algorithm
    田金兰
    朱林
    张素琴
    刘璐
    Tsinghua Science and Technology, 2005, (03) : 277 - 281
  • [6] Research on k-means Clustering Algorithm An Improved k-means Clustering Algorithm
    Shi Na
    Liu Xumin
    Guan Yong
    2010 THIRD INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY AND SECURITY INFORMATICS (IITSI 2010), 2010, : 63 - 67
  • [7] Improvement of the Fast Clustering Algorithm Improved by K-Means in the Big Data
    Xie, Ting
    Liu, Ruihua
    Wei, Zhengyuan
    APPLIED MATHEMATICS AND NONLINEAR SCIENCES, 2020, 5 (01) : 1 - 10
  • [8] On K-means Data Clustering Algorithm with Genetic Algorithm
    Kapil, Shruti
    Chawla, Meenu
    Ansari, Mohd Dilshad
    2016 FOURTH INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND GRID COMPUTING (PDGC), 2016, : 202 - 206
  • [9] Soil data clustering by using K-means and fuzzy K-means algorithm
    Hot, Elma
    Popovic-Bugarin, Vesna
    2015 23RD TELECOMMUNICATIONS FORUM TELFOR (TELFOR), 2015, : 890 - 893
  • [10] An efficient K-means clustering algorithm for tall data
    Capo, Marco
    Perez, Aritz
    Lozano, Jose A.
    DATA MINING AND KNOWLEDGE DISCOVERY, 2020, 34 (03) : 776 - 811