A new semi-supervised clustering technique using multi-objective optimization

被引:15
作者
Alok, Abhay Kumar [1 ]
Saha, Sriparna [1 ]
Ekbal, Asif [1 ]
机构
[1] Indian Inst Technol, Dept Comp Sci & Engn, Patna, Bihar, India
关键词
Semi-supervised clustering; Multiobjective optimization; Cluster validity index; AMOSA; PERFORMANCE EVALUATION; AUTOMATIC EVOLUTION; CLASSIFICATION; ALGORITHMS;
D O I
10.1007/s10489-015-0656-z
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Semi-supervised clustering techniques have been proposed in the literature to overcome the problems associated with unsupervised and supervised classification. It considers a small amount of labeled data and the whole data distribution during the process of clustering a data. In this paper, a new approach towards semi-supervised clustering is implemented using multiobjective optimization (MOO) framework. Four objective functions are optimized using the search capability of a multiobjective simulated annealing based technique, AMOSA. These objective functions are based on some unsupervised and supervised information. First three objective functions represent, respectively, the goodness of the partitioning in terms of Euclidean distance, total symmetry present in the clusters and the cluster connectedness. For the last objective function, we have considered different external cluster validity indices, including adjusted rand index, rand index, a newly developed min-max distance based MMI index, NMMI index and Minkowski Score. Results show that the proposed semi-supervised clustering technique can effectively detect the appropriate number of clusters as well as the appropriate partitioning from the data sets having either well-separated clusters of any shape or symmetrical clusters with or without overlaps. Twenty four artificial and five real-life data sets have been used in the evaluation. We develop five different versions of Semi-GenClustMOO clustering technique by varying the external cluster validity indices. Obtained partitioning results are compared with another recently developed multiobjective semi-supervised clustering technique, Mock-Semi. At the end of the paper the effectiveness of the proposed Semi-GenClustMOO clustering technique is shown in segmenting one remote sensing satellite image on the part from the city of Kolkata.
引用
收藏
页码:633 / 661
页数:29
相关论文
共 42 条
[21]   Data clustering with partial supervision [J].
Bouchachia, A ;
Pedrycz, W .
DATA MINING AND KNOWLEDGE DISCOVERY, 2006, 12 (01) :47-78
[22]  
Chapelle O, 2004, AI STATS
[23]  
Chapelle Olivier, 2006, Semi-Supervised Learning, DOI DOI 10.7551/MITPRESS/9780262033589.001.0001
[24]  
Deb K., 2001, Multi-Objective Optimization Using Evolutionary Algorithms, V16
[25]  
Demiriz A., 1999, Artificial Neural Networks in Engineering
[26]   Genetic algorithm-tuned entropy-based fuzzy C-means algorithm for obtaining distinct and compact clusters [J].
Dey, Vidyut ;
Pratihar, Dilip Kumar ;
Datta, G. L. .
FUZZY OPTIMIZATION AND DECISION MAKING, 2011, 10 (02) :153-166
[27]  
Ebrahimi Javid, 2012, Machine Learning and Data Mining in Pattern Recognition. Proceedings 8th International Conference, MLDM 2012, P237, DOI 10.1007/978-3-642-31537-4_19
[28]   The use of multiple measurements in taxonomic problems [J].
Fisher, RA .
ANNALS OF EUGENICS, 1936, 7 :179-188
[29]  
Grira N., 2004, Unsupervised and semi-supervised clustering: a brief survey. A review of machine learning techniques for processing multimedia content
[30]  
Handl J., 2004, Tech. Rep.