Semi-supervised classification method based on spectral clustering

被引:1
作者
Chen, Xi [1 ]
机构
[1] School of Mathematics and Computer, Yangtze Normal University, Chongqing
关键词
Labeled Data; Semi-Supervised Classification; Spectral Clustering; Unlabeled Data;
D O I
10.4304/jnw.9.2.384-392
中图分类号
学科分类号
摘要
With the rapid development of data collection and storage technology, there are plentiful unlabeled data but very few and often expensive labeled data in real-word applications. Thus, semi-supervised learning algorithms have attracted much attention. In this paper, we propose a new semi-supervised classification algorithm benefiting from spectral clustering called SC-SSL. First, we introduce spectral clustering to partition all labeled and unlabeled data into clusters. Second, we build a classifier using all labeled data and predict the probabilities (weights) of classes that each unlabeled instance belongs to for each cluster. Third, for each cluster, we add those unlabeled instances whose labels with the maximum weights as same as the cluster label into the labeled data. Fourth, in terms of the new labeled data set, we reconstruct the classifier. We repeat the above processing of steps 2 and 3 till meeting the stopping condition. Finally, extensive experiments reveal that our SC-SSL algorithm can sufficiently use the information of unlabeled data to get a robust classifier by spectral clustering, and it maintains a higher classification accuracy compared to several well known semi-supervised algorithms. © 2014 Academy Publisher.
引用
收藏
页码:384 / 392
页数:8
相关论文
共 33 条
[1]  
Belkin M., Niyogi P., Sindhwani V., Manifold regularization: A geometric framework for learning from labeled and unlabeled examples, Journal of Machine Learning Research, 7, pp. 2399-2434, (2006)
[2]  
Blake C., Keogh E., Merz C.J., UCI repository of machine learning databases, (1998)
[3]  
Chen W.F., Feng G.C., Spectral clustering: A semisupervised approach, Neurocomputing, 77, 1, pp. 229-242, (2012)
[4]  
Driessens K., Reutemann P., Pfahringer B., Leschi C., Using weighted nearest neighbor to benefit from unlabeled data, Proceedings of the 10th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'06), pp. 97-106, (2006)
[5]  
Filipovych R., Davatzikos C., Semi-supervised pattern classification of medical images: Application to mild cognitive impairment (MCI), NeuroImage, 55, 3, pp. 1109-1119, (2011)
[6]  
Fujino A., Ueda N., Saito K., A hybrid generative/discriminative approach to semi-supervised classifier design, Proceedings of the 20th National Conference on Artificial Intelligence (AAAI'05), pp. 764-769, (2005)
[7]  
Ghahramani Z., Unsupervised learning, Advanced Lectures on Machine Learning, 3176, pp. 72-112, (2004)
[8]  
Hagen L., Kahng A.B., New spectral methods for ratio cut partitioning and clustering, IEEE Transactions on Computer-Aided Design for Integrated Circuits and Systems, 11, 9, pp. 1074-1085, (1992)
[9]  
Joachims T., Transductive inference for text classification using support vector machines, Proceedings of the 16th International Conference on Machine Learning (ICML'99), pp. 200-209, (1999)
[10]  
Kotsiantis S.B., Supervised machine learning: A review of classification technique, Informatica, 31, pp. 249-268, (2007)