Deep clustering based on embedded auto-encoder

被引:17
作者
Huang, Xuan [1 ,2 ]
Hu, Zhenlong [3 ,4 ]
Lin, Lin [5 ]
机构
[1] Southwest Jiaotong Univ, Sch Informat Sci & Technol, Chengdu 610031, Peoples R China
[2] Chengdu Coll Univ Elect Sci & Technol China, Chengdu 611731, Peoples R China
[3] Zhejiang A&F Univ, Jiyang Coll, Zhuji 311800, Zhejiang, Peoples R China
[4] Zhejiang Yuexiu Univ, Shaoxing 312000, Zhejiang, Peoples R China
[5] Chengdu Aeronaut Polytech, Coll Informat Engn, Chengdu 610100, Peoples R China
关键词
Deep clustering; The embedded auto-encoder; Feature representation;
D O I
10.1007/s00500-021-05934-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep clustering is a new research direction that combines deep learning and clustering. It performs feature representation and cluster assignments simultaneously, and its clustering performance is significantly superior to traditional clustering algorithms. The auto-encoder is a neural network model, which can learn the hidden features of the input object to achieve nonlinear dimensionality reduction. This paper proposes the embedded auto-encoder network model; specifically, the auto-encoder is embedded into the encoder unit and the decoder unit of the prototype auto-encoder, respectively. To ensure effectively cluster high-dimensional objects, the encoder of model first encodes the raw features of the input objects, and obtains a cluster-friendly feature representation. Then, in the model training stage, by adding smoothness constraints to the objective function of the encoder, the representation capabilities of the hidden layer coding are significantly improved. Finally, the adaptive self-paced learning threshold is determined according to the median distance between the object and its corresponding the centroid, and the fine-tuning sample of the model is automatically selected. Experimental results on multiple image datasets have shown that our model has fewer parameters, higher efficiency and the comprehensive clustering performance is significantly superior to the state-of-the-art clustering methods.
引用
收藏
页码:1075 / 1090
页数:16
相关论文
共 31 条
[1]  
Alimoglu F, 1997, PROC INT CONF DOC, P637, DOI 10.1109/ICDAR.1997.620583
[2]  
Aljalbout E., 2018, Clustering with deep learning: Taxonomy and new methods
[3]   Large Scale Spectral Clustering Via Landmark-Based Sparse Representation [J].
Cai, Deng ;
Chen, Xinlei .
IEEE TRANSACTIONS ON CYBERNETICS, 2015, 45 (08) :1669-1680
[4]   Graph Regularized Nonnegative Matrix Factorization for Data Representation [J].
Cai, Deng ;
He, Xiaofei ;
Han, Jiawei ;
Huang, Thomas S. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2011, 33 (08) :1548-1560
[5]  
Caron M., 2018, P EUR C COMP VIS ECC, P132, DOI [DOI 10.1007/978-3-030-01264-9_9, 10.1007/978-3-030-01264-9_9]
[6]   Parallel Spectral Clustering in Distributed Systems [J].
Chen, Wen-Yen ;
Song, Yangqiu ;
Bai, Hongjie ;
Lin, Chih-Jen ;
Chang, Edward Y. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2011, 33 (03) :568-586
[7]   From few to many: Illumination cone models for face recognition under variable lighting and pose [J].
Georghiades, AS ;
Belhumeur, PN ;
Kriegman, DJ .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2001, 23 (06) :643-660
[8]  
Goodfellow IJ, 2014, ADV NEUR IN, V27, P2672
[9]  
Guo X., 2018, PMLR, P550
[10]   Adaptive Self-Paced Deep Clustering with Data Augmentation [J].
Guo, Xifeng ;
Liu, Xinwang ;
Zhu, En ;
Zhu, Xinzhong ;
Li, Miaomiao ;
Xu, Xin ;
Yin, Jianping .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2020, 32 (09) :1680-1693