Information bottleneck fusion for deep multi-view clustering

被引:3
作者
Hu, Jie [1 ,2 ,3 ,4 ]
Yang, Chenghao [1 ]
Huang, Kai [1 ]
Wang, Hongjun [1 ,2 ,3 ,4 ]
Peng, Bo [1 ,2 ,3 ,4 ]
Li, Tianrui [1 ,2 ,3 ,4 ]
机构
[1] Southwest Jiaotong Univ, Sch Comp & Artificial Intelligence, Chengdu 611756, Peoples R China
[2] Minist Educ, Engn Res Ctr Sustainable Urban Intelligent Transpo, Chengdu 611756, Peoples R China
[3] Southwest Jiaotong Univ, Natl Engn Lab Integrated Transportat Big Data Appl, Chengdu 611756, Peoples R China
[4] Southwest Jiaotong Univ, Mfg Ind Chains Collaborat & Informat Support Techn, Chengdu 611756, Peoples R China
基金
中国国家自然科学基金;
关键词
Multi-view clustering; Information bottleneck; Contrastive learning; Linear encoding; Collaborative training; FRAMEWORK;
D O I
10.1016/j.knosys.2024.111551
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-view clustering aims to employ semantic information from multiple perspectives to accomplish the clustering task. However, a crucial concern in this domain is the selection of distinctive features. Most existing methods map data into a single feature space and then construct a similarity matrix, which often leads to an insufficient utilisation of intrinsic information in the data, meanwhile neglecting the impact of noise in the data, resulting in poor representation learning performance. Information bottleneck (IB) is a theoretical model based on information theory, the core idea of which is to extract information that is useful for a given task by selecting an appropriate representation and discarding redundant and irrelevant information. In this study, we propose an innovative IB fusion model for deep multi-view clustering (IBFDMVC), which operates on two distinct feature spaces and reconstructs semantic information in a parallel manner. IBFDMVC consists of three modules. The encoder module uses two linear encoding layers to learn and obtain embeddings with different dimensions. The fusion module adopts a collaborative training learning concept, where contrastive learning first employed to enhance representation and IB theory is further used to reduce representation noise. Finally, clustering is performed using k-means in the clustering module. Compared with state-of-the-art multi-view clustering methods, IBFDMVC achieves better results, verifying the significant role of IB theory in providing a robust framework for feature selection and semantic information extraction in multi-view data analysis.
引用
收藏
页数:12
相关论文
共 50 条
[1]  
Belghazi MI, 2018, PR MACH LEARN RES, V80
[2]   ContrastNet: Unsupervised feature learning by autoencoder and prototypical contrastive learning for hyperspectral imagery classification [J].
Cao, Zeyu ;
Li, Xiaorun ;
Feng, Yueming ;
Chen, Shuhan ;
Xia, Chaoqun ;
Zhao, Liaoying .
NEUROCOMPUTING, 2021, 460 :71-83
[3]   Deep Self-Evolution Clustering [J].
Chang, Jianlong ;
Meng, Gaofeng ;
Wang, Lingfeng ;
Xiang, Shiming ;
Pan, Chunhong .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (04) :809-823
[4]  
Chen T, 2020, PR MACH LEARN RES, V119
[5]   Improving Bi-encoder Document Ranking Models with Two Rankers and Multi-teacher Distillation [J].
Choi, Jaekeol ;
Jung, Euna ;
Suh, Jangwon ;
Rhee, Wonjong .
SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, :2192-2196
[6]  
Deng XZ, 2022, Arxiv, DOI arXiv:2206.00380
[7]   Deep Multiple Auto-Encoder-Based Multi-view Clustering [J].
Du, Guowang ;
Zhou, Lihua ;
Yang, Yudi ;
Lu, Kevin ;
Wang, Lizhen .
DATA SCIENCE AND ENGINEERING, 2021, 6 (03) :323-338
[8]   Normalized Mutual Information Feature Selection [J].
Estevez, Pablo. A. ;
Tesmer, Michel ;
Perez, Claudio A. ;
Zurada, Jacek A. .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2009, 20 (02) :189-201
[9]  
Ghassany M, 2013, IEEE IJCNN
[10]   Multi-view content-context information bottleneck for image clustering [J].
Hu, Shizhe ;
Wang, Bo ;
Lou, Zhengzheng ;
Ye, Yangdong .
EXPERT SYSTEMS WITH APPLICATIONS, 2021, 183