A Clustering-Guided Contrastive Fusion for Multi-View Representation Learning

被引：19

作者：

Ke, Guanzhou ^{[1
]}

Chao, Guoqing ^{[2
]}

Wang, Xiaoli ^{[3
]}

Xu, Chenyang ^{[4
]}

Zhu, Yongqi ^{[1
]}

Yu, Yang ^{[1
]}

机构：

[1] Beijing Jiaotong Univ, Inst Data Sci & Intelligent Decis Support, Beijing Inst Big Data Res, Beijing 100080, Peoples R China

[2] Harbin Inst Technol, Sch Comp Sci & Technol, Weihai 264209, Peoples R China

[3] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing 210000, Peoples R China

[4] Wuyi Univ, Fac Intelligent Mfg, Jiangmen 529000, Peoples R China

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY | 2024年 / 34卷 / 04期

关键词：

Task analysis; Semantics; Robustness; Representation learning; Image reconstruction; Data models; Learning systems; Multi-view representation learning; contrastive learning; fusion; clustering; incomplete view; ENHANCEMENT;

D O I：

10.1109/TCSVT.2023.3300319

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Multi-view representation learning aims to extract comprehensive information from multiple sources. It has achieved significant success in applications such as video understanding and 3D rendering. However, how to improve the robustness and generalization of multi-view representations from unsupervised and incomplete scenarios remains an open question in this field. In this study, we discovered a positive correlation between the semantic distance of multi-view representations and the tolerance for data corruption. Moreover, we found that the information ratio of consistency and complementarity significantly impacts the performance of discriminative and generative tasks related to multi-view representations. Based on these observations, we propose an end-to-end CLustering-guided cOntrastiVE fusioN (CLOVEN) method, which enhances the robustness and generalization of multi-view representations simultaneously. To balance consistency and complementarity, we design an asymmetric contrastive fusion module. The module first combines all view-specific representations into a comprehensive representation through a scaling fusion layer. Then, the information of the comprehensive representation and view-specific representations is aligned via contrastive learning loss function, resulting in a view-common representation that includes both consistent and complementary information. We prevent the module from learning suboptimal solutions by not allowing information alignment between view-specific representations. We design a clustering-guided module that encourages the aggregation of semantically similar views. This action reduces the semantic distance of the view-common representation. We quantitatively and qualitatively evaluate CLOVEN on five datasets, demonstrating its superiority over 13 other competitive multi-view learning methods in terms of clustering and classification performance. In the data-corrupted scenario, our proposed method resists noise interference better than competitors. Additionally, the visualization demonstrates that CLOVEN succeeds in preserving the intrinsic structure of view-specific representations and improves the compactness of view-common representations. Our code can be found at https://github.com/guanzhou-ke/cloven.

引用

页码：2056 / 2069

页数：14

共 50 条

[21] Multi-task self-supervised learning based fusion representation for Multi-view clustering
Guo, Tianlong
Shen, Derong
Kou, Yue
Nie, Tiezheng
INFORMATION SCIENCES, 2025, 694
[22] Multi-View Representation Learning With Deep Gaussian Processes
Sun, Shiliang
Dong, Wenbo
Liu, Qiuyang
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2021, 43 (12) : 4453 - 4468
[23] Global and local combined contrastive learning for multi-view clustering
Gu, Wenjie
Zhu, Changming
MULTIMEDIA SYSTEMS, 2024, 30 (05)
[24] MULTI-VIEW SUBSPACE CLUSTERING WITH CONSENSUS GRAPH CONTRASTIVE LEARNING
Zhang, Jie
Sun, Yuan
Guo, Yu
Wang, Zheng
Nie, Feiping
Wang, Fei
2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 6340 - 6344
[25] Multi-view graph contrastive representation learning for bundle recommendation
Zhang, Peng
Niu, Zhendong
Ma, Ru
Zhang, Fuzhi
INFORMATION PROCESSING & MANAGEMENT, 2025, 62 (01)
[26] Learning Smooth Representation for Multi-view Subspace Clustering
Huang, Shudong
Liu, Yixi
Ren, Yazhou
Tsang, Ivor W.
Xu, Zenglin
Lv, Jiancheng
PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 3421 - 3429
[27] Representation Learning in Multi-view Clustering: A Literature Review
Man-Sheng Chen
Jia-Qi Lin
Xiang-Long Li
Bao-Yu Liu
Chang-Dong Wang
Dong Huang
Jian-Huang Lai
Data Science and Engineering, 2022, 7 : 225 - 241
[28] Separable Consistency and Diversity Feature Learning for Multi-View Clustering
Zhang, Fenghua
Che, Hangjun
IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 1595 - 1599
[29] CCR-Net: Consistent contrastive representation network for multi-view clustering
Lin, Renjie
Lin, Yongkun
Lin, Zhenghong
Du, Shide
Wang, Shiping
INFORMATION SCIENCES, 2023, 637
[30] Artifact-Tolerant Clustering-Guided Contrastive Embedding Learning for Ophthalmic Images in Glaucoma
Shi, Min
Lokhande, Anagha
Fazli, Mojtaba S.
Sharma, Vishal
Tian, Yu
Luo, Yan
Pasquale, Louis R.
Elze, Tobias
Boland, Michael V.
Zebardast, Nazlee
Friedman, David S.
Shen, Lucy Q.
Wang, Mengyu
IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2023, 27 (09) : 4329 - 4340

← 1 2 3 4 5 →