Information theory-guided heuristic progressive multi-view coding

被引：2

作者：

Li, Jiangmeng ^{[1
,2
]}

Gao, Hang ^{[1
,2
]}

Qiang, Wenwen ^{[1
,2
]}

Zheng, Changwen ^{[1
]}

机构：

[1] Chinese Acad Sci, Inst Software, Sci & Technol Integrated Informat Syst Lab, Beijing, Peoples R China

[2] Univ Chinese Acad Sci, Beijing, Peoples R China

来源：

NEURAL NETWORKS | 2023年 / 167卷

关键词：

Self-supervised learning; Representation learning; Multi-view; Wasserstein distance; Information theory; DEEP NETWORK;

D O I：

10.1016/j.neunet.2023.08.027

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Multi-view representation learning aims to capture comprehensive information from multiple views of a shared context. Recent works intuitively apply contrastive learning to different views in a pairwise manner, which is still scalable: view-specific noise is not filtered in learning view-shared representations; the fake negative pairs, where the negative terms are actually within the same class as the positive, and the real negative pairs are coequally treated; evenly measuring the similarities between terms might interfere with optimization. Importantly, few works study the theoretical framework of generalized self-supervised multi-view learning, especially for more than two views. To this end, we rethink the existing multi-view learning paradigm from the perspective of information theory and then propose a novel information theoretical framework for generalized multi-view learning. Guided by it, we build a multi-view coding method with a three-tier progressive architecture, namely Information theory-guided heuristic Progressive Multi-view Coding (IPMC). In the distribution-tier, IPMC aligns the distribution between views to reduce view-specific noise. In the set-tier, IPMC constructs self-adjusted contrasting pools, which are adaptively modified by a view filter. Lastly, in the instance-tier, we adopt a designed unified loss to learn representations and reduce the gradient interference. Theoretically and empirically, we demonstrate the superiority of IPMC over state-of-the-art methods.& COPY; 2023 Elsevier Ltd. All rights reserved.

引用

页码：415 / 432

页数：18

共 94 条

[31] Goodfellow IJ, 2014, ADV NEUR IN, V27, P2672
[32] Grill J.-B., 2020, P 34 INT C NEURAL IN
[33] Masked Autoencoders Are Scalable Vision Learners
He, Kaiming
Chen, Xinlei
Xie, Saining
Li, Yanghao
Dollar, Piotr
Girshick, Ross
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 15979 - 15988
[34] Momentum Contrast for Unsupervised Visual Representation Learning
He, Kaiming
Fan, Haoqi
Wu, Yuxin
Xie, Saining
Girshick, Ross
[J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, : 9726 - 9735
[35] Deep Residual Learning for Image Recognition
He, Kaiming
Zhang, Xiangyu
Ren, Shaoqing
Sun, Jian
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
[36] Henaff Olivier., 2020, Data-Efficient Image Recognition with Contrastive Predictive Coding
[37] Reducing the dimensionality of data with neural networks
Hinton, G. E.
Salakhutdinov, R. R.
[J]. SCIENCE, 2006, 313 (5786) : 504 - 507
[38] Multi-view Deep Network for Cross-view Classification
Kan, Meina
Shan, Shiguang
Chen, Xilin
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 4847 - 4855
[39] Krizhevsky A., 2009, Tech. Rep. TR 2009
[40] ImageNet Classification with Deep Convolutional Neural Networks
Krizhevsky, Alex
Sutskever, Ilya
Hinton, Geoffrey E.
[J]. COMMUNICATIONS OF THE ACM, 2017, 60 (06) : 84 - 90

← 1 2 3 4 5 6 7 8 9 10 →