Information theory-guided heuristic progressive multi-view coding

被引:2
作者
Li, Jiangmeng [1 ,2 ]
Gao, Hang [1 ,2 ]
Qiang, Wenwen [1 ,2 ]
Zheng, Changwen [1 ]
机构
[1] Chinese Acad Sci, Inst Software, Sci & Technol Integrated Informat Syst Lab, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Beijing, Peoples R China
关键词
Self-supervised learning; Representation learning; Multi-view; Wasserstein distance; Information theory; DEEP NETWORK;
D O I
10.1016/j.neunet.2023.08.027
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-view representation learning aims to capture comprehensive information from multiple views of a shared context. Recent works intuitively apply contrastive learning to different views in a pairwise manner, which is still scalable: view-specific noise is not filtered in learning view-shared representations; the fake negative pairs, where the negative terms are actually within the same class as the positive, and the real negative pairs are coequally treated; evenly measuring the similarities between terms might interfere with optimization. Importantly, few works study the theoretical framework of generalized self-supervised multi-view learning, especially for more than two views. To this end, we rethink the existing multi-view learning paradigm from the perspective of information theory and then propose a novel information theoretical framework for generalized multi-view learning. Guided by it, we build a multi-view coding method with a three-tier progressive architecture, namely Information theory-guided heuristic Progressive Multi-view Coding (IPMC). In the distribution-tier, IPMC aligns the distribution between views to reduce view-specific noise. In the set-tier, IPMC constructs self-adjusted contrasting pools, which are adaptively modified by a view filter. Lastly, in the instance-tier, we adopt a designed unified loss to learn representations and reduce the gradient interference. Theoretically and empirically, we demonstrate the superiority of IPMC over state-of-the-art methods.& COPY; 2023 Elsevier Ltd. All rights reserved.
引用
收藏
页码:415 / 432
页数:18
相关论文
共 94 条
  • [21] Coates A., 2011, AISTATS, P215
  • [22] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
  • [23] ArcFace: Additive Angular Margin Loss for Deep Face Recognition
    Deng, Jiankang
    Guo, Jia
    Xue, Niannan
    Zafeiriou, Stefanos
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 4685 - 4694
  • [24] Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
  • [25] Hjelm RD, 2019, Arxiv, DOI arXiv:1808.06670
  • [26] Donahue J., 2016, arXiv
  • [27] Dukler Y, 2019, PR MACH LEARN RES, V97
  • [28] Multilingual handwritten numeral recognition using a robust deep network joint with transfer learning
    Fateh, Amirreza
    Fateh, Mansoor
    Abolghasemi, Vahid
    [J]. INFORMATION SCIENCES, 2021, 581 : 479 - 494
  • [29] Geometry Guided Convolutional Neural Networks for Self-Supervised Video Representation Learning
    Gan, Chuang
    Gong, Boqing
    Liu, Kun
    Su, Hao
    Guibas, Leonidas J.
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 5589 - 5597
  • [30] Goldberger J, 2003, NINTH IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION, VOLS I AND II, PROCEEDINGS, P487