SCAE: Structural Contrastive Auto-Encoder for Incomplete Multi-View Representation Learning

被引:0
作者
Li, Mengran [1 ]
Zhang, Ronghui [1 ]
Zhang, Yong [2 ]
Piao, Xinglin [2 ]
Zhao, Shiyu [2 ]
Yin, Baocai [2 ]
机构
[1] Sun Yat Sen Univ, Sch Intelligent Syst Engn, Guangdong Prov Key Lab Intelligent Transport Syst, Guangzhou 510006, Peoples R China
[2] Beijing Univ Technol, Beijing Inst Artificial Intelligence, Dept Informat Sci, Beijing Key Lab Multimedia & Intelligent Software, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Incomplete multi-view representation learning; MC-VAE; Dirichlet energy; mutual information maximization; contrastive learning; CLASSIFICATION;
D O I
10.1145/3672078
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Describing an object from multiple perspectives often leads to incomplete data representation. Consequently, learning consistent representations for missing data from multiple views has emerged as a key focus in the realm of Incomplete Multi-view Representation Learning (IMRL). In recent years, various strategies, such as subspace learning, matrix decomposition, and deep learning, have been harnessed to develop numerous IMRL methods. In this article, our primary research revolves around IMRL, with a particular emphasis on addressing two main challenges. Firstly, we delve into the effective integration of intra-view similarity and contextual structure into a unified framework. Secondly, we explore the effective facilitation of information exchange and fusion across multiple views. To tackle these issues, we propose a deep learning approach known as Structural Contrastive Auto-Encoder (SCAE) to solve the challenges of IMRL. SCAE comprises two major components: intra-view structural representation learning and inter-view contrastive representation learning. The former involves capturing intra-view similarity by minimizing the Dirichlet energy of the feature matrix, while also applying spatial dispersion regularization to capture intra-view contextual structure. The latter encourages maximizing the mutual information of inter-view representations, facilitating information exchange and fusion across views. Experimental results demonstrate the efficacy of our approach in significantly enhancing model accuracy and robustly addressing IMRL problems. The code is available at https://github.com/limengran98/SCAE.
引用
收藏
页数:24
相关论文
共 64 条
  • [1] Andrew G., 2013, PMLR, P1247
  • [2] [Anonymous], 2010, P NIPS
  • [3] Dynamic 3D Hand Gesture Recognition by Learning Weighted Depth Motion Maps
    Azad, Reza
    Asadi-Aghbolaghi, Maryam
    Kasaei, Shohreh
    Escalera, Sergio
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2019, 29 (06) : 1729 - 1740
  • [4] Bardes A, 2022, Arxiv, DOI arXiv:2105.04906
  • [5] Chen DL, 2020, AAAI CONF ARTIF INTE, V34, P3438
  • [6] Chen JL, 2023, Arxiv, DOI [arXiv:2311.05767, DOI 10.48550/ARXIV.2311.05767]
  • [7] Inducing metallicity in graphene nanoribbons via zero-mode superlattices
    Rizzo, Daniel J.
    Veber, Gregory
    Jiang, Jingwei
    McCurdy, Ryan
    Cao, Ting
    Bronner, Christopher
    Chen, Ting
    Louie, Steven G.
    Fischer, Felix R.
    Crommie, Michael F.
    [J]. SCIENCE, 2020, 369 (6511) : 1597 - +
  • [8] Learning a Deep ConvNet for Multi-label Classification with Partial Labels
    Durand, Thibaut
    Mehrasa, Nazanin
    Mori, Greg
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 647 - 657
  • [9] Fei-Fei L, 2005, PROC CVPR IEEE, P524
  • [10] ActionVLAD: Learning spatio-temporal aggregation for action classification
    Girdhar, Rohit
    Ramanan, Deva
    Gupta, Abhinav
    Sivic, Josef
    Russell, Bryan
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 3165 - 3174