Tri-factorized Modular Hypergraph Autoencoder for Multimodal Semantic Analysis

被引:0
|
作者
Shaily Malik [1 ]
Geetika Dhand [1 ]
Kavita Sheoran [1 ]
Divya Jatain [1 ]
Vaani Garg [1 ]
机构
[1] Maharaja Surajmal Institute of Technology,Department of Computer Science and Engineering
关键词
TriNMF; Multimodal retrieval; Multimodality; Wiki dataset; Semantic analysis; Nonnegative matrix factorization;
D O I
10.1007/s42979-024-03210-8
中图分类号
学科分类号
摘要
For image-to-text and text-to-image classifications, the features of data collected from various imaging devices, sensors, and their text descriptions must be mapped into a common latent space with reduced dimensions. The low-dimensional features are supposed to provide the most information with the least amount of loss.In this paper we propose a cross-modal semantic autoencoder that uses nonnegative matrix factorization (NMF) to factorize the features into a lower rank. Due to two matrix factorization, the traditional NMF is unable to translate all of the information into lower space. This is addressed by a unique tri-factorized NMF with hypergraph regularization. Instead of using the feature adjacency matrix in hypergraph regularization, a more information-rich modularity matrix is suggested. The Wiki dataset is used to evaluate this tri-factorized hypergraph regularized multimodal autoencoder for image-to-text and text-to-image conversion. In order to lower the feature dimension, Multimodal Conditional Principal label space transformation (MCPLST) is also enabled by this novel autoencoder. Comparing the proposed autoencoder against the semantic autoencoder, the former showed an improvement in classification accuracy of up to 1.8%.
引用
收藏
相关论文
共 50 条
  • [1] Multimodal semantic analysis with regularized semantic autoencoder
    Malik, Shaily
    Bansal, Poonam
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2022, 42 (02) : 909 - 917
  • [2] Hypergraph Variational Autoencoder for Multimodal Semi-supervised Representation Learning
    Liu, Jingquan
    Du, Xiaoyong
    Li, Yuanzhe
    Hu, Weidong
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2022, PT IV, 2022, 13532 : 395 - 406
  • [3] SPAE: Semantic Pyramid AutoEncoder for Multimodal Generation with Frozen LLMs
    Yu, Lijun
    Cheng, Yong
    Wang, Zhiruo
    Kumar, Vivek
    Macherey, Wolfgang
    Huang, Yanping
    Ross, David A.
    Essa, Irfan
    Bisk, Yonatan
    Yang, Ming-Hsuan
    Murphy, Kevin
    Hauptmann, Alexander G.
    Jiang, Lu
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [4] Deep supervised multimodal semantic autoencoder for cross-modal retrieval
    Tian, Yu
    Yang, Wenjing
    Liu, Qingsong
    Yang, Qiong
    COMPUTER ANIMATION AND VIRTUAL WORLDS, 2020, 31 (4-5)
  • [5] Multimodal hypergraph network with contrastive learning for sentiment analysis
    Huang, Jian
    Jiang, Kun
    Pu, Yuanyuan
    Zhao, Zhengpeng
    Yang, Qiuxia
    Gu, Jinjing
    Xu, Dan
    NEUROCOMPUTING, 2025, 627
  • [6] Dynamic hypergraph convolutional network for multimodal sentiment analysis
    Huang, Jian
    Pu, Yuanyuan
    Zhou, Dongming
    Cao, Jinde
    Gu, Jinjing
    Zhao, Zhengpeng
    Xu, Dan
    NEUROCOMPUTING, 2024, 565
  • [7] MMP-MSH: Multimodal Mortality Prediction Based on a Multilevel Semantic Hypergraph Network
    Niu, Ke
    Zhang, Ke
    Pan, Yijie
    Tai, Wenjuan
    Cai, Jiuyun
    IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2024,
  • [8] DHHNN: A Dynamic Hypergraph Hyperbolic Neural Network based on variational autoencoder for multimodal data integration and node classification
    Mei, Zhangyu
    Bi, Xiao
    Li, Dianguo
    Xia, Wen
    Yang, Fan
    Wu, Hao
    INFORMATION FUSION, 2025, 119
  • [9] CHAMPS: Cardiac health Hypergraph Analysis using Multimodal Physiological Signals
    Choudhury, Anirban Dutta
    Chowdhury, Ananda S.
    2019 41ST ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2019, : 4640 - 4645
  • [10] Directed Hypergraph-Based STEP Product Semantic Visual Analysis
    Jian, Chengfeng
    Pang, Chunxia
    Tao, Meng
    JOURNAL OF DISCRETE MATHEMATICAL SCIENCES & CRYPTOGRAPHY, 2014, 17 (04): : 363 - 378