Geometric Multimodal Deep Learning With Multiscaled Graph Wavelet Convolutional Network

被引:10
作者
Behmanesh, Maysam [1 ]
Adibi, Peyman [1 ]
Ehsani, Sayyed Mohammad Saeed [1 ]
Chanussot, Jocelyn [2 ]
机构
[1] Univ Isfahan, Fac Comp Engn, Artificial Intelligence Dept, Esfahan 8174673441, Iran
[2] Univ Grenoble Alpes, GIPSA Lab, CNRS, Grenoble INP, Grenoble, France
关键词
Wavelet transforms; Convolution; Wavelet domain; Manifolds; Learning systems; Laplace equations; Deep learning; Geometric deep learning; graph convolution neural networks; graph wavelet transform; multimodal learning; spectral approaches; NEURAL-NETWORK;
D O I
10.1109/TNNLS.2022.3213589
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multimodal data provide complementary information of a natural phenomenon by integrating data from various domains with very different statistical properties. Capturing the intramodality and cross-modality information of multimodal data is the essential capability of multimodal learning methods. The geometry-aware data analysis approaches provide these capabilities by implicitly representing data in various modalities based on their geometric underlying structures. Also, in many applications, data are explicitly defined on an intrinsic geometric structure. Generalizing deep learning methods to the non-Euclidean domains is an emerging research field, which has recently been investigated in many studies. Most of those popular methods are developed for unimodal data. In this article, a multimodal graph wavelet convolutional network (M-GWCN) is proposed as an end-to-end network. M-GWCN simultaneously finds intramodality representation by applying the multiscale graph wavelet transform to provide helpful localization properties in the graph domain of each modality and cross-modality representation by learning permutations that encode correlations among various modalities. M-GWCN is not limited to either the homogeneous modalities with the same number of data or any prior knowledge indicating correspondences between modalities. Several semisupervised node classification experiments have been conducted on three popular unimodal explicit graph-based datasets and five multimodal implicit ones. The experimental results indicate the superiority and effectiveness of the proposed methods compared with both spectral graph domain convolutional neural networks and state-of-the-art multimodal methods.
引用
收藏
页码:6991 / 7005
页数:15
相关论文
共 40 条
  • [1] Arfken H. J., 2013, Mathematical Methods forPhysicists(Bessel Functions), V7th
  • [2] Multimodal Machine Learning: A Survey and Taxonomy
    Baltrusaitis, Tadas
    Ahuja, Chaitanya
    Morency, Louis-Philippe
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2019, 41 (02) : 423 - 443
  • [3] Behmanesh M., 2021, arXiv
  • [4] Geometric Multimodal Learning Based on Local Signal Expansion for Joint Diagonalization
    Behmanesh, Maysam
    Adibi, Peyman
    Chanussot, Jocelyn
    Jutten, Christian
    Ehsani, Sayyed Mohammad Saeed
    [J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2021, 69 : 1271 - 1286
  • [5] Graph Neural Networks With Convolutional ARMA Filters
    Bianchi, Filippo Maria
    Grattarola, Daniele
    Livi, Lorenzo
    Alippi, Cesare
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (07) : 3496 - 3507
  • [6] Geometric Deep Learning Going beyond Euclidean data
    Bronstein, Michael M.
    Bruna, Joan
    LeCun, Yann
    Szlam, Arthur
    Vandergheynst, Pierre
    [J]. IEEE SIGNAL PROCESSING MAGAZINE, 2017, 34 (04) : 18 - 42
  • [7] Generalized Multi-View Embedding for Visual Recognition and Cross-Modal Retrieval
    Cao, Guanqun
    Iosifidis, Alexandros
    Chen, Ke
    Gabbouj, Moncef
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2018, 48 (09) : 2542 - 2555
  • [8] Casanova P., 2018, 6 INT C LEARN REPR I, V24
  • [9] Chung F., 2001, Lectures on Spectral Graph Theory
  • [10] Defferrard M, 2016, ADV NEUR IN, V29