A unified multimodal classification framework based on deep metric learning

被引:0
|
作者
Peng, Liwen [1 ,2 ]
Jian, Songlei [2 ]
Li, Minne [1 ]
Kan, Zhigang [1 ]
Qiao, Linbo [2 ]
Li, Dongsheng [2 ]
机构
[1] Intelligent Game & Decis Lab, Beijing 100080, Peoples R China
[2] Natl Univ Def Technol, Coll Comp, Changsha 410073, Hunan, Peoples R China
基金
中国国家自然科学基金;
关键词
Multimodal classification; Deep metric learning; Multimodal learning; Fake news detection; Sentiment analysis; FUSION;
D O I
10.1016/j.neunet.2024.106747
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multimodal classification algorithms play an essential role in multimodal machine learning, aiming to categorize distinct data points by analyzing data characteristics from multiple modalities. Extensive research has been conducted on distilling multimodal attributes and devising specialized fusion strategies for targeted classification tasks. Nevertheless, current algorithms mainly concentrate on a specific classification task and process data about the corresponding modalities. To address these limitations, we propose a unified multimodal classification framework proficient in handling diverse multimodal classification tasks and processing data from disparate modalities. UMCF is task-independent, and its unimodal feature extraction module can be adaptively substituted to accommodate data from diverse modalities. Moreover, we construct the multimodal learning scheme based on deep metric learning to mine latent characteristics within multimodal data. Specifically, we design the metric-based triplet learning to extract the intra-modal relationships within each modality and the contrastive pairwise learning to capture the inter-modal relationships across various modalities. Extensive experiments on two multimodal classification tasks, fake news detection and sentiment analysis, demonstrate that UMCF can extract multimodal data features and achieve superior classification performance than task- specific benchmarks. UMCF outperforms the best fake news detection baselines by 2.3% on average regarding F1 scores.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] Multimodal attention-based deep learning for automatic modulation classification
    Han, Jia
    Yu, Zhiyong
    Yang, Jian
    FRONTIERS IN ENERGY RESEARCH, 2023, 10
  • [32] A Social Network Image Classification Algorithm Based on Multimodal Deep Learning
    Bai, J. W.
    Chi, C.
    INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL, 2020, 15 (06) : 1 - 12
  • [33] Deep-Learning-Based Multimodal Emotion Classification for Music Videos
    Pandeya, Yagya Raj
    Bhattarai, Bhuwan
    Lee, Joonwhoan
    SENSORS, 2021, 21 (14)
  • [34] Multimodal archive resources organization based on deep learning: a prospective framework
    Zhou, Yaolin
    Zhang, Zhaoyang
    Wang, Xiaoyu
    Sheng, Quanzheng
    Zhao, Rongying
    ASLIB JOURNAL OF INFORMATION MANAGEMENT, 2024,
  • [35] A Multimodal Classification Architecture for the Severity Diagnosis of Glaucoma Based on Deep Learning
    Yi, Sanli
    Zhang, Gang
    Qian, Chaoxu
    Lu, YunQing
    Zhong, Hua
    He, Jianfeng
    FRONTIERS IN NEUROSCIENCE, 2022, 16
  • [36] Deep Metric Learning for Cervical Image Classification
    Pal, Anabik
    Xue, Zhiyun
    Befano, Brian
    Rodriguez, Ana Cecilia
    Long, L. Rodney
    Schiffman, Mark
    Antani, Sameer
    IEEE ACCESS, 2021, 9 : 53266 - 53275
  • [37] Research on Online Review Information Classification Based on Multimodal Deep Learning
    Liu, Jingnan
    Sun, Yefang
    Zhang, Yueyi
    Lu, Chenyuan
    APPLIED SCIENCES-BASEL, 2024, 14 (09):
  • [38] Hyperspectral imagery classification with deep metric learning
    Cao, Xianghai
    Ge, Yiming
    Li, Renjie
    Zhao, Jing
    Jiao, Licheng
    NEUROCOMPUTING, 2019, 356 : 217 - 227
  • [39] Aurora Image Classification with Deep Metric Learning
    Endo, Takeru
    Matsumoto, Mitsuharu
    SENSORS, 2022, 22 (17)
  • [40] Deep metric learning for otitis media classification
    Sundgaard, Josefine Vilsboll
    Harte, James
    Bray, Peter
    Laugesen, Soren
    Kamide, Yosuke
    Tanaka, Chiemi
    Paulsen, Rasmus R.
    Christensen, Anders Nymark
    MEDICAL IMAGE ANALYSIS, 2021, 71