Comprehensive Semi-Supervised Multi-Modal Learning

被引:0
|
作者
Yang, Yang [1 ]
Wang, Ke-Tao [1 ]
Zhan, De-Chuan [1 ]
Xiong, Hui [2 ]
Jiang, Yuan [1 ]
机构
[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing, Jiangsu, Peoples R China
[2] Rutgers State Univ, New Brunswick, NJ USA
基金
国家重点研发计划;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-modal learning refers to the process of learning a precise model to represent the joint representations of different modalities. Despite its promise for multi-modal learning, the co-regularization method is based on the consistency principle with a sufficient assumption, which usually does not hold for real-world multi-modal data. Indeed, due to the modal insufficiency in real-world applications, there are divergences among heterogeneous modalities. This imposes a critical challenge for multi-modal learning. To this end, in this paper, we propose a novel Comprehensive Multi-Modal Learning (CMML) framework, which can strike a balance between the consistency and divergency modalities by considering the insufficiency in one unified framework. Specifically, we utilize an instance level attention mechanism to weight the sufficiency for each instance on different modalities. Moreover, novel diversity regularization and robust consistency metrics are designed for discovering insufficient modalities. Our empirical studies show the superior performances of CMML on real-world data in terms of various criteria.
引用
收藏
页码:4092 / 4098
页数:7
相关论文
共 50 条
  • [1] Semi-Supervised Multi-Modal Learning with Incomplete Modalities
    Yang, Yang
    Zhan, De-Chuan
    Sheng, Xiang-Rong
    Jiang, Yuan
    PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 2998 - 3004
  • [2] Multi-Modal Curriculum Learning for Semi-Supervised Image Classification
    Gong, Chen
    Tao, Dacheng
    Maybank, Stephen J.
    Liu, Wei
    Kang, Guoliang
    Yang, Jie
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (07) : 3249 - 3260
  • [3] Semi-Supervised Multi-Modal Learning with Balanced Spectral Decomposition
    Hu, Peng
    Zhu, Hongyuan
    Peng, Xi
    Lin, Jie
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 99 - 106
  • [4] Semi-supervised Grounding Alignment for Multi-modal Feature Learning
    Chou, Shih-Han
    Fan, Zicong
    Little, James J.
    Sigal, Leonid
    2022 19TH CONFERENCE ON ROBOTS AND VISION (CRV 2022), 2022, : 48 - 57
  • [5] Semi-supervised image clustering with multi-modal information
    Jianqing Liang
    Yahong Han
    Qinghua Hu
    Multimedia Systems, 2016, 22 : 149 - 160
  • [6] Semi-supervised image clustering with multi-modal information
    Liang, Jianqing
    Han, Yahong
    Hu, Qinghua
    MULTIMEDIA SYSTEMS, 2016, 22 (02) : 149 - 160
  • [7] Failure Analysis of a Complex Learning Framework Incorporating Multi-Modal and Semi-Supervised Learning
    Pullum, Laura L.
    Symons, Christopher T.
    2011 IEEE 17TH PACIFIC RIM INTERNATIONAL SYMPOSIUM ON DEPENDABLE COMPUTING (PRDC), 2011, : 308 - 313
  • [8] Multi-Modal Data-Based Semi-Supervised Learning for Vehicle Positioning
    Huan, Ouwen
    Yang, Yang
    Luo, Tao
    Chen, Mingzhe
    IEEE TRANSACTIONS ON COMMUNICATIONS, 2025, 73 (03) : 1663 - 1676
  • [9] Semi-Supervised Learning of Geospatial Objects Through Multi-Modal Data Integration
    Yang, Yi
    Newsam, Shawn
    2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2014, : 4062 - 4067
  • [10] A multi-modal dental dataset for semi-supervised deep learning image segmentation
    Wang, Yaqi
    Ye, Fan
    Chen, Yifei
    Wang, Chengkai
    Wu, Chengyu
    Xu, Feng
    Ma, Zhean
    Liu, Yi
    Zhang, Yifan
    Cao, Mingguo
    Chen, Xiaodiao
    SCIENTIFIC DATA, 2025, 12 (01)