Supervised cross-modal factor analysis for multiple modal data classification

被引:13
作者
Wang, Jingbin [1 ,2 ]
Zhou, Yihua [3 ]
Duan, Kanghong [4 ]
Wang, Jim Jing-Yan [5 ]
Bensmail, Halima [6 ]
机构
[1] Chinese Acad Sci, Natl Time Serv Ctr, Xian 710600, Peoples R China
[2] Chinese Acad Sci, Grad Univ, Beijing 100039, Peoples R China
[3] Lehigh Univ, Dept Mech Engn & Mech, Bethlehem, PA 18015 USA
[4] State Ocean Adm, North China Sea Marine Tech Support Ctr, Qingdao 266033, Peoples R China
[5] King Abdullah Univ Sci & Technol, Comp Elect & Math Sci & Engn Div, Thuwal 23955, Saudi Arabia
[6] Qatar Comp Res Inst, Doha 5825, Qatar
来源
2015 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC 2015): BIG DATA ANALYTICS FOR HUMAN-CENTRIC SYSTEMS | 2015年
关键词
Multiple modal learning; Cross-modal factor analysis; Supervised learning; SPARSE REPRESENTATION; TEXT CLASSIFICATION; SURFACE; ACTIVATION; NETWORK;
D O I
10.1109/SMC.2015.329
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper we study the problem of learning from multiple modal data for purpose of document classification. In this problem, each document is composed two different modals of data, i.e., an image and a text. Cross-modal factor analysis (CFA) has been proposed to project the two different modals of data to a shared data space, so that the classification of a image or a text can be performed directly in this space. A disadvantage of CFA is that it has ignored the supervision information. In this paper, we improve CFA by incorporating the supervision information to represent and classify both image and text modals of documents. We project both image and text data to a shared data space by factor analysis, and then train a class label predictor in the shared space to use the class label information. The factor analysis parameter and the predictor parameter are learned jointly by solving one single objective function. With this objective function, we minimize the distance between the projections of image and text of the same document, and the classification error of the projection measured by hinge loss function. The objective function is optimized by an alternate optimization strategy in an iterative algorithm. Experiments in two different multiple modal document data sets show the advantage of the proposed algorithm over other CFA methods.
引用
收藏
页码:1882 / 1888
页数:7
相关论文
共 50 条
  • [41] Metaplasticity framework for cross-modal synaptic plasticity in adults
    Lee, Hey-Kyoung
    FRONTIERS IN SYNAPTIC NEUROSCIENCE, 2023, 14
  • [42] Cross-Modal Transformers for Infrared and Visible Image Fusion
    Park, Seonghyun
    Vien, An Gia
    Lee, Chul
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (02) : 770 - 785
  • [43] Semantic Collaborative Learning for Cross-Modal Moment Localization
    Hu, Yupeng
    Wang, Kun
    Liu, Meng
    Tang, Haoyu
    Nie, Liqiang
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2024, 42 (02)
  • [44] Adaptive Adversarial Learning based cross-modal retrieval
    Li, Zhuoyi
    Lu, Huibin
    Fu, Hao
    Wang, Zhongrui
    Gu, Guanghun
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 123
  • [45] Unsupervised Cross-Modal Hashing With Modality-Interaction
    Tu, Rong-Cheng
    Jiang, Jie
    Lin, Qinghong
    Cai, Chengfei
    Tian, Shangxuan
    Wang, Hongfa
    Liu, Wei
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (09) : 5296 - 5308
  • [46] Masked cross-modal priming turns on a glimpse of the prime
    Davis, Chris
    Kim, Jeesun
    CONSCIOUSNESS AND COGNITION, 2015, 33 : 457 - 471
  • [47] Semantic Disentanglement Adversarial Hashing for Cross-Modal Retrieval
    Meng, Min
    Sun, Jiaxuan
    Liu, Jigang
    Yu, Jun
    Wu, Jigang
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (03) : 1914 - 1926
  • [48] Cross-modal learning for optical flow estimation with events
    Zhang, Chi
    Jiang, Chenxu
    Yu, Lei
    SIGNAL PROCESSING, 2024, 223
  • [49] Combination subspace graph learning for cross-modal retrieval
    Xu, Gongwen
    Li, Xiaomei
    Shi, Lin
    Zhang, Zhijun
    Zhai, Aidong
    ALEXANDRIA ENGINEERING JOURNAL, 2020, 59 (03) : 1333 - 1343
  • [50] Cross-Modal Omni Interaction Modeling for Phrase Grounding
    Yu, Tianyu
    Hui, Tianrui
    Yu, Zhihao
    Liao, Yue
    Yu, Sansi
    Zhang, Faxi
    Liu, Si
    MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 1725 - 1734