Supervised cross-modal factor analysis for multiple modal data classification

被引：13

作者：

Wang, Jingbin ^{[1
,2
]}

Zhou, Yihua ^{[3
]}

Duan, Kanghong ^{[4
]}

Wang, Jim Jing-Yan ^{[5
]}

Bensmail, Halima ^{[6
]}

机构：

[1] Chinese Acad Sci, Natl Time Serv Ctr, Xian 710600, Peoples R China

[2] Chinese Acad Sci, Grad Univ, Beijing 100039, Peoples R China

[3] Lehigh Univ, Dept Mech Engn & Mech, Bethlehem, PA 18015 USA

[4] State Ocean Adm, North China Sea Marine Tech Support Ctr, Qingdao 266033, Peoples R China

[5] King Abdullah Univ Sci & Technol, Comp Elect & Math Sci & Engn Div, Thuwal 23955, Saudi Arabia

[6] Qatar Comp Res Inst, Doha 5825, Qatar

来源：

2015 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC 2015): BIG DATA ANALYTICS FOR HUMAN-CENTRIC SYSTEMS | 2015年

关键词：

Multiple modal learning; Cross-modal factor analysis; Supervised learning; SPARSE REPRESENTATION; TEXT CLASSIFICATION; SURFACE; ACTIVATION; NETWORK;

D O I：

10.1109/SMC.2015.329

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this paper we study the problem of learning from multiple modal data for purpose of document classification. In this problem, each document is composed two different modals of data, i.e., an image and a text. Cross-modal factor analysis (CFA) has been proposed to project the two different modals of data to a shared data space, so that the classification of a image or a text can be performed directly in this space. A disadvantage of CFA is that it has ignored the supervision information. In this paper, we improve CFA by incorporating the supervision information to represent and classify both image and text modals of documents. We project both image and text data to a shared data space by factor analysis, and then train a class label predictor in the shared space to use the class label information. The factor analysis parameter and the predictor parameter are learned jointly by solving one single objective function. With this objective function, we minimize the distance between the projections of image and text of the same document, and the classification error of the projection measured by hinge loss function. The objective function is optimized by an alternate optimization strategy in an iterative algorithm. Experiments in two different multiple modal document data sets show the advantage of the proposed algorithm over other CFA methods.

引用

页码：1882 / 1888

页数：7

共 50 条

[21] Neuroscience of synesthesia and cross-modal associations
Neckar, Marcel
Bob, Petr
REVIEWS IN THE NEUROSCIENCES, 2014, 25 (06) : 833 - 840
[22] UCSL: Toward Unsupervised Common Subspace Learning for Cross-Modal Image Classification
Yao, Jing
Hong, Danfeng
Wang, Haipeng
Liu, Hao
Chanussot, Jocelyn
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
[23] Online deep hashing for both uni-modal and cross-modal retrieval
Xie, Yicai
Zeng, Xianhua
Wang, Tinghua
Yi, Yun
INFORMATION SCIENCES, 2022, 608 : 1480 - 1502
[24] Intra-class low-rank regularization for supervised and semi-supervised cross-modal retrieval
Peipei Kang
Zehang Lin
Zhenguo Yang
Xiaozhao Fang
Alexander M. Bronstein
Qing Li
Wenyin Liu
Applied Intelligence, 2022, 52 : 33 - 54
[25] Intra-class low-rank regularization for supervised and semi-supervised cross-modal retrieval
Kang, Peipei
Lin, Zehang
Yang, Zhenguo
Fang, Xiaozhao
Bronstein, Alexander M.
Li, Qing
Liu, Wenyin
APPLIED INTELLIGENCE, 2022, 52 (01) : 33 - 54
[26] Cross-modal cueing in audiovisual spatial attention
Blurton, Steven P.
Greenlee, Mark W.
Gondan, Matthias
ATTENTION PERCEPTION & PSYCHOPHYSICS, 2015, 77 (07) : 2356 - 2376
[27] Toward Generic Cross-Modal Transmission Strategy
Wei, Xin
Liao, Junqi
Zhou, Liang
Sari, Hikmet
Zhuang, Weihua
IEEE TRANSACTIONS ON COMMUNICATIONS, 2024, 72 (10) : 6059 - 6072
[28] Object Segmentation by Mining Cross-Modal Semantics
Wu, Zongwei
Wang, Jingjing
Zhou, Zhuyun
An, Zhaochong
Jiang, Qiuping
Demonceaux, Cedric
Sun, Guolei
Timofte, Radu
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 3455 - 3464
[29] Deep Relation Embedding for Cross-Modal Retrieval
Zhang, Yifan
Zhou, Wengang
Wang, Min
Tian, Qi
Li, Houqiang
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 617 - 627
[30] Haptic Signal Reconstruction for Cross-Modal Communications
Wei, Xin
Shi, Yingying
Zhou, Liang
IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 4514 - 4525

← 1 2 3 4 5 →