Learning Joint Multimodal Representation with Adversarial Attention Networks

被引：17

作者：

Huang, Feiran ^{[1
]}

Zhang, Xiaoming ^{[2
]}

Li, Zhoujun ^{[1
]}

机构：

[1] Beihang Univ, State Key Lab Software Dev Environm, Beijing, Peoples R China

[2] Beihang Univ, Sch Cyber Sci & Technol, Beijing, Peoples R China

来源：

PROCEEDINGS OF THE 2018 ACM MULTIMEDIA CONFERENCE (MM'18) | 2018年

基金：

北京市自然科学基金; 中国国家自然科学基金;

关键词：

multimodal; representation learning; adversarial networks; attention model; siamese learning;

D O I：

10.1145/3240508.3240614

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Recently, learning a joint representation for the multimodal data (e.g., containing both visual content and text description) has attracted extensive research interests. Usually, the features of different modalities are correlational and compositive, and thus a joint representation capturing the correlation is more effective than a subset of the features. Most of existing multimodal representation learning methods suffer from lack of additional constraints to enhance the robustness of the learned representations. In this paper, a novel Adversarial Attention Networks (AAN) is proposed to incorporate both the attention mechanism and the adversarial networks for effective and robust multimodal representation learning. Specifically, a visual-semantic attention model with siamese learning strategy is proposed to encode the fine-grained correlation between visual and textual modalities. Meanwhile, the adversarial learning model is employed to regularize the generated representation by matching the posterior distribution of the representation to the given priors. Then, the two modules are incorporated into a integrated learning framework to learn the joint multimodal representation. Experimental results in two tasks, i.e., multi-label classification and tag recommendation, show that the proposed model outperforms state-of-the-art representation learning methods.

引用

页码：1874 / 1882

页数：9

共 50 条

[41] Feature Equilibrium: An Adversarial Training Method to Improve Representation Learning
Minghui Liu
Meiyi Yang
Jiali Deng
Xuan Cheng
Tianshu Xie
Pan Deng
Haigang Gong
Ming Liu
Xiaomin Wang
International Journal of Computational Intelligence Systems, 16
[42] PRAAD: Pseudo representation adversarial learning for unsupervised anomaly detection
Xi, Liang
He, Dong
Liu, Han
JOURNAL OF INFORMATION SECURITY AND APPLICATIONS, 2025, 89
[43] Multimodal Adversarial Learning Based Unsupervised Time Series Anomaly Detection
Huang X.
Zhang F.
Fan H.
Xi L.
Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2021, 58 (08): : 1655 - 1667
[44] Adversarial Representation Learning for Intelligent Condition Monitoring of Complex Machinery
Sun, Shilin
Wang, Tianyang
Yang, Hongxing
Chu, Fulei
IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2023, 70 (05) : 5255 - 5265
[45] Attentive Representation Learning With Adversarial Training for Short Text Clustering
Zhang, Wei
Dong, Chao
Yin, Jianhua
Wang, Jianyong
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (11) : 5196 - 5210
[46] Establishing joint attention with multimodal resources in lingua franca guided tours
Hosoda, Yuri
Aline, David
LEARNING CULTURE AND SOCIAL INTERACTION, 2021, 31
[47] Adversarial Adaptive Interpolation in Autoencoders for Dually Regularizing Representation Learning
Li, Guanyue
Wei, Xiwen
Wu, Si
Yu, Zhiwen
Qian, Sheng
Wong, Hau-San
IEEE MULTIMEDIA, 2022, 29 (03) : 57 - 67
[48] Incremental Unit Networks for Distributed, Symbolic Multimodal Processing and Representation
Imtiaz, Mir Tahsin
Kennington, Casey
DIGITAL HUMAN MODELING AND APPLICATIONS IN HEALTH, SAFETY, ERGONOMICS AND RISK MANAGEMENT: HEALTH, OPERATIONS MANAGEMENT, AND DESIGN, PT II, 2022, 13320 : 344 - 363
[49] Joint compression and despeckling by SAR representation learning
Amao-Oliva, Joel
Foix-Colonier, Nils
Sica, Francescopaolo
ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING, 2025, 220 : 524 - 534
[50] Learning a Joint Representation for Classification of Networked Documents
You, Zhenni
Qian, Tieyun
NEURAL INFORMATION PROCESSING (ICONIP 2018), PT V, 2018, 11305 : 199 - 209

← 1 2 3 4 5 →