Adversarial Multimodal Representation Learning for Click-Through Rate Prediction

被引：32

作者：

Li, Xiang ^{[1
,2
]}

Wang, Chao ^{[1
,2
]}

Tan, Jiwei ^{[1
,2
]}

Zeng, Xiaoyi ^{[1
,2
]}

Ou, Dan ^{[1
,2
]}

Zheng, Bo ^{[1
,2
]}

机构：

[1] Alibaba Grp, Hangzhou, Peoples R China

[2] Alibaba Grp, Beijing, Peoples R China

来源：

WEB CONFERENCE 2020: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2020) | 2020年

关键词：

multimodal learning; adversarial learning; recurrent neural network; attention; representation learning; e-commerce search;

D O I：

10.1145/3366423.3380163

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

For better user experience and business effectiveness, Click-Through Rate (CTR) prediction has been one of the most important tasks in E-commerce. Although extensive CTR prediction models have been proposed, learning good representation of items from multimodal features is still less investigated, considering an item in E-commerce usually contains multiple heterogeneous modalities. Previous works either concatenate the multiple modality features, that is equivalent to giving a fixed importance weight to each modality; or learn dynamic weights of different modalities for different items through technique like attention mechanism. However, a problem is that there usually exists common redundant information across multiple modalities. The dynamic weights of different modalities computed by using the redundant information may not correctly reflect the different importance of each modality. To address this, we explore the complementarity and redundancy of modalities by considering modality-specific and modality-invariant features differently. We propose a novel Multimodal Adversarial Representation Network (MARN) for the CTR prediction task. A multimodal attention network first calculates the weights of multiple modalities for each item according to its modality-specific features. Then a multimodal adversarial network learns modality-invariant representations where a double-discriminators strategy is introduced. Finally, we achieve the multimodal item representations by combining both modality-specific and modality-invariant representations. We conduct extensive experiments on both public and industrial datasets, and the proposed method consistently achieves remarkable improvements to the state-of-the-art methods. Moreover, the approach has been deployed in an operational E-commerce system and online A/B testing further demonstrates the effectiveness.

引用

页码：827 / 836

页数：10

共 50 条

[31] Adversarial Substructured Representation Learning for Mobile User Profiling
Wang, Pengyang
Fu, Yanjie
Xiong, Hui
Li, Xiaolin
KDD'19: PROCEEDINGS OF THE 25TH ACM SIGKDD INTERNATIONAL CONFERENCCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2019, : 130 - 138
[32] Adversarial Bootstrapped Question Representation Learning for Knowledge Tracing
Sun, Jianwen
Yu, Fenghua
Liu, Sannyuya
Luo, Yawei
Liang, Ruxia
Shen, Xiaoxuan
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 8016 - 8025
[33] Adversarial Knowledge Representation Learning Without External Model
Lei, Jingpei
Ouyang, Dantong
Liu, Ying
IEEE ACCESS, 2019, 7 : 3512 - 3524
[34] CROSS-CULTURE MULTIMODAL EMOTION RECOGNITION WITH ADVERSARIAL LEARNING
Liang, Jingjun
Chen, Shizhe
Zhao, Jinming
Jin, Qin
Liu, Haibo
Lu, Li
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 4000 - 4004
[35] Fixation Prediction through Multimodal Analysis
Min, Xiongkuo J
Zhai, Guangtao
Gu, Ke
Yang, Xiaokang
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2017, 13 (01)
[36] MALN: Multimodal Adversarial Learning Network for Conversational Emotion Recognition
Ren, Minjie
Huang, Xiangdong
Liu, Jing
Liu, Ming
Li, Xuanya
Liu, An-An
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (11) : 6965 - 6980
[37] Multi-Task Deep Learning with Task Attention for Post-Click Conversion Rate Prediction
Luo, Hongxin
Zhou, Xiaobing
Ding, Haiyan
Wang, Liqing
INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2023, 36 (03): : 3583 - 3593
[38] Randomized Prediction Games for Adversarial Machine Learning
Bulo, Samuel Rota
Biggio, Battista
Pillai, Ignazio
Pelillo, Marcello
Roli, Fabio
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2017, 28 (11) : 2466 - 2478
[39] Hybrid Contrastive Learning of Tri-Modal Representation for Multimodal Sentiment Analysis
Mai, Sijie
Zeng, Ying
Zheng, Shuangjia
Hu, Haifeng
IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2023, 14 (03) : 2276 - 2289
[40] Adversarial evasion attacks detection for tree-based ensembles: A representation learning approach
Braun, Gal
Cohen, Seffi
Rokach, Lior
INFORMATION FUSION, 2025, 118

← 1 2 3 4 5 →