Adversarial Multimodal Representation Learning for Click-Through Rate Prediction

被引:32
|
作者
Li, Xiang [1 ,2 ]
Wang, Chao [1 ,2 ]
Tan, Jiwei [1 ,2 ]
Zeng, Xiaoyi [1 ,2 ]
Ou, Dan [1 ,2 ]
Zheng, Bo [1 ,2 ]
机构
[1] Alibaba Grp, Hangzhou, Peoples R China
[2] Alibaba Grp, Beijing, Peoples R China
来源
WEB CONFERENCE 2020: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2020) | 2020年
关键词
multimodal learning; adversarial learning; recurrent neural network; attention; representation learning; e-commerce search;
D O I
10.1145/3366423.3380163
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
For better user experience and business effectiveness, Click-Through Rate (CTR) prediction has been one of the most important tasks in E-commerce. Although extensive CTR prediction models have been proposed, learning good representation of items from multimodal features is still less investigated, considering an item in E-commerce usually contains multiple heterogeneous modalities. Previous works either concatenate the multiple modality features, that is equivalent to giving a fixed importance weight to each modality; or learn dynamic weights of different modalities for different items through technique like attention mechanism. However, a problem is that there usually exists common redundant information across multiple modalities. The dynamic weights of different modalities computed by using the redundant information may not correctly reflect the different importance of each modality. To address this, we explore the complementarity and redundancy of modalities by considering modality-specific and modality-invariant features differently. We propose a novel Multimodal Adversarial Representation Network (MARN) for the CTR prediction task. A multimodal attention network first calculates the weights of multiple modalities for each item according to its modality-specific features. Then a multimodal adversarial network learns modality-invariant representations where a double-discriminators strategy is introduced. Finally, we achieve the multimodal item representations by combining both modality-specific and modality-invariant representations. We conduct extensive experiments on both public and industrial datasets, and the proposed method consistently achieves remarkable improvements to the state-of-the-art methods. Moreover, the approach has been deployed in an operational E-commerce system and online A/B testing further demonstrates the effectiveness.
引用
收藏
页码:827 / 836
页数:10
相关论文
共 50 条
  • [31] Adversarial Substructured Representation Learning for Mobile User Profiling
    Wang, Pengyang
    Fu, Yanjie
    Xiong, Hui
    Li, Xiaolin
    KDD'19: PROCEEDINGS OF THE 25TH ACM SIGKDD INTERNATIONAL CONFERENCCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2019, : 130 - 138
  • [32] Adversarial Bootstrapped Question Representation Learning for Knowledge Tracing
    Sun, Jianwen
    Yu, Fenghua
    Liu, Sannyuya
    Luo, Yawei
    Liang, Ruxia
    Shen, Xiaoxuan
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 8016 - 8025
  • [33] Adversarial Knowledge Representation Learning Without External Model
    Lei, Jingpei
    Ouyang, Dantong
    Liu, Ying
    IEEE ACCESS, 2019, 7 : 3512 - 3524
  • [34] CROSS-CULTURE MULTIMODAL EMOTION RECOGNITION WITH ADVERSARIAL LEARNING
    Liang, Jingjun
    Chen, Shizhe
    Zhao, Jinming
    Jin, Qin
    Liu, Haibo
    Lu, Li
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 4000 - 4004
  • [35] Fixation Prediction through Multimodal Analysis
    Min, Xiongkuo J
    Zhai, Guangtao
    Gu, Ke
    Yang, Xiaokang
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2017, 13 (01)
  • [36] MALN: Multimodal Adversarial Learning Network for Conversational Emotion Recognition
    Ren, Minjie
    Huang, Xiangdong
    Liu, Jing
    Liu, Ming
    Li, Xuanya
    Liu, An-An
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (11) : 6965 - 6980
  • [37] Multi-Task Deep Learning with Task Attention for Post-Click Conversion Rate Prediction
    Luo, Hongxin
    Zhou, Xiaobing
    Ding, Haiyan
    Wang, Liqing
    INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2023, 36 (03): : 3583 - 3593
  • [38] Randomized Prediction Games for Adversarial Machine Learning
    Bulo, Samuel Rota
    Biggio, Battista
    Pillai, Ignazio
    Pelillo, Marcello
    Roli, Fabio
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2017, 28 (11) : 2466 - 2478
  • [39] Hybrid Contrastive Learning of Tri-Modal Representation for Multimodal Sentiment Analysis
    Mai, Sijie
    Zeng, Ying
    Zheng, Shuangjia
    Hu, Haifeng
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2023, 14 (03) : 2276 - 2289
  • [40] Adversarial evasion attacks detection for tree-based ensembles: A representation learning approach
    Braun, Gal
    Cohen, Seffi
    Rokach, Lior
    INFORMATION FUSION, 2025, 118