Multi-modal Mixture of Experts Represetation Learning for Sequential Recommendation

被引:7
作者
Bian, Shuqing [1 ,3 ]
Pan, Xingyu [1 ]
Zhao, Wayne Xin [2 ,4 ]
Wang, Jinpeng [3 ]
Wang, Chuyuan [3 ]
Wen, Ji-Rong [1 ,2 ]
机构
[1] Renmin Univ China, Sch Informat, Beijing, Peoples R China
[2] Renmin Univ China, Gaoling Sch Artificial Intelligence, Beijing, Peoples R China
[3] Meituan Grp, Beijing, Peoples R China
[4] Beijing Key Lab Big Data Management & Anal Method, Beijing, Peoples R China
来源
PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023 | 2023年
基金
北京市自然科学基金; 中国国家自然科学基金;
关键词
User Behavior Modeling; Multi-modal Recommendation;
D O I
10.1145/3583780.3614978
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Within online platforms, it is critical to capture the dynamic user preference from the sequential interaction behaviors for making accurate recommendation over time. Recently, significant progress has been made in sequential recommendation with deep learning. However, existing neural sequential recommender often suffer from the data sparsity issue in real-world applications. To tackle this problem, we propose a Multi-Modal Mixture of experts model for Sequential Recommendation, named M3SRec, which leverage rich multi-modal interaction data for improving sequential recommendation. Different from existing multi-modal recommendation models, our approach jointly considers reducing the semantic gap across modalities and adapts multi-modal semantics to fit recommender systems. For this purpose, we make two important technical contributions in architecture and training. Firstly, we design a novel multi-modal mixture-of-experts (MoE) fusion network, which can deeply fuse the across-modal semantics and largely enhance the modeling capacity of complex user intents. For training, we design specific pre-training tasks that can mimic the goal of the recommendation, which help model learn the semantic relatedness between the multi-modal sequential context and the target item. Extensive experiments conducted on both public and industry datasets demonstrate the superiority of our proposed method over existing state-of-the-art methods, especially when only limited training data is available.
引用
收藏
页码:110 / 119
页数:10
相关论文
共 49 条
[1]   Contrastive Curriculum Learning for Sequential User Behavior Modeling via Data Augmentation [J].
Bian, Shuqing ;
Zhao, Wayne Xin ;
Zhou, Kun ;
Cai, Jing ;
He, Yancheng ;
Yin, Cunxiang ;
Wen, Ji-Rong .
PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, :3737-3746
[2]   A Novel Macro-Micro Fusion Network for User Representation Learning on Mobile Apps [J].
Bian, Shuqing ;
Zhao, Wayne Xin ;
Zhou, Kun ;
Chen, Xu ;
Cai, Jing ;
He, Yancheng ;
Luo, Xingji ;
Wen, Ji-Rong .
PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE 2021 (WWW 2021), 2021, :3199-3209
[3]   Cross-Platform App Recommendation by Jointly Modeling Ratings and Texts [J].
Cao, Da ;
He, Xiangnan ;
Nie, Liqiang ;
Wei, Xiaochi ;
Hu, Xia ;
Wu, Shunxiang ;
Chua, Tat-Seng .
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2017, 35 (04)
[4]   Attentive Collaborative Filtering: Multimedia Recommendation with Item- and Component-Level Attention [J].
Chen, Jingyuan ;
Zhang, Hanwang ;
He, Xiangnan ;
Nie, Liqiang ;
Liu, Wei ;
Chua, Tat-Seng .
SIGIR'17: PROCEEDINGS OF THE 40TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2017, :335-344
[5]  
Chen T., 2020, ICML
[6]   Personalized Fashion Recommendation with Visual Explanations based on Multimodal Attention Network [J].
Chen, Xu ;
Chen, Hanxiong ;
Xu, Hongteng ;
Zhang, Yongfeng ;
Cao, Yixin ;
Qin, Zheng ;
Zha, Hongyuan .
PROCEEDINGS OF THE 42ND INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '19), 2019, :765-774
[7]  
Chen YJ, 2021, Arxiv, DOI [arXiv:2109.11654, 10.48550/arXiv.2109.11654, DOI 10.48550/ARXIV.2109.11654]
[8]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[9]  
He RN, 2016, IEEE DATA MINING, P191, DOI [10.1109/ICDM.2016.0030, 10.1109/ICDM.2016.88]
[10]  
Hidasi B, 2016, Arxiv, DOI [arXiv:1511.06939, 10.48550/arXiv.1511.06939, DOI 10.48550/ARXIV.1511.06939]