Online Distillation-enhanced Multi-modal Transformer for Sequential Recommendation

被引:3
作者
Ji, Wei [1 ]
Liu, Xiangyan [1 ]
Zhang, An [1 ]
Wei, Yinwei [2 ]
Ni, Yongxin [1 ]
Wang, Xiang [3 ,4 ]
机构
[1] Natl Univ Singapore, Singapore, Singapore
[2] Monash Univ, Melbourne, Vic, Australia
[3] Univ Sci & Technol China, Hefei, Peoples R China
[4] Hefei Comprehens Natl Sci Ctr, Inst Artificial Intelligence, Inst Dataspace, Hefei, Peoples R China
来源
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023 | 2023年
基金
中国国家自然科学基金;
关键词
Multi-modal Recommendation; Knowledge Distillation; Sequential Recommendation;
D O I
10.1145/3581783.3612091
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-modal recommendation systems, which integrate diverse types of information, have gained widespread attention in recent years. However, compared to traditional collaborative filtering-based multi-modal recommendation systems, research on multi-modal sequential recommendation is still in its nascent stages. Unlike traditional sequential recommendation models that solely rely on item identifier (ID) information and focus on network structure design, multi-modal recommendation models need to emphasize item representation learning and the fusion of heterogeneous data sources. This paper investigates the impact of item representation learning on downstream recommendation tasks and examines the disparities in information fusion at different stages. Empirical experiments are conducted to demonstrate the need to design a framework suitable for collaborative learning and fusion of diverse information. Based on this, we propose a new model-agnostic framework for multi-modal sequential recommendation tasks, called Online Distillation-enhanced Multi-modal Transformer (ODMT), to enhance feature interaction and mutual learning among multi-source input (ID, text, and image), while avoiding conflicts among different features during training, thereby improving recommendation accuracy. To be specific, we first introduce an ID-aware Multi-modal Transformer module in the item representation learning stage to facilitate information interaction among different features. Secondly, we employ an online distillation training strategy in the prediction optimization stage to make multi-source data learn from each other and improve prediction robustness. Experimental results on a stream media recommendation dataset and three e-commerce recommendation datasets demonstrate the effectiveness of the proposed two modules, which is approximately 10% improvement in performance compared to baseline models. Our code will be released at: https://github.com/xyliugo/ODMT.
引用
收藏
页码:955 / 965
页数:11
相关论文
共 59 条
  • [51] Zhang Jinghao, 2022, IEEE T KNOWLEDGE DAT
  • [52] Zhang Lingzi, 2023, ARXIV230311879
  • [53] Zhang TT, 2019, PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P4320
  • [54] Deep Mutual Learning
    Zhang, Ying
    Xiang, Tao
    Hospedales, Timothy M.
    Lu, Huchuan
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 4320 - 4328
  • [55] RecBole: Towards a Unified, Comprehensive and Efficient Framework for Recommendation Algorithms
    Zhao, Wayne Xin
    Mu, Shanlei
    Hou, Yupeng
    Lin, Zihan
    Chen, Yushuo
    Pan, Xingyu
    Li, Kaiyuan
    Lu, Yujie
    Wang, Hui
    Tian, Changxin
    Min, Yingqian
    Feng, Zhichao
    Fan, Xinyan
    Chen, Xu
    Wang, Pengfei
    Ji, Wendi
    Li, Yaliang
    Wang, Xiaoling
    Wen, Ji-Rong
    [J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 4653 - 4664
  • [56] Zhou Hongyu, 2023, ARXIV230112097
  • [57] Zhou Hongyu, 2023, ARXIV230204473
  • [58] S3-Rec: Self-Supervised Learning for Sequential Recommendation with Mutual Information Maximization
    Zhou, Kun
    Wang, Hui
    Zhao, Wayne Xin
    Zhu, Yutao
    Wang, Sirui
    Zhang, Fuzheng
    Wang, Zhongyuan
    Wen, Ji-Rong
    [J]. CIKM '20: PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, 2020, : 1893 - 1902
  • [59] Zhou Xin, 2022, ARXIV220705969