Denoising Self-Attentive Sequential Recommendation

被引:28
作者
Chen, Huiyuan [1 ]
Lin, Yusan [1 ]
Pan, Menghai [1 ]
Wang, Lan [1 ]
Yeh, Chin-Chia Michael [1 ]
Li, Xiaoting [1 ]
Zheng, Yan [1 ]
Wang, Fei [1 ]
Yang, Hao [1 ]
机构
[1] Visa Res, Foster City, CA 94404 USA
来源
PROCEEDINGS OF THE 16TH ACM CONFERENCE ON RECOMMENDER SYSTEMS, RECSYS 2022 | 2022年
关键词
Sequential Recommendation; Sparse Transformer; Noise Analysis; Differentiable Mask;
D O I
10.1145/3523227.3546788
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transformer-based sequential recommenders are very powerful for capturing both short-term and long-term sequential item dependencies. This is mainly attributed to their unique self-attention networks to exploit pairwise item-item interactions within the sequence. However, real-world item sequences are often noisy, which is particularly true for implicit feedback. For example, a large portion of clicks do not align well with user preferences, and many products end up with negative reviews or being returned. As such, the current user action only depends on a subset of items, not on the entire sequences. Many existing Transformer-based models use full attention distributions, which inevitably assign certain credits to irrelevant items. This may lead to sub-optimal performance if Transformers are not regularized properly. Here we propose the Rec-denoiser model for better training of self-attentive recommender systems. In Rec-denoiser, we aim to adaptively prune noisy items that are unrelated to the next item prediction. To achieve this, we simply attach each self-attention layer with a trainable binary mask to prune noisy attentions, resulting in sparse and clean attention distributions. This largely purifies item-item dependencies and provides better model interpretability. In addition, the self-attention network is typically not Lipschitz continuous and is vulnerable to small perturbations. Jacobian regularization is further applied to the Transformer blocks to improve the robustness of Transformers for noisy sequences. Our Rec-denoiser is a general plugin that is compatible to many Transformers. Quantitative results on real-world datasets show that our Rec-denoiser outperforms the state-of-the-art baselines.
引用
收藏
页码:92 / 101
页数:10
相关论文
共 60 条
[1]  
Bastings J, 2019, 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), P2963
[2]  
Beltagy I, 2020, Arxiv, DOI [arXiv:2004.05150, 10.48550/arXiv.2004.05150]
[3]  
Bengio Y, 2013, Arxiv, DOI arXiv:1308.3432
[4]   Sequential Recommendation with Graph Neural Networks [J].
Chang, Jianxin ;
Gao, Chen ;
Zheng, Yu ;
Hui, Yiqun ;
Niu, Yanan ;
Song, Yang ;
Jin, Depeng ;
Li, Yong .
SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, :378-387
[5]   Graph Neural Transport Networks with Non-local Attentions for Recommender Systems [J].
Chen, Huiyuan ;
Yeh, Chin-Chia Michael ;
Wang, Fei ;
Yang, Hao .
PROCEEDINGS OF THE ACM WEB CONFERENCE 2022 (WWW'22), 2022, :1955-1964
[6]   Tops, Bottoms, and Shoes: Building Capsule Wardrobes via Cross-Attention Tensor Network [J].
Chen, Huiyuan ;
Lin, Yusan ;
Wang, Fei ;
Yang, Hao .
15TH ACM CONFERENCE ON RECOMMENDER SYSTEMS (RECSYS 2021), 2021, :453-462
[7]   Structured Graph Convolutional Networks with Stochastic Masks for Recommender Systems [J].
Chen, Huiyuan ;
Wang, Lan ;
Lin, Yusan ;
Yeh, Chin-Chia Michael ;
Wang, Fei ;
Yang, Hao .
SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, :614-623
[8]   Behavior Sequence Transformer for E-commerce Recommendation in Alibaba [J].
Chen, Qiwei ;
Zhao, Huan ;
Li, Wei ;
Huang, Pipei ;
Ou, Wenwu .
1ST INTERNATIONAL WORKSHOP ON DEEP LEARNING PRACTICE FOR HIGH-DIMENSIONAL SPARSE DATA WITH KDD (DLP-KDD 2019), 2019,
[9]   Sequential Recommendation with User Memory Networks [J].
Chen, Xu ;
Xu, Hongteng ;
Zhang, Yongfeng ;
Tang, Jiaxi ;
Cao, Yixin ;
Qin, Zheng ;
Zha, Hongyuan .
WSDM'18: PROCEEDINGS OF THE ELEVENTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, 2018, :108-116
[10]  
Child R, 2019, Arxiv, DOI arXiv:1904.10509