Sequential Recommendation via Stochastic Self-Attention

被引:90
|
作者
Fan, Ziwei [1 ,5 ]
Liu, Zhiwei [1 ]
Wang, Yu [1 ]
Wang, Alice [2 ]
Nazari, Zahra [2 ]
Zheng, Lei [3 ]
Peng, Hao [4 ]
Yu, Philip S. [1 ]
机构
[1] Univ Illinois, Dept Comp Sci, Chicago, IL 60680 USA
[2] Spotify, New York, NY USA
[3] Pinterest Inc, Chicago, IL USA
[4] Beihang Univ, Sch Cyber Sci & Technol, Beijing, Peoples R China
[5] Spotify Res, New York, NY USA
来源
PROCEEDINGS OF THE ACM WEB CONFERENCE 2022 (WWW'22) | 2022年
关键词
Sequential Recommendation; Transformer; Self-Attention; Uncertainty;
D O I
10.1145/3485447.3512077
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Sequential recommendation models the dynamics of a user's previous behaviors in order to forecast the next item, and has drawn a lot of attention. Transformer-based approaches, which embed items as vectors and use dot-product self-attention to measure the relationship between items, demonstrate superior capabilities among existing sequential methods. However, users' real-world sequential behaviors are uncertain rather than deterministic, posing a significant challenge to present techniques. We further suggest that dot-product-based approaches cannot fully capture collaborative transitivity, which can be derived in item-item transitions inside sequences and is beneficial for cold start items. We further argue that BPR loss has no constraint on positive and sampled negative items, which misleads the optimization. We propose a novel STOchastic Self-Attention (STOSA) to overcome these issues. STOSA, in particular, embeds each item as a stochastic Gaussian distribution, the covariance of which encodes the uncertainty. We devise a novel Wasserstein Self-Attention module to characterize item-item position-wise relationships in sequences, which effectively incorporates uncertainty into model training. Wasserstein attentions also enlighten the collaborative transitivity learning as it satisfies triangle inequality. Moreover, we introduce a novel regularization term to the ranking loss, which assures the dissimilarity between positive and the negative items. Extensive experiments on five real-world benchmark datasets demonstrate the superiority of the proposed model over state-of-the-art baselines, especially on cold start items. The code is available in https://github.com/zfan20/STOSA.
引用
收藏
页码:2036 / 2047
页数:12
相关论文
共 50 条
  • [41] PROXIMITY-AWARE SELF-ATTENTION-BASED SEQUENTIAL LOCATION RECOMMENDATION
    Luo, Xuan
    Huang, Mingqing
    Lv, Rui
    Zhao, Hui
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2024, 20 (05): : 1277 - 1299
  • [42] MGSAN: A Multi-granularity Self-attention Network for Next POI Recommendation
    Li, Yepeng
    Xian, Xuefeng
    Zhao, Pengpeng
    Liu, Yanchi
    Sheng, Victor S.
    WEB INFORMATION SYSTEMS ENGINEERING - WISE 2021, PT II, 2021, 13081 : 193 - 208
  • [43] Dynamic Network Embedding in Hyperbolic Space via Self-attention
    Duan, Dingyang
    Zha, Daren
    Yang, Xiao
    Mu, Nan
    Shen, Jiahui
    WEB ENGINEERING (ICWE 2022), 2022, 13362 : 189 - 203
  • [44] Dynamic Graph Embedding via Self-Attention in the Lorentz Space
    Duan, Dingyang
    Zha, Daren
    Lie, Zeyi
    Chen, Yu
    PROCEEDINGS OF THE 2024 27 TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, CSCWD 2024, 2024, : 199 - 204
  • [45] Capturing Dynamic Interests of Similar Users for POI Recommendation Using Self-Attention Mechanism
    Fan, Xinhua
    Hua, Yixin
    Cao, Yibing
    Zhao, Xinke
    SUSTAINABILITY, 2023, 15 (06)
  • [46] GSSA: Pay attention to graph feature importance for GCN via statistical self-attention
    Zheng, Jin
    Wang, Yang
    Xu, Wanjun
    Gan, Zilu
    Li, Ping
    Lv, Jiancheng
    NEUROCOMPUTING, 2020, 417 : 458 - 470
  • [47] GSSA: Pay attention to graph feature importance for GCN via statistical self-attention
    Zheng J.
    Wang Y.
    Xu W.
    Gan Z.
    Li P.
    Lv J.
    Neurocomputing, 2020, 417 : 458 - 470
  • [48] Attention Calibration for Transformer-based Sequential Recommendation
    Zhou, Peilin
    Ye, Qichen
    Xie, Yueqi
    Gao, Jingqi
    Wang, Shoujin
    Kim, Jae Boum
    You, Chenyu
    Kim, Sunghun
    PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023, 2023, : 3595 - 3605
  • [49] Frequency Enhanced Hybrid Attention Network for Sequential Recommendation
    Du, Xinyu
    Yuan, Huanhuan
    Zhao, Pengpeng
    Qu, Jianfeng
    Zhuang, Fuzhen
    Liu, Guanfeng
    Liu, Yanchi
    Sheng, Victor S.
    PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 78 - 88
  • [50] Assessing the Impact of Attention and Self-Attention Mechanisms on the Classification of Skin Lesions
    Pedro, Rafael
    Oliveira, Arlindo L.
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,