BERT4Rec: Sequential Recommendation with Bidirectional Encoder Representations from Transformer

被引:1417
作者
Sun, Fei [1 ]
Liu, Jun [1 ]
Wu, Jian [1 ]
Pei, Changhua [1 ]
Lin, Xiao [1 ]
Ou, Wenwu [1 ]
Jiang, Peng [1 ]
机构
[1] Alibaba Grp, Beijing, Peoples R China
来源
PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM '19) | 2019年
关键词
Sequential Recommendation; Bidirectional Sequential Model; Cloze;
D O I
10.1145/3357384.3357895
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Modeling users' dynamic preferences from their historical behaviors is challenging and crucial for recommendation systems. Previous methods employ sequential neural networks to encode users' historical interactions from left to right into hidden representations for making recommendations. Despite their effectiveness, we argue that such left-to-right unidirectional models are sub-optimal due to the limitations including: a) unidirectional architectures restrict the power of hidden representation in users' behavior sequences; b) they often assume a rigidly ordered sequence which is not always practical. To address these limitations, we proposed a sequential recommendation model called BERT4Rec, which employs the deep bidirectional self-attention to model user behavior sequences. To avoid the information leakage and efficiently train the bidirectional model, we adopt the Cloze objective to sequential recommendation, predicting the random masked items in the sequence by jointly conditioning on their left and right context. In this way, we learn a bidirectional representation model to make recommendations by allowing each item in user historical behaviors to fuse information from both left and right sides. Extensive experiments on four benchmark datasets show that our model outperforms various state-of-the-art sequential models consistently.
引用
收藏
页码:1441 / 1450
页数:10
相关论文
共 59 条
[1]  
Ba J. L., 2016, Layer Normalization, DOI 10.48550/arXiv.1607.06450
[2]  
Bahdanau D, 2016, Arxiv, DOI arXiv:1409.0473
[3]   Sequential Recommendation with User Memory Networks [J].
Chen, Xu ;
Xu, Hongteng ;
Zhang, Yongfeng ;
Tang, Jiaxi ;
Cao, Yixin ;
Qin, Zheng ;
Zha, Hongyuan .
WSDM'18: PROCEEDINGS OF THE ELEVENTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, 2018, :108-116
[4]  
Cho K, 2014, ARXIV14061078
[5]   Deep Neural Networks for YouTube Recommendations [J].
Covington, Paul ;
Adams, Jay ;
Sargin, Emre .
PROCEEDINGS OF THE 10TH ACM CONFERENCE ON RECOMMENDER SYSTEMS (RECSYS'16), 2016, :191-198
[6]  
Devlin J., 2018, ARXIV
[7]   Sequential User-based Recurrent Neural Network Recommendations [J].
Donkers, Tim ;
Loepp, Benedikt ;
Ziegler, Juergen .
PROCEEDINGS OF THE ELEVENTH ACM CONFERENCE ON RECOMMENDER SYSTEMS (RECSYS'17), 2017, :152-160
[8]  
Graves A, 2012, STUD COMPUT INTELL, V385, P1, DOI [10.1162/neco.1997.9.1.1, 10.1007/978-3-642-24797-2]
[9]   The MovieLens Datasets: History and Context [J].
Harper, F. Maxwell ;
Konstan, Joseph A. .
ACM TRANSACTIONS ON INTERACTIVE INTELLIGENT SYSTEMS, 2016, 5 (04)
[10]  
He K., 2016, CVPR, DOI [10.1109/CVPR.2016.90, DOI 10.1109/CVPR.2016.90]