Underestimation Refinement: A General Enhancement Strategy for Exploration in Recommendation Systems

被引:4
作者
Song, Yuhai [1 ]
Wang, Lu [1 ]
Dang, Haoming [1 ]
Zhou, Weiwei [1 ]
Guan, Jing [1 ]
Zhao, Xiwei [1 ]
Peng, Changping [1 ]
Bao, Yongjun [1 ]
Shao, Jingping [1 ]
机构
[1] JD com, Business Growth BU, Beijing, Peoples R China
来源
SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL | 2021年
关键词
Recommendation System; Contextual Bandit;
D O I
10.1145/3404835.3462983
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Click-through rate (CTR) prediction based on deep neural networks has made significant progress in recommendation systems. However, these methods often suffer from CTR underestimation due to insufficient impressions for long-tail items. When formalizing CTR prediction as a contextual bandit problem, exploration methods provide a natural solution addressing this issue. In this paper, we first benchmark state-of-the-art exploration methods in the recommendation system setting. We find that the combination of gradient-based uncertainty modeling and Thompson Sampling achieves a significant advantage. On the basis of the benchmark, we further propose a general enhancement strategy, Underestimation Refinement (UR), which explicitly incorporates the prior knowledge that insufficient impressions likely leads to CTR underestimation. This strategy is applicable to almost all the existing exploration methods. Experimental results validate UR's effectiveness, achieving consistent improvement across all baseline exploration methods.
引用
收藏
页码:1818 / 1822
页数:5
相关论文
共 29 条
[11]  
Gal Y, 2016, PR MACH LEARN RES, V48
[12]  
Guo HF, 2017, PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P1725
[13]  
Jadidinejad Amir H, 2020, P 43 INT ACM SIGIR C, P2025
[14]  
Lattimore T, 2020, BANDIT ALGORITHMS, P1, DOI 10.1017/9781108571401
[15]  
Li L., 2010, P 19 INT C WORLD WID, P661, DOI DOI 10.1145/1772690.1772758
[16]  
Liu Hong, 2020, Advances in Neural Information Processing Systems, V33
[17]   Category-Specific CNN for Visual-aware CTR Prediction at JD.com [J].
Liu, Hu ;
Lu, Jing ;
Yang, Hao ;
Zhao, Xiwei ;
Xu, Sulong ;
Peng, Hao ;
Zhang, Zehua ;
Niu, Wenjie ;
Zhu, Xiaokun ;
Bao, Yongjun ;
Yan, Weipeng .
KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, :2686-2696
[18]  
Rasmussen CE, 2005, ADAPT COMPUT MACH LE, P1
[19]   A Tutorial on Thompson Sampling [J].
Russo, Daniel J. ;
Van Roy, Benjamin ;
Kazerouni, Abbas ;
Osband, Ian ;
Wen, Zheng .
FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2018, 11 (01) :1-96
[20]   Asymmetric Tri-training for Debiasing Missing-Not-At-Random Explicit Feedback [J].
Saito, Yuta .
PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20), 2020, :309-318