共 49 条
- [1] [Anonymous], 2015, Asian Conference on Machine Learning
- [2] [Anonymous], 2011, P 2 INT WORKSHOP INF, P57, DOI DOI 10.1145/2039320.2039329
- [3] Finite-time analysis of the multiarmed bandit problem [J]. MACHINE LEARNING, 2002, 47 (2-3) : 235 - 256
- [4] Bartlett PL, 2017, 31 ANN C NEURAL INFO, V30
- [5] Blanda Stephanie, 2016, Online Recommender Systems-How Does a Website Know What I Want?, V31
- [6] Ensemble Recommendations via Thompson Sampling: an Experimental Study within e-Commerce [J]. IUI 2018: PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON INTELLIGENT USER INTERFACES, 2018, : 19 - 29
- [7] Chapelle O, 2011, Advances in Neural Information Processing Systems, V24
- [8] Off-Policy Actor-critic for Recommender Systems [J]. PROCEEDINGS OF THE 16TH ACM CONFERENCE ON RECOMMENDER SYSTEMS, RECSYS 2022, 2022, : 338 - 349
- [9] Top-K Off-Policy Correction for a REINFORCE Recommender System [J]. PROCEEDINGS OF THE TWELFTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING (WSDM'19), 2019, : 456 - 464
- [10] Chen XS, 2019, 36 INT C MACHINE LEA, V97