Contextual Bandit Algorithm for Risk-Aware Recommender Systems

被引:0
作者
Bouneffouf, Djallel [1 ]
机构
[1] Univ British Columbia, Canadas Michael Smith Genome Sci Ctr, Vancouver, BC, Canada
来源
2016 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC) | 2016年
关键词
Recommender Systems; Context-Aware; exploration/exploitation;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Context-Aware Recommender Systems can be naturally modelled as an exploration/exploitation trade-off (exr/exp) problem, where the system has to choose between maximizing its expected rewards dealing with its current knowledge (exploitation) and learning more about the unknown user's preferences to improve its knowledge (exploration). This problem has been addressed by the reinforcement learning community but they do not consider the risk level of the current user's situation, where it may be dangerous to recommend items the user may not desire in her current situation if the risk level is high. We introduce in this paper an algorithm named R-UCB that considers the risk level of the user's situation to adaptively balance between exr and exp. The detailed analysis of the experimental results reveals several important discoveries in the exr/exp behaviour.
引用
收藏
页码:4667 / 4674
页数:8
相关论文
共 22 条
[1]  
Agrawal D, 2009, IEEE SARNOFF SYMPOS, P93
[2]  
[Anonymous], ICONIP
[3]  
[Anonymous], AUSTR J INTELLIGENT
[4]  
[Anonymous], P 26 ANN SIGCHI C HU
[5]  
[Anonymous], INFORM COMPUTATION
[6]  
[Anonymous], 2012, IFAC P, DOI DOI 10.3182/20120829-3-MX-2028.00160
[7]  
[Anonymous], SPATIOTEMPORAL PROXI
[8]   Finite-time analysis of the multiarmed bandit problem [J].
Auer, P ;
Cesa-Bianchi, N ;
Fischer, P .
MACHINE LEARNING, 2002, 47 (2-3) :235-256
[9]  
Baltrunas L., 2011, PERS UBIQUIT COMPUT, P1
[10]  
Bila Nilton, 2008, 2008 9th International Conference on Mobile Data Management (MDM '08), P98, DOI 10.1109/MDM.2008.34