Preference-based online learning with dueling bandits: A survey

被引:0
作者
Bengs, Viktor [1 ]
Busa-Fekete, Robert [2 ]
Mesaoudi-Paul, Adil El [1 ]
Hullermeier, Eyke [1 ]
机构
[1] Heinz Nixdorf Institute, Department of Computer Science, Paderborn University, Germany
[2] Google Research, New York,NY, United States
关键词
D O I
暂无
中图分类号
学科分类号
摘要
引用
收藏
相关论文
共 50 条
[31]   Preference-based Teaching [J].
Gao, Ziyuan ;
Ries, Christoph ;
Simon, Hans U. ;
Zilles, Sandra .
JOURNAL OF MACHINE LEARNING RESEARCH, 2017, 18 :1-32
[32]   Preference-based unawareness [J].
Schipper, Burkhard C. .
MATHEMATICAL SOCIAL SCIENCES, 2014, 70 :34-41
[33]   APReL: A Library for Active Preference-based Reward Learning Algorithms [J].
Biyik, Erdem ;
Talati, Aditi ;
Sadigh, Dorsa .
PROCEEDINGS OF THE 2022 17TH ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION (HRI '22), 2022, :613-617
[34]   Preference-Based Assistance Map Learning With Robust Adaptive Oscillators [J].
Li, Shilei ;
Zou, Wulin ;
Duan, Pu ;
Shi, Ling .
IEEE TRANSACTIONS ON MEDICAL ROBOTICS AND BIONICS, 2022, 4 (04) :1000-1009
[35]   Preference-based decision making for personalised access to Learning Resources [J].
Department of Special Education, University of Thessaly, Argonafton and Filellinon Street, Volos, GR 38221, Greece ;
不详 ;
不详 .
Int. J. Auton. Adapt. Commun. Syst., 2008, 3 (356-369) :356-369
[36]   A Policy Iteration Algorithm for Learning from Preference-Based Feedback [J].
Wirth, Christian ;
Furnkranz, Johannes .
ADVANCES IN INTELLIGENT DATA ANALYSIS XII, 2013, 8207 :427-437
[37]   Efficient Meta Reinforcement Learning for Preference-based Fast Adaptation [J].
Ren, Zhizhou ;
Liu, Anji ;
Liang, Yitao ;
Peng, Jian ;
Ma, Jianzhu .
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
[38]   Active Preference-Based Gaussian Process Regression for Reward Learning [J].
Biyik, Lirdem ;
Huynh, Nicolas ;
Kochenderfer, Mykel J. ;
Sadigh, Dorsa .
ROBOTICS: SCIENCE AND SYSTEMS XVI, 2020,
[39]   Preference-based Reinforcement Learning with Finite-Time Guarantees [J].
Xu, Yichong ;
Wang, Ruosong ;
Yang, Lin F. ;
Singh, Aarti ;
Dubrawski, Artur .
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
[40]   Preference-based valuation of treatment attributes in haemophilia A using web survey [J].
Carlsson, K. Steen ;
Andersson, E. ;
Berntorp, E. .
HAEMOPHILIA, 2017, 23 (06) :894-903